Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

2023

Classification

Discipline
Institution
Publication
Publication Type
File Type

Articles 1 - 30 of 48

Full-Text Articles in Physical Sciences and Mathematics

Comparison Of Support Vector Machine (Svm), K-Nearest Neighbor (K-Nn), And Stochastic Gradient Descent (Sgd) For Classifying Corn Leaf Disease Based On Histogram Of Oriented Gradients (Hog) Feature Extraction, Firdaus Solihin, Muhammad Syarief, Eka Mala Sari Rochman, Aeri Rachmad Dec 2023

Comparison Of Support Vector Machine (Svm), K-Nearest Neighbor (K-Nn), And Stochastic Gradient Descent (Sgd) For Classifying Corn Leaf Disease Based On Histogram Of Oriented Gradients (Hog) Feature Extraction, Firdaus Solihin, Muhammad Syarief, Eka Mala Sari Rochman, Aeri Rachmad

Elinvo (Electronics, Informatics, and Vocational Education)

Image classification involves categorizing an image's pixels into specific classes based on their unique characteristics. It has diverse applications in everyday life. One such application is the classification of diseases on corn leaves. Corn is a widely consumed staple food in Indonesia, and healthy corn plants are crucial for meeting market demands. Currently, disease identification in corn plants relies on manual checks, which are time-consuming and less effective. This research aims to automate disease identification on corn leaves using the Support Vector Machine (SVM), K-Nearest Neighbor (K-NN) with K=2, and Stochastic Gradient Descent (SGD) algorithms. The classification process utilizes the …


Classification Of Beef And Pork Images Based On Color Features And Pseudo Nearest Neighbor Rule, Ahmad Awaluddin Baiti, Muhammad Fachrie, Saucha Diwandari Dec 2023

Classification Of Beef And Pork Images Based On Color Features And Pseudo Nearest Neighbor Rule, Ahmad Awaluddin Baiti, Muhammad Fachrie, Saucha Diwandari

Elinvo (Electronics, Informatics, and Vocational Education)

This research is motivated by the need for halal foods in Muslim society with the purpose of avoiding non-halal foods, such as pork, that are sold in the market. Although beef and pork basically have different characteristics, not all Muslims know the differences. Moreover, people nowadays sell beef mixed with pork to obtain more profits. Hence, this paper proposed the implementation of the Pseudo-Nearest Neighbor Rule (PNNR) in classifying images of beef and pork slices based on color features. Based on the image dataset that has been collected, the very significant difference that can be identified visually between beef and …


Differentiation Of Human, Dog, And Cat Hair Fibers Using Dart Tofms And Machine Learning, Laura Ahumada, Erin R. Mcclure-Price, Chad Kwong, Edgard O. Espinoza, John Santerre Dec 2023

Differentiation Of Human, Dog, And Cat Hair Fibers Using Dart Tofms And Machine Learning, Laura Ahumada, Erin R. Mcclure-Price, Chad Kwong, Edgard O. Espinoza, John Santerre

SMU Data Science Review

Hair is found in over 90% of crime scenes and has long been analyzed as trace evidence. However, recent reviews of traditional hair fiber analysis techniques, primarily morphological examination, have cast doubt on its reliability. To address these concerns, this study employed machine learning algorithms, specifically Linear Discriminant Analysis (LDA) and Random Forest, on Direct Analysis in Real Time time-of-flight mass spectra collected from human, cat, and dog hair samples. The objective was to develop a chemistry- and statistics-based classification method for unbiased taxonomic identification of hair. The results of the study showed that LDA and Random Forest were highly …


Habitat Heterogeneity In Nebraska Streams And Distribution Prediction For Tier-1 Cyprinids Using Multi-Scale Modeling Of Fluvial And Landscape Features, Connor P. Hart Dec 2023

Habitat Heterogeneity In Nebraska Streams And Distribution Prediction For Tier-1 Cyprinids Using Multi-Scale Modeling Of Fluvial And Landscape Features, Connor P. Hart

School of Natural Resources: Dissertations, Theses, and Student Research

Multiscale environmental processes determine in-stream habitat conditions which drive species distributions. Habitat constitutes the physical template upon which ecological processes occur and species conduct life stage activities. Habitat heterogeneity promotes biodiversity of aquatic systems. Stream classification informs freshwater conservation by providing a useful framework to account for habitat heterogeneity, often based on landscape regions of similar environmental processes. A greater understanding of landscape-based classification frameworks as means to classify stream systems may improve understanding of drivers of biodiversity. Using Nebraska as a case study, on a statewide scale, objectives were 1) to characterize habitat availability for several at-risk fish species, …


Towards Long-Term Fairness In Sequential Decision Making, Yaowei Hu Dec 2023

Towards Long-Term Fairness In Sequential Decision Making, Yaowei Hu

Graduate Theses and Dissertations

With the development of artificial intelligence, automated decision-making systems are increasingly integrated into various applications, such as hiring, loans, education, recommendation systems, and more. These machine learning algorithms are expected to facilitate faster, more accurate, and impartial decision-making compared to human judgments. Nevertheless, these expectations are not always met in practice due to biased training data, leading to discriminatory outcomes. In contemporary society, countering discrimination has become a consensus among people, leading the EU and the US to enact laws and regulations that prohibit discrimination based on factors such as gender, age, race, and religion. Consequently, addressing algorithmic discrimination has …


Adversarially Reweighted Sequence Anomaly Detection With Limited Log Data, Kevin Vulcano Dec 2023

Adversarially Reweighted Sequence Anomaly Detection With Limited Log Data, Kevin Vulcano

All Graduate Theses and Dissertations, Fall 2023 to Present

In the realm of safeguarding digital systems, the ability to detect anomalies in log sequences is paramount, with applications spanning cybersecurity, network surveillance, and financial transaction monitoring. This thesis presents AdvSVDD, a sophisticated deep learning model designed for sequence anomaly detection. Built upon the foundation of Deep Support Vector Data Description (Deep SVDD), AdvSVDD stands out by incorporating Adversarial Reweighted Learning (ARL) to enhance its performance, particularly when confronted with limited training data. By leveraging the Deep SVDD technique to map normal log sequences into a hypersphere and harnessing the amplification effects of Adversarial Reweighted Learning, AdvSVDD demonstrates remarkable efficacy …


Classification Of Chronic Pain Using Fmri Data: Unveiling Brain Activity Patterns For Diagnosis, Rejula V, Anitha J, Belfin Robinson Oct 2023

Classification Of Chronic Pain Using Fmri Data: Unveiling Brain Activity Patterns For Diagnosis, Rejula V, Anitha J, Belfin Robinson

Turkish Journal of Electrical Engineering and Computer Sciences

Millions of people throughout the world suffer from the complicated and crippling condition of chronic pain. It can be brought on by several underlying disorders or injuries and is defined by chronic pain that lasts for a period exceeding three months. To better understand the brain processes behind pain and create prediction models for pain-related outcomes, machine learning is a potent technology that may be applied in Functional magnetic resonance imaging (fMRI) chronic pain research. Data (fMRI and T1-weighted images) from 76 participants has been included (30 chronic pain and 46 healthy controls). The raw data were preprocessed using fMRIprep …


Deep Feature Extraction, Dimensionality Reduction, And Classification Of Medical Images Using Combined Deep Learning Architectures, Autoencoder, And Multiple Machine Learning Models, Ahmet Hi̇dayet Ki̇raz, Fatime Oumar Djibrillah, Mehmet Emi̇n Yüksel Oct 2023

Deep Feature Extraction, Dimensionality Reduction, And Classification Of Medical Images Using Combined Deep Learning Architectures, Autoencoder, And Multiple Machine Learning Models, Ahmet Hi̇dayet Ki̇raz, Fatime Oumar Djibrillah, Mehmet Emi̇n Yüksel

Turkish Journal of Electrical Engineering and Computer Sciences

Accurate analysis and classification of medical images are essential factors in clinical decision-making and patient care. A novel comparative approach for medical image classification is proposed in this study. This new approach involves several steps: deep feature extraction, which extracts the informative features from medical images; concatenation, which concatenates the extracted deep features to form a robust feature vector; dimensionality reduction with autoencoder, which reduces the dimensionality of the feature vector by transforming it into a different feature space with a lower dimension; and finally, these features obtained from all these steps were fed into multiple machine learning classifiers (SVM, …


Cognitive Digital Modelling For Hyperspectral Image Classification Using Transfer Learning Model, Mohammad Shabaz, Mukesh Soni Oct 2023

Cognitive Digital Modelling For Hyperspectral Image Classification Using Transfer Learning Model, Mohammad Shabaz, Mukesh Soni

Turkish Journal of Electrical Engineering and Computer Sciences

Deep convolutional neural networks can fully use the intrinsic relationship between features and improve the separability of hyperspectral images, which has received extensive in recent years. However, the need for a large number of labelled samples to train deep network models limits the application of such methods. The idea of transfer learning is introduced into remote sensing image classification to reduce the need for the number of labelled samples. In particular, the situation in which each class in the target picture only has one labelled sample is investigated. In the target domain, the number of training samples is enlarged by …


Stepwise Dynamic Nearest Neighbor (Sdnn): A New Algorithm For Classification, Deni̇z Karabaş, Derya Bi̇rant, Peli̇n Yildirim Taşer Sep 2023

Stepwise Dynamic Nearest Neighbor (Sdnn): A New Algorithm For Classification, Deni̇z Karabaş, Derya Bi̇rant, Peli̇n Yildirim Taşer

Turkish Journal of Electrical Engineering and Computer Sciences

Although the standard k-nearest neighbor (KNN) algorithm has been used widely for classification in many different fields, it suffers from various limitations that abate its classification ability, such as being influenced by the distribution of instances, ignoring distances between the test instance and its neighbors during classification, and building a single/weak learner. This paper proposes a novel algorithm, called stepwise dynamic nearest neighbor (SDNN), which can effectively handle these problems. Instead of using a fixed parameter k like KNN, it uses a dynamic neighborhood strategy according to the data distribution and implements a new voting mechanism, called stepwise voting. Experimental …


A Machine Learning Approach For Dyslexia Detection Using Turkish Audio Records, Tuğberk Taş, Muhammed Abdullah Bülbül, Abas Haşi̇moğlu, Yavuz Meral, Yasi̇n Çalişkan, Gunay Budagova, Mücahi̇d Kutlu Sep 2023

A Machine Learning Approach For Dyslexia Detection Using Turkish Audio Records, Tuğberk Taş, Muhammed Abdullah Bülbül, Abas Haşi̇moğlu, Yavuz Meral, Yasi̇n Çalişkan, Gunay Budagova, Mücahi̇d Kutlu

Turkish Journal of Electrical Engineering and Computer Sciences

Dyslexia is a learning disorder, characterized by impairment in the ability to read, spell, and decode letters. It is vital to detect dyslexia in earlier stages to reduce its effects. However, diagnosing dyslexia is a time-consuming and costly process. In this paper, we propose a machine-learning model that predicts whether a Turkish-speaking child has dyslexia using his/her audio records. Therefore, our model can be easily used by smart phones and work as a warning system such that children who are likely to be dyslexic according to our model can seek an examination by experts. In order to train and evaluate, …


Reu-Deim Classification Of Hispanic Voters In Hispanic Groups Using Name And Zip Code Data In Palm Beach, Florida, Kamila Soto-Ortiz Sep 2023

Reu-Deim Classification Of Hispanic Voters In Hispanic Groups Using Name And Zip Code Data In Palm Beach, Florida, Kamila Soto-Ortiz

Beyond: Undergraduate Research Journal

When it comes to registering to vote, Hispanic voters can only register as “Hispanic” in the “Race/Ethnicity” category, causing difficulties when analyzing voting trends amongst the Hispanic community. Upon the recent idea that not all Hispanic Groups vote the same, the goal is to create a model that can possibly identify a voter’s Hispanic Group with the information provided on the public Florida voter file. This is accomplished using name and zip code data for all voters in Palm Beach, Florida. This paper will explore the model implemented, its findings and limitations. Palm Beach, Florida, is met with low confidence …


Compatibility Of Clique Clustering Algorithm With Dimensionality Reduction, Ug ̆Ur Madran, Duygu Soyog ̆Lu Sep 2023

Compatibility Of Clique Clustering Algorithm With Dimensionality Reduction, Ug ̆Ur Madran, Duygu Soyog ̆Lu

Applied Mathematics & Information Sciences

In our previous work, we introduced a clustering algorithm based on clique formation. Cliques, the obtained clusters, are constructed by choosing the most dense complete subgraphs by using similarity values between instances. The clique algorithm successfully reduces the number of instances in a data set without substantially changing the accuracy rate. In this current work, we focused on reducing the number of features. For this purpose, the effect of the clique clustering algorithm on dimensionality reduction has been analyzed. We propose a novel algorithm for support vector machine classification by combining these two techniques and applying different strategies by differentiating …


Fine-Grained In-Context Permission Classification For Android Apps Using Control-Flow Graph Embedding, Vikas Kumar Malviya, Naing Tun Yan, Chee Wei Leow, Ailys Xynyn Tee, Lwin Khin Shar, Lingxiao Jiang Sep 2023

Fine-Grained In-Context Permission Classification For Android Apps Using Control-Flow Graph Embedding, Vikas Kumar Malviya, Naing Tun Yan, Chee Wei Leow, Ailys Xynyn Tee, Lwin Khin Shar, Lingxiao Jiang

Research Collection School Of Computing and Information Systems

Android is the most popular operating system for mobile devices nowadays. Permissions are a very important part of Android security architecture. Apps frequently need the users’ permission, but many of them only ask for it once—when the user uses the app for the first time—and then they keep and abuse the given permissions. Longing to enhance Android permission security and users’ private data protection is the driving factor behind our approach to explore fine-grained contextsensitive permission usage analysis and thereby identify misuses in Android apps. In this work, we propose an approach for classifying the fine-grained permission uses for each …


Intrusion Detection: Machine Learning Techniques For Software Defined Networks, Jacob S. Rodriguez Aug 2023

Intrusion Detection: Machine Learning Techniques For Software Defined Networks, Jacob S. Rodriguez

Masters Theses

In recent years, software defined networking (SDN) has gained popularity as a novel approach towards network management and architecture. Compared to traditional network architectures, this software-based approach offers greater flexibility, programmability, and automation. However, despite the advantages of this system, there still remains the possibility that it could be compromised. As we continue to explore new approaches to network management, we must also develop new ways of protecting those systems from threats. Throughout this paper, I will describe and test a network intrusion detection system (NIDS), and how it can be implemented within a software defined network. This system will …


Improving Xrd Analysis With Machine Learning, Rachel E. Drapeau Aug 2023

Improving Xrd Analysis With Machine Learning, Rachel E. Drapeau

Theses and Dissertations

X-ray diffraction analysis (XRD) is an inexpensive method to quantify the relative proportions of mineral phases in a rock or soil sample. However, the analytical software available for XRD requires extensive user input to choose phases to include in the analysis. Consequently, analysis accuracy depends greatly on the experience of the analyst, especially as the number of phases in a sample increases (Raven & Self, 2017; Omotoso, 2006). The purpose of this project is to test whether incorporating machine learning methods into XRD software can improve the accuracy of analyses by assisting in the phase-picking process. In order to provide …


Comparative Study Of Supervised Classification Techniques With A Modified Knn Algorithm, Noah Owusu Aug 2023

Comparative Study Of Supervised Classification Techniques With A Modified Knn Algorithm, Noah Owusu

Open Access Theses & Dissertations

The goal of classification is to develop a model that can be used to accurately assign new observations to labeled classes based on the patterns learned from the training data. K-nearest Neighbors algorithm (KNN) is a popular and widely used algorithm for classification, however, its performance can be adversely affected by the presence of outliers in a dataset. In this study we have modified this existing KNN algorithm that can alleviate the effect of outliers in a dataset, thereby improving the performance of the KNN algorithm. We compared the performances of the Modified KNN method and the Existing KNN algorithm …


Mathematics Behind Machine Learning, Rim Hammoud Aug 2023

Mathematics Behind Machine Learning, Rim Hammoud

Electronic Theses, Projects, and Dissertations

Artificial intelligence (AI) is a broad field of study that involves developing intelligent
machines that can perform tasks that typically require human intelligence. Machine
learning (ML) is often used as a tool to help create AI systems. The goal of ML is
to create models that can learn and improve to make predictions or decisions based on given data. The goal of this thesis is to build a clear and rigorous exposition of the mathematical underpinnings of support vector machines (SVM), a popular platform used in ML. As we will explore later on in the thesis, SVM can be implemented …


Deep Learning-Based Diagnosis Of Disease Activity In Patients With Graves’ Orbitopathy Using Orbital Spect/Ct, Ni Yao, Longxi Li, Zhengyuan Gao, Chen Zhao, Yanting Li, Chuang Han, Jiaofen Nan, Zelin Zhu, Yi Xiao, Fubao Zhu, Min Zhao, Weihua Zhou Jul 2023

Deep Learning-Based Diagnosis Of Disease Activity In Patients With Graves’ Orbitopathy Using Orbital Spect/Ct, Ni Yao, Longxi Li, Zhengyuan Gao, Chen Zhao, Yanting Li, Chuang Han, Jiaofen Nan, Zelin Zhu, Yi Xiao, Fubao Zhu, Min Zhao, Weihua Zhou

Michigan Tech Publications

Purpose: Orbital [99mTc]TcDTPA orbital single-photon emission computed tomography (SPECT)/CT is an important method for assessing inflammatory activity in patients with Graves’ orbitopathy (GO). However, interpreting the results requires substantial physician workload. We aim to propose an automated method called GO-Net to detect inflammatory activity in patients with GO. Materials and methods: GO-Net had two stages: (1) a semantic V-Net segmentation network (SV-Net) that extracts extraocular muscles (EOMs) in orbital CT images and (2) a convolutional neural network (CNN) that uses SPECT/CT images and the segmentation results to classify inflammatory activity. A total of 956 eyes from 478 patients with GO …


A Practical Framework For Early Detection Of Diabetes Using Ensemble Machine Learning Models, Qusay Saihood, Emrullah Sonuç Jul 2023

A Practical Framework For Early Detection Of Diabetes Using Ensemble Machine Learning Models, Qusay Saihood, Emrullah Sonuç

Turkish Journal of Electrical Engineering and Computer Sciences

The diagnosis of diabetes, a prevalent global health condition, is crucial for preventing severe complications. In recent years, there has been a growing effort to develop intelligent diagnostic systems for diabetes utilizing machine learning (ML) algorithms. Despite these efforts, achieving high accuracy rates using such systems remains a significant challenge. Recent advancements in ensemble ML methods offer promising opportunities for early detection of diabetes, as they are known to be faster and more cost-effective than traditional approaches. Therefore, this study proposes a practical framework for diagnosing diabetes that involves three stages. The data preprocessing stage encompasses several crucial tasks, including …


Sarcasm Detection In English And Arabic Tweets Using Transformer Models, Rishik Lad Jun 2023

Sarcasm Detection In English And Arabic Tweets Using Transformer Models, Rishik Lad

Computer Science Senior Theses

This thesis describes our approach toward the detection of sarcasm and its various types in English and Arabic Tweets through methods in deep learning. There are five problems we attempted: (1) detection of sarcasm in English Tweets, (2) detection of sarcasm in Arabic Tweets, (3) determining the type of sarcastic speech subcategory for English Tweets, (4) determining which of two semantically equivalent English Tweets is sarcastic, and (5) determining which of two semantically equivalent Arabic Tweets is sarcastic. All tasks were framed as classification problems, and our contributions are threefold: (a) we developed an English binary classifier system with RoBERTa, …


Why Softmax? Because It Is The Only Consistent Approach To Probability-Based Classification, Anatole Lokshin, Vladik Kreinovich Jun 2023

Why Softmax? Because It Is The Only Consistent Approach To Probability-Based Classification, Anatole Lokshin, Vladik Kreinovich

Departmental Technical Reports (CS)

In many practical problems, the most effective classification techniques are based on deep learning. In this approach, once the neural network generates values corresponding to different classes, these values are transformed into probabilities by using the softmax formula. Researchers tried other transformation, but they did not work as well as softmax. A natural question is: why is softmax so effective? In this paper, we provide a possible explanation for this effectiveness: namely, we prove that softmax is the only consistent approach to probability-based classification. In precise terms, it is the only approach for which two reasonable probability-based ideas -- Least …


Using Deep Neural Networks To Classify Astronomical Images, Andrew D. Macpherson May 2023

Using Deep Neural Networks To Classify Astronomical Images, Andrew D. Macpherson

Honors Projects

As the quantity of astronomical data available continues to exceed the resources available for analysis, recent advances in artificial intelligence encourage the development of automated classification tools. This paper lays out a framework for constructing a deep neural network capable of classifying individual astronomical images by describing techniques to extract and label these objects from large images.


Enhancing Health Tweet Classification: An Evaluation Of Transformer-Based Models For Comprehensive Analysis, Foram Pankajbhai Patel May 2023

Enhancing Health Tweet Classification: An Evaluation Of Transformer-Based Models For Comprehensive Analysis, Foram Pankajbhai Patel

Computer Science and Engineering Theses

The task of health tweet classification entails identifying whether a given tweet is health-related or not. While existing research in this area has made significant progress in classifying tweets into specific sub-domains of health, such as mental health, COVID-19, or specific diseases, there is a need for a more comprehensive approach that considers a broader range of health-related topics. This thesis addresses this need by proposing a diverse and comprehensive dataset that includes various existing health-related datasets, data collected through a keyword-based approach, and manually annotated data. However, the use of health-related keywords in a figurative or non-health context poses …


A Classification Of Tensors In Ecsk Theory, Joshua James Leiter May 2023

A Classification Of Tensors In Ecsk Theory, Joshua James Leiter

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

You might have heard of Einstein’s theory of General relativity (GR): it is the one where mass and energy curve the fabric of spacetime to create gravity. This is the major theory which allows communication through satellites and our GPS to work too! Wormholes have interested me, but there are some issues about forming them in GR. Interestingly enough, elementary particles are also characterized by their spin in the standard model. However, intrinsic spin is nowhere geometrically coupled to the geometry of spacetime in Einstein’s theory. Later, Élie Cartan, Dennis Sciama, and Tom Kibble all flushed out adding different aspects …


Deep Learning With Attention Mechanisms In Breast Ultrasound Image Segmentation And Classification, Meng Xu May 2023

Deep Learning With Attention Mechanisms In Breast Ultrasound Image Segmentation And Classification, Meng Xu

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

Breast cancer is a great threat to women’s health. Breast ultrasound (BUS) imaging is commonly used in the early detection of breast cancer as a portable, valuable, and widely available diagnosis tool. Automated BUS image analysis can assist radiologists in making accurate and fast decisions. Generally, automated BUS image analysis includes BUS image segmentation and classification. BUS image segmentation automatically extracts tumor regions from a BUS image. BUS image classification automatically classifies breast tumors into benign or malignant categories. Multi-task learning accomplishes segmentation and classification simultaneously, which makes it more appealing and practical than an either individual task. Deep neural …


Convolutional Neural Networks Analysis Reveals Three Possible Sources Of Bronze Age Writings Between Greece And India, Shruti Daggumati, Peter Z. Revesz Apr 2023

Convolutional Neural Networks Analysis Reveals Three Possible Sources Of Bronze Age Writings Between Greece And India, Shruti Daggumati, Peter Z. Revesz

School of Computing: Faculty Publications

This paper analyzes the relationships among eight ancient scripts from between Greece and India. We used convolutional neural networks combined with support vector machines to give a numerical rating of the similarity between pairs of signs (one sign from each of two different scripts). Two scripts that had a one-to-one matching of their signs were determined to be related. The result of the analysis is the finding of the following three groups, which are listed in chronological order: (1) Sumerian pictograms, the Indus Valley script, and the proto-Elamite script; (2) Cretan hieroglyphs and Linear B; and (3) the Phoenician, Greek, …


Domain Specific Analysis Of Privacy Practices And Concerns In The Mobile Application Market, Fahimeh Ebrahimi Meymand Apr 2023

Domain Specific Analysis Of Privacy Practices And Concerns In The Mobile Application Market, Fahimeh Ebrahimi Meymand

LSU Doctoral Dissertations

Mobile applications (apps) constantly demand access to sensitive user information in exchange for more personalized services. These-mostly unjustified-data collection tactics have raised major privacy concerns among mobile app users. Existing research on mobile app privacy aims to identify these concerns, expose apps with malicious data collection practices, assess the quality of apps' privacy policies, and propose automated solutions for privacy leak detection and prevention. However, existing solutions are generic, frequently missing the contextual characteristics of different application domains. To address these limitations, in this dissertation, we study privacy in the app store at a domain level. Our objective is to …


Modeling The Probability Of A Successful Stolen Base Attempt In Major League Baseball, Cade Stanley Apr 2023

Modeling The Probability Of A Successful Stolen Base Attempt In Major League Baseball, Cade Stanley

Senior Theses

In Major League Baseball (MLB), the outcome of a stolen base attempt has important implications. Success moves the runner closer to scoring, while failure records an out and removes the runner from the basepaths altogether. Therefore, it is important that the decision by a coach or player to steal a base is well-informed. In this thesis, I explore a statistical approach to making this decision. I train logistic regression and random forest models, using data about the game situation and about the runner, pitcher, and catcher involved in the stolen base attempt, to estimate the probability that a stolen base …


Integrating And Optimizing Genomic, Weather, And Secondary Trait Data For Multiclass Classification, Vamsi Manthena, Diego Jarquín, Reka Howard Mar 2023

Integrating And Optimizing Genomic, Weather, And Secondary Trait Data For Multiclass Classification, Vamsi Manthena, Diego Jarquín, Reka Howard

Department of Statistics: Faculty Publications

Modern plant breeding programs collect several data types such as weather, images, and secondary or associated traits besides the main trait (e.g., grain yield). Genomic data is high-dimensional and often over-crowds smaller data types when naively combined to explain the response variable. There is a need to develop methods able to effectively combine different data types of differing sizes to improve predictions. Additionally, in the face of changing climate conditions, there is a need to develop methods able to effectively combine weather information with genotype data to predict the performance of lines better. In this work, we develop a novel …