Open Access. Powered by Scholars. Published by Universities.®
- Institution
-
- Southern Methodist University (10)
- City University of New York (CUNY) (5)
- Illinois State University (5)
- Western University (5)
- University of Central Florida (4)
-
- Claremont Colleges (3)
- University of Kentucky (3)
- University of Massachusetts Amherst (3)
- University of New Mexico (3)
- East Tennessee State University (2)
- Kennesaw State University (2)
- Louisiana State University (2)
- Murray State University (2)
- Purdue University (2)
- South Dakota State University (2)
- The University of Southern Mississippi (2)
- University of New Hampshire (2)
- Virginia Commonwealth University (2)
- American University in Cairo (1)
- Bowdoin College (1)
- Bowling Green State University (1)
- Brigham Young University (1)
- California Polytechnic State University, San Luis Obispo (1)
- Chinese Academy of Sciences (1)
- Dartmouth College (1)
- Embry-Riddle Aeronautical University (1)
- Florida International University (1)
- Georgia Southern University (1)
- Kansas State University Libraries (1)
- Michigan Technological University (1)
- Keyword
-
- Machine learning (8)
- Statistics (6)
- Data Science (4)
- Classification (3)
- Deep Learning (3)
-
- Deep learning (3)
- Logistic regression (3)
- Machine Learning (3)
- Sports (3)
- Analytics (2)
- Biostatistics (2)
- CNN (2)
- COVID-19 (2)
- Computer vision (2)
- Covid-19 (2)
- Data analysis (2)
- LDA (2)
- Neural Network (2)
- Parameter estimation (2)
- Probability (2)
- Regression (2)
- Statistical analysis (2)
- Time Series (2)
- Time series analysis (2)
- Time-series (2)
- Unsupervised learning (2)
- 1 (1)
- 1-D (1)
- 100% renewable energy (1)
- ARIMA (1)
- Publication
-
- SMU Data Science Review (10)
- Annual Symposium on Biomathematics and Ecology Education and Research (4)
- Data Science and Data Mining (4)
- Dissertations, Theses, and Capstone Projects (3)
- Doctoral Dissertations (3)
-
- Electronic Theses and Dissertations (3)
- Electronic Thesis and Dissertation Repository (3)
- Honors Projects (3)
- Theses and Dissertations (3)
- Theses and Dissertations--Statistics (3)
- CMC Senior Theses (2)
- Dissertations (2)
- Honors College Theses (2)
- Honors Theses and Capstones (2)
- LSU Doctoral Dissertations (2)
- Mathematics & Statistics ETDs (2)
- Publications and Research (2)
- SDSU Data Science Symposium (2)
- Symposium of Student Scholars (2)
- Undergraduate Student Research Internships Conference (2)
- Access*: Interdisciplinary Journal of Student Research and Scholarship (1)
- Biology and Medicine Through Mathematics Conference (1)
- Bulletin of Chinese Academy of Sciences (Chinese Version) (1)
- Dartmouth College Master’s Theses (1)
- Department of Statistics: Dissertations, Theses, and Student Work (1)
- Dissertations, Master's Theses and Master's Reports (1)
- Electrical and Computer Engineering ETDs (1)
- FIU Electronic Theses and Dissertations (1)
- Graduate Student Theses, Dissertations, & Professional Papers (1)
- Graduate Theses, Dissertations, and Problem Reports (1)
- Publication Type
- File Type
Articles 1 - 30 of 86
Full-Text Articles in Statistical Models
Research On Chinese Data Sovereignty Policy Based On Lda Model And Policy Instruments, Han Qiao, Junru Xu
Research On Chinese Data Sovereignty Policy Based On Lda Model And Policy Instruments, Han Qiao, Junru Xu
Bulletin of Chinese Academy of Sciences (Chinese Version)
Data sovereignty has become an important component of national sovereignty in the dual context of the digital economy development and the overall national security concept. Major countries and regions are actively carrying out data sovereignty strategic deployment and engaging in fierce competition in data resources, data technology, and data rules. This work adopts the policy text analysis method to study China’s data sovereignty policy, and employs the LDA model and policy instruments to quantitatively analyze the process evolution and thematic characteristics of China’s data sovereignty policy. Drawing on these findings, this study comprehensively considers the global data sovereignty policy and …
Predicting Crop Yield Using Remote Sensing Data, Mary Row, Jung-Han Kimn, Hossein Moradi
Predicting Crop Yield Using Remote Sensing Data, Mary Row, Jung-Han Kimn, Hossein Moradi
SDSU Data Science Symposium
Accurate crop yield predictions can help farmers make adjustments or changes in their farming practices to optimize their harvest. Remote sensing data is an inexpensive approach to collecting massive amounts of data that could be utilized for predicting crop yield. This study employed linear regression and spatial linear models were used to predict soybean yield with data from Landsat 8 OLI. Each model was built using only spectral bands of the satellite, only vegetation indices, and both spectral bands and vegetation indices. All analysis was based on data collected from two fields in South Dakota from the 2019 and 2021 …
Making Sense Of Making Parole In New York, Alexandra Mcglinchy
Making Sense Of Making Parole In New York, Alexandra Mcglinchy
Dissertations, Theses, and Capstone Projects
For many individuals incarcerated in New York, the initial step toward freedom begins with an interview with the Board of Parole. This process, however, is frequently a complex and challenging one, characterized by repeated denials and extended incarcerations. The disparity in outcomes – where one individual may receive over 20 denials and another is granted parole on their first attempt – highlights the ambiguity and inconsistency in the parole decision-making process. This project aims to clarify the factors that influence parole decisions by concentrating on measurable variables. These include age, race, duration of sentence served, proportion of sentence served, type …
Modeling Of Covid-19 Clinical Outcomes In Mexico: An Analysis Of Demographic, Clinical, And Chronic Disease Factors, Livia Clarete
Modeling Of Covid-19 Clinical Outcomes In Mexico: An Analysis Of Demographic, Clinical, And Chronic Disease Factors, Livia Clarete
Dissertations, Theses, and Capstone Projects
This study explores COVID-19 clinical outcomes in Mexico, focusing on demographic, clinical, and chronic disease variables to develop predictive models. In the binary classification task, the Ada Boost Classifier distinguishes survivors from non-survivors, with age, sex, ethnicity, and chronic medical conditions influencing outcomes. In multiclass classification, the Gradient Boosting Classifier categorizes patients into outcome groups.
Demographic variables, especially age, are crucial for predicting COVID-19 outcomes for both the binary and multiclass classification tasks. Clinical information about previous conditions, including chronic diseases, also holds relevance, especially diabetes, immunocompromise, and cardiovascular diseases. These insights inform public health measures and healthcare strategies, emphasizing …
Model Selection Through Cross-Validation For Supervised Learning Tasks With Manifold Data, Derek Brown
Model Selection Through Cross-Validation For Supervised Learning Tasks With Manifold Data, Derek Brown
The Journal of Purdue Undergraduate Research
No abstract provided.
Machine Learning Approaches For Cyberbullying Detection, Roland Fiagbe
Machine Learning Approaches For Cyberbullying Detection, Roland Fiagbe
Data Science and Data Mining
Cyberbullying refers to the act of bullying using electronic means and the internet. In recent years, this act has been identifed to be a major problem among young people and even adults. It can negatively impact one’s emotions and lead to adverse outcomes like depression, anxiety, harassment, and suicide, among others. This has led to the need to employ machine learning techniques to automatically detect cyberbullying and prevent them on various social media platforms. In this study, we want to analyze the combination of some Natural Language Processing (NLP) algorithms (such as Bag-of-Words and TFIDF) with some popular machine learning …
Predicting Superconducting Critical Temperature Using Regression Analysis, Roland Fiagbe
Predicting Superconducting Critical Temperature Using Regression Analysis, Roland Fiagbe
Data Science and Data Mining
This project estimates a regression model to predict the superconducting critical temperature based on variables extracted from the superconductor’s chemical formula. The regression model along with the stepwise variable selection gives a reasonable and good predictive model with a lower prediction error (MSE). Variables extracted based on atomic radius, valence, atomic mass and thermal conductivity appeared to have the most contribution to the predictive model.
A Bayesian Inversion For Emissions And Export Productivity Across The End-Cretaceous Boundary, Alexander A. Cox
A Bayesian Inversion For Emissions And Export Productivity Across The End-Cretaceous Boundary, Alexander A. Cox
Dartmouth College Master’s Theses
The end-Cretaceous mass extinction was marked by both the Chicxulub impact and the ongoing emplacement of the Deccan Traps flood basalt province. Both of these events perturbed the environment by the emission of climate-active volatiles, primarily CO2 and SO2. To understand the mechanism of extinction, we must disentangle the timing, duration, and intensity of volcanic and meteoritic environmental forcings. In this thesis, we used a parallel Markov chain Monte Carlo approach to invert for the aforementioned volatile emissions, export productivity, and remineralization from 67 to 65 million years ago using the LOSCAR (Long-term Ocean-atmosphere-Sediment CArbon cycle Reservoir) model. The parallel …
Reducing Food Scarcity: The Benefits Of Urban Farming, S.A. Claudell, Emilio Mejia
Reducing Food Scarcity: The Benefits Of Urban Farming, S.A. Claudell, Emilio Mejia
Journal of Nonprofit Innovation
Urban farming can enhance the lives of communities and help reduce food scarcity. This paper presents a conceptual prototype of an efficient urban farming community that can be scaled for a single apartment building or an entire community across all global geoeconomics regions, including densely populated cities and rural, developing towns and communities. When deployed in coordination with smart crop choices, local farm support, and efficient transportation then the result isn’t just sustainability, but also increasing fresh produce accessibility, optimizing nutritional value, eliminating the use of ‘forever chemicals’, reducing transportation costs, and fostering global environmental benefits.
Imagine Doris, who is …
Differentiation Of Human, Dog, And Cat Hair Fibers Using Dart Tofms And Machine Learning, Laura Ahumada, Erin R. Mcclure-Price, Chad Kwong, Edgard O. Espinoza, John Santerre
Differentiation Of Human, Dog, And Cat Hair Fibers Using Dart Tofms And Machine Learning, Laura Ahumada, Erin R. Mcclure-Price, Chad Kwong, Edgard O. Espinoza, John Santerre
SMU Data Science Review
Hair is found in over 90% of crime scenes and has long been analyzed as trace evidence. However, recent reviews of traditional hair fiber analysis techniques, primarily morphological examination, have cast doubt on its reliability. To address these concerns, this study employed machine learning algorithms, specifically Linear Discriminant Analysis (LDA) and Random Forest, on Direct Analysis in Real Time time-of-flight mass spectra collected from human, cat, and dog hair samples. The objective was to develop a chemistry- and statistics-based classification method for unbiased taxonomic identification of hair. The results of the study showed that LDA and Random Forest were highly …
Exploration And Statistical Modeling Of Profit, Caleb Gibson
Exploration And Statistical Modeling Of Profit, Caleb Gibson
Undergraduate Honors Theses
For any company involved in sales, maximization of profit is the driving force that guides all decision-making. Many factors can influence how profitable a company can be, including external factors like changes in inflation or consumer demand or internal factors like pricing and product cost. Understanding specific trends in one's own internal data, a company can readily identify problem areas or potential growth opportunities to help increase profitability.
In this discussion, we use an extensive data set to examine how a company might analyze their own data to identify potential changes the company might investigate to drive better performance. Based …
The Impacts Of The Covid-19 Pandemic On Mental Health Across Different Genders And Sexualities, Jiale Zhu, Jonas Katona
The Impacts Of The Covid-19 Pandemic On Mental Health Across Different Genders And Sexualities, Jiale Zhu, Jonas Katona
Undergraduate Research Journal for the Human Sciences
Current studies report an increase in psychological distress as a result of the COVID-19 pandemic. This study is interested in examining mental health disparities and how the COVID-19 pandemic has disproportionately impacted marginalized groups—and more specifically, those identified by sex, gender, and sexuality—compared with the general population. This study also considers the effects and ramifications of different policy measures taken during the course of the pandemic. We perform exploratory data modeling and analysis on several important and publicly available datasets taken during the pandemic on mental health and COVID-19 infection data across various identity groups to look for significant disparities, …
The Double Edged Sword Of The Pandemic: Exploring Associations Between Covid-19 And Social Isolation In The Usa, Alexander Fulk
The Double Edged Sword Of The Pandemic: Exploring Associations Between Covid-19 And Social Isolation In The Usa, Alexander Fulk
Annual Symposium on Biomathematics and Ecology Education and Research
No abstract provided.
Mathematical Modeling Of The Impact Of Lobbying On Climate Policy, Andrew Jacoby, Claire Hannah, James Hutchinson, Jasmine Narehood, Aditi Ghosh, Padmanabhan Seshaiyer
Mathematical Modeling Of The Impact Of Lobbying On Climate Policy, Andrew Jacoby, Claire Hannah, James Hutchinson, Jasmine Narehood, Aditi Ghosh, Padmanabhan Seshaiyer
Annual Symposium on Biomathematics and Ecology Education and Research
No abstract provided.
Deep Q-Learning Framework For Quantitative Climate Change Adaptation Policy For Florida Road Network Due To Extreme Precipitation, Orhun Aydin
I-GUIDE Forum
Climate change-induced extreme weather and increasing population are increasing the pressure on the global aging road networks. Adaptation requires designing interventions and alterations to the road networks that consider future dynamics of flooding and increased traffic due to the growing population. This paper introduces a reinforcement learning approach to designing interventions for Florida's road network under future traffic and climate projections. Three climate models and a tide and surge model are used to create flooding and coastal inundation projections, respectively. The optimal sequence of decisions for adapting Florida's road network to minimize flooding-related disruptions is solved by using a graph-based …
Using Geographic Information To Explore Player-Specific Movement And Its Effects On Play Success In The Nfl, Hayley Horn, Eric Laigaie, Alexander Lopez, Shravan Reddy
Using Geographic Information To Explore Player-Specific Movement And Its Effects On Play Success In The Nfl, Hayley Horn, Eric Laigaie, Alexander Lopez, Shravan Reddy
SMU Data Science Review
American Football is a billion-dollar industry in the United States. The analytical aspect of the sport is an ever-growing domain, with open-source competitions like the NFL Big Data Bowl accelerating this growth. With the amount of player movement during each play, tracking data can prove valuable in many areas of football analytics. While concussion detection, catch recognition, and completion percentage prediction are all existing use cases for this data, player-specific movement attributes, such as speed and agility, may be helpful in predicting play success. This research calculates player-specific speed and agility attributes from tracking data and supplements them with descriptive …
Forecasting Covid-19 With Temporal Hierarchies And Ensemble Methods, Li Shandross
Forecasting Covid-19 With Temporal Hierarchies And Ensemble Methods, Li Shandross
Masters Theses
Infectious disease forecasting efforts underwent rapid growth during the COVID-19 pandemic, providing guidance for pandemic response and about potential future trends. Yet despite their importance, short-term forecasting models often struggled to produce accurate real-time predictions of this complex and rapidly changing system. This gap in accuracy persisted into the pandemic and warrants the exploration and testing of new methods to glean fresh insights.
In this work, we examined the application of the temporal hierarchical forecasting (THieF) methodology to probabilistic forecasts of COVID-19 incident hospital admissions in the United States. THieF is an innovative forecasting technique that aggregates time-series data into …
Addressing The Impact Of Time-Dependent Social Groupings On Animal Survival And Recapture Rates In Mark-Recapture Studies, Alexandru M. Draghici
Addressing The Impact Of Time-Dependent Social Groupings On Animal Survival And Recapture Rates In Mark-Recapture Studies, Alexandru M. Draghici
Electronic Thesis and Dissertation Repository
Mark-recapture (MR) models typically assume that individuals under study have independent survival and recapture outcomes. One such model of interest is known as the Cormack-Jolly-Seber (CJS) model. In this dissertation, we conduct three major research projects focused on studying the impact of violating the independence assumption in MR models along with presenting extensions which relax the independence assumption. In the first project, we conduct a simulation study to address the impact of failing to account for pair-bonded animals having correlated recapture and survival fates on the CJS model. We examined the impact of correlation on the likelihood ratio test (LRT), …
Analytical Approach For Monitoring The Behavior Of Patients With Pancreatic Adenocarcinoma At Different Stages As A Function Of Time, Aditya Chakaborty Dr, Chris P. Tsokos Dr
Analytical Approach For Monitoring The Behavior Of Patients With Pancreatic Adenocarcinoma At Different Stages As A Function Of Time, Aditya Chakaborty Dr, Chris P. Tsokos Dr
Biology and Medicine Through Mathematics Conference
No abstract provided.
Movie Recommender System Using Matrix Factorization, Roland Fiagbe
Movie Recommender System Using Matrix Factorization, Roland Fiagbe
Data Science and Data Mining
Recommendation systems are a popular and beneficial field that can help people make informed decisions automatically. This technique assists users in selecting relevant information from an overwhelming amount of available data. When it comes to movie recommendations, two common methods are collaborative filtering, which compares similarities between users, and content-based filtering, which takes a user’s specific preferences into account. However, our study focuses on the collaborative filtering approach, specifically matrix factorization. Various similarity metrics are used to identify user similarities for recommendation purposes. Our project aims to predict movie ratings for unwatched movies using the MovieLens rating dataset. We developed …
A Probabilistic Exploration Of Food Supplementation And Assistance, Logan Mattingly
A Probabilistic Exploration Of Food Supplementation And Assistance, Logan Mattingly
Honors College Theses
Food insecurity is a stark threat that grips our country and affects households throughout our country. Dietary insufficiency manifests itself in ways that affect health and public safety. According to researchers, individuals who suffer from food insecurity have a higher risk of aggression, anxiety, suicide ideation and depression. These problems tend to occur unequally distributed among those households with lower income. In this work, an exploratory analysis within these data sets will be performed to examine the socio-economic, biographical, nutritional, and geographical principal components of food insecurity among survey participants and how the US Supplemental Nutrition Assistance Program (SNAP) effects …
Multidimensional Investigation Of Tennessee’S Urban Forest, Jillian L. Gorrell
Multidimensional Investigation Of Tennessee’S Urban Forest, Jillian L. Gorrell
Doctoral Dissertations
Preserving existing trees in urban areas and properly cultivating urban forest conservation and management opportunities is valuable to the ever-growing urban environment and necessary for creating optimal experiences and educational tools to meet the needs of increasing urban populations. This dissertation contains studies investigating several facets of the urban forest, including environmental effects of deforestation and urbanization, tree equity, and urban forest facility management and accessibility. Community education and outreach at arboreta about the importance of the tree canopy can help promote environmental stewardship. A digital questionnaire was electronically distributed to representatives of arboreta certified through the Tennessee Division of …
Time Series Analysis Of Longitudinally Collected Standard Autoperimetry Data In Glaucoma Patients, Carlyn Childress
Time Series Analysis Of Longitudinally Collected Standard Autoperimetry Data In Glaucoma Patients, Carlyn Childress
Honors College Theses
Glaucoma is a group of eye diseases in which damage gradually occurs to the optic nerve, which often leads to partial or complete loss of vision. As the second leading cause of blindness, there is no cure for glaucoma. Early detection and the tracking of its progression is key to managing the effects of glaucoma. Ordinary Least Squares Regression (OLSR), the most commonly used methodology for tracking glaucoma progression, is inappropriate as the longitudinally collected perimetry data from the glaucoma patients appears to be temporally correlated. Time series models, that account for temporal correlation, are better methods to analyze Mean …
Employee Attrition: Analyzing Factors Influencing Job Satisfaction Of Ibm Data Scientists, Graham Nash
Employee Attrition: Analyzing Factors Influencing Job Satisfaction Of Ibm Data Scientists, Graham Nash
Symposium of Student Scholars
Employee attrition is a relevant issue that every business employer must consider when gauging the effectiveness of their employees. Whether or not an employee chooses to leave their job can come from a multitude of factors. As a result, employers need to develop methods in which they can measure attrition by calculating the several qualities of their employees. Factors like their age, years with the company, which department they work in, their level of education, their job role, and even their marital status are all considered by employers to assist in predicting employee attrition. This project will be analyzing a …
Bridging The Chasm Between Fundamental, Momentum, And Quantitative Investing, Allen Hoskins, Jeff Reed, Robert Slater
Bridging The Chasm Between Fundamental, Momentum, And Quantitative Investing, Allen Hoskins, Jeff Reed, Robert Slater
SMU Data Science Review
A chasm exists between the active public equity investment management industry's fundamental, momentum, and quantitative styles. In this study, the researchers explore ways to bridge this gap by leveraging domain knowledge, fundamental analysis, momentum, crowdsourcing, and data science methods. This research also seeks to test the developed tools and strategies during the volatile time period of 2020 and 2021.
Comparison Of Sampling Methods For Predicting Wine Quality Based On Physicochemical Properties, Robert Burigo, Scott Frazier, Eli Kravez, Nibhrat Lohia
Comparison Of Sampling Methods For Predicting Wine Quality Based On Physicochemical Properties, Robert Burigo, Scott Frazier, Eli Kravez, Nibhrat Lohia
SMU Data Science Review
Using the physicochemical properties of wine to predict quality has been done in numerous studies. Given the nature of these properties, the data is inherently skewed. Previous works have focused on handful of sampling techniques to balance the data. This research compares multiple sampling techniques in predicting the target with limited data. For this purpose, an ensemble model is used to evaluate the different techniques. There was no evidence found in this research to conclude that there are specific oversampling methods that improve random forest classifier for a multi-class problem.
Self-Learning Algorithms For Intrusion Detection And Prevention Systems (Idps), Juan E. Nunez, Roger W. Tchegui Donfack, Rohit Rohit, Hayley Horn
Self-Learning Algorithms For Intrusion Detection And Prevention Systems (Idps), Juan E. Nunez, Roger W. Tchegui Donfack, Rohit Rohit, Hayley Horn
SMU Data Science Review
Today, there is an increased risk to data privacy and information security due to cyberattacks that compromise data reliability and accessibility. New machine learning models are needed to detect and prevent these cyberattacks. One application of these models is cybersecurity threat detection and prevention systems that can create a baseline of a network's traffic patterns to detect anomalies without needing pre-labeled data; thus, enabling the identification of abnormal network events as threats. This research explored algorithms that can help automate anomaly detection on an enterprise network using Canadian Institute for Cybersecurity data. This study demonstrates that Neural Networks with Bayesian …
Analyzing Relationships With Machine Learning, Oscar Ko
Analyzing Relationships With Machine Learning, Oscar Ko
Dissertations, Theses, and Capstone Projects
Procedurally, this project aims to take a dataset, analyze it, and offer insights to the audience in an easy-to-digest format. Conceptually, this project will seek to explore questions like: “Do couples that meet through online dating or dating apps have higher or lower quality relationships?”, “Can any features in this dataset help predict how a subject would rate their relationship quality?”, and “What other insights can I derive from using machine learning for exploratory analysis?” The intended audience for this project is anyone interested in romantic relationships or machine learning.
The dataset is from a Stanford University survey, “How Couples …
Classification Of Adult Income Using Decision Tree, Roland Fiagbe
Classification Of Adult Income Using Decision Tree, Roland Fiagbe
Data Science and Data Mining
Decision tree is a commonly used data mining methodology for performing classification tasks. It is a tree-based supervised machine learning algorithm that is used to classify or make predictions in a path of how previous questions are answered. Generally, the decision tree algorithm categorizes data into branch-like segments that develop into a tree that contains a root, nodes, and leaves. This project seeks to explore the decision tree methodology and apply it to the Adult Income dataset from the UCI Machine Learning Repository, to determine whether a person makes over 50K per year and determine the necessary factors that improve …
Application Of Sentiment Analysis And Machine Learning Techniques To Predict Daily Cryptocurrency Price Returns, Edward Wu
CMC Senior Theses
This paper examines the effects of social media sentiment relating to Bitcoin on the daily price returns of Bitcoin and other popular cryptocurrencies by utilizing sentiment analysis and machine learning techniques to predict daily price returns. Many investors think that social media sentiment affects cryptocurrency prices. However, the results of this paper find that social media sentiment relating to Bitcoin does not add significant predictive value to forecasting daily price returns for each of the six cryptocurrencies used for analysis and that machine learning models that do not assume linearity between the current day price return and previous daily price …