Open Access. Powered by Scholars. Published by Universities.®
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Institution
- Publication
- Publication Type
Articles 1 - 8 of 8
Full-Text Articles in Physical Sciences and Mathematics
Applications Of Machine Learning Algorithms In Materials Science And Bioinformatics, Mohammed Quazi
Applications Of Machine Learning Algorithms In Materials Science And Bioinformatics, Mohammed Quazi
Mathematics & Statistics ETDs
The piezoelectric response has been a measure of interest in density functional theory (DFT) for micro-electromechanical systems (MEMS) since the inception of MEMS technology. Piezoelectric-based MEMS devices find wide applications in automobiles, mobile phones, healthcare devices, and silicon chips for computers, to name a few. Piezoelectric properties of doped aluminum nitride (AlN) have been under investigation in materials science for piezoelectric thin films because of its wide range of device applicability. In this research using rigorous DFT calculations, high throughput ab-initio simulations for 23 AlN alloys are generated.
This research is the first to report strong enhancements of piezoelectric properties …
Generating A Dataset For Comparing Linear Vs. Non-Linear Prediction Methods In Education Research, Jack Mauro, Elena Martinez, Anna Bargagliotti
Generating A Dataset For Comparing Linear Vs. Non-Linear Prediction Methods In Education Research, Jack Mauro, Elena Martinez, Anna Bargagliotti
Honors Thesis
Machine learning is often used to build predictive models by extracting patterns from large data sets. Such techniques are increasingly being utilized to predict outcomes in the social sciences. One such application is predicting student success. Machine learning can be applied to predicting student acceptance and success in academia. Using these tools for education-related data analysis, may enable the evaluation of programs, resources and curriculum. Currently, research is needed to examine application, admissions, and retention data in order to address equity in college computer science programs. However, most student-level data sets contain sensitive data that cannot be made public. To …
Causalmodels: An R Library For Estimating Causal Effects, Joshua Wolff Anderson
Causalmodels: An R Library For Estimating Causal Effects, Joshua Wolff Anderson
Computational and Data Sciences (MS) Theses
Free and open source software for statistical modeling and machine learning have advanced productivity in data science significantly. Packages such as SciPy in Python and caret in R provide fundamental tools for statistical modeling and machine learning in the two most popular programming languages used by data scientists. Unfortunately, robust tools similar to these are limited in terms of causal inference. The tools in R that exist lack consistent and standardized methodologies and inputs. R lacks a comprehensive package that offers traditional causal inference methods such as standardization, IP weighting, G-estimation, outcome regression, and propensity matching in one common package. …
Mixture Models In Machine Learning, Soumyabrata Pal
Mixture Models In Machine Learning, Soumyabrata Pal
Doctoral Dissertations
Modeling with mixtures is a powerful method in the statistical toolkit that can be used for representing the presence of sub-populations within an overall population. In many applications ranging from financial models to genetics, a mixture model is used to fit the data. The primary difficulty in learning mixture models is that the observed data set does not identify the sub-population to which an individual observation belongs. Despite being studied for more than a century, the theoretical guarantees of mixture models remain unknown for several important settings. In this thesis, we look at three groups of problems. The first part …
Split Classification Model For Complex Clustered Data, Katherine Gerot
Split Classification Model For Complex Clustered Data, Katherine Gerot
Honors Theses
Classification in high-dimensional data has generated tremendous interest in a multitude of fields. Data in higher dimensions often tend to reside in non-Euclidean metric space. This prevents Euclidean-based classification methodologies, such as regression, from reliably modeling the data. Many proposed models rely on computationally-complex embedding to convert the data to a more usable format. Others, namely the Support Vector Machine, rely on kernel manipulation to implicitly describe the "feature space" to arrive at a non-linear decision boundary. The proposed methodology in this paper seeks to classify complex data in a relatively computationally-simple and explainable manner.
A Monte Carlo Simulation Of Rat Choice Behavior With Interdependent Outcomes, Michelle A. Frankot
A Monte Carlo Simulation Of Rat Choice Behavior With Interdependent Outcomes, Michelle A. Frankot
Graduate Theses, Dissertations, and Problem Reports
Preclinical behavioral neuroscience often uses choice paradigms to capture psychiatric symptoms. In particular, the subfield of operant research produces nested datasets with many discrete choices in a session. The standard analytic practice is to aggregate choice into a continuous variable and analyze using ANOVA or linear regression. However, choice data often have multiple interdependent outcomes of interest, violating an assumption of general linear models. The aim of the current study was to quantify the accuracy of linear mixed-effects regression (LMER) for analyzing data from a 4-choice operant task called the Rodent Gambling Task (RGT), which measures decision-making in the context …
Finding The Best Predictors For Foot Traffic In Us Seafood Restaurants, Isabel Paige Beaulieu
Finding The Best Predictors For Foot Traffic In Us Seafood Restaurants, Isabel Paige Beaulieu
Honors Theses and Capstones
COVID-19 caused state and nation-wide lockdowns, which altered human foot traffic, especially in restaurants. The seafood sector in particular suffered greatly as there was an increase in illegal fishing, it is made up of perishable goods, it is seasonal in some places, and imports and exports were slowed. Foot traffic data is useful for business owners to have to know how much to order, how many employees to schedule, etc. One issue is that the data is very expensive, hard to get, and not available until months after it is recorded. Our goal is to not only find covariates that …
Statistical Theory For Specialized Linear Regression Adjustment Methods Compared To Multiple Linear Regression In The Presence And Absence Of Interaction Effects, Leon Su
Theses and Dissertations--Statistics
When building models to investigate outcomes and variables of interest, researchers often want to adjust for other variables. There is a variety of ways that these adjustments are performed. In this work, we will consider four approaches to adjustment utilized by researchers in various fields. We will compare the efficacy of these methods to what we call the ”true model method”, fitting a multiple linear regression model in which adjustment variables are model covariates. Our goal is to show that these adjustment methods have inferior performance to the true model method by comparing model parameter estimates, power, type I error, …