Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 16 of 16

Full-Text Articles in Physical Sciences and Mathematics

Identifying Key Activity Indicators In Rats' Neuronal Data Using Lasso Regularized Logistic Regression, Avery Woods May 2023

Identifying Key Activity Indicators In Rats' Neuronal Data Using Lasso Regularized Logistic Regression, Avery Woods

Honors Theses

This thesis aims to identify timestamps of rats’ neuronal activity that best determine behavior using a machine learning model. Neuronal data is a complex and high-dimensional dataset, and identifying the most informative features is crucial for understanding the underlying neuronal processes. The Lasso regularization technique is employed to select the most relevant features of the data to the model’s prediction. The results of this study provide insights into the key activity indicators that are associated with specific behaviors or cognitive processes in rats, as well as the effect that stress can have on neuronal activity and behavior. Ultimately, it was …


Uconn Baseball Batting Order Optimization, Gavin Rublewski, Gavin Rublewski May 2023

Uconn Baseball Batting Order Optimization, Gavin Rublewski, Gavin Rublewski

Honors Scholar Theses

Challenging conventional wisdom is at the very core of baseball analytics. Using data and statistical analysis, the sets of rules by which coaches make decisions can be justified, or possibly refuted. One of those sets of rules relates to the construction of a batting order. Through data collection, data adjustment, the construction of a baseball simulator, and the use of a Monte Carlo Simulation, I have assessed thousands of possible batting orders to determine the roster-specific strategies that lead to optimal run production for the 2023 UConn baseball team. This paper details a repeatable process in which basic player statistics …


Behaviors For Which Deinonychosaurs Used Their Feet, Alexander King Dec 2022

Behaviors For Which Deinonychosaurs Used Their Feet, Alexander King

Honors Projects

This paper seeks to show for what purpose deinonychosaurs used their feet. Fowler et al., (2011) showed that D. antirrhopus’s feet were closest in function to accipitrids, as they found it was more built for grasping prey than running.

I answered this question by using 2D images of the feet of three modern birds (Buteo jamaicensis, Phasianus colchicus, and Gallus gallus domesticus), one eudromaeosaur (Deinonychus antirrhopus), and one troodontid (Borogovia gracilicrus). I used ImageJ to apply 73 landmarks to each foot, capturing the variation between species in the metatarsals and pedal phalanges. These data were then uploaded to the software …


The Forestecology R Package For Fitting And Assessing Neighborhood Models Of The Effect Of Interspecific Competition On The Growth Of Trees, Albert Y. Kim, David N. Allen, Simon P. Couch Nov 2021

The Forestecology R Package For Fitting And Assessing Neighborhood Models Of The Effect Of Interspecific Competition On The Growth Of Trees, Albert Y. Kim, David N. Allen, Simon P. Couch

Statistical and Data Sciences: Faculty Publications

Neighborhood competition models are powerful tools to measure the effect of interspecific competition. Statistical methods to ease the application of these models are currently lacking. We present the forestecology package providing methods to (a) specify neighborhood competition models, (b) evaluate the effect of competitor species identity using permutation tests, and (cs) measure model performance using spatial cross-validation. Following Allen and Kim (PLoS One, 15, 2020, e0229930), we implement a Bayesian linear regression neighborhood competition model. We demonstrate the package's functionality using data from the Smithsonian Conservation Biology Institute's large forest dynamics plot, part of the ForestGEO global network of research …


Development Of Machine Learning Tutorials For R, John Pintar Jan 2020

Development Of Machine Learning Tutorials For R, John Pintar

All Undergraduate Theses and Capstone Projects

Machine learning (ML) techniques developed in computer science have revolutionized nearly every sector of industry. Despite the prevalence and usefulness of ML, students outside of computer science rarely receive training in ML. Students frequently receive training in statistical analysis, often using the software package R, which is free, open source, and has additional downloadable modules. A popular module is the ML package caret, which contains 238 different ML algorithms, each with 0-9 hyperparameters. caret is powerful, flexible, and provides consistent syntax across algorithms. In the hands of an experienced practitioner, this tunability is welcomed and can increase accuracy. However, when …


Factor Analysis Of Mixed Data (Famd) And Multiple Linear Regression In R, Nestor Pereira Dec 2019

Factor Analysis Of Mixed Data (Famd) And Multiple Linear Regression In R, Nestor Pereira

Dissertations

In the previous projects, it has been worked to statistically analysis of the factors to impact the score of the subjects of Mathematics and Portuguese for several groups of the student from secondary school from Portugal.

In this project will be interested in finding a model, hypothetically multiple linear regression, to predict the final score, dependent variable G3, of the student according to some features divide into two groups. One group, analyses the features or predictors which impact in the final score more related to the performance of the students, means variables like study time or past failures. The second …


Data Mining And Machine Learning To Improve Northern Florida’S Foster Care System, Daniel Oldham, Nathan Foster, Mihhail Berezovski Jun 2019

Data Mining And Machine Learning To Improve Northern Florida’S Foster Care System, Daniel Oldham, Nathan Foster, Mihhail Berezovski

Beyond: Undergraduate Research Journal

The purpose of this research project is to use statistical analysis, data mining, and machine learning techniques to determine identifiable factors in child welfare service records that could lead to a child entering the foster care system multiple times. This would allow us the capability of accurately predicting a case’s outcome based on these factors. We were provided with eight years of data in the form of multiple spreadsheets from Partnership for Strong Families (PSF), a child welfare services organization based in Gainesville, Florida, who is contracted by the Florida Department for Children and Families (DCF). This data contained a …


Cs+Sociology: Global Inequality Lab 2, Elin Waring, Janet Michello May 2019

Cs+Sociology: Global Inequality Lab 2, Elin Waring, Janet Michello

Open Educational Resources

These materials include background for the instructor and a lab that engages student in an analysis of global inequality while learning and using the R language (a programming language for statistics). Students ultimately write a function to access country level data from the CIA World Factbook.


Cs+Sociology: Global Inequality Lab 1, Elin Waring, Janet Michello May 2019

Cs+Sociology: Global Inequality Lab 1, Elin Waring, Janet Michello

Open Educational Resources

These materials include background for the instructor and a lab that engages student in an analysis of global inequality while learning and using the R language (a programming language for statistics). Students obtain data on the US and two other countries (one more developed and one less developed).


Introductory R For Water Resources - Fall 2019 - University Of North Carolina At Chapel Hill, David Gorelick, Gregory Characklis Jan 2019

Introductory R For Water Resources - Fall 2019 - University Of North Carolina At Chapel Hill, David Gorelick, Gregory Characklis

All ECSTATIC Materials

This is all course material for R for Researchers, a one-credit course taught at UNC Chapel Hill in Fall 2019 to introduce upperclassmen and graduate students to the R programming language and apply learned skills in basic water resources applications, as well as other (semi-related) topics of interest to students.

Lecture notes were distributed before (as a subset of full lecture notes) and after lectures, and lectures involved collaborative coding exercises with students in class without any powerpoint material. Course material here includes:

Syllabus: rough schedule and description of lectures

Lectures: pdf lecture notes with embedded code, including …


Considering The Non-Programming Geographer's Perspective When Designing Extracurricular Introductory Computer Programming Workshops, Thomas R Etherington Dec 2018

Considering The Non-Programming Geographer's Perspective When Designing Extracurricular Introductory Computer Programming Workshops, Thomas R Etherington

Journal of Spatial Information Science

Computer programming is becoming an increasingly important scientific skill, but geographers are not necessarily receiving this training as part of their formal education. While there are efforts to promote and support extracurricular introductory computer programming workshops, there remain questions about how best to deliver these workshops. Therefore, as part of a recent introductory programming extracurricular workshop I organized for non-programing geographers, I tried to understand more about their perceptions of computer programming. I identify that one of the most important aspects for geographers to learn to computer program is to have training that is domain specific to ensure that the …


Data Scientist’S Analysis Toolbox: Comparison Of Python, R, And Sas Performance, Jim Brittain, Mariana Cendon, Jennifer Nizzi, John Pleis Jul 2018

Data Scientist’S Analysis Toolbox: Comparison Of Python, R, And Sas Performance, Jim Brittain, Mariana Cendon, Jennifer Nizzi, John Pleis

SMU Data Science Review

A quantitative analysis will be performed on experiments utilizing three different tools used for Data Science. The analysis will include replication of analysis along with comparisons of code length, output, and results. Qualitative data will supplement the quantitative findings. The conclusion will provide data support guidance on the correct tool to use for common situations in the field of Data Science.


Analysis Of 2016-17 Major League Soccer Season Data Using Poisson Regression With R, Ian D. Campbell May 2018

Analysis Of 2016-17 Major League Soccer Season Data Using Poisson Regression With R, Ian D. Campbell

Undergraduate Theses and Capstone Projects

To the outside observer, soccer is chaotic with no given pattern or scheme to follow, a random conglomeration of passes and shots that go on for 90 minutes. Yet, what if there was a pattern to the chaos, or a way to describe the events that occur in the game quantifiably. Sports statistics is a critical part of baseball and a variety of other of today’s sports, but we see very little statistics and data analysis done on soccer. Of this research, there has been looks into the effect of possession time on the outcome of a game, the difference …


Anomalydetection: Implementation Of Augmented Network Log Anomaly Detection Procedures, Robert J. Gutierrez, Bradley C. Boehmke, Kenneth W. Bauer, Cade M. Saie, Trevor J. Bihl Aug 2017

Anomalydetection: Implementation Of Augmented Network Log Anomaly Detection Procedures, Robert J. Gutierrez, Bradley C. Boehmke, Kenneth W. Bauer, Cade M. Saie, Trevor J. Bihl

Faculty Publications

As the number of cyber-attacks continues to grow on a daily basis, so does the delay in threat detection. For instance, in 2015, the Office of Personnel Management discovered that approximately 21.5 million individual records of Federal employees and contractors had been stolen. On average, the time between an attack and its discovery is more than 200 days. In the case of the OPM breach, the attack had been going on for almost a year. Currently, cyber analysts inspect numerous potential incidents on a daily basis, but have neither the time nor the resources available to perform such a task. …


Gpusvcalibration: A R Package For Fast Stochastic Volatility Model Calibration Using Gpus, Matthew Dixon, Sabbir Ahmed Khan, Mohammad Zubair Jan 2014

Gpusvcalibration: A R Package For Fast Stochastic Volatility Model Calibration Using Gpus, Matthew Dixon, Sabbir Ahmed Khan, Mohammad Zubair

Business Analytics and Information Systems

In this paper we describe the gpusvcalibration R package for accelerating stochastic volatility model calibration on GPUs. The package is designed for use with existing CRAN packages for optimization such as DEOptim and nloptr. Stochastic volatility models are used extensively across the capital markets for pricing and risk management of exchange traded financial options. However, there are many challenges to calibration, including comparative assessment of the robustness of different models and optimization routines. For example, we observe that when fitted to sub-minute level midmarket quotes, models require frequent calibration every few minutes and the quality of the fit is routine …


Basic R Matrix Operations, Joseph Hilbe Aug 2011

Basic R Matrix Operations, Joseph Hilbe

Joseph M Hilbe

No abstract provided.