Minimizing The Perceived Financial Burden Due To Cancer, 2018 Southern Methodist University
Minimizing The Perceived Financial Burden Due To Cancer, Hassan Azhar, Zoheb Allam, Gino Varghese, Daniel W. Engels, Sajiny John
SMU Data Science Review
In this paper, we present a regression model that predicts perceived financial burden that a cancer patient experiences in the treatment and management of the disease. Cancer patients do not fully understand the burden associated with the cost of cancer, and their lack of understanding can increase the difficulties associated with living with the disease, in particular coping with the cost. The relationship between demographic characteristics and financial burden were examined in order to better understand the characteristics of a cancer patient and their burden, while all subsets regression was used to determine the best predictors of financial burden. Age ...
Cryptocurrency Price Prediction Using Tweet Volumes And Sentiment Analysis, 2018 Southern Methodist University
Cryptocurrency Price Prediction Using Tweet Volumes And Sentiment Analysis, Jethin Abraham, Daniel Higdon, John Nelson, Juan Ibarra
SMU Data Science Review
In this paper, we present a method for predicting changes in Bitcoin and Ethereum prices utilizing Twitter data and Google Trends data. Bitcoin and Ethereum, the two largest cryptocurrencies in terms of market capitalization represent over \$160 billion dollars in combined value. However, both Bitcoin and Ethereum have experienced significant price swings on both daily and long term valuations. Twitter is increasingly used as a news source influencing purchase decisions by informing users of the currency and its increasing popularity. As a result, quickly understanding the impact of tweets on price direction can provide a purchasing and selling advantage to ...
Efvs Effects On Pilot Performance, 2018 Purdue University
Efvs Effects On Pilot Performance, Michael Campbell, Nsikak Udo-Imeh, Steven J. Landry
The Summer Undergraduate Research Fellowship (SURF) Symposium
Flight tests have been conducted at Purdue University using a computer-based flying simulator in an attempt to determine and measure the effects of Enhanced Flight Vision Systems (EFVS) on the performance of pilots during landing. Knowledge of these effects could help guide future design and implementation of EFVS in modern commercial aircraft, and further increase pilots’ ability to control the aircraft in low-visibility conditions. The problem that has faced researchers in the past has revolved around the difficulty in interpreting the data which is generated by these tests. The difficulty in making a generalized conclusion based on the large amount ...
Pretrial Release And Failure-To-Appear In Mclean County, Il, 2018 Illinois State University
Pretrial Release And Failure-To-Appear In Mclean County, Il, Jonathan Monsma
Stevenson Center for Community and Economic Development to Stevenson Center for Community and Economic Development—Student Research
Actuarial risk assessment tools increasingly have been employed in jurisdictions across the U.S. to assist courts in the decision of whether someone charged with a crime should be detained or released prior to their trial. These tools should be continually monitored and researched by independent 3rd parties to ensure that these powerful tools are being administered properly and used in the most proficient way as to provide socially optimal results. McLean County, Illinois began using the Public Safety Assessment-CourtTM (PSA-Court or simply PSA) risk assessment tool beginning in 2016. This study culls data from the McLean County ...
Data Center Application Security: Lateral Movement Detection Of Malware Using Behavioral Models, 2018 Southen Methodist University, Dallas, Texas
Data Center Application Security: Lateral Movement Detection Of Malware Using Behavioral Models, Harinder Pal Singh Bhasin, Elizabeth Ramsdell, Albert Alva, Rajiv Sreedhar, Medha Bhadkamkar
SMU Data Science Review
Data center security traditionally is implemented at the external network access points, i.e., the perimeter of the data center network, and focuses on preventing malicious software from entering the data center. However, these defenses do not cover all possible entry points for malicious software, and they are not 100% effective at preventing infiltration through the connection points. Therefore, security is required within the data center to detect malicious software activity including its lateral movement within the data center. In this paper, we present a machine learning-based network traffic analysis approach to detect the lateral movement of malicious software within ...
Predicting Game Day Outcomes In National Football League Games, 2018 Southern Methodist University
Predicting Game Day Outcomes In National Football League Games, Josh Klein, Anna Frowein, Chris Irwin
SMU Data Science Review
In this paper, we present a model for predicting the game day outcomes of National Football League games. 3 of the most popular sources for game day predictions are analyzed for comparison. Player data and outcomes from previous games are used, but we also incorporate several weather factors into our models. Over 1,700 games were incorporated and 3 separate models are created using simple regression, principal component analysis, and a recursive model. We also discuss the ethicality of using data science techniques by individuals with the knowledge in order to gain an advantage over a population lacking this specialized ...
Examining Multimorbidities Using Association Rule Learning, 2018 Brigham Young University
Examining Multimorbidities Using Association Rule Learning, Kaylee Dudley
Undergraduate Honors Theses
All insurance companies, regardless of the kind of insurance they offer, do their best to predict the future by comparing current to historical information. Any statistically significant correlation, regardless of expectations and hidden factors, can help to actuarially model future behavior. Using deidentified data from over 6 million health insurance policies over one year, we looked for any significant groupings of medical issues. The medical issues are defined based on the commercial “Episode Treatment Groups” (ETGs) classification, and our claims contain 347 different ETGs. We performed different kinds of analysis, including Bayesian posterior cluster analysis, k-means cluster analysis, and association ...
Text Analytics Approach To Extract Course Improvement Suggestions From Students’ Feedback, 2018 Singapore Management University
Text Analytics Approach To Extract Course Improvement Suggestions From Students’ Feedback, Swapna Gottipati, Venky Shankararaman, Jeff Rongsheng Lin
Research Collection School Of Information Systems
In academic institutions, it is normal practice that at the end of each term, students are required to complete a questionnaire that is designed to gather students’ perceptions of the instructor and their learning experience in the course. Students’ feedback includes numerical answers to Likert scale questions and textual comments to open-ended questions. Within the textual comments given by the students are embedded suggestions. A suggestion can be explicit or implicit. Any suggestion provides useful pointers on how the instructor can further enhance the student learning experience. However, it is tedious to manually go through all the qualitative comments and ...
Understanding Natural Keyboard Typing Using Convolutional Neural Networks On Mobile Sensor Data, 2018 Southern Methodist University
Understanding Natural Keyboard Typing Using Convolutional Neural Networks On Mobile Sensor Data, Travis Siems
Computer Science and Engineering Theses and Dissertations
Mobile phones and other devices with embedded sensors are becoming increasingly ubiquitous. Audio and motion sensor data may be able to detect information that we did not think possible. Some researchers have created models that can predict computer keyboard typing from a nearby mobile device; however, certain limitations to their experiment setup and methods compelled us to be skeptical of the models’ realistic prediction capability. We investigate the possibility of understanding natural keyboard typing from mobile phones by performing a well-designed data collection experiment that encourages natural typing and interactions. This data collection helps capture realistic vulnerabilities of the security ...
Under The Influence, 2018 Bryant University
Under The Influence, Leonardo Cavicchio
Honors Projects in Mathematics
The purpose of this Honors Capstone entitled Under the Influence is to assess the validity of claims concerning the possible influence of roommates on one another, concerning alcohol on college campuses. This will be done by examining data collected in a prior study conducted over a two-year period. This analysis will focus on how alcohol consumption changes in correlation with the personality factors of roommates over an extended period of time. This secondary analysis of de-identified data will focus on primary and secondary subquestions. The primary question that will be addressed with the data set collected from the University of ...
A Convolutional Neural Network Model For Species Classification Of Camera Trap Images, 2018 Boise State University
A Convolutional Neural Network Model For Species Classification Of Camera Trap Images, Annie Casey
Mathematics Undergraduate Theses
The overall purpose of this study was to automate the manual process of tagging species found in camera trap images using machine learning. The basic design of this study was to implement a Convolutional Neural Network model in Python using the Keras and Tensorflow modules that learn to recognize patterns in images in order to classify what species is in a given image and to label it accordingly. Results of the analysis highlight the importance of a large sample size, the degree of accuracy according to various arguments in the model, effectiveness of multiple layers that include Max Pooling, and ...
Default Priors For The Intercept Parameter In Logistic Regressions, 2018 The University Of Michigan
Default Priors For The Intercept Parameter In Logistic Regressions, Philip S. Boonstra, Ryan P. Barbaro, Ananda Sen
The University of Michigan Department of Biostatistics Working Paper Series
In logistic regression, separation refers to the situation in which a linear combination of predictors perfectly discriminates the binary outcome. Because finite-valued maximum likelihood parameter estimates do not exist under separation, Bayesian regressions with informative shrinkage of the regression coefficients offer a suitable alternative. Little focus has been given on whether and how to shrink the intercept parameter. Based upon classical studies of separation, we argue that efficiency in estimating regression coefficients may vary with the intercept prior. We adapt alternative prior distributions for the intercept that downweight implausibly extreme regions of the parameter space rendering less sensitivity to separation ...
Building A Better Risk Prevention Model, 2018 Houston County Schools
Building A Better Risk Prevention Model, Steven Hornyak
National Youth-At-Risk Conference Savannah
This presentation chronicles the work of Houston County Schools in developing a risk prevention model built on more than ten years of longitudinal student data. In its second year of implementation, Houston At-Risk Profiles (HARP), has proven effective in identifying those students most in need of support and linking them to interventions and supports that lead to improved outcomes and significantly reduces the risk of failure.
Comparing Various Machine Learning Statistical Methods Using Variable Differentials To Predict College Basketball, 2018 The University of Akron
Comparing Various Machine Learning Statistical Methods Using Variable Differentials To Predict College Basketball, Nicholas Bennett
Honors Research Projects
The purpose of this Senior Honors Project is to research, study, and demonstrate newfound knowledge of various machine learning statistical techniques that are not covered in the University of Akron’s statistics major curriculum. This report will be an overview of three machine-learning methods that were used to predict NCAA Basketball results, specifically, the March Madness tournament. The variables used for these methods, models, and tests will include numerous variables kept throughout the season for each team, along with a couple variables that are used by the selection committee when tournament teams are being picked. The end goal is to ...
Exploring Quantitative Timed Up And Go Sensor Data With Statistical Learning Techniques, 2018 University of Windsor
Exploring Quantitative Timed Up And Go Sensor Data With Statistical Learning Techniques, Anthony Wright
Injuries and hospitalizations due to accidental falls among seniors represent a major expense for the Canadian public health system. It is highly desirable to be able to predict risk of falls for senior individuals in order to place them in prevention programs. Recently, sensor technologies have been used to predict risk of falls and levels of frailty of individuals. A commonly used test for assessing risk of falls is known as QTUG (Quantitative `Timed Up and Go'). The QTUG data often consist of a small set of survey answers about the individuals' historic variables (e.g., number of falls in ...
Penalized Mixed-Effects Ordinal Response Models For High-Dimensional Genomic Data In Twins And Families, 2018 Virginia Commonwealth University
Penalized Mixed-Effects Ordinal Response Models For High-Dimensional Genomic Data In Twins And Families, Amanda E. Gentry
Theses and Dissertations
The Brisbane Longitudinal Twin Study (BLTS) was being conducted in Australia and was funded by the US National Institute on Drug Abuse (NIDA). Adolescent twins were sampled as a part of this study and surveyed about their substance use as part of the Pathways to Cannabis Use, Abuse and Dependence project. The methods developed in this dissertation were designed for the purpose of analyzing a subset of the Pathways data that includes demographics, cannabis use metrics, personality measures, and imputed genotypes (SNPs) for 493 complete twin pairs (986 subjects.) The primary goal was to determine what combination of SNPs and ...
Campus Climate Sexual Assault Survey (2015) Analysis, 2018 The University of Akron
Campus Climate Sexual Assault Survey (2015) Analysis, Felicia Rosin
Honors Research Projects
The issue of sexual assault has garnered widespread attention in recent years, as is evident by the growing number of high-profile cases and mainstream social movements. With this increasingly bright spotlight, it is no surprise that The University of Akron has interest in improving the sexual violence education programs offered to students. In 2015, the university conducted a survey to gather information on the campus climate surrounding sexual assault. This analysis dives into a deeper analysis of the data gathered in an attempt to pinpoint areas that require the university’s attention. The analysis covers topics identified by Dean of ...
Using Data Analytics For Discovering Library Resource Insights – Case From Singapore Management University, 2017 Singapore Management University
Using Data Analytics For Discovering Library Resource Insights – Case From Singapore Management University, Ning Lu, Rui Song, Dina Heng, Swapna Gottipati, Chee Hsien Aaron (Zheng Zhixian) Tay, Aaron Tay
Research Collection School Of Information Systems
Library resources are critical in supporting teaching, research and learning processes. Several universities have employed online platforms and infrastructure for enabling the online services to students, faculty and staff. To provide efficient services by understanding and predicting user needs libraries are looking into the area of data analytics. Library analytics in Singapore Management University is the project committed to provide an interface for data-intensive project collaboration, while supporting one of the library’s key pillars on its commitment to collaborate on initiatives with SMU Communities and external groups. In this paper, we study the transaction logs for user behavior analysis ...
Data-Adaptive Kernel Support Vector Machine, 2017 The University of Western Ontario
Data-Adaptive Kernel Support Vector Machine, Xin Liu
Electronic Thesis and Dissertation Repository
In this thesis, we propose the data-adaptive kernel Support Vector Machine (SVM), a new method with a data-driven scaling kernel function based on real data sets. This two-stage approach of kernel function scaling can enhance the accuracy of a support vector machine, especially when the data are imbalanced. Followed by the standard SVM procedure in the first stage, the proposed method locally adapts the kernel function to data locations based on the skewness of the class outcomes. In the second stage, the decision rule is constructed with the data-adaptive kernel function and is used as the classifier. This process enlarges ...
Data Envelopment Analysis Using Glpkapi In R, 2017 Portland State University
Data Envelopment Analysis Using Glpkapi In R, Konrad Miziolek, Jordan Beary, Shreyas Vasanth, Surekha Chanamolu, Rudraxi Mitra
Engineering and Technology Management Student Projects
The work done here is primarily a wrapper function written to separate some of the more difficult-to-use glpkAPI functionality from the end-user. The user, when prompted, selects the appropriate configuration of the .mod file to the task (for example, output-oriented CRS), and the data file, as a .dat. The function then loads the required glpkAPI library, and carries forward the model. It allocates the problem and workspace, reads the model file and data file the user selects, builds the problem, and solves it. The function returns primal values, and, if dual = TRUE is selected, also returns dual weights.