Cointegration And Statistical Arbitrage Of Precious Metals, 2021 University of Arkansas, Fayetteville
Cointegration And Statistical Arbitrage Of Precious Metals, Judge Van Horn
Finance Undergraduate Honors Theses
When talking about financial instruments correlation is often thrown around as a measure of the relation between two securities. An often more useful or tradeable measure is cointegration. Cointegration is the measure of two securities tendency to revert to an average price over time. In other words, cointegration ignores directionality and only cares about the distance between two securities. For a mean reversion strategy such as statistical arbitrage cointegration proves to be a far more reliable statistical measure of mean reversion, and while it is more reliable than correlation it still has its own problems. One thing to consider is ...
Applying Emotional Analysis For Automated Content Moderation, 2021 University of Arkansas, Fayetteville
Applying Emotional Analysis For Automated Content Moderation, John Shelnutt
Computer Science and Computer Engineering Undergraduate Honors Theses
The purpose of this project is to explore the effectiveness of emotional analysis as a means to automatically moderate content or flag content for manual moderation in order to reduce the workload of human moderators in moderating toxic content online. In this context, toxic content is defined as content that features excessive negativity, rudeness, or malice. This often features offensive language or slurs. The work involved in this project included creating a simple website that imitates a social media or forum with a feed of user submitted text posts, implementing an emotional analysis algorithm from a word emotions dataset, designing ...
Use Of Linear Discriminant Analysis In Song Classification: Modeling Based On Wilco Albums, 2021 University of Mississippi
Use Of Linear Discriminant Analysis In Song Classification: Modeling Based On Wilco Albums, Caroline Pollard
The study of music recommender algorithms is a relatively new area of study. Although these algorithms serve a variety of functions, they primarily help advertise and suggest music to users on music streaming services. This thesis explores the use of linear discriminant analysis in music categorization for the purpose of serving as a cheaper and simpler content-based recommender algorithm. The use of linear discriminant analysis was tested by creating lineardiscriminant functions that classify Wilco’s songs into their respective albums, specifically A.M., Yankee Hotel Foxtrot, and Sky Blue Sky. 4 sample songs were chosen from each album, and song ...
How Risk-Related Statistics, As Reported In News And Social Media, Are Linked To The Use Of The Public Transit System, 2021 University of Southern Maine
How Risk-Related Statistics, As Reported In News And Social Media, Are Linked To The Use Of The Public Transit System, Prashiddhi Pokhrel
Thinking Matters Symposium
Due to the pandemic, people have started relying more on televisions, news, social media, and other news outlets for guidance. Moreover, with the increasing amount of news, data, and information there is also an increase in the amount of misleading statistics. People’s opinions and decisions significantly depend on the data, statistics, and information that they are exposed to, as well as their sources. For this project, we want to look at how information and its sources are affecting the decision made by the general public for the usage of the Portland Transit System. It is very important to know ...
Conjunction Of Factors Impacting The 2019-2020 Flu Season In The Us, 2021 University of Minnesota - Morris
Conjunction Of Factors Impacting The 2019-2020 Flu Season In The Us, Yichen Wang
Undergraduate Research Symposium 2021
The 2019-2020 flu season is regarded as one of the most serious ones in decades. Previous researchers usually studied the effects of different factors on seasonal flu separately instead of their conjugate impact, so we wanted to find how multiple factors combine to affect the spread of influenza in the 2019-2020 flu season in America. We chose types of virus (A and B), environmental factors (temperature, precipitation, relative humidity), population density, and influenza vaccination status for different age groups which are statewide data containing monthly information from Sep. 2019 to May 2020. By principal component analysis, we could see the ...
Geometric Representation Learning, 2021 University of Massachusetts Amherst
Geometric Representation Learning, Luke Vilnis
Vector embedding models are a cornerstone of modern machine learning methods for knowledge representation and reasoning. These methods aim to turn semantic questions into geometric questions by learning representations of concepts and other domain objects in a lower-dimensional vector space. In that spirit, this work advocates for density- and region-based representation learning. Embedding domain elements as geometric objects beyond a single point enables us to naturally represent breadth and polysemy, make asymmetric comparisons, answer complex queries, and provides a strong inductive bias when labeled data is scarce. We present a model for word representation using Gaussian densities, enabling asymmetric entailment ...
Netsci High: Bringing Agency To Diverse Teens Through The Science Of Connected Systems, 2021 New York Hall of Science
Netsci High: Bringing Agency To Diverse Teens Through The Science Of Connected Systems, Stephen M. Uzzo, Catherine B. Cramer, Hiroki Sayama, Russell Faux
Northeast Journal of Complex Systems (NEJCS)
This paper follows NetSci High, a decade-long initiative to inspire teams of teenage researchers to develop, execute and disseminate original research in network science. The project introduced high school students to the computer-based analysis of networks, and instilled in the participants the habits of mind to deepen inquiry in connected systems and statistics, and to sustain interest in continuing to study and pursue careers in fields involving network analysis. Goals of NetSci High ranged from proximal learning outcomes (e.g., increasing high school student competencies in computing and improving student attitudes toward computing) to highly distal (e.g., preparing students ...
Analysis And Publication Profile Of Indonesian Scientific Work In 2020 Based On The Scopus Database, 2021 Universitas Negeri Yogyakarta, Yogyakarta, Indonesia
Analysis And Publication Profile Of Indonesian Scientific Work In 2020 Based On The Scopus Database, Akbar Iskandar, Nico Djundharto Djajasinga, Andi Dirga Noegraha, Erwin Gatot, Ansari Saleh Ahmar
Library Philosophy and Practice (e-journal)
This research was conducted to identify and describe the profile of publications in Indonesia in 2020. This research used the bibliometric methods. The data in this research were collected by searching through the Scopus database with the keywords: AFFILCOUNTRY “Indonesia” and PUBYEAR “2020” with the exception of AFFILCOUNTRY other than “Indonesia”. Data were then analyzed based on author affiliation, subject, document type, source type, source title, and language. The results of the research indicated that the development of Indonesian scientific publications was dominated by article types (50.69%) and conference papers (45.83%) with the subject area of publication dominated ...
Nondominant Hand Computer Mouse Training And The Bilateral Transfer Effect To The Dominant Hand, 2021 Iowa State University
Nondominant Hand Computer Mouse Training And The Bilateral Transfer Effect To The Dominant Hand, Drew Schweiger, Richard T. Stone, Ulrike Genschel
Industrial and Manufacturing Systems Engineering Publications
This study explored the effects of training computer mouse use in the nondominant hand on clicking performance of the dominant and nondominant hands. Computer mouse use is a daily operation in the workplace and requires minute hand and wrist movements developed and refined through practice and training for many years. Our study had eleven right-handed computer mouse users train their nondominant hand for 15 min a day, five days per week, for six weeks. This study found improved performance with the computer mouse in the dominant hand following nondominant hand training because of the bilateral transfer effect of training. Additionally ...
The Mean-Reverting 4/2 Stochastic Volatility Model: Properties And Financial Applications, 2021 The University of Western Ontario
The Mean-Reverting 4/2 Stochastic Volatility Model: Properties And Financial Applications, Zhenxian Gong
Electronic Thesis and Dissertation Repository
Financial markets and instruments are continuously evolving, displaying new and more refined stylized facts. This requires regular reviews and empirical evaluations of advanced models. There is evidence in literature that supports stochastic volatility models over constant volatility models in capturing stylized facts such as "smile" and "skew" presented in implied volatility surfaces. In this thesis, we target commodity and volatility index markets, and develop a novel stochastic volatility model that incorporates mean-reverting property and 4/2 stochastic volatility process. Commodities and volatility indexes have been proved to be mean-reverting, which means their prices tend to revert to their long term ...
Sars-Cov-2 Pandemic Analytical Overview With Machine Learning Predictability, 2021 Southern Methodist University
Sars-Cov-2 Pandemic Analytical Overview With Machine Learning Predictability, Anthony Tanaydin, Jingchen Liang, Daniel W. Engels
SMU Data Science Review
Understanding diagnostic tests and examining important features of novel coronavirus (COVID-19) infection are essential steps for controlling the current pandemic of 2020. In this paper, we study the relationship between clinical diagnosis and analytical features of patient blood panels from the US, Mexico, and Brazil. Our analysis confirms that among adults, the risk of severe illness from COVID-19 increases with pre-existing conditions such as diabetes and immunosuppression. Although more than eight months into pandemic, more data have become available to indicate that more young adults were getting infected. In addition, we expand on the definition of COVID-19 test and discuss ...
Bias Of Rank Correlation Under A Mixture Model, 2021 Georgia Southern University
Bias Of Rank Correlation Under A Mixture Model, Russell Land
Electronic Theses and Dissertations
This thesis project will analyze the bias in mixture models when contaminated data is present. Specifically, we will analyze the relationship between the bias and the mixing proportion, p, for the rank correlation methods Spearman’s Rho and Kendall’s Tau. We will first look at the history of the two non-parametric rank correlation methods and the sample and population definitions will be introduced. Copulas will be introduced to show a few ways we can define these correlation methods. After that, mixture models will be defined and the main theorem will be stated and proved. As an example, we will ...
Analysis And Implementation Of The Maximum Likelihood Expectation Maximization Algorithm For Find, 2020 University of New Hampshire
Analysis And Implementation Of The Maximum Likelihood Expectation Maximization Algorithm For Find, Angus Boyd Jameson
Student Research Projects
This thesis presents an organized explanation and breakdown of the Maximum Likelihood Expectation Maximization image reconstruction algorithm. This background research was used to develop a means of implementing the algorithm into the imaging code for UNH's Field Deployable Imaging Neutron Detector to improve its ability to resolve complex neutron sources. This thesis provides an overview for this implementation scheme, and include the results of a couple of reconstruction tests for the algorithm. A discussion is given on the current state of the algorithm and its integration with the neutron detector system, and suggestions are given for how the work ...
Bayesian Semi-Supervised Keyphrase Extraction And Jackknife Empirical Likelihood For Assessing Heterogeneity In Meta-Analysis, 2020 Southern Methodist University
Bayesian Semi-Supervised Keyphrase Extraction And Jackknife Empirical Likelihood For Assessing Heterogeneity In Meta-Analysis, Guanshen Wang
Statistical Science Theses and Dissertations
This dissertation investigates: (1) A Bayesian Semi-supervised Approach to Keyphrase Extraction with Only Positive and Unlabeled Data, (2) Jackknife Empirical Likelihood Confidence Intervals for Assessing Heterogeneity in Meta-analysis of Rare Binary Events.
In the big data era, people are blessed with a huge amount of information. However, the availability of information may also pose great challenges. One big challenge is how to extract useful yet succinct information in an automated fashion. As one of the first few efforts, keyphrase extraction methods summarize an article by identifying a list of keyphrases. Many existing keyphrase extraction methods focus on the unsupervised setting ...
Improved Statistical Methods For Time-Series And Lifetime Data, 2020 Southern Methodist University
Improved Statistical Methods For Time-Series And Lifetime Data, Xiaojie Zhu
Statistical Science Theses and Dissertations
In this dissertation, improved statistical methods for time-series and lifetime data are developed. First, an improved trend test for time series data is presented. Then, robust parametric estimation methods based on system lifetime data with known system signatures are developed.
In the first part of this dissertation, we consider a test for the monotonic trend in time series data proposed by Brillinger (1989). It has been shown that when there are highly correlated residuals or short record lengths, Brillinger’s test procedure tends to have significance level much higher than the nominal level. This could be related to the discrepancy ...
Confirmative Evaluation: New Cipp Evaluation Model, 2020 Wayne State University
Confirmative Evaluation: New Cipp Evaluation Model, Tia L. Finney
Journal of Modern Applied Statistical Methods
Struggling trainees often require a substantial investment of time, effort, and resources from medical educators. An emergent challenge involves developing effective ways to accurately identify struggling students and better understand the primary causal factors underlying their poor performance. Identifying the potential reasons for poor performance in medical school is a key first step in developing suitable remediation plans. The SOM Modified Program is a remediation program that aims to ensure academic success for medical students. The purpose of this study is to determine the impact of modifying the CIPP evaluation model by adding a confirmative evaluation step to the model ...
Quantifying The Simultaneous Effect Of Socio-Economic Predictors And Build Environment On Spatial Crime Trends, 2020 University of Arkansas, Fayetteville
Quantifying The Simultaneous Effect Of Socio-Economic Predictors And Build Environment On Spatial Crime Trends, Alfieri Daniel Ek
Theses and Dissertations
Proper allocation of law enforcement agencies falls under the umbrella of risk terrainmodeling (Caplan et al., 2011, 2015; Drawve, 2016) that primarily focuses on crime prediction and prevention by spatially aggregating response and predictor variables of interest. Although mental health incidents demand resource allocation from law enforcement agencies and the city, relatively less emphasis has been placed on building spatial models for mental health incidents events. Analyzing spatial mental health events in Little Rock, AR over 2015 to 2018, we found evidence of spatial heterogeneity via Moran’s I statistic. A spatial modeling framework is then built using generalized linear ...
Comparative Evaluation Of Statistical Dependence Measures, 2020 University of Arkansas, Fayetteville
Comparative Evaluation Of Statistical Dependence Measures, Eman Abdel Rahman Ibrahim
Theses and Dissertations
Measuring and testing dependence between random variables is of great importance in many scientific fields. In the case of linearly correlated variables, Pearson’s correlation coefficient is a commonly used measure of the correlation strength. In the case of nonlinear correlation, several innovative measures have been proposed, such as distance-based correlation, rank-based correlations, and information theory-based correlation. This thesis focuses on the statistical comparison of several important correlations, including Spearman’s correlation, mutual information, maximal information coefficient, biweight midcorrelation, distance correlation, and copula correlation, under various simulation settings such as correlative patterns and the level of random noise. Furthermore, we ...
Developing A Tourism Opportunity Index Regarding The Prospective Of Overtourism In Nepal, 2020 Missouri State University
Developing A Tourism Opportunity Index Regarding The Prospective Of Overtourism In Nepal, Susan Phuyal
MSU Graduate Theses
This research explores Nepal's overtourism scenario based on the capacity of a locality to manage sustainable tourism practices. Environmental degradation, local infrastructure degradation, negative tourist experience and local resident responses regarding visitors are the four main variables used in this study to analyze overtourism. In order to analyze the case study of overtourism, we select the three top touristic cities of Nepal, Kathmandu, Pokhara, and Chitwan based on the number of annual visitors. Nepal's case analysis of overtourism conditions reviews the overall threat of over-tourism and establishes a metric by which tourism can be viewed as potentially detrimental ...
Incorporating Shear Resistance Into Debris Flow Triggering Model Statistics, 2020 California Polytechnic State University, San Luis Obispo
Incorporating Shear Resistance Into Debris Flow Triggering Model Statistics, Noah J. Lyman
Several regions of the Western United States utilize statistical binary classification models to predict and manage debris flow initiation probability after wildfires. As the occurrence of wildfires and large intensity rainfall events increase, so has the frequency in which development occurs in the steep and mountainous terrain where these events arise. This resulting intersection brings with it an increasing need to derive improved results from existing models, or develop new models, to reduce the economic and human impacts that debris flows may bring. Any development or change to these models could also theoretically increase the ease of collection, processing, and ...