Group-Lasso Estimation In High-Dimensional Factor Models With Structural Breaks, 2018 University of Windsor

#### Group-Lasso Estimation In High-Dimensional Factor Models With Structural Breaks, Yujie Song

*Major Papers*

In this major paper, we study the influence of structural breaks in the financial market model with high-dimensional data. We present a model which is capable of detecting changes in factor loadings, determining the number of factors and detecting the break date. We consider the case where the break date is both known and unknown and identify the type of instability. For the unknown break date case, we propose a group-LASSO estimator to determine the number of pre- and post-break factors, the break date and the existence of instability of factor loadings when the number of factor is constant. We ...

Estimation In High-Dimensional Factor Models With Structural Instabilities, 2018 University of Windsor

#### Estimation In High-Dimensional Factor Models With Structural Instabilities, Wen Gao

*Major Papers*

In this major paper, we use high-dimensional models to analyze macroeconomic data which is in influenced by the break point. In particular, we consider to detect the break point and study the changes of the number of factors and the factor loadings with the structural instability.

Concretely, we propose two factor models which explain the processes of pre- and post- break periods. Then, we consider the break point as known or unknown. In both situations, we derive the shrinkage estimators by minimizing the penalized least square function and calculate the estimators of the numbers of pre- and post- break factors ...

Snakebite Dynamics Of Colombia: Effects Of Precipitation Seasonality Of Incidence, 2018 Illinois State University

#### Snakebite Dynamics Of Colombia: Effects Of Precipitation Seasonality Of Incidence, Carlos Cruz

*Annual Symposium on Biomathematics and Ecology: Education and Research*

No abstract provided.

Identifying Treatment Effects In The Presence Of Confounded Types, 2018 Iowa State University

#### Identifying Treatment Effects In The Presence Of Confounded Types, Desire Kedagni

*Economics Working Papers*

In this paper, I consider identification of treatment effects when

the treatment is endogenous. The use of instrumental variables is a popular

solution to deal with endogeneity, but this may give misleading answers when

the instrument is invalid. I show that when the instrument is invalid due to

correlation with the first stage unobserved heterogeneity, a second (also

possibly invalid) instrument allows to partially identify not only the local

average treatment effect but also the entire potential outcomes distributions

for compliers. I exploit the fact that the distribution of the observed

outcome in each group defined by the treatment and ...

Overcoming Small Data Limitations In Heart Disease Prediction By Using Surrogate Data, 2018 Southern Methodist University

#### Overcoming Small Data Limitations In Heart Disease Prediction By Using Surrogate Data, Alfeo Sabay, Laurie Harris, Vivek Bejugama, Karen Jaceldo-Siegl

*SMU Data Science Review*

In this paper, we present a heart disease prediction use case showing how synthetic data can be used to address privacy concerns and overcome constraints inherent in small medical research data sets. While advanced machine learning algorithms, such as neural networks models, can be implemented to improve prediction accuracy, these require very large data sets which are often not available in medical or clinical research. We examine the use of surrogate data sets comprised of synthetic observations for modeling heart disease prediction. We generate surrogate data, based on the characteristics of original observations, and compare prediction accuracy results achieved from ...

Minimizing The Perceived Financial Burden Due To Cancer, 2018 Southern Methodist University

#### Minimizing The Perceived Financial Burden Due To Cancer, Hassan Azhar, Zoheb Allam, Gino Varghese, Daniel W. Engels, Sajiny John

*SMU Data Science Review*

In this paper, we present a regression model that predicts perceived financial burden that a cancer patient experiences in the treatment and management of the disease. Cancer patients do not fully understand the burden associated with the cost of cancer, and their lack of understanding can increase the difficulties associated with living with the disease, in particular coping with the cost. The relationship between demographic characteristics and financial burden were examined in order to better understand the characteristics of a cancer patient and their burden, while all subsets regression was used to determine the best predictors of financial burden. Age ...

Yelp’S Review Filtering Algorithm, 2018 Southern Methodist University

#### Yelp’S Review Filtering Algorithm, Yao Yao, Ivelin Angelov, Jack Rasmus-Vorrath, Mooyoung Lee, Daniel W. Engels

*SMU Data Science Review*

In this paper, we present an analysis of features influencing Yelp's proprietary review filtering algorithm. Classifying or misclassifying reviews as *recommended* or *non-recommended* affects average ratings, consumer decisions, and ultimately, business revenue. Our analysis involves systematically sampling and scraping Yelp restaurant reviews. Features are extracted from review metadata and engineered from metrics and scores generated using text classifiers and sentiment analysis. The coefficients of a multivariate logistic regression model were interpreted as quantifications of the relative importance of features in classifying reviews as recommended or non-recommended. The model classified review recommendations with an accuracy of 78%. We found that ...

Cryptocurrency Price Prediction Using Tweet Volumes And Sentiment Analysis, 2018 Southern Methodist University

#### Cryptocurrency Price Prediction Using Tweet Volumes And Sentiment Analysis, Jethin Abraham, Daniel Higdon, John Nelson, Juan Ibarra

*SMU Data Science Review*

In this paper, we present a method for predicting changes in Bitcoin and Ethereum prices utilizing Twitter data and Google Trends data. Bitcoin and Ethereum, the two largest cryptocurrencies in terms of market capitalization represent over \$160 billion dollars in combined value. However, both Bitcoin and Ethereum have experienced significant price swings on both daily and long term valuations. Twitter is increasingly used as a news source influencing purchase decisions by informing users of the currency and its increasing popularity. As a result, quickly understanding the impact of tweets on price direction can provide a purchasing and selling advantage to ...

Coastal Wetland Dynamics Under Sea-Level Rise And Wetland Restoration In The Northern Gulf Of Mexico Using Bayesian Multilevel Models And A Web Tool, 2018 The University of Southern Mississippi

#### Coastal Wetland Dynamics Under Sea-Level Rise And Wetland Restoration In The Northern Gulf Of Mexico Using Bayesian Multilevel Models And A Web Tool, Tyler Hardy

*Master's Theses*

There is currently a lack of modeling framework to predict how relative sea-level rise (SLR), combined with restoration activities, affects landscapes of coastal wetlands with uncertainties accounted for at the entire northern Gulf of Mexico (NGOM). I developed such a modeling framework – Bayesian multi-level models to study the spatial pattern of wetland loss in the NGOM, driven by relative RSLR, vegetation productivity, tidal range, coastal slope, and wave height – all interacting with river-borne sediment availability, indicated by hydrological regimes. These interactions have not been comprehensively investigated before. I further modified this model to assess the efficacy of restoration projects from ...

A Bayesian State-Space Model Using Age-At-Harvest Data For Estimating The Population Of Black Bears (Ursus Americanus) In Wisconsin, 2018 University of Illinois at Urbana-Champaign

#### A Bayesian State-Space Model Using Age-At-Harvest Data For Estimating The Population Of Black Bears (Ursus Americanus) In Wisconsin, Maximilian L. Allen, Andrew S. Norton, Glenn Stauffer, Nathan M. Roberts, Yanshi Luo, Qing Li, David Macfarland, Timothy R. Van Deelen

*Industrial and Manufacturing Systems Engineering Publications*

Population estimation is essential for the conservation and management of fish and wildlife, but accurate estimates are often difficult or expensive to obtain for cryptic species across large geographical scales. Accurate statistical models with manageable financial costs and field efforts are needed for hunted populations and using age-at-harvest data may be the most practical foundation for these models. Several rigorous statistical approaches that use age-at-harvest and other data to accurately estimate populations have recently been developed, but these are often dependent on (a) accurate prior knowledge about demographic parameters of the population, (b) auxiliary data, and (c) initial population size ...

Testing Hypotheses Of Covariance Structure In Multivariate Data, 2018 NOVA University of Lisbon

#### Testing Hypotheses Of Covariance Structure In Multivariate Data, Miguel Fonseca, Arkadiusz Koziol, Roman Zmyslony

*Electronic Journal of Linear Algebra*

In this paper there is given a new approach for testing hypotheses on the structure of covariance matrices in double multivariate data. It is proved that ratio of positive and negative parts of best unbiased estimators (BUE) provide an F-test for independence of blocks variables in double multivariate models.

Application Of Jordan Algebra For Testing Hypotheses About Structure Of Mean Vector In Model With Block Compound Symmetric Covariance Structure, 2018 Faculty of Mathematics, Computer Science and Econometrics, University of Zielona Góra,

#### Application Of Jordan Algebra For Testing Hypotheses About Structure Of Mean Vector In Model With Block Compound Symmetric Covariance Structure, Roman Zmyślony, Ivan Zezula, Arkadiusz Kozioł

*Electronic Journal of Linear Algebra*

In this article authors derive test for structure of mean vector in model with block compound symmetric covariance structure for two-level multivariate observations. One possible structure is so called structured mean vector when its components remain constant over sites or over time points, so that mean vector is of the form $\boldsymbol{1}_{u}\otimes\boldsymbol{\mu}$ with $\boldsymbol{\mu}=(\mu_1,\mu_2,\ldots,\mu_m)'\in\mathbb{R}^m$. This hypothesis is tested against alternative of unstructured mean vector, which can change over sites or over time points.

Applications Of Game Theory, Tableau, Analytics, And R To Fashion Design, 2018 Atlanta University Center

#### Applications Of Game Theory, Tableau, Analytics, And R To Fashion Design, Aisha Asiri

*Electronic Theses & Dissertations Collection for Atlanta University & Clark Atlanta University*

This thesis presents various models to the fashion industry to predict the profits for some products. To determine the expected performance of each product in 2016, we used tools of game theory to help us identify the expected value. We went further and performed a simple linear regression and used scatter plots to help us predict further the performance of the products of Prada. We used tools of game theory, analytics, and statistics to help us predict the performance of some of Prada's products. We also used the Tableau platform to visualize an overview of the products' performances. All ...

Deep Machine Learning For Mechanical Performance And Failure Prediction, 2018 Penn State University

#### Deep Machine Learning For Mechanical Performance And Failure Prediction, Elijah Reber, Nickolas D. Winovich, Guang Lin

*The Summer Undergraduate Research Fellowship (SURF) Symposium*

Deep learning has provided opportunities for advancement in many fields. One such opportunity is being able to accurately predict real world events. Ensuring proper motor function and being able to predict energy output is a valuable asset for owners of wind turbines. In this paper, we look at how effective a deep neural network is at predicting the failure or energy output of a wind turbine. A data set was obtained that contained sensor data from 17 wind turbines over 13 months, measuring numerous variables, such as spindle speed and blade position and whether or not the wind turbine experienced ...

Efvs Effects On Pilot Performance, 2018 Purdue University

#### Efvs Effects On Pilot Performance, Michael Campbell, Nsikak Udo-Imeh, Steven J. Landry

*The Summer Undergraduate Research Fellowship (SURF) Symposium*

Flight tests have been conducted at Purdue University using a computer-based flying simulator in an attempt to determine and measure the effects of Enhanced Flight Vision Systems (EFVS) on the performance of pilots during landing. Knowledge of these effects could help guide future design and implementation of EFVS in modern commercial aircraft, and further increase pilots’ ability to control the aircraft in low-visibility conditions. The problem that has faced researchers in the past has revolved around the difficulty in interpreting the data which is generated by these tests. The difficulty in making a generalized conclusion based on the large amount ...

Wald Confidence Intervals For A Single Poisson Parameter And Binomial Misclassification Parameter When The Data Is Subject To Misclassification, 2018 Stephen F Austin State University

#### Wald Confidence Intervals For A Single Poisson Parameter And Binomial Misclassification Parameter When The Data Is Subject To Misclassification, Nishantha Janith Chandrasena Poddiwala Hewage

*Electronic Theses and Dissertations*

This thesis is based on a Poisson model that uses both error-free data and error-prone data subject to misclassification in the form of false-negative and false-positive counts. We present maximum likelihood estimators (MLEs), Fisher's Information, and Wald statistics for Poisson rate parameter and the two misclassification parameters. Next, we invert the Wald statistics to get asymptotic confidence intervals for Poisson rate parameter and false-negative rate parameter. The coverage and width properties for various sample size and parameter configurations are studied via a simulation study. Finally, we apply the MLEs and confidence intervals to one real data set and another ...

Generalized Spatiotemporal Modeling And Causal Inference For Assessing Treatment Effects For Multiple Groups For Ordinal Outcome., 2018 University of Louisville

#### Generalized Spatiotemporal Modeling And Causal Inference For Assessing Treatment Effects For Multiple Groups For Ordinal Outcome., Soutik Ghosal

*Electronic Theses and Dissertations*

This dissertation consists of three projects and can be categorized in two broad research areas: generalized spatiotemporal modeling and causal inference based on observational data. In the first project, I introduce a Bayesian hierarchical mixed effect hurdle model with a nested random effect structure to model the count for primary care providers and understand their spatial and temporal variation. This study further enables us to identify the health professional shortage areas and the possible impacting factors. In the second project, I have unified popular parametric and nonparametric propensity score-based methods to assess the treatment effect of multiple groups for ordinal ...

Distribution Of A Sum Of Random Variables When The Sample Size Is A Poisson Distribution, 2018 East Tennessee State University

#### Distribution Of A Sum Of Random Variables When The Sample Size Is A Poisson Distribution, Mark Pfister

*Electronic Theses and Dissertations*

A probability distribution is a statistical function that describes the probability of possible outcomes in an experiment or occurrence. There are many different probability distributions that give the probability of an event happening, given some sample size *n*. An important question in statistics is to determine the distribution of the sum of independent random variables when the sample size *n* is fixed. For example, it is known that the sum of *n* independent Bernoulli random variables with success probability *p* is a Binomial distribution with parameters *n* and *p*: However, this is not true when the sample size is not ...

Bayesian Sparse Propensity Score Estimation For Unit Nonresponse, 2018 Iowa State University

#### Bayesian Sparse Propensity Score Estimation For Unit Nonresponse, Hejian Sang, Gyuhyeong Goh, Jae Kwang Kim

*Statistics Preprints*

Nonresponse weighting adjustment using propensity score is a popular method for handling unit nonresponse. However, including all available auxiliary variables into the propensity model can lead to inefficient and inconsistent estimation, especially with high-dimensional covariates. In this paper, a new Bayesian method using the Spike-and-Slab prior is proposed for sparse propensity score estimation. The proposed method is not based on any model assumption on the outcome variable and is computationally efficient. Instead of doing model selec- tion and parameter estimation separately as in many frequentist methods, the proposed method simultaneously selects the sparse response probability model and provides consistent parameter ...

Fuel Flow Reduction Impact Analysis Of Drag Reducing Film Applied To Aircraft Wings, 2018 Southern Methodist University

#### Fuel Flow Reduction Impact Analysis Of Drag Reducing Film Applied To Aircraft Wings, Damon Resnick, Chris Donlan, Nimish Sakalle, Cody Pinkerman

*SMU Data Science Review*

In this paper, we present an analysis of flight data in order to determine whether the application of the Edge Aerodynamix Conformal Vortex Generator (CVG), applied to the wings of aircraft, reduces fuel flow during cruising conditions of flight. The CVG is a special treatment and film applied to the wings of an aircraft to protect the wings and reduce the non-laminar flow of air around the wings during flight. It is thought that by reducing the non-laminar flow or vortices around and directly behind the wings that an aircraft will move more smoothly through the air and provide a ...