At The Interface Of Algebra And Statistics, 2020 The Graduate Center, City University of New York
At The Interface Of Algebra And Statistics, Tai-Danae Bradley
All Dissertations, Theses, and Capstone Projects
This thesis takes inspiration from quantum physics to investigate mathematical structure that lies at the interface of algebra and statistics. The starting point is a passage from classical probability theory to quantum probability theory. The quantum version of a probability distribution is a density operator, the quantum version of marginalizing is an operation called the partial trace, and the quantum version of a marginal probability distribution is a reduced density operator. Every joint probability distribution on a finite set can be modeled as a rank one density operator. By applying the partial trace, we obtain reduced density operators whose diagonals ...
Waiting-Time Paradox In 1922, 2020 University at Buffalo
Waiting-Time Paradox In 1922, Naoki Masuda, Takayuki Hiraoka
Northeast Journal of Complex Systems (NEJCS)
We present an English translation and discussion of an essay that a Japanese physicist, Torahiko Terada, wrote in 1922. In the essay, he described the waiting-time paradox, also called the bus paradox, which is a known mathematical phenomenon in queuing theory, stochastic processes, and modern temporal network analysis. He also observed and analyzed data on Tokyo City trams to verify the relevance of the waiting-time paradox to busy passengers in Tokyo at the time. This essay seems to be one of the earliest documentations of the waiting-time paradox in a sufficiently scientific manner.
Statistical Models And Analysis Of Univariate And Multivariate Degradation Data, 2020 Southern Methodist University
Statistical Models And Analysis Of Univariate And Multivariate Degradation Data, Lochana Palayangoda
Statistical Science Theses and Dissertations
For degradation data in reliability analysis, estimation of the first-passage time (FPT) distribution to a threshold provides valuable information on reliability characteristics. Recently, Balakrishnan and Qin (2019; Applied Stochastic Models in Business and Industry, 35:571-590) studied a nonparametric method to approximate the FPT distribution of such degradation processes if the underlying process type is unknown. In this thesis, we propose improved techniques based on saddlepoint approximation, which enhance upon their suggested methods. Numerical examples and Monte Carlo simulation studies are used to illustrate the advantages of the proposed techniques. Limitations of the improved techniques are discussed and some possible ...
Gait Characterization Using Computer Vision Video Analysis, 2020 College of William and Mary
Gait Characterization Using Computer Vision Video Analysis, Martha T. Gizaw
Undergraduate Honors Theses
The World Health Organization reports that falls are the second-leading cause of accidental death among senior adults around the world. Currently, a research team at William & Mary’s Department of Kinesiology & Health Sciences attempts to recognize and correct aging-related factors that can result in falling. To meet this goal, the members of that team videotape walking tests to examine individual gait parameters of older subjects. However, they undergo a slow, laborious process of analyzing video frame by video frame to obtain such parameters. This project uses computer vision software to reconstruct walking models from residents of an independent living retirement ...
Exact Distribution Of Linkage Disequilibrium In The Presence Of Mutation, Selection, Or Minor Allele Frequency Filtering, 2020 University of California, Davis
Exact Distribution Of Linkage Disequilibrium In The Presence Of Mutation, Selection, Or Minor Allele Frequency Filtering, Jiayi Qu, Stephen D. Kachman, Dorian Garrick, Rohan L. Fernando, Hao Cheng
Faculty Publications, Department of Statistics
Linkage disequilibrium (LD), often expressed in terms of the squared correlation (r2) between allelic values at two loci, is an important concept in many branches of genetics and genomics. Genetic drift and recombination have opposite effects on LD, and thus r2 will keep changing until the effects of these two forces are counterbalanced. Several approximations have been used to determine the expected value of r2 at equilibrium in the presence or absence of mutation. In this paper, we propose a probability-based approach to compute the exact distribution of allele frequencies at two loci in a ﬁnite population at any generation ...
Rmse-Minimizing Confidence Intervals For The Binomial Parameter, 2020 William & Mary
Rmse-Minimizing Confidence Intervals For The Binomial Parameter, Kexin Feng
Undergraduate Honors Theses
Let X1, X2, . . . , Xn be independent and identically distributed Bernoulli(p) random variables with unknown parameter p satisfying 0 < p < 1. Let X = Pn i=1 Xi be the number of successes in the n mutually independent Bernoulli trials. The maximum likelihood estimator of p is ˆp = X/n. For fixed n and α, there are n + 1 distinct 100(1 − α)% confidence intervals associated with X = 0, 1, 2, . . . , n. Currently there is no known exact confidence interval for p. Our goal is to construct the confidence interval for p whose actual coverage is closest to the stated coverage, using the root mean squared error, RMSE, to measure the difference between the actual coverage and the stated coverage. The approximate confidence interval for p developed here minimizes the RMSE for a sample size n and a significance level α.
Effects Of Quantitative Literacy On Healthcare Decision-Making: An Aural Context, 2020 Lafayette College
Effects Of Quantitative Literacy On Healthcare Decision-Making: An Aural Context, Robert G. Root, Sonia Bhala
We propose a relationship between sensory modality, numerical formatting, and performance on a survey simulating healthcare decision-making. We examine the current literature on aural health literacy, and specifically aural literacy coupled with health numeracy. We then create a survey instrument called the Bhala test for this purpose and demonstrate that it is moderately internally consistent and provides results that correlate with the NUMi assessment, a widely accepted measure of health numeracy. The quantitative information provided in the Bhala test has two treatments, percentage and natural frequency formats, in an effort to determine which format is easier for subjects to use ...
Association Between Baseline Abundance Of Peptoniphilus, A Gram-Positive Anaerobic Coccus, And Wound Healing Outcomes Of Dfus, Kyung R. Min, Adriana Galvis, Katherine L. Baquerizo Nole, Rohita Sinha, Jennifer Clarke, Robert S. Kirsner, Dragana Ajdic
Faculty Publications, Department of Statistics
Diabetic foot ulcers (DFUs) lead to nearly 100,000 lower limb amputations annually in the United States. DFUs are colonized by complex microbial communities, and infection is one of the most common reasons for diabetes-related hospitalizations and amputations. In this study, we examined how DFU microbiomes respond to initial sharp debridement and off- loading and how the initial composition associates with 4 week healing outcomes. We employed 16S rRNA next generation sequencing to perform microbial profiling on 50 sam- ples collected from 10 patients with vascularized neuropathic DFUs. Debrided wound sam- ples were obtained at initial visit and after one ...
Accuracy Of Avs Life Expectancy Reports, 2020 The University of Akron
Accuracy Of Avs Life Expectancy Reports, Ariya Aghababa
Williams Honors College, Honors Research Projects
Use insurance company data to predict the trends in life insurance life expectancy reports. Also, use the data to predict what impairments could potentially decrease or increase an insured's life expectancy based on reports created by various Actuaries at life settlement companies.
Evaluating An Ordinal Output Using Data Modeling, Algorithmic Modeling, And Numerical Analysis, 2020 Murray State University
Evaluating An Ordinal Output Using Data Modeling, Algorithmic Modeling, And Numerical Analysis, Martin Keagan Wynne Brown
Murray State Theses and Dissertations
Data and algorithmic modeling are two diﬀerent approaches used in predictive analytics. The models discussed from these two approaches include the proportional odds logit model (POLR), the vector generalized linear model (VGLM), the classiﬁcation and regression tree model (CART), and the random forests model (RF). Patterns in the data were analyzed using trigonometric polynomial approximations and Fast Fourier Transforms. Predictive modeling is used frequently in statistics and data science to ﬁnd the relationship between the explanatory (input) variables and a response (output) variable. Both approaches prove advantageous in diﬀerent cases depending on the data set. In our case, the data ...
Aggregate Loss Model With Poisson-Tweedie Loss Frequency, 2020 Wilfrid Laurier University
Aggregate Loss Model With Poisson-Tweedie Loss Frequency, Si Chen
Theses and Dissertations (Comprehensive)
The aggregate loss model has applications in various areas such as financial risk management and actuarial science. The aggregate loss is the summation of all random losses occurred in a period, and it is governed by both the loss severity and the loss frequency. While the impact of the loss severity on aggregate loss is well studied, less focus is paid on the influence of loss frequency on aggregate loss, which motivates our study. In this thesis, we enrich the aggregate loss framework by introducing the Poisson-Tweedie distribution as a candidate for modelling loss frequency, prove the closedness of Poisson-Tweedie ...
Identifying Customer Churn In After-Market Operations Using Machine Learning Algorithms, 2019 Southern Methodist University
Identifying Customer Churn In After-Market Operations Using Machine Learning Algorithms, Vitaly Briker, Richard Farrow, William Trevino, Brent Allen
SMU Data Science Review
This paper presents a comparative study on machine learning methods as they are applied to product associations, future purchase predictions, and predictions of customer churn in aftermarket operations. Association rules are used help to identify patterns across products and find correlations in customer purchase behaviour. Studying customer behaviour as it pertains to Recency, Frequency, and Monetary Value (RFM) helps inform customer segmentation and identifies customers with propensity to churn. Lastly, Flowserve’s customer purchase history enables the establishment of churn thresholds for each customer group and assists in constructing a model to predict future churners. The aim of this model ...
Ordinal Hyperplane Loss, 2019 Kennesaw State University
Ordinal Hyperplane Loss, Bob Vanderheyden
Analytics and Data Science Dissertations
This research presents the development of a new framework for analyzing ordered class data, commonly called “ordinal class” data. The focus of the work is the development of classifiers (predictive models) that predict classes from available data. Ratings scales, medical classification scales, socio-economic scales, meaningful groupings of continuous data, facial emotional intensity and facial age estimation are examples of ordinal data for which data scientists may be asked to develop predictive classifiers. It is possible to treat ordinal classification like any other classification problem that has more than two classes. Specifying a model with this strategy does not fully utilize ...
Murine Gut Microbiota Is Defined By Host Genetics And Modulates Variation Of Metabolic Traits, 2019 University of Nebraska-Lincoln
Murine Gut Microbiota Is Defined By Host Genetics And Modulates Variation Of Metabolic Traits, Autumn M. Mcknite, Maria Elisa Perez-Munoz, Lu Lu, Evan G. Williams, Simon Brewer, Penelope A. Andreux, John W. M. Bastiaansen, Xusheng Wang, Stephen D. Kachman, Johan Auwerx, Robert W. Williams, Andrew K. Benson, Daniel A. Peterson, Daniel C. Ciobanu
The gastrointestinal tract harbors a complex and diverse microbiota that has an important role in host metabolism. Microbial diversity is influenced by a combination of environmental and host genetic factors and is associated with several polygenic diseases. In this study we combined next-generation sequencing, genetic mapping, and a set of physiological traits of the BXD mouse population to explore genetic factors that explain differences in gut microbiota and its impact on metabolic traits. Molecular profiling of the gut microbiota revealed important quantitative differences in microbial composition among BXD strains. These differences in gut microbial composition are influenced by host-genetics, which ...
Machine Learning In Support Of Electric Distribution Asset Failure Prediction, 2019 Southern Methodist University
Machine Learning In Support Of Electric Distribution Asset Failure Prediction, Robert D. Flamenbaum, Thomas Pompo, Christopher Havenstein, Jade Thiemsuwan
SMU Data Science Review
In this paper, we present novel approaches to predicting as- set failure in the electric distribution system. Failures in overhead power lines and their associated equipment in particular, pose significant finan- cial and environmental threats to electric utilities. Electric device failure furthermore poses a burden on customers and can pose serious risk to life and livelihood. Working with asset data acquired from an electric utility in Southern California, and incorporating environmental and geospatial data from around the region, we applied a Random Forest methodology to predict which overhead distribution lines are most vulnerable to fail- ure. Our results provide evidence ...
Is Corequisite Developmental Math Effective At East Tennessee State University?, 2019 East Tennessee State University
Is Corequisite Developmental Math Effective At East Tennessee State University?, Christine Padden
Electronic Theses and Dissertations
This thesis looks at the corequisite developmental math program at East Tennessee State University (ETSU) and compares the effectiveness to the previous developmental math program by comparing the student outcomes in MATH 1530. MATH 1530 is a non-calculus based statistic and probability course that satisfies most majors’ general education math requirements. ETSU sees approximately 1,000 students a year pass through MATH 1530 which is around 6.7% of the total enrollment at ETSU. We are interested in the last five years of the developmental math program before it was changed to corequisite developmental math and the first five ...
Mathematics Versus Statistics, 2019 Valparaiso University
Mathematics Versus Statistics, Mindy B. Capaldi
Journal of Humanistic Mathematics
Mathematics and statistics are both important and useful subjects, but the former has maintained prominence in the American education system. On the other hand, statistics is more prevalent in daily life and is an increasingly marketable subject to know. This article gives a personal history of one mathematician’s bumpy road to learning and teaching statistics. Additionally, arguments for how and why to include statistics in the K-12 and college curricula are provided.
Some Recent Developments On Pareto-Optimal Reinsurance, 2019 The University of Western Ontario
Some Recent Developments On Pareto-Optimal Reinsurance, Wenjun Jiang
Electronic Thesis and Dissertation Repository
This thesis focuses on developing Pareto-optimal reinsurance policy which considers the interests of both the insurer and the reinsurer. The optimal insurance/reinsurance design has been extensively studied in actuarial science literature, while in early years most studies were concentrated on optimizing the insurer’s interests. However, as early as 1960s, Borch argued that “an agreement which is quite attractive to one party may not be acceptable to its counterparty” and he pioneered the study on “fair” risk sharing between the insurer and the reinsurer. Quite recently, the question of how to strike a balance in risk sharing between an ...
Cs + Sociology: Using Big Data To Identify And Understand Educational Inequality In America (1), 2019 CUNY Lehman College
Cs + Sociology: Using Big Data To Identify And Understand Educational Inequality In America (1), Joseph Cleary, Elin Waring
Open Educational Resources
This is the first of two lessons/labs for teaching and learning of computer science and sociology. Either and be used on their own or they can be used in sequence, in which case this should be used first.
Students will develop CS skills and behaviors including but not limited to: learning what an API is, learning how to access and utilize data on an API, and developing their R coding skills and knowledge. Students will also learn basic, but important, sociological principles such as how poverty is related to educational opportunities in America. Although prior knowledge of CS and ...
Development Of A School Boredom Proneness Scale For Children, 2019 James Madison University
Development Of A School Boredom Proneness Scale For Children, Taylor Carrington
One common phrase heard from students is, “I’m bored.” However, there is no real understanding of what this actually means. In this study, elementary-age students were asked to respond to a newly developed School Boredom Proneness Scale (SBPS) including questions relating to a five-factor model of boredom. Students were also asked to rate how often they become bored at school and how bored they seem compared to classmates. In addition to student responses, parents and teachers were asked to rate how bored they thought the student was, and teachers were additionally asked to rate students’ level of work completion ...