Open Access. Powered by Scholars. Published by Universities.®

Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 30

Full-Text Articles in Mathematics

Contrastive Learning, With Application To Forensic Identification Of Source, Cole Ryan Patten Jan 2024

Contrastive Learning, With Application To Forensic Identification Of Source, Cole Ryan Patten

Electronic Theses and Dissertations

Forensic identification of source problems often fall under the category of verification problems, where recent advances in deep learning have been made by contrastive learning methods. Many forensic identification of source problems deal with a scarcity of data, an issue addressed by few-shot learning. In this work, we make specific what makes a neural network a contrastive network. We then consider the use of contrastive neural networks for few-shot learning classification problems and compare them to other statistical and deep learning methods. Our findings indicate similar performance between models trained by contrastive loss and models trained by cross-entropy loss. We …


Classification In Supervised Statistical Learning With The New Weighted Newton-Raphson Method, Toma Debnath Jan 2024

Classification In Supervised Statistical Learning With The New Weighted Newton-Raphson Method, Toma Debnath

Electronic Theses and Dissertations

In this thesis, the Weighted Newton-Raphson Method (WNRM), an innovative optimization technique, is introduced in statistical supervised learning for categorization and applied to a diabetes predictive model, to find maximum likelihood estimates. The iterative optimization method solves nonlinear systems of equations with singular Jacobian matrices and is a modification of the ordinary Newton-Raphson algorithm. The quadratic convergence of the WNRM, and high efficiency for optimizing nonlinear likelihood functions, whenever singularity in the Jacobians occur allow for an easy inclusion to classical categorization and generalized linear models such as the Logistic Regression model in supervised learning. The WNRM is thoroughly investigated …


Investigaion Of The Gamma Hurdle Model For A Single Population Mean, Alissa Jacobs Jan 2022

Investigaion Of The Gamma Hurdle Model For A Single Population Mean, Alissa Jacobs

Electronic Theses and Dissertations

A common issue in some statistical inference problems is dealing with a high frequency of zeroes in a sample of data. For many distributions such as the gamma, optimal inference procedures do not allow for zeroes to be present. In practice, however, it is natural to observe real data sets where nonnegative distributions would make sense to model but naturally zeroes will occur. One example of this is in the analysis of cost in insurance claim studies. One common approach to deal with the presence of zeroes is using a hurdle model. Most literary work on hurdle models will focus …


Reinforcement Learning: Low Discrepancy Action Selection For Continuous States And Actions, Jedidiah Lindborg Jan 2022

Reinforcement Learning: Low Discrepancy Action Selection For Continuous States And Actions, Jedidiah Lindborg

Electronic Theses and Dissertations

In reinforcement learning the process of selecting an action during the exploration or exploitation stage is difficult to optimize. The purpose of this thesis is to create an action selection process for an agent by employing a low discrepancy action selection (LDAS) method. This should allow the agent to quickly determine the utility of its actions by prioritizing actions that are dissimilar to ones that it has already picked. In this way the learning process should be faster for the agent and result in more optimal policies.


Graph Realizability And Factor Properties Based On Degree Sequence, Daniel John Jan 2022

Graph Realizability And Factor Properties Based On Degree Sequence, Daniel John

Electronic Theses and Dissertations

A graph is a structure consisting of a set of vertices and edges. Graph construction has been a focus of research for a long time, and generating graphs has proven helpful in complex networks and artificial intelligence.

A significant problem that has been a focus of research is whether a given sequence of integers is graphical. Havel and Hakimi stated necessary and sufficient conditions for a degree sequence to be graphic with different properties. In our work, we have proved the sufficiency of the requirements by generating algorithms and providing constructive proof.

Given a degree sequence, one crucial problem is …


Applying Deep Learning To The Ice Cream Vendor Problem: An Extension Of The Newsvendor Problem, Gaffar Solihu Aug 2021

Applying Deep Learning To The Ice Cream Vendor Problem: An Extension Of The Newsvendor Problem, Gaffar Solihu

Electronic Theses and Dissertations

The Newsvendor problem is a classical supply chain problem used to develop strategies for inventory optimization. The goal of the newsvendor problem is to predict the optimal order quantity of a product to meet an uncertain demand in the future, given that the demand distribution itself is known. The Ice Cream Vendor Problem extends the classical newsvendor problem to an uncertain demand with unknown distribution, albeit a distribution that is known to depend on exogenous features. The goal is thus to estimate the order quantity that minimizes the total cost when demand does not follow any known statistical distribution. The …


Zeta Function Regularization And Its Relationship To Number Theory, Stephen Wang May 2021

Zeta Function Regularization And Its Relationship To Number Theory, Stephen Wang

Electronic Theses and Dissertations

While the "path integral" formulation of quantum mechanics is both highly intuitive and far reaching, the path integrals themselves often fail to converge in the usual sense. Richard Feynman developed regularization as a solution, such that regularized path integrals could be calculated and analyzed within a strictly physics context. Over the past 50 years, mathematicians and physicists have retroactively introduced schemes for achieving mathematical rigor in the study and application of regularized path integrals. One such scheme was introduced in 2007 by the mathematicians Klaus Kirsten and Paul Loya. In this thesis, we reproduce the Kirsten and Loya approach to …


Comparison Of Software Packages For Detecting Differentially Expressed Genes From Single-Sample Rna-Seq Data, Rong Zhou Jan 2021

Comparison Of Software Packages For Detecting Differentially Expressed Genes From Single-Sample Rna-Seq Data, Rong Zhou

Electronic Theses and Dissertations

RNA-sequencing (RNA-seq) has rapidly become the tool in many genome-wide transcriptomic studies. It provides a way to understand the RNA environment of cells in different physiological or pathological states to determine how cells respond to these changes. RNA-seq provides quantitative information about the abundance of different RNA species present in a given sample. If the difference or change observed in the read counts or expression level between two experimental conditions is statistically significant, the gene is declared as differentially expressed. A large number of methods for detecting differentially expressed genes (DEGs) with RNA-seq have been developed, such as the methods …


Artificial Neural Network Models For Pattern Discovery From Ecg Time Series, Mehakpreet Kaur Jan 2020

Artificial Neural Network Models For Pattern Discovery From Ecg Time Series, Mehakpreet Kaur

Electronic Theses and Dissertations

Artificial Neural Network (ANN) models have recently become de facto models for deep learning with a wide range of applications spanning from scientific fields such as computer vision, physics, biology, medicine to social life (suggesting preferred movies, shopping lists, etc.). Due to advancements in computer technology and the increased practice of Artificial Intelligence (AI) in medicine and biological research, ANNs have been extensively applied not only to provide quick information about diseases, but also to make diagnostics accurate and cost-effective. We propose an ANN-based model to analyze a patient's electrocardiogram (ECG) data and produce accurate diagnostics regarding possible heart diseases …


An Epidemiological Model With Simultaneous Recoveries, Ariel B. Farber Jun 2019

An Epidemiological Model With Simultaneous Recoveries, Ariel B. Farber

Electronic Theses and Dissertations

Epidemiological models are an essential tool in understanding how infection spreads throughout a population. Exploring the effects of varying parameters provides insight into the driving forces of an outbreak. In this thesis, an SIS (susceptible-infectious-susceptible) model is built partnering simulation methods, differential equations, and transition matrices with the intent to describe how simultaneous recoveries influence the spread of a disease in a well-mixed population. Individuals in the model transition between only two states; an individual is either susceptible — able to be infected, or infectious — able to infect others. Events in this model (infections and recoveries) occur by way …


Generalizations Of The Arcsine Distribution, Rebecca Rasnick May 2019

Generalizations Of The Arcsine Distribution, Rebecca Rasnick

Electronic Theses and Dissertations

The arcsine distribution looks at the fraction of time one player is winning in a fair coin toss game and has been studied for over a hundred years. There has been little further work on how the distribution changes when the coin tosses are not fair or when a player has already won the initial coin tosses or, equivalently, starts with a lead. This thesis will first cover a proof of the arcsine distribution. Then, we explore how the distribution changes when the coin the is unfair. Finally, we will explore the distribution when one person has won the first …


Development Of A Data-Driven Patient Engagement Score Using Finite Mixture Models, Eric Bae Jan 2019

Development Of A Data-Driven Patient Engagement Score Using Finite Mixture Models, Eric Bae

Electronic Theses and Dissertations

Patient activation measure (PAM) is widely adopted by health care providers to access individual's knowledge, skill, and confidence for managing one's health and healthcare. Patient activation measure (PAM), licensed by Insignia Health, is widely adopted by health care providers to access individual's knowledge, skill, and confidence for managing one's health and healthcare. Multiple studies corroborate the effectiveness of activation measure in predicting most health behaviors, including preventive behaviors, healthy behaviors, self-management behaviors, and health information seeking. However, PAM is heavily dependent on subjective patient-reported data, which are often incomplete. The purpose of this study is to develop an objective statistical …


Cramer Type Moderate Deviations For Random Fields And Mutual Information Estimation For Mixed-Pair Random Variables, Aleksandr Beknazaryan Jan 2019

Cramer Type Moderate Deviations For Random Fields And Mutual Information Estimation For Mixed-Pair Random Variables, Aleksandr Beknazaryan

Electronic Theses and Dissertations

In this dissertation we first study Cramer type moderate deviation for partial sums of random fields by applying the conjugate method. In 1938 Cramer published his results on large deviations of sums of i.i.d. random variables after which a lot of research has been done on establishing Cramer type moderate and large deviation theorems for different types of random variables and for various statistics. In particular results have been obtained for independent non-identically distributed random variables for the sum of independent random to estimate the mutual information between two random variables. The estimates enjoy a central limit theorem under some …


Wald Confidence Intervals For A Single Poisson Parameter And Binomial Misclassification Parameter When The Data Is Subject To Misclassification, Nishantha Janith Chandrasena Poddiwala Hewage Aug 2018

Wald Confidence Intervals For A Single Poisson Parameter And Binomial Misclassification Parameter When The Data Is Subject To Misclassification, Nishantha Janith Chandrasena Poddiwala Hewage

Electronic Theses and Dissertations

This thesis is based on a Poisson model that uses both error-free data and error-prone data subject to misclassification in the form of false-negative and false-positive counts. We present maximum likelihood estimators (MLEs), Fisher's Information, and Wald statistics for Poisson rate parameter and the two misclassification parameters. Next, we invert the Wald statistics to get asymptotic confidence intervals for Poisson rate parameter and false-negative rate parameter. The coverage and width properties for various sample size and parameter configurations are studied via a simulation study. Finally, we apply the MLEs and confidence intervals to one real data set and another realistic …


The Expected Number Of Patterns In A Random Generated Permutation On [N] = {1,2,...,N}, Evelyn Fokuoh Aug 2018

The Expected Number Of Patterns In A Random Generated Permutation On [N] = {1,2,...,N}, Evelyn Fokuoh

Electronic Theses and Dissertations

Previous work by Flaxman (2004) and Biers-Ariel et al. (2018) focused on the number of distinct words embedded in a string of words of length n. In this thesis, we will extend this work to permutations, focusing on the maximum number of distinct permutations contained in a permutation on [n] = {1,2,...,n} and on the expected number of distinct permutations contained in a random permutation on [n]. We further considered the problem where repetition of subsequences are as a result of the occurrence of (Type A and/or Type B) replications. Our method of enumerating the Type A replications causes double …


Re-Evaluating Performance Measurement: New Mathematical Methods To Address Common Performance Measurement Challenges, Jordan David Benis May 2018

Re-Evaluating Performance Measurement: New Mathematical Methods To Address Common Performance Measurement Challenges, Jordan David Benis

Electronic Theses and Dissertations

Performance Measurement is an essential discipline for any business. Robust and reliable performance metrics for people, processes, and technologies enable a business to identify and address deficiencies to improve performance and profitability. The complexity of modern operating environments presents real challenges to developing equitable and accurate performance metrics. This thesis explores and develops two new methods to address common challenges encountered in businesses across the world. The first method addresses the challenge of estimating the relative complexity of various tasks by utilizing the Pearson Correlation Coefficient to identify potentially over weighted and under weighted tasks. The second method addresses the …


Multi Self-Adapting Particle Swarm Optimization Algorithm (Msapso)., Gerhard Koch May 2018

Multi Self-Adapting Particle Swarm Optimization Algorithm (Msapso)., Gerhard Koch

Electronic Theses and Dissertations

The performance and stability of the Particle Swarm Optimization algorithm depends on parameters that are typically tuned manually or adapted based on knowledge from empirical parameter studies. Such parameter selection is ineffectual when faced with a broad range of problem types, which often hinders the adoption of PSO to real world problems. This dissertation develops a dynamic self-optimization approach for the respective parameters (inertia weight, social and cognition). The effects of self-adaption for the optimal balance between superior performance (convergence) and the robustness (divergence) of the algorithm with regard to both simple and complex benchmark functions is investigated. This work …


Statistical Algorithms And Bioinformatics Tools Development For Computational Analysis Of High-Throughput Transcriptomic Data, Adam Mcdermaid Jan 2018

Statistical Algorithms And Bioinformatics Tools Development For Computational Analysis Of High-Throughput Transcriptomic Data, Adam Mcdermaid

Electronic Theses and Dissertations

Next-Generation Sequencing technologies allow for a substantial increase in the amount of data available for various biological studies. In order to effectively and efficiently analyze this data, computational approaches combining mathematics, statistics, computer science, and biology are implemented. Even with the substantial efforts devoted to development of these approaches, numerous issues and pitfalls remain. One of these issues is mapping uncertainty, in which read alignment results are biased due to the inherent difficulties associated with accurately aligning RNA-Sequencing reads. GeneQC is an alignment quality control tool that provides insight into the severity of mapping uncertainty in each annotated gene from …


The Impact Of Data Sovereignty On American Indian Self-Determination: A Framework Proof Of Concept Using Data Science, Joseph Carver Robertson Jan 2018

The Impact Of Data Sovereignty On American Indian Self-Determination: A Framework Proof Of Concept Using Data Science, Joseph Carver Robertson

Electronic Theses and Dissertations

The Data Sovereignty Initiative is a collection of ideas that was designed to create SMART solutions for tribal communities. This concept was to develop a horizontal governance framework to create a strategic act of sovereignty using data science. The core concept of this idea was to present data sovereignty as a way for tribal communities to take ownership of data in order to affect policy and strategic decisions that are data driven in nature. The case studies in this manuscript were developed around statistical theories of spatial statistics, exploratory data analysis, and machine learning. And although these case studies are …


Old English Character Recognition Using Neural Networks, Sattajit Sutradhar Jan 2018

Old English Character Recognition Using Neural Networks, Sattajit Sutradhar

Electronic Theses and Dissertations

Character recognition has been capturing the interest of researchers since the beginning of the twentieth century. While the Optical Character Recognition for printed material is very robust and widespread nowadays, the recognition of handwritten materials lags behind. In our digital era more and more historical, handwritten documents are digitized and made available to the general public. However, these digital copies of handwritten materials lack the automatic content recognition feature of their printed materials counterparts. We are proposing a practical, accurate, and computationally efficient method for Old English character recognition from manuscript images. Our method relies on a modern machine learning …


Peptide Identification: Refining A Bayesian Stochastic Model, Theophilus Barnabas Kobina Acquah May 2017

Peptide Identification: Refining A Bayesian Stochastic Model, Theophilus Barnabas Kobina Acquah

Electronic Theses and Dissertations

Notwithstanding the challenges associated with different methods of peptide identification, other methods have been explored over the years. The complexity, size and computational challenges of peptide-based data sets calls for more intrusion into this sphere. By relying on the prior information about the average relative abundances of bond cleavages and the prior probability of any specific amino acid sequence, we refine an already developed Bayesian approach in identifying peptides. The likelihood function is improved by adding additional ions to the model and its size is driven by two overall goodness of fit measures. In the face of the complexities associated …


Newsvendor Models With Monte Carlo Sampling, Ijeoma W. Ekwegh Aug 2016

Newsvendor Models With Monte Carlo Sampling, Ijeoma W. Ekwegh

Electronic Theses and Dissertations

Newsvendor Models with Monte Carlo Sampling by Ijeoma Winifred Ekwegh The newsvendor model is used in solving inventory problems in which demand is random. In this thesis, we will focus on a method of using Monte Carlo sampling to estimate the order quantity that will either maximizes revenue or minimizes cost given that demand is uncertain. Given data, the Monte Carlo approach will be used in sampling data over scenarios and also estimating the probability density function. A bootstrapping process yields an empirical distribution for the order quantity that will maximize the expected profit. Finally, this method will be used …


Multilevel Models For Longitudinal Data, Aastha Khatiwada Aug 2016

Multilevel Models For Longitudinal Data, Aastha Khatiwada

Electronic Theses and Dissertations

Longitudinal data arise when individuals are measured several times during an ob- servation period and thus the data for each individual are not independent. There are several ways of analyzing longitudinal data when different treatments are com- pared. Multilevel models are used to analyze data that are clustered in some way. In this work, multilevel models are used to analyze longitudinal data from a case study. Results from other more commonly used methods are compared to multilevel models. Also, comparison in output between two software, SAS and R, is done. Finally a method consisting of fitting individual models for each …


Takens Theorem With Singular Spectrum Analysis Applied To Noisy Time Series, Thomas K. Torku May 2016

Takens Theorem With Singular Spectrum Analysis Applied To Noisy Time Series, Thomas K. Torku

Electronic Theses and Dissertations

The evolution of big data has led to financial time series becoming increasingly complex, noisy, non-stationary and nonlinear. Takens theorem can be used to analyze and forecast nonlinear time series, but even small amounts of noise can hopelessly corrupt a Takens approach. In contrast, Singular Spectrum Analysis is an excellent tool for both forecasting and noise reduction. Fortunately, it is possible to combine the Takens approach with Singular Spectrum analysis (SSA), and in fact, estimation of key parameters in Takens theorem is performed with Singular Spectrum Analysis. In this thesis, we combine the denoising abilities of SSA with the Takens …


Identifying Data Centers From Satellite Imagery, Adam Buskirk Jan 2016

Identifying Data Centers From Satellite Imagery, Adam Buskirk

Electronic Theses and Dissertations

We develop two different descriptors which can be utilized to describe satellite imagery. The first, the differential-magnitude and radius descriptor, describes a scene by computing the directional gradient of the scene with respect to a vector field whose solutions are circles around a pixel to be described, and then counts pixels in a descriptor matrix according to the magnitude of this gradient and the distance at which this magnitude occurs. The second, the radial Fourier descriptor, extracts from the scene a sequence of annuloid sectors, and uses this to approximate the behavior of the image on a circle around the …


Predicting Intraday Financial Market Dynamics Using Takens' Vectors; Incorporating Causality Testing And Machine Learning Techniques, Abubakar-Sadiq Bouda Abdulai Dec 2015

Predicting Intraday Financial Market Dynamics Using Takens' Vectors; Incorporating Causality Testing And Machine Learning Techniques, Abubakar-Sadiq Bouda Abdulai

Electronic Theses and Dissertations

Traditional approaches to predicting financial market dynamics tend to be linear and stationary, whereas financial time series data is increasingly nonlinear and non-stationary. Lately, advances in dynamical systems theory have enabled the extraction of complex dynamics from time series data. These developments include theory of time delay embedding and phase space reconstruction of dynamical systems from a scalar time series. In this thesis, a time delay embedding approach for predicting intraday stock or stock index movement is developed. The approach combines methods of nonlinear time series analysis with those of causality testing, theory of dynamical systems and machine learning (artificial …


Are Highly Dispersed Variables More Extreme? The Case Of Distributions With Compact Support, Benedict E. Adjogah May 2014

Are Highly Dispersed Variables More Extreme? The Case Of Distributions With Compact Support, Benedict E. Adjogah

Electronic Theses and Dissertations

We consider discrete and continuous symmetric random variables X taking values in [0; 1], and thus having expected value 1/2. The main thrust of this investigation is to study the correlation between the variance, Var(X) of X and the value of the expected maximum E(Mn) = E(X1,...,Xn) of n independent and identically distributed random variables X1,X2,...,Xn, each distributed as X. Many special cases are studied, some leading to very interesting alternating sums, and some progress is made towards a general theory.


Level Crossing Times In Mathematical Finance, Ofosuhene Osei May 2013

Level Crossing Times In Mathematical Finance, Ofosuhene Osei

Electronic Theses and Dissertations

Level crossing times and their applications in finance are of importance, given certain threshold levels that represent the "desirable" or "sell" values of a stock. In this thesis, we make use of Wald's lemmas and various deep results from renewal theory, in the context of finance, in modelling the growth of a portfolio of stocks. Several models are employed .


Estimation Of Standardized Mortality Ratio In Epidemiological Studies, Bingxia Wang Jan 2002

Estimation Of Standardized Mortality Ratio In Epidemiological Studies, Bingxia Wang

Electronic Theses and Dissertations

In epidemiological studies, we are often interested in comparing the mortality rate of a certain cohort to that of a standard population. A standard computational statistic in this regard is the Standardized Mortality Ratio (SMR) @reslow and Day, 1987), given by where 0 is the number of deaths observed in the study cohort from a specified cause, E is the expected number calculated from that population. In occupational epidemiology, the SMR is the most common measure of risk. It is a comparative statistic. It is frequently based on a comparison of the number0 in the cohort with the expected value …


Reliability Studies Of The Skew Normal Distribution, Nicole Dawn Brown Jan 2001

Reliability Studies Of The Skew Normal Distribution, Nicole Dawn Brown

Electronic Theses and Dissertations

It has been observed in various practical applications that data do not conform to the normal distribution, which is symmetric with no skewness. The skew normal distribution proposed by Azzalini(1985) is appropriate for the analysis of data which is unimodal but exhibits some skewness. The skew normal distribution includes the normal distribution as a special case where the skewness parameter is zero. In this thesis, we study the structural properties of the skew normal distribution, with an emphasis on the reliability properties of the model. More specifically, we obtain the failure rate, the mean residual life function, and the reliability …