Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Theses/Dissertations

Statistics and Probability

2015

Institution
Keyword
Publication

Articles 1 - 30 of 164

Full-Text Articles in Physical Sciences and Mathematics

Applying Bayesian Machine Learning Methods To Theoretical Surface Science, Shane Carr Dec 2015

Applying Bayesian Machine Learning Methods To Theoretical Surface Science, Shane Carr

McKelvey School of Engineering Theses & Dissertations

Machine learning is a rapidly evolving field in computer science with increasingly many applications to other domains. In this thesis, I present a Bayesian machine learning approach to solving a problem in theoretical surface science: calculating the preferred active site on a catalyst surface for a given adsorbate molecule. I formulate the problem as a low-dimensional objective function. I show how the objective function can be approximated into a certain confidence interval using just one iteration of the self-consistent field (SCF) loop in density functional theory (DFT). I then use Bayesian optimization to perform a global search for the solution. …


Recent Advances In Accumulating Priority Queues, Na Li Dec 2015

Recent Advances In Accumulating Priority Queues, Na Li

Electronic Thesis and Dissertation Repository

This thesis extends the theory underlying the Accumulating Priority Queue (APQ) in three directions. In the first, we present a multi-class multi-server accumulating priority queue with Poisson arrivals and heterogeneous services. The waiting time distributions for different classes have been derived. A conservation law for systems with heterogeneous servers has been studied. We also investigate an optimization problem to find the optimal level of heterogeneity in the multi-server system. Numerical investigations through simulation are carried out to validate the model.

We next focus on a queueing system with Poisson arrivals, generally distributed service times and nonlinear priority accumulation functions. We …


To Hydrate Or Chlorinate: A Regression Analysis Of The Levels Of Chlorine In The Public Water Supply, Drew A. Doyle Dec 2015

To Hydrate Or Chlorinate: A Regression Analysis Of The Levels Of Chlorine In The Public Water Supply, Drew A. Doyle

HIM 1990-2015

Public water supplies contain disease-causing microorganisms in the water or distribution ducts. In order to kill off these pathogens, a disinfectant, such as chlorine, is added to the water. Chlorine is the most widely used disinfectant in all U.S. water treatment facilities. Chlorine is known to be one of the most powerful disinfectants to restrict harmful pathogens from reaching the consumer. In the interest of obtaining a better understanding of what variables affect the levels of chlorine in the water, this thesis will analyze a particular set of water samples randomly collected from locations in Orange County, Florida. Thirty water …


Calorimetry And Body Composition Research In Broilers And Broiler Breeders, Justina Victoria Caldas Cueva Dec 2015

Calorimetry And Body Composition Research In Broilers And Broiler Breeders, Justina Victoria Caldas Cueva

Graduate Theses and Dissertations

Indirect calorimetry to study heat production (HP) and dual energy X-ray absorptiometry (DEXA) for body composition (BC) are powerful techniques to study the dynamics of energy and protein utilization in poultry. The first two chapters present the BC (dry matter, lean, protein, and fat, bone mineral, calcium and phosphorus) of modern broilers from 1 – 60 d of age analyzed by chemical analysis and DEXA. DEXA has been validated for precision, standardized for position, and equations and validations developed for chickens under two different feeding levels. These equations are unique to the machine and software in use. Research in broilers …


Objective Bayesian Analysis On The Quantile Regression, Shiyi Tu Dec 2015

Objective Bayesian Analysis On The Quantile Regression, Shiyi Tu

All Dissertations

The dissertation consists of two distinct but related research projects. First of all, we study the Bayesian analysis on the two-piece location-scale models, which contain several well-known sub-distributions, such as the asymmetric Laplace distribution, the skewed normal distribution, and the skewed Student-t distribution. The use of two-piece location-scale models is an attractive method to model non-symmetric data. From a practical point of view, a prior with some objective information may be more reasonable due to the lack of prior information in many applied situations. It has been shown that several common used objective priors, such as the Jeffreys prior, result …


Meta-Analysis Of Lapatinib Plus Capecitabine Versus Capecitabine In The Treatment Of Her2 Positive Breast Cancer, Lynda Smith Dec 2015

Meta-Analysis Of Lapatinib Plus Capecitabine Versus Capecitabine In The Treatment Of Her2 Positive Breast Cancer, Lynda Smith

Culminating Projects in Applied Statistics

BACKGROUND:

Breast cancer is the most common type of cancer in women despite advances in research and detection methods. Approximately 25 to 30 percent of newly diagnosed cases of breast cancer will overexpress HER2, human epidermal growth factor receptor 2, and are at a greater risk for disease progression and poorer clinical outcomes. The traditional treatment is associated with irreversible cardiac dysfunction. An alternative treatment involving lapatinib plus capecitabine has been reported in some randomized controlled clinical trials comparing treatment outcomes. To quantify the effectiveness of lapatinib plus capecitabine combination therapy versus capecitabine monotherapy in treating metastatic breast cancer, a …


Rank Based Procedures For Ordered Alternative Models, Yuanyuan Shao Dec 2015

Rank Based Procedures For Ordered Alternative Models, Yuanyuan Shao

Dissertations

The ordered alternatives in a one-way layout with k ordered treatment levels are appropriate for many applications, especially in psychology and medicine. There is extensive literature in this area, and many parametric and nonparametric approaches have been introduced. Abelson-Tukey (AT) test is a frequently used parametric method. Its coefficients provide an ideal way of combining means for the purpose of detecting a monotonic relationship between the independent and dependent variables. The AT method, though, is not robust. Furthermore, our initial empirical studies show that it is not more powerful than the Jonckheere-Terpstra (JT) and the Hettmansperger- Norton (HN) nonparametric tests …


A Statistical Model For The Prediction Of Dissolved Oxygen Dynamics And The Potential For Hypoxia In The Mississippi Sound And Bight, Andreas Moshogianis Dec 2015

A Statistical Model For The Prediction Of Dissolved Oxygen Dynamics And The Potential For Hypoxia In The Mississippi Sound And Bight, Andreas Moshogianis

Master's Theses

Hypoxia events occur when dissolved oxygen concentrations fall below the minimum threshold (dissolved oxygen concentrations < 2 mg O2 L-1) necessary to avoid respiratory distress among aquatic organisms. In the Mississippi Sound and Bight, hypoxia is most prevalent from late-spring through late summer. Since hypoxia events can have dramatic effects on coastal fisheries, the spatial and temporal magnitude of hypoxia presents a clear threat to the productive fisheries in the northern Gulf of Mexico. Long-term hydrographic data were collected from eight sampling stations on a monthly basis from January 2009 to December 2011 along a cross-shelf transect from the mouth of …


Probabilistic Graphical Modeling On Big Data, Ming-Hua Chung Dec 2015

Probabilistic Graphical Modeling On Big Data, Ming-Hua Chung

Graduate Theses and Dissertations

The rise of Big Data in recent years brings many challenges to modern statistical analysis and modeling. In toxicogenomics, the advancement of high-throughput screening technologies facilitates the generation of massive amount of biological data, a big data phenomena in biomedical science. Yet, researchers still heavily rely on key word search and/or literature review to navigate the databases and analyses are often done in rather small-scale. As a result, the rich information of a database has not been fully utilized, particularly for the information embedded in the interactive nature between data points that are largely ignored and buried. For the past …


Predicting Intraday Financial Market Dynamics Using Takens' Vectors; Incorporating Causality Testing And Machine Learning Techniques, Abubakar-Sadiq Bouda Abdulai Dec 2015

Predicting Intraday Financial Market Dynamics Using Takens' Vectors; Incorporating Causality Testing And Machine Learning Techniques, Abubakar-Sadiq Bouda Abdulai

Electronic Theses and Dissertations

Traditional approaches to predicting financial market dynamics tend to be linear and stationary, whereas financial time series data is increasingly nonlinear and non-stationary. Lately, advances in dynamical systems theory have enabled the extraction of complex dynamics from time series data. These developments include theory of time delay embedding and phase space reconstruction of dynamical systems from a scalar time series. In this thesis, a time delay embedding approach for predicting intraday stock or stock index movement is developed. The approach combines methods of nonlinear time series analysis with those of causality testing, theory of dynamical systems and machine learning (artificial …


Macrobenthic Communities In The Northern Gulf Of Mexico Hypoxic Zone: Testing The Pearson-Rosenberg Model, Shivakumar Shivarudrappa Dec 2015

Macrobenthic Communities In The Northern Gulf Of Mexico Hypoxic Zone: Testing The Pearson-Rosenberg Model, Shivakumar Shivarudrappa

Dissertations

The Pearson and Rosenberg (P-R) conceptual model of macrobenthic succession was used to assess the impact of hypoxia (dissolved oxygen [DO] ≤ 2 mg/L) on the macrobenthic community on the continental shelf of northern Gulf of Mexico for the first time. The model uses a stress-response relationship between environmental parameters and the macrobenthic community to determine the ecological condition of the benthic habitat. The ecological significance of dissolved oxygen in a benthic habitat is well understood. In addition, the annual recurrence of bottom-water hypoxia on the Louisiana/Texas shelf during summer months is well documented.

The P-R model illustrates the decreasing …


Niche-Based Modeling Of Japanese Stiltgrass (Microstegium Vimineum) Using Presence-Only Information, Nathan Bush Nov 2015

Niche-Based Modeling Of Japanese Stiltgrass (Microstegium Vimineum) Using Presence-Only Information, Nathan Bush

Masters Theses

The Connecticut River watershed is experiencing a rapid invasion of aggressive non-native plant species, which threaten watershed function and structure. Volunteer-based monitoring programs such as the University of Massachusetts’ OutSmart Invasives Species Project, Early Detection Distribution Mapping System (EDDMapS) and the Invasive Plant Atlas of New England (IPANE) have gathered valuable invasive plant data. These programs provide a unique opportunity for researchers to model invasive plant species utilizing citizen-sourced data. This study took advantage of these large data sources to model invasive plant distribution and to determine environmental and biophysical predictors that are most influential in dispersion, and to identify …


Estimation Problems In Complex Field Studies With Deep Interactions: Time-To-Event And Local Regression Models For Environmental Effects On Vital Rates, Krzysztof M. Sakrejda Nov 2015

Estimation Problems In Complex Field Studies With Deep Interactions: Time-To-Event And Local Regression Models For Environmental Effects On Vital Rates, Krzysztof M. Sakrejda

Doctoral Dissertations

Field studies that measure vital rates in context over extended time periods are a cornerstone of our understanding of population processes. These studies inform us about the relationship between biological process and environmental noise in an irreplaceable way. These data sets bring ``big data'' and ``big model'' challenges, which limit the application of standard software (e.g., \textbf{BUGS}). The environmental sensitivity of vital rates is also expected to exhibit interactions and non-linearity, which typically result in difficult model selection questions in large data sets. Finally, long-term ecological data sets often contain complex temporal structure. In commonly applied discrete-time models complex temporal …


Wind Power Capacity Value Metrics And Variability: A Study In New England, Frederick W. Letson Nov 2015

Wind Power Capacity Value Metrics And Variability: A Study In New England, Frederick W. Letson

Doctoral Dissertations

Capacity value is the contribution of a power plant to the ability of the power system to meet high demand. As wind power penetration in New England, and worldwide, increases so does the importance of identifying the capacity contribution made by wind power plants. It is critical to accurately characterize the capacity value of these wind power plants and the variability of the capacity value over the long term. This is important in order to avoid the cost of keeping extra power plants operational while still being able to cover the demand for power reliably. This capacity value calculation is …


Variable Selection In Single Index Varying Coefficient Models With Lasso, Peng Wang Nov 2015

Variable Selection In Single Index Varying Coefficient Models With Lasso, Peng Wang

Doctoral Dissertations

Single index varying coefficient model is a very attractive statistical model due to its ability to reduce dimensions and easy-of-interpretation. There are many theoretical studies and practical applications with it, but typically without features of variable selection, and no public software is available for solving it. Here we propose a new algorithm to fit the single index varying coefficient model, and to carry variable selection in the index part with LASSO. The core idea is a two-step scheme which alternates between estimating coefficient functions and selecting-and-estimating the single index. Both in simulation and in application to a Geoscience dataset, we …


Threat Analysis, Countermeaures And Design Strategies For Secure Computation In Nanometer Cmos Regime, Raghavan Kumar Nov 2015

Threat Analysis, Countermeaures And Design Strategies For Secure Computation In Nanometer Cmos Regime, Raghavan Kumar

Doctoral Dissertations

Advancements in CMOS technologies have led to an era of Internet Of Things (IOT), where the devices have the ability to communicate with each other apart from their computational power. As more and more sensitive data is processed by embedded devices, the trend towards lightweight and efficient cryptographic primitives has gained significant momentum. Achieving a perfect security in silicon is extremely difficult, as the traditional cryptographic implementations are vulnerable to various active and passive attacks. There is also a threat in the form of "hardware Trojans" inserted into the supply chain by the untrusted third-party manufacturers for economic incentives. Apart …


Physical Activity Classification With Conditional Random Fields, Evan L. Ray Nov 2015

Physical Activity Classification With Conditional Random Fields, Evan L. Ray

Doctoral Dissertations

In this thesis we develop methods for classifying physical activity using accelerometer recordings. We cast this as a problem of classification in time series with moderate to high dimensional observations at each time point. Specifically, we observe a vector of summary statistics of the accelerometer signal at each point in time, and we wish to use these observations to estimate the type and intensity of physical activity the individual engaged in as it changes over time. Our methods are based on Conditional Random Fields, which allow us to capture temporal dependence in an individual’s physical activity type without requiring us …


Analysis Of Rheumatoid Arthritis Data Using Logistic Regression And Penalized Approach, Wei Chen Nov 2015

Analysis Of Rheumatoid Arthritis Data Using Logistic Regression And Penalized Approach, Wei Chen

USF Tampa Graduate Theses and Dissertations

In this paper, a rheumatoid arthritis (RA) medicine clinical dataset with an ordinal response is selected to study this new medicine. In the dataset, there are four features, sex, age,treatment, and preliminary. Sex is a binary categorical variable with 1 indicates male, and 0 indicates female. Age is the numerical age of the patients. And treatment is a binary categorical variable with 1 indicates has RA, and 0 indicates does not have RA. And preliminary is a five class categorical variable indicates the patient’s RA severity status before taking the medication. The response Y is 5 class ordinal variable shows …


Ensemble Learning Method On Machine Maintenance Data, Xiaochuang Zhao Nov 2015

Ensemble Learning Method On Machine Maintenance Data, Xiaochuang Zhao

USF Tampa Graduate Theses and Dissertations

In the industry, a lot of companies are facing the explosion of big data. With this much information stored, companies want to make sense of the data and use it to help them for better decision making, especially for future prediction. A lot of money can be saved and huge revenue can be generated with the power of big data. When building statistical learning models for prediction, companies in the industry are aiming to build models with efficiency and high accuracy. After the learning models have been developed for production, new data will be generated. With the updated data, the …


A Novel Method For Assessing Co-Monotonicity: An Interplay Between Mathematics And Statistics With Applications, Danang T. Qoyyimi Nov 2015

A Novel Method For Assessing Co-Monotonicity: An Interplay Between Mathematics And Statistics With Applications, Danang T. Qoyyimi

Electronic Thesis and Dissertation Repository

Numerous problems in econometrics, insurance, reliability engineering, and statistics rely on the assumption that certain functions are monotonic, which may or may not be true in real life scenarios. To satisfy this requirement, from the theoretical point of view, researchers frequently model the underlying phenomena using parametric and semi-parametric families of functions, thus effectively specifying the required shapes of the functions. To tackle these problems in a non-parametric way, when the shape cannot be specified explicitly but only estimated approximately, we suggest indices for measuring the lack of monotonicity in functions. We investigate properties of these indices and offer convenient …


Telecom Data Analysis, Sai Roopak Sarva, Anudeep Masetty, Vinay Reddy Kondam Oct 2015

Telecom Data Analysis, Sai Roopak Sarva, Anudeep Masetty, Vinay Reddy Kondam

All Capstone Projects

The telecommunications industry regularly uses data analytics in fields such as customer analysis and network optimization. For financial analysis such as identifying risks, which could negatively impact an entity’s financial performance, communications service providers have traditionally used statistical sampling techniques that cover only short time periods and a limited subset of data.

Given the massive number of transactions processed by telecommunications companies; and the costs and complexity involved in their operations, data analytics offers a valuable opportunity for enhancing the frameworks and procedures they adopt to drive profitability and minimize unnecessary downside risk.


Probabilistic Reasoning In Cosmology, Yann Benétreau-Dupin Sep 2015

Probabilistic Reasoning In Cosmology, Yann Benétreau-Dupin

Electronic Thesis and Dissertation Repository

Cosmology raises novel philosophical questions regarding the use of probabilities in inference. This work aims at identifying and assessing lines of arguments and problematic principles in probabilistic reasoning in cosmology.

The first, second, and third papers deal with the intersection of two distinct problems: accounting for selection effects, and representing ignorance or indifference in probabilistic inferences. These two problems meet in the cosmology literature when anthropic considerations are used to predict cosmological parameters by conditionalizing the distribution of, e.g., the cosmological constant on the number of observers it allows for. However, uniform probability distributions usually appealed to in such arguments …


Bayesian Inference On Longitudinal Semi-Continuous Substance Abuse/Dependence Symptoms Data, Dongyuan Xing Sep 2015

Bayesian Inference On Longitudinal Semi-Continuous Substance Abuse/Dependence Symptoms Data, Dongyuan Xing

USF Tampa Graduate Theses and Dissertations

Substance use data such as alcohol drinking often contain a high proportion of zeros. In studies examining the alcohol consumption in college students, for instance, many students may not drink in the studied period, resulting in a number of zeros. Zero-inflated continuous data, also called semi continuous data, typically consist of a mixture of a degenerate distribution at the origin (zero) and a right-skewed, continuous distribution for the positive values. Ignoring the extreme non-normality in semi-continuous data may lead to substantially biased estimates and inference. Longitudinal or repeated measures of semi-continuous data present special challenges in statistical inference because of …


Per-Contact Infectivity Of Hcv Associated With Injection Exposures In A Prospective Cohort Of Young Injection Drug Users In San Francisco, Ca (Ufo Study), Yuridia Leyva Sep 2015

Per-Contact Infectivity Of Hcv Associated With Injection Exposures In A Prospective Cohort Of Young Injection Drug Users In San Francisco, Ca (Ufo Study), Yuridia Leyva

Mathematics & Statistics ETDs

Sharing needles and ancillary injection drug equipment places injection drug users (IDU) at risk for Hepatitis C Virus (HCV), a highly infectious blood-borne virus. A limited number of studies have analyzed the per-contact infectivity of HCV associated with the use of previously-used needles, but per-contact infectivity of ancillary injecting equipment has not been previously investigated. Our goal is to estimate the per-contact infectivity of HCV associated with (1) injecting with another person's previously-used needle, classified as receptive needle sharing (RNS), and (2) using another person's previously-used ancillary injecting equipment, such as cookers to melt drugs and cottons to strain impurities …


On The Estimation Of Intracluster Correlation For Time-To-Event Outcomes In Cluster Randomized Trials, Sumeet Kalia Aug 2015

On The Estimation Of Intracluster Correlation For Time-To-Event Outcomes In Cluster Randomized Trials, Sumeet Kalia

Electronic Thesis and Dissertation Repository

Cluster randomized trials (CRTs) involve the random assignment of intact social units rather than independent subjects to intervention groups. Time-to-event outcomes often are endpoints in CRTs where the intracluster correlation coefficient (ICC) serves as a descriptive parameter to assess the similarity among outcomes in a cluster. However, estimating the ICC in CRTs with time-to-event outcomes is a challenge due to the presence of censored observations. The ICC is estimated for two CRTs using the censoring indicators and observed outcomes.

A simulation study explores the effect of administrative censoring on estimating the ICC. Results show that the ICC estimators derived from …


The Impact Of Panama Canal Expansion On The U.S. Gateway Ports’ Attractiveness To The Discretionary Cargo Shippers, Jie Xu Aug 2015

The Impact Of Panama Canal Expansion On The U.S. Gateway Ports’ Attractiveness To The Discretionary Cargo Shippers, Jie Xu

World Maritime University Dissertations

No abstract provided.


The Optimization Research Of Southeast Asian Container Liner Routes Of Sitc Company, Sheng Sheng Aug 2015

The Optimization Research Of Southeast Asian Container Liner Routes Of Sitc Company, Sheng Sheng

World Maritime University Dissertations

No abstract provided.


The Analysis Of Bdti In Tanker Transport Market, Zhisen Wang Aug 2015

The Analysis Of Bdti In Tanker Transport Market, Zhisen Wang

World Maritime University Dissertations

No abstract provided.


Research On Port Network Layout From The Perspective Of Sea Ports And Dry Ports Linked Development Under The Background Of “Obor”, Yameng Guo Aug 2015

Research On Port Network Layout From The Perspective Of Sea Ports And Dry Ports Linked Development Under The Background Of “Obor”, Yameng Guo

World Maritime University Dissertations

No abstract provided.


Research On Liner Shipping Schedule Recovery, Xiaye Tang Aug 2015

Research On Liner Shipping Schedule Recovery, Xiaye Tang

World Maritime University Dissertations

No abstract provided.