Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

2015

PDF

Statistics and Probability

Institution
Keyword
Publication
Publication Type

Articles 1 - 30 of 597

Full-Text Articles in Physical Sciences and Mathematics

A Data Science Course For Undergraduates: Thinking With Data, Benjamin Baumer Dec 2015

A Data Science Course For Undergraduates: Thinking With Data, Benjamin Baumer

Mathematics Sciences: Faculty Publications

Data science is an emerging interdisciplinary field that combines elements of mathematics, statistics, computer science, and knowledge in a particular application domain for the purpose of extracting meaningful information from the increasingly sophisticated array of data available in many settings. These data tend to be nontraditional, in the sense that they are often live, large, complex, and/or messy. A first course in statistics at the undergraduate level typically introduces students to a variety of techniques to analyze small, neat, and clean datasets. However, whether they pursue more formal training in statistics or not, many of these students will end up …


System-Wide Prediction Of General, All-Cause, Preventable Hospital Readmissions, Ken Musselman, Brandon Pope, Steve Witz, Zhiyi Tian, Lingsong Zhang, Linda Leon, Ann Davis Dec 2015

System-Wide Prediction Of General, All-Cause, Preventable Hospital Readmissions, Ken Musselman, Brandon Pope, Steve Witz, Zhiyi Tian, Lingsong Zhang, Linda Leon, Ann Davis

RCHE Publications

Existing studies of hospital readmissions typically focus on specific diagnoses, age groups, discharge dispositions, payer classes, or hospitals, and often use small samples. It is not clear how predictive models generated from such studies generalize across diseases, hospitals, or time periods. In this study, a logistic regression model of readmission risk within 30 days based on hospital administrative data was constructed and validated across hospitals and time periods. The hospitals included both general and specialty hospitals such as long-term care, women’s, and children’s hospitals. The administrative data included information on patient’s demographics, diagnoses, procedures, and discharge disposition. Derivation and validation …


Statistical Handling Of Medical Data - An Ethical Perspective, Ajay Kumar Bansal Dr Dec 2015

Statistical Handling Of Medical Data - An Ethical Perspective, Ajay Kumar Bansal Dr

COBRA Preprint Series

Medical Science is a delicate subject and the clinical data generated from the medical trials must be reliable and of good quality. Not only the quality of generated data is important, but the management is also crucial and is to be handled very carefully. In this paper, the ethical aspect of statistical handling of such data is discussed.

Every profession has some set of norms to follow to achieve its objectives. These norms are called professional ethics which shows the essence of human behaviour. Same way, the field of medical research is expected to follow ethical norms, to obtain reliable …


Semi-Parametric Estimation And Inference For The Mean Outcome Of The Single Time-Point Intervention In A Causally Connected Population, Oleg Sofrygin, Mark J. Van Der Laan Dec 2015

Semi-Parametric Estimation And Inference For The Mean Outcome Of The Single Time-Point Intervention In A Causally Connected Population, Oleg Sofrygin, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

We study the framework for semi-parametric estimation and statistical inference for the sample average treatment-specific mean effects in observational settings where data are collected on a single network of connected units (e.g., in the presence of interference or spillover). Despite recent advances, many of the current statistical methods rely on estimation techniques that assume a particular parametric model for the outcome, even though some of the most important statistical assumptions required by these models are most likely violated in the observational network settings, often resulting in invalid and anti-conservative statistical inference. In this manuscript, we rely on the recent methodological …


Statistical Estimation Of White Matter Microstructure From Conventional Mri, Leah Suttner, Amanda Mejia, Blake Dewey, Pascal Sati, Daniel S. Reich, Russell T. Shinohara Dec 2015

Statistical Estimation Of White Matter Microstructure From Conventional Mri, Leah Suttner, Amanda Mejia, Blake Dewey, Pascal Sati, Daniel S. Reich, Russell T. Shinohara

UPenn Biostatistics Working Papers

Diffusion tensor imaging (DTI) has become the predominant modality for studying white matter integrity in multiple sclerosis (MS) and other neurological disorders. Unfortunately, the use of DTI-based biomarkers in large multi-center studies is hindered by systematic biases that confound the study of disease-related changes. Furthermore, the site-to-site variability in multi-center studies is significantly higher for DTI than that for conventional MRI-based markers. In our study, we apply the Quantitative MR Estimation Employing Normalization (QuEEN) model to estimate the four DTI measures: MD, FA, RD, and AD. QuEEN uses a voxel-wise generalized additive regression model to relate the normalized intensities of …


Correction Of Verication Bias Using Log-Linear Models For A Single Binaryscale Diagnostic Tests, Haresh Rochani, Hani M. Samawi, Robert L. Vogel, Jingjing Yin Dec 2015

Correction Of Verication Bias Using Log-Linear Models For A Single Binaryscale Diagnostic Tests, Haresh Rochani, Hani M. Samawi, Robert L. Vogel, Jingjing Yin

Biostatistics Faculty Publications

In diagnostic medicine, the test that determines the true disease status without an error is referred to as the gold standard. Even when a gold standard exists, it is extremely difficult to verify each patient due to the issues of costeffectiveness and invasive nature of the procedures. In practice some of the patients with test results are not selected for verification of the disease status which results in verification bias for diagnostic tests. The ability of the diagnostic test to correctly identify the patients with and without the disease can be evaluated by measures such as sensitivity, specificity and predictive …


Applying Bayesian Machine Learning Methods To Theoretical Surface Science, Shane Carr Dec 2015

Applying Bayesian Machine Learning Methods To Theoretical Surface Science, Shane Carr

McKelvey School of Engineering Theses & Dissertations

Machine learning is a rapidly evolving field in computer science with increasingly many applications to other domains. In this thesis, I present a Bayesian machine learning approach to solving a problem in theoretical surface science: calculating the preferred active site on a catalyst surface for a given adsorbate molecule. I formulate the problem as a low-dimensional objective function. I show how the objective function can be approximated into a certain confidence interval using just one iteration of the self-consistent field (SCF) loop in density functional theory (DFT). I then use Bayesian optimization to perform a global search for the solution. …


A Generally Efficient Targeted Minimum Loss Based Estimator, Mark J. Van Der Laan Dec 2015

A Generally Efficient Targeted Minimum Loss Based Estimator, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

Suppose we observe n independent and identically distributed observations of a finite dimensional bounded random variable. This article is concerned with the construction of an efficient targeted minimum loss-based estimator (TMLE) of a pathwise differentiable target parameter based on a realistic statistical model.

The canonical gradient of the target parameter at a particular data distribution will depend on the data distribution through an infinite dimensional nuisance parameter which can be defined as the minimizer of the expectation of a loss function (e.g., log-likelihood loss). For many models and target parameters the nuisance parameter can be split up in two components, …


Characterizing The Statistical Distribution Of Organic Carbon And Extractable Phosphorus At A Regional Scale, John J. Brejda, David W. Meek, Douglas L. Karlen Dec 2015

Characterizing The Statistical Distribution Of Organic Carbon And Extractable Phosphorus At A Regional Scale, John J. Brejda, David W. Meek, Douglas L. Karlen

Douglas L Karlen

Greater awareness of potential environmental problems has created the need to monitor total organic carbon (TOC) and extractable phosphorus (P) concentrations at a regional scale. The probability distribution of these soil properties can have a significant effect on the power of statistical tests and the quality of inferences applied to these properties. The objectives of this study were to: (1) evaluate the probability distribution of TOC and extractable P at the regional scale in three Major Land Resource Areas (MLRA), and (2) identify appropriate transformations that will result in a normal distribution. Both TOC and extractable P were non-normally distributed …


Corn-Soybean And Alternative Cropping Systems Effects On No 3 -N Leaching Losses In Subsurface Drainage Water, Rameshwar S. Kanwar, Richard M. Cruse, Mohammadreza Ghaffarzadeh, Allah Bakhsh, Douglas Karlen, Theodore B. Bailey Dec 2015

Corn-Soybean And Alternative Cropping Systems Effects On No 3 -N Leaching Losses In Subsurface Drainage Water, Rameshwar S. Kanwar, Richard M. Cruse, Mohammadreza Ghaffarzadeh, Allah Bakhsh, Douglas Karlen, Theodore B. Bailey

Douglas L Karlen

Alternative cropping systems can improve resource use efficiency, increase corn grain yield, and help in reducing negative impacts on the environment. A 6-yr (1993 to 1998) field study was conducted at the Iowa State University’s Northeastern Research Center near Nashua, Iowa, to evaluate the effects of non-traditional cropping systems [strip inter cropping (STR)-corn (Zea mays L.)/soybean (Glycine max L.)/oats (Avina sativa L.)]; alfalfa rotation (ROT)-3-yr (1993 to 1995) alfalfa (Medicago sativa L.) followed by corn in 1996, soybean in 1997, and oats in 1998), and traditional cropping system (corn after soybean (CS) and soybean after corn (SC) on the flow …


Cropping System Effects On No3-N Loss With Subsurface Drainage Water, Allah Bakhsh, Rameshwar S. Kanwar, Theodore B. Bailey, Cynthia A. Cambardella, Douglas Karlen, Thomas S. Colvin Dec 2015

Cropping System Effects On No3-N Loss With Subsurface Drainage Water, Allah Bakhsh, Rameshwar S. Kanwar, Theodore B. Bailey, Cynthia A. Cambardella, Douglas Karlen, Thomas S. Colvin

Douglas L Karlen

An appropriate combination of tillage and nitrogen management practices will be necessary to develop sustainable farming practices. A six–year (1993–1998) field study was conducted on subsurface–drained Clyde–Kenyon–Floyd soils to quantify the impact of two tillage systems (chisel plow vs. no tillage) and two N fertilizer management practices (preplant single application vs. late–spring soil test based application) on nitrate–nitrogen (NO3–N) leaching loss with subsurface drain discharge from corn (Zea mays L.) soybean (Glycine max L.) rotation plots. Preplant injected urea ammonium nitrate solution (UAN) fertilizer was applied at the rate of 110 kg ha–1 to chisel plow and no–till corn plots, …


Evaluation Of Animal Model Research, Kenneth J. Shapiro Dec 2015

Evaluation Of Animal Model Research, Kenneth J. Shapiro

Kenneth J. Shapiro, PhD

It is argued that a concept of evaluation of animal models that is broader and more useful than validation is available. Productive generativity refers to the degree to which a model furthers understanding and leads to more-effective treatment interventions. Results of the application of this novel evaluative frame to several animal models of eating disorders show that this animal-based research has not been productive. The question of the relation between clinic and animal laboratory is discussed.


Estimated Probability Of Becoming Alcohol Dependent: Extending A Multiparametric Approach, Olga A. Vsevolozhskaya, James C. Anthony Dec 2015

Estimated Probability Of Becoming Alcohol Dependent: Extending A Multiparametric Approach, Olga A. Vsevolozhskaya, James C. Anthony

Biostatistics Presentations

Background: United States (US) epidemiological studies suggest that for every 5-8 who start drinking alcoholic beverages, at least one drinker will develop an alcohol dependence (AD) syndrome within the first 10 years after onset of drinking (Lopez-Quintero et al., 2011; Wagner & Anthony, 2002). Recently, we described a multiparametric functional analysis approach for new research to estimate these transition probabilities with a one-dimensional function (1D; Vsevolozhskaya & Anthony, 2015). Here, we demonstrate extension of this analysis to two-dimensional (2D) functions that combine information about number of recent drinking days and number of drinks on the typical drinking day.

Methods: Data …


Inequality In Treatment Benefits: Can We Determine If A New Treatment Benefits The Many Or The Few?, Emily Huang, Ethan Fang, Daniel Hanley, Michael Rosenblum Dec 2015

Inequality In Treatment Benefits: Can We Determine If A New Treatment Benefits The Many Or The Few?, Emily Huang, Ethan Fang, Daniel Hanley, Michael Rosenblum

Johns Hopkins University, Dept. of Biostatistics Working Papers

The primary analysis in many randomized controlled trials focuses on the average treatment effect and does not address whether treatment benefits are widespread or limited to a select few. This problem affects many disease areas, since it stems from how randomized trials, often the gold standard for evaluating treatments, are designed and analyzed. Our goal is to learn about the fraction who benefit from a treatment, based on randomized trial data. We consider the case where the outcome is ordinal, with binary outcomes as a special case. In general, the fraction who benefit is a non-identifiable parameter, and the best …


Recent Advances In Accumulating Priority Queues, Na Li Dec 2015

Recent Advances In Accumulating Priority Queues, Na Li

Electronic Thesis and Dissertation Repository

This thesis extends the theory underlying the Accumulating Priority Queue (APQ) in three directions. In the first, we present a multi-class multi-server accumulating priority queue with Poisson arrivals and heterogeneous services. The waiting time distributions for different classes have been derived. A conservation law for systems with heterogeneous servers has been studied. We also investigate an optimization problem to find the optimal level of heterogeneity in the multi-server system. Numerical investigations through simulation are carried out to validate the model.

We next focus on a queueing system with Poisson arrivals, generally distributed service times and nonlinear priority accumulation functions. We …


To Hydrate Or Chlorinate: A Regression Analysis Of The Levels Of Chlorine In The Public Water Supply, Drew A. Doyle Dec 2015

To Hydrate Or Chlorinate: A Regression Analysis Of The Levels Of Chlorine In The Public Water Supply, Drew A. Doyle

HIM 1990-2015

Public water supplies contain disease-causing microorganisms in the water or distribution ducts. In order to kill off these pathogens, a disinfectant, such as chlorine, is added to the water. Chlorine is the most widely used disinfectant in all U.S. water treatment facilities. Chlorine is known to be one of the most powerful disinfectants to restrict harmful pathogens from reaching the consumer. In the interest of obtaining a better understanding of what variables affect the levels of chlorine in the water, this thesis will analyze a particular set of water samples randomly collected from locations in Orange County, Florida. Thirty water …


Combating Anti-Statistical Thinking Using Simulation-Based Methods Throughout The Undergraduate Curriculum, Nathan L. Tintle, Beth Chance, George Cobb, Soma Roy, Todd Swanson, Jill Vanderstoep Dec 2015

Combating Anti-Statistical Thinking Using Simulation-Based Methods Throughout The Undergraduate Curriculum, Nathan L. Tintle, Beth Chance, George Cobb, Soma Roy, Todd Swanson, Jill Vanderstoep

Faculty Work Comprehensive List

The use of simulation-based methods for introducing inference is growing in popularity for the Stat 101 course, due in part to increasing evidence of the methods ability to improve students’ statistical thinking. This impact comes from simulation-based methods (a) clearly presenting the overarching logic of inference, (b) strengthening ties between statistics and probability/mathematical concepts, (c) encouraging a focus on the entire research process, (d) facilitating student thinking about advanced statistical concepts, (e) allowing more time to explore, do, and talk about real research and messy data, and (f) acting as a firmer foundation on which to build statistical intuition. Thus, …


An Optimal Reinsurance Contract From Insurer's And Reinsurer's Viewpoints, Ali P. Bazaz, Amir T. Payandeh Najafabadi Dec 2015

An Optimal Reinsurance Contract From Insurer's And Reinsurer's Viewpoints, Ali P. Bazaz, Amir T. Payandeh Najafabadi

Applications and Applied Mathematics: An International Journal (AAM)

This article constructs two classes of appropriate reinsurance contracts from both an insurer’s and a reinsurer’s viewpoints. The first class, say C; has been constructed by minimizing the conditional tail expectation, say CTE, of an insurer’s random risk. Then an optimal reinsurance contract has been obtained by estimating the reinsurance’s random risk, using the Bayesian estimation method while the second class of reinsurance contracts, say C*; is obtained by minimizing a convex combination of the CTE of both the insurer’s and reinsurer’s random risks. These two approaches consider both the insurer’s and reinsurer’s viewpoints to establish an optimal reinsurance contract. …


Discrete Grüss Type Inequality On Fractional Calculus, Elvan Akin, Serkan Asliyuce, Ayse Feza Guvenilir, Billur Kaymakcalan Dec 2015

Discrete Grüss Type Inequality On Fractional Calculus, Elvan Akin, Serkan Asliyuce, Ayse Feza Guvenilir, Billur Kaymakcalan

Mathematics and Statistics Faculty Research & Creative Works

We give a discrete Grüss type inequality on fractional calculus.


An M/G/1 Queue With Server Breakdown And Multiple Working Vavation, S. P. Bala Murugan, K. Santhi Dec 2015

An M/G/1 Queue With Server Breakdown And Multiple Working Vavation, S. P. Bala Murugan, K. Santhi

Applications and Applied Mathematics: An International Journal (AAM)

This paper deals with the steady state behavior of an M=G=1 multiple working vacation queue with server breakdown. The server works with different service times rather than completely stopping service during a vacation. Both service times in a vacation period and in a regular service period are assumed to be generally distributed random variables. The system may breakdown at random and repair time is arbitrary. Further, just after completion of a customer’s service the server may take a multiple working vacation. Supplementary variable technique is employed to find the probability generating function for the number of customers in the system. …


Analysis Of Repairable M[X]/(G1,G2)/1 - Feedback Retrial G-Queue With Balking And Starting Failures Under At Most J Vacations, P. Rajadurai, M. C. Saravanarajan, V. M. Chandrasekaran Dec 2015

Analysis Of Repairable M[X]/(G1,G2)/1 - Feedback Retrial G-Queue With Balking And Starting Failures Under At Most J Vacations, P. Rajadurai, M. C. Saravanarajan, V. M. Chandrasekaran

Applications and Applied Mathematics: An International Journal (AAM)

In this paper, we discuss the steady state analysis of a batch arrival feedback retrial queue with two types of service and negative customers. Any arriving batch of positive customers finds the server is free, one of the customers from the batch enters into the service area and the rest of them join into the orbit. The negative customer, arriving during the service time of a positive customer, will remove the positive customer in-service and the interrupted positive customer either enters into the orbit or leaves the system. If the orbit is empty at the service completion of each type …


Stability Condition Of A Retrial Queueing System With Abandoned And Feedback Customers, Amina A. Bouchentouf, Abbes Rabhi, Lahcene Yahiaoui Dec 2015

Stability Condition Of A Retrial Queueing System With Abandoned And Feedback Customers, Amina A. Bouchentouf, Abbes Rabhi, Lahcene Yahiaoui

Applications and Applied Mathematics: An International Journal (AAM)

This paper deals with the stability of a retrial queueing system with two orbits, abandoned and feedback customers. Two independent Poisson streams of customers arrive to the system, and flow into a single-server service system. An arriving one of type i; i = 1; 2, is handled by the server if it is free; otherwise, it is blocked and routed to a separate type-i retrial (orbit) queue that attempts to re-dispatch its jobs at its specific Poisson rate. The customer in the orbit either attempts service again after a random time or gives up receiving service and leaves the system …


Correcting For Measurement Error In Latent Variables Used As Predictors, Lynne Steuerle Schofield Dec 2015

Correcting For Measurement Error In Latent Variables Used As Predictors, Lynne Steuerle Schofield

Mathematics & Statistics Faculty Works

This paper represents a methodological-substantive synergy. A new model, the Mixed Effects Structural Equations (MESE) model which combines structural equations modeling and item response theory, is introduced to attend to measurement error bias when using several latent variables as predictors in generalized linear models. The paper investigates racial and gender disparities in STEM retention in higher education. Using the MESE model with 1997 National Longitudinal Survey of Youth data, I find prior mathematics proficiency and personality have been previously underestimated in the STEM retention literature. Pre-college mathematics proficiency and personality explain large portions of the racial and gender gaps. The …


Calorimetry And Body Composition Research In Broilers And Broiler Breeders, Justina Victoria Caldas Cueva Dec 2015

Calorimetry And Body Composition Research In Broilers And Broiler Breeders, Justina Victoria Caldas Cueva

Graduate Theses and Dissertations

Indirect calorimetry to study heat production (HP) and dual energy X-ray absorptiometry (DEXA) for body composition (BC) are powerful techniques to study the dynamics of energy and protein utilization in poultry. The first two chapters present the BC (dry matter, lean, protein, and fat, bone mineral, calcium and phosphorus) of modern broilers from 1 – 60 d of age analyzed by chemical analysis and DEXA. DEXA has been validated for precision, standardized for position, and equations and validations developed for chickens under two different feeding levels. These equations are unique to the machine and software in use. Research in broilers …


Factors Impacting Transgender Patients’ Discomfort With Their Family Physicians: A Respondent-Driven Sampling Survey, Greta R. Bauer, Xuchen Zong, Ayden I. Scheim, Rebecca Hammond, Amardeep Thind Dec 2015

Factors Impacting Transgender Patients’ Discomfort With Their Family Physicians: A Respondent-Driven Sampling Survey, Greta R. Bauer, Xuchen Zong, Ayden I. Scheim, Rebecca Hammond, Amardeep Thind

Epidemiology and Biostatistics Publications

BACKGROUND: Representing approximately 0.5% of the population, transgender (trans) persons in Canada depend on family physicians for both general and transition-related care. However, physicians receive little to no training on this patient population, and trans patients are often profoundly uncomfortable and may avoid health care. This study examined factors associated with patient discomfort discussing trans health issues with a family physician in Ontario, Canada.

METHODS: 433 trans people age 16 and over were surveyed using respondent-driven sampling for the Trans PULSE Project; 356 had a family physician. Weighted logistic regression models were fit to produce prevalence risk ratios (PRRs) via …


Objective Bayesian Analysis On The Quantile Regression, Shiyi Tu Dec 2015

Objective Bayesian Analysis On The Quantile Regression, Shiyi Tu

All Dissertations

The dissertation consists of two distinct but related research projects. First of all, we study the Bayesian analysis on the two-piece location-scale models, which contain several well-known sub-distributions, such as the asymmetric Laplace distribution, the skewed normal distribution, and the skewed Student-t distribution. The use of two-piece location-scale models is an attractive method to model non-symmetric data. From a practical point of view, a prior with some objective information may be more reasonable due to the lack of prior information in many applied situations. It has been shown that several common used objective priors, such as the Jeffreys prior, result …


Meta-Analysis Of Lapatinib Plus Capecitabine Versus Capecitabine In The Treatment Of Her2 Positive Breast Cancer, Lynda Smith Dec 2015

Meta-Analysis Of Lapatinib Plus Capecitabine Versus Capecitabine In The Treatment Of Her2 Positive Breast Cancer, Lynda Smith

Culminating Projects in Applied Statistics

BACKGROUND:

Breast cancer is the most common type of cancer in women despite advances in research and detection methods. Approximately 25 to 30 percent of newly diagnosed cases of breast cancer will overexpress HER2, human epidermal growth factor receptor 2, and are at a greater risk for disease progression and poorer clinical outcomes. The traditional treatment is associated with irreversible cardiac dysfunction. An alternative treatment involving lapatinib plus capecitabine has been reported in some randomized controlled clinical trials comparing treatment outcomes. To quantify the effectiveness of lapatinib plus capecitabine combination therapy versus capecitabine monotherapy in treating metastatic breast cancer, a …


Rank Based Procedures For Ordered Alternative Models, Yuanyuan Shao Dec 2015

Rank Based Procedures For Ordered Alternative Models, Yuanyuan Shao

Dissertations

The ordered alternatives in a one-way layout with k ordered treatment levels are appropriate for many applications, especially in psychology and medicine. There is extensive literature in this area, and many parametric and nonparametric approaches have been introduced. Abelson-Tukey (AT) test is a frequently used parametric method. Its coefficients provide an ideal way of combining means for the purpose of detecting a monotonic relationship between the independent and dependent variables. The AT method, though, is not robust. Furthermore, our initial empirical studies show that it is not more powerful than the Jonckheere-Terpstra (JT) and the Hettmansperger- Norton (HN) nonparametric tests …


A Statistical Model For The Prediction Of Dissolved Oxygen Dynamics And The Potential For Hypoxia In The Mississippi Sound And Bight, Andreas Moshogianis Dec 2015

A Statistical Model For The Prediction Of Dissolved Oxygen Dynamics And The Potential For Hypoxia In The Mississippi Sound And Bight, Andreas Moshogianis

Master's Theses

Hypoxia events occur when dissolved oxygen concentrations fall below the minimum threshold (dissolved oxygen concentrations < 2 mg O2 L-1) necessary to avoid respiratory distress among aquatic organisms. In the Mississippi Sound and Bight, hypoxia is most prevalent from late-spring through late summer. Since hypoxia events can have dramatic effects on coastal fisheries, the spatial and temporal magnitude of hypoxia presents a clear threat to the productive fisheries in the northern Gulf of Mexico. Long-term hydrographic data were collected from eight sampling stations on a monthly basis from January 2009 to December 2011 along a cross-shelf transect from the mouth of …


Probabilistic Graphical Modeling On Big Data, Ming-Hua Chung Dec 2015

Probabilistic Graphical Modeling On Big Data, Ming-Hua Chung

Graduate Theses and Dissertations

The rise of Big Data in recent years brings many challenges to modern statistical analysis and modeling. In toxicogenomics, the advancement of high-throughput screening technologies facilitates the generation of massive amount of biological data, a big data phenomena in biomedical science. Yet, researchers still heavily rely on key word search and/or literature review to navigate the databases and analyses are often done in rather small-scale. As a result, the rich information of a database has not been fully utilized, particularly for the information embedded in the interactive nature between data points that are largely ignored and buried. For the past …