Open Access. Powered by Scholars. Published by Universities.®

Life Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

Statistics and Probability

Theses/Dissertations

2015

Institution
Keyword
Publication

Articles 1 - 21 of 21

Full-Text Articles in Life Sciences

Calorimetry And Body Composition Research In Broilers And Broiler Breeders, Justina Victoria Caldas Cueva Dec 2015

Calorimetry And Body Composition Research In Broilers And Broiler Breeders, Justina Victoria Caldas Cueva

Graduate Theses and Dissertations

Indirect calorimetry to study heat production (HP) and dual energy X-ray absorptiometry (DEXA) for body composition (BC) are powerful techniques to study the dynamics of energy and protein utilization in poultry. The first two chapters present the BC (dry matter, lean, protein, and fat, bone mineral, calcium and phosphorus) of modern broilers from 1 – 60 d of age analyzed by chemical analysis and DEXA. DEXA has been validated for precision, standardized for position, and equations and validations developed for chickens under two different feeding levels. These equations are unique to the machine and software in use. Research in broilers …


Macrobenthic Communities In The Northern Gulf Of Mexico Hypoxic Zone: Testing The Pearson-Rosenberg Model, Shivakumar Shivarudrappa Dec 2015

Macrobenthic Communities In The Northern Gulf Of Mexico Hypoxic Zone: Testing The Pearson-Rosenberg Model, Shivakumar Shivarudrappa

Dissertations

The Pearson and Rosenberg (P-R) conceptual model of macrobenthic succession was used to assess the impact of hypoxia (dissolved oxygen [DO] ≤ 2 mg/L) on the macrobenthic community on the continental shelf of northern Gulf of Mexico for the first time. The model uses a stress-response relationship between environmental parameters and the macrobenthic community to determine the ecological condition of the benthic habitat. The ecological significance of dissolved oxygen in a benthic habitat is well understood. In addition, the annual recurrence of bottom-water hypoxia on the Louisiana/Texas shelf during summer months is well documented.

The P-R model illustrates the decreasing …


Estimation Problems In Complex Field Studies With Deep Interactions: Time-To-Event And Local Regression Models For Environmental Effects On Vital Rates, Krzysztof M. Sakrejda Nov 2015

Estimation Problems In Complex Field Studies With Deep Interactions: Time-To-Event And Local Regression Models For Environmental Effects On Vital Rates, Krzysztof M. Sakrejda

Doctoral Dissertations

Field studies that measure vital rates in context over extended time periods are a cornerstone of our understanding of population processes. These studies inform us about the relationship between biological process and environmental noise in an irreplaceable way. These data sets bring ``big data'' and ``big model'' challenges, which limit the application of standard software (e.g., \textbf{BUGS}). The environmental sensitivity of vital rates is also expected to exhibit interactions and non-linearity, which typically result in difficult model selection questions in large data sets. Finally, long-term ecological data sets often contain complex temporal structure. In commonly applied discrete-time models complex temporal …


Individual Tree Measurements From Three-Dimensional Point Clouds, Elias Ayrey Aug 2015

Individual Tree Measurements From Three-Dimensional Point Clouds, Elias Ayrey

Electronic Theses and Dissertations

This study develops and tests novel methodologies for measuring the attributes of individual trees from three-dimensional point clouds generated from an aerial platform. Recently, advancements in technology have allowed for the acquisition of very high resolution three-dimensional point clouds that can be used to map the forest in a virtual environment. These point clouds can be interpreted to produce valuable forest attributes across entire landscapes with minimal field labor, which can then aid forest managers in their planning and decision making.

Biometrics derived from point clouds are often generated on a plot level, with estimates spanning many meters (rather than …


Nanoscaled Cellulose And Its Carbonaceous Material: Application And Local Structure Investigation, Yujie Meng Aug 2015

Nanoscaled Cellulose And Its Carbonaceous Material: Application And Local Structure Investigation, Yujie Meng

Doctoral Dissertations

In this dissertation, cellulose nanocrystals three-dimensional morphology, size distribution, and the crystal structure were statistically and quantitatively investigated. Lognormal distribution was identified as the most likely for cellulose nanocrystals’ size distribution. Height and width dimensions were shown to decrease toward the ends from the midpoint of individual CNCs, implying a spindle-like shape. XRD analysis of crystallite size accompanied with TEM and AFM measurements revealed that the cross-sectional dimensions of individual switchgrass CNC were either rectangular or elliptical shape, with an approximately 3~5 nm [nanometer] lateral element length range. A sponge-like carbon aerogel from microfibril cellulose with high porosity, ultra-low density, …


Germline Mutation Detection In Next Generation Sequencing Data And Tp53 Mutation Carrier Probability Estimation For Li-Fraumeni Syndrome, Gang Peng Aug 2015

Germline Mutation Detection In Next Generation Sequencing Data And Tp53 Mutation Carrier Probability Estimation For Li-Fraumeni Syndrome, Gang Peng

Dissertations & Theses (Open Access)

Next generation sequencing technology has been widely used in genomic analysis, but its application has been compromised by the missing true variants, especially when these variants are rare. We proposed a family-based variant calling method, FamSeq, integrating Mendelian transmission information with de novo mutation and sequencing data to improve the variant calling accuracy. We investigated the factors impacting the improvement of family-based variant calling in simulation data and validated it in real sequencing data. In both simulation and real data, FamSeq works better than the single individual based method.

In FamSeq, we implemented four different methods for the Mendelian genetic …


Computational Modeling Of Rna-Small Molecule And Rna-Protein Interactions, Lu Chen Aug 2015

Computational Modeling Of Rna-Small Molecule And Rna-Protein Interactions, Lu Chen

Dissertations & Theses (Open Access)

The past decade has witnessed an era of RNA biology; despite the considerable discoveries nowadays, challenges still remain when one aims to screen RNA-interacting small molecule or RNA-interacting protein. These challenges imply an immediate need for cost-efficient while predictive computational tools capable of generating insightful hypotheses to discover novel RNA-interacting small molecule or RNA-interacting protein. Thus, we implemented novel computational models in this dissertation to predict RNA-ligand interactions (Chapter 1) and RNA-protein interactions (Chapter 2).

Targeting RNA has not garnered comparable interest as protein, and is restricted by lack of computational tools for structure-based drug design. To test the potential …


Using Capture-Mark-Recapture Techniques To Estimate Detection Probabilities & Fidelity Of Expression For The Critically Endangered James Spinymussel (Pleurobema Collina)., Alaina C. Esposito May 2015

Using Capture-Mark-Recapture Techniques To Estimate Detection Probabilities & Fidelity Of Expression For The Critically Endangered James Spinymussel (Pleurobema Collina)., Alaina C. Esposito

Masters Theses, 2010-2019

The critically endangered James Spinymussel (Pleurobema collina) is a species of freshwater mussel endemic to Virginia’s James and Dan River basins. In the last 20 years, P. collina has experienced a substantial decline in numbers and currently occupies approximately 10% of its original habitat; however, little information is known about this species to assist in conservation. A 230-meter reach of transitional habitat in Swift Run was selected for repeat observations to estimate detection probabilities using a Capture-Mark-Recapture framework. In June 2014, visual scouting began to locate and tag P. collina (including other mussels in the community) with PIT …


Annotation Tools For Multivariate Gene Set Testing Of Non-Model Organisms, Russell K. Banks May 2015

Annotation Tools For Multivariate Gene Set Testing Of Non-Model Organisms, Russell K. Banks

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

Microarray chip technology enables researchers to obtain measures of gene activity for essentially all genes in an organism. After grouping genes into biologically meaningful sets, researchers employ certain statistical tests to identify which gene sets (biological processes) show different levels of activity across different treatment groups. The idea is to identify which biological processes are significantly affected by a certain treatment/condition in a given organism.

Non-model organisms (such as sheep) are not widely studied so gene set membership information is not always readily accessible. This thesis work utilizes two microarray studies involving sheep to provide researchers with working examples of …


Genetics Of Obesity In Starr County, Texas Mexican Americans, Heather M. Highland May 2015

Genetics Of Obesity In Starr County, Texas Mexican Americans, Heather M. Highland

Dissertations & Theses (Open Access)

Currently, over two-thirds of Americans are classified as over-weight or obese. Obesity increases risk for many other diseases including type 2 diabetes, heart disease, stroke, and cancer, making obesity the largest public health problem in America and most other Westernized nations. Hispanics have a higher rate of both obesity and type 2 diabetes, making them a particularly interesting population in which to study obesity. For the last 33 years, the Starr County Health Studies has collected an array of phenotypes and biological samples from residents of Starr County, along Texas-Mexico border. This study includes 825 subjects who were not known …


Summary Of Survival Analysis With Sas Procedures., Derek Duane Childers 1990- May 2015

Summary Of Survival Analysis With Sas Procedures., Derek Duane Childers 1990-

Electronic Theses and Dissertations

The research conducted for this thesis was performed to summarize some of the most commonly used survival analysis techniques as well as to create one macro that will provide the solutions for these techniques. Some of the techniques that this thesis focuses on are survival and hazard functions, mean and median survival times, life table, log rank test, proportional hazards/model building, and competing risk. To further analyze these survival analysis techniques I will use the Bone Marrow Transplantation for Leukemia dataset. This trial consists of either acute myelocytic leukemia (AML 99 patients) or acute lymphoblastic leukemia (ALL 38 patients). There …


Optcluster : An R Package For Determining The Optimal Clustering Algorithm And Optimal Number Of Clusters., Michael N. Sekula May 2015

Optcluster : An R Package For Determining The Optimal Clustering Algorithm And Optimal Number Of Clusters., Michael N. Sekula

Electronic Theses and Dissertations

Determining the best clustering algorithm and ideal number of clusters for a particular dataset is a fundamental difficulty in unsupervised clustering analysis. In biological research, data generated from Next Generation Sequencing technology and microarray gene expression data are becoming more and more common, so new tools and resources are needed to group such high dimensional data using clustering analysis. Different clustering algorithms can group data very differently. Therefore, there is a need to determine the best groupings in a given dataset using the most suitable clustering algorithm for that data. This paper presents the R package optCluster as an efficient …


Failing To Replicate: Hypothesis Testing As A Crucial Key To Make Direct Replications More Credible And Predictable, Pedro Fernando Mateu Bullón May 2015

Failing To Replicate: Hypothesis Testing As A Crucial Key To Make Direct Replications More Credible And Predictable, Pedro Fernando Mateu Bullón

Dissertations

Theory cannot be fully validated unless the original results have been replicated, resulting in conclusion consistency. Replications are the strongest source to verify research findings and knowledge claims. Sciences such as medicine, chemistry, physics, genetics, and biology, are considered successful because their knowledge claims are buttressed by a large set of replications of original studies. Unfortunately in the social sciences many attempts to replicate fail and thus there is a continuing need for replication studies to confirm facts, expand knowledge to gain new understanding, and verify hypotheses. Two plausible explanations for the failure to replicate in the social sciences could …


Zero-Inflated Models To Identify Transcription Factor Binding Sites In Chip-Seq Experiments, Sameera Dhananjaya Viswakula Apr 2015

Zero-Inflated Models To Identify Transcription Factor Binding Sites In Chip-Seq Experiments, Sameera Dhananjaya Viswakula

Mathematics & Statistics Theses & Dissertations

It is essential to determine the protein-DNA binding sites to understand many biological processes. A transcription factor is a particular type of protein that binds to DNA and controls gene regulation in living organisms. Chromatin immunoprecipitation followed by highthroughput sequencing (ChIP-seq) is considered the gold standard in locating these binding sites and programs use to identify DNA-transcription factor binding sites are known as peak-callers. ChIP-seq data are known to exhibit considerable background noise and other biases. In this study, we propose a negative binomial model (NB), a zero-inflated Poisson model (ZIP) and a zero-inflated negative binomial model (ZINB) for peak-calling. …


Statistics In The Billera-Holmes-Vogtmann Treespace, Grady S. Weyenberg Jan 2015

Statistics In The Billera-Holmes-Vogtmann Treespace, Grady S. Weyenberg

Theses and Dissertations--Statistics

This dissertation is an effort to adapt two classical non-parametric statistical techniques, kernel density estimation (KDE) and principal components analysis (PCA), to the Billera-Holmes-Vogtmann (BHV) metric space for phylogenetic trees. This adaption gives a more general framework for developing and testing various hypotheses about apparent differences or similarities between sets of phylogenetic trees than currently exists.

For example, while the majority of gene histories found in a clade of organisms are expected to be generated by a common evolutionary process, numerous other coexisting processes (e.g. horizontal gene transfers, gene duplication and subsequent neofunctionalization) will cause some genes to exhibit a …


A Meta-Analysis Of Association Between One-Carbon Metabolism Gene Polymorphisms And Risk Of Prostate Cancer, Mahmood Tazari Jan 2015

A Meta-Analysis Of Association Between One-Carbon Metabolism Gene Polymorphisms And Risk Of Prostate Cancer, Mahmood Tazari

Walden Dissertations and Doctoral Studies

Prostate cancer is the most common cancer among men. The purpose of this quantitative, meta-analysis study was to examine one-carbon metabolism gene polymorphisms in a group of genes to determine their association with prostate cancer risk. The genetic epidemiology theory provided the framework for the study. The data collected were from published articles. From over 2,800 individual studies, 20 articles were retained for results and data abstraction, following the title, abstract screen, and full text screening in the second phase. The data were analyzed by a meta-analysis statistical method, combining the results from selected studies to estimate the overall association. …


High-Throughput Data Analysis: Application To Micronuclei Frequency And T-Cell Receptor Sequencing, Mateusz Makowski Jan 2015

High-Throughput Data Analysis: Application To Micronuclei Frequency And T-Cell Receptor Sequencing, Mateusz Makowski

Theses and Dissertations

The advent of high-throughput sequencing has brought about the creation of an unprecedented amount of research data. Analytical methodology has not been able to keep pace with the plethora of data being produced. Two assays, ImmunoSEQ and the cytokinesisblock micronucleus (CBMN), that both produce count data and have few methods available to analyze them are considered.

ImmunoSEQ is a sequencing assay that measures the beta T-cell receptor (TCR) repertoire. The ImmunoSEQ assay was used to describe the TCR repertoires of patients that have undergone hematopoietic stem cell transplantation (HSCT). Several different methods for spectratype analysis were extended to the TCR …


Proof-Of-Concept Of Environmental Dna Tools For Atlantic Sturgeon Management, Jameson Hinkle Jan 2015

Proof-Of-Concept Of Environmental Dna Tools For Atlantic Sturgeon Management, Jameson Hinkle

Theses and Dissertations

Abstract

The Atlantic Sturgeon (Acipenser oxyrinchus oxyrinchus, Mitchell) is an anadromous species that spawns in tidal freshwater rivers from Canada to Florida. Overfishing, river sedimentation and alteration of the river bottom have decreased Atlantic Sturgeon populations, and NOAA lists the species as endangered. Ecologists sometimes find it difficult to locate individuals of a species that is rare, endangered or invasive. The need for methods less invasive that can create more resolution of cryptic species presence is necessary. Environmental DNA (eDNA) is a non-invasive means of detecting rare, endangered, or invasive species by isolating nuclear or mitochondrial DNA (mtDNA) from the …


Graph-Based Regularization In Machine Learning: Discovering Driver Modules In Biological Networks, Xi Gao Jan 2015

Graph-Based Regularization In Machine Learning: Discovering Driver Modules In Biological Networks, Xi Gao

Theses and Dissertations

Curiosity of human nature drives us to explore the origins of what makes each of us different. From ancient legends and mythology, Mendel's law, Punnett square to modern genetic research, we carry on this old but eternal question. Thanks to technological revolution, today's scientists try to answer this question using easily measurable gene expression and other profiling data. However, the exploration can easily get lost in the data of growing volume, dimension, noise and complexity. This dissertation is aimed at developing new machine learning methods that take data from different classes as input, augment them with knowledge of feature relationships, …


A Model For Determining Drivers Of Phenology In Western United States Rangelands, Joseph R. St. Peter Jan 2015

A Model For Determining Drivers Of Phenology In Western United States Rangelands, Joseph R. St. Peter

Graduate Student Theses, Dissertations, & Professional Papers

Plant phenology has long been used as an indicator of climate. Recent changes in plant phenology are evidence of the influence of climate change. Modeling plant phenology has become an effective tool to understand the impacts of climate change. Using machine learning techniques I developed a modeling process for accurately predicting phenology across a diverse landscape. This model uses individual site data to set site specific climate thresholds for plant phenology. This model also identifies the limiting factors to vegetation phenology for rangelands in the western United States. NDVI remotely sensed data was used to quantify land surface phenology and …


Evaluation Of The Signature Molecular Descriptor With Blosum62 And An All-Atom Description For Use In Sequence Alignment Of Proteins, Lindsay M. Aichinger Jan 2015

Evaluation Of The Signature Molecular Descriptor With Blosum62 And An All-Atom Description For Use In Sequence Alignment Of Proteins, Lindsay M. Aichinger

Williams Honors College, Honors Research Projects

This Honors Project focused on a few aspects of this topic. The second is comparing the molecular signature kernels to three of the BLOSUM matrices (30, 62, and 90) to test the accuracy of the mathematical model. The kernel matrix was manipulated in order to improve the relationship by focusing on side groups and also by changing how the structure was represented in the matrix by increasing the initial height distance from the central atom (Height 1 and Height 2 included).

There were multiple design constraints for this project. The first was the comparison with the BLOSUM matrices (30, 62, …