Physical Sciences and Mathematics | Open Access Articles

Inferential Methods For High-Throughput Methylation Data, Maria Capparuccini Nov 2010

Inferential Methods For High-Throughput Methylation Data, Maria Capparuccini

Theses and Dissertations

The role of abnormal DNA methylation in the progression of disease is a growing area of research that relies upon the establishment of sound statistical methods. The common method for declaring there is differential methylation between two groups at a given CpG site, as summarized by the difference between proportions methylated db=b1-b2, has been through use of a Filtered Two Sample t-test, using the recommended filter of 0.17 (Bibikova et al., 2006b). In this dissertation, we performed a re-analysis of the data used in recommending the threshold by fitting a mixed-effects ANOVA model. It was determined that the 0.17 filter …

Go to article

Power And Sample Size For Three-Level Cluster Designs, Tina Cunningham Nov 2010

Power And Sample Size For Three-Level Cluster Designs, Tina Cunningham

Theses and Dissertations

Over the past few decades, Cluster Randomized Trials (CRT) have become a design of choice in many research areas. One of the most critical issues in planning a CRT is to ensure that the study design is sensitive enough to capture the intervention effect. The assessment of power and sample size in such studies is often faced with many challenges due to several methodological difficulties. While studies on power and sample size for cluster designs with one and two levels are abundant, the evaluation of required sample size for three-level designs has been generally overlooked. First, the nesting effect introduces …

Go to article

Stereotype Logit Models For High Dimensional Data, Andre Williams Oct 2010

Stereotype Logit Models For High Dimensional Data, Andre Williams

Theses and Dissertations

Gene expression studies are of growing importance in the field of medicine. In fact, subtypes within the same disease have been shown to have differing gene expression profiles (Golub et al., 1999). Often, researchers are interested in differentiating a disease by a categorical classification indicative of disease progression. For example, it may be of interest to identify genes that are associated with progression and to accurately predict the state of progression using gene expression data. One challenge when modeling microarray gene expression data is that there are more genes (variables) than there are observations. In addition, the genes usually demonstrate …

Go to article

An Inferential Framework For Network Hypothesis Tests: With Applications To Biological Networks, Phillip Yates Jun 2010

An Inferential Framework For Network Hypothesis Tests: With Applications To Biological Networks, Phillip Yates

Theses and Dissertations

The analysis of weighted co-expression gene sets is gaining momentum in systems biology. In addition to substantial research directed toward inferring co-expression networks on the basis of microarray/high-throughput sequencing data, inferential methods are being developed to compare gene networks across one or more phenotypes. Common gene set hypothesis testing procedures are mostly confined to comparing average gene/node transcription levels between one or more groups and make limited use of additional network features, e.g., edges induced by significant partial correlations. Ignoring the gene set architecture disregards relevant network topological comparisons and can result in familiar n<

Go to article

An Empirical Approach To Evaluating Sufficient Similarity: Utilization Of Euclidean Distance As A Similarity Measure, Scott Marshall May 2010

An Empirical Approach To Evaluating Sufficient Similarity: Utilization Of Euclidean Distance As A Similarity Measure, Scott Marshall

Theses and Dissertations

Individuals are exposed to chemical mixtures while carrying out everyday tasks, with unknown risk associated with exposure. Given the number of resulting mixtures it is not economically feasible to identify or characterize all possible mixtures. When complete dose-response data are not available on a (candidate) mixture of concern, EPA guidelines define a similar mixture based on chemical composition, component proportions and expert biological judgment (EPA, 1986, 2000). Current work in this literature is by Feder et al. (2009), evaluating sufficient similarity in exposure to disinfection by-products of water purification using multivariate statistical techniques and traditional hypothesis testing. The work of …

Go to article

Cost And Accuracy Comparisons In Medical Testing Using Sequential Testing Strategies, Anwar Ahmed May 2010

Cost And Accuracy Comparisons In Medical Testing Using Sequential Testing Strategies, Anwar Ahmed

Theses and Dissertations

The practice of sequential testing is followed by the evaluation of accuracy, but often not by the evaluation of cost. This research described and compared three sequential testing strategies: believe the negative (BN), believe the positive (BP) and believe the extreme (BE), the latter being a less-examined strategy. All three strategies were used to combine results of two medical tests to diagnose a disease or medical condition. Descriptions of these strategies were provided in terms of accuracy (using the maximum receiver operating curve or MROC) and cost of testing (defined as the proportion of subjects who need 2 tests to …

Go to article

A Numerical Method For Estimating The Variance Of Age At Maximum Growth Rate In Growth Models, Semhar Ogbagaber Apr 2010

A Numerical Method For Estimating The Variance Of Age At Maximum Growth Rate In Growth Models, Semhar Ogbagaber

Theses and Dissertations

Most studies on maturation and body composition using the Fels Longitudinal data mention peak height velocity (PHV) as an important outcome measure. The PHV is often derived from growth models such as the triple logistic model fitted to the stature (height) data. The age at PHV is sometimes ordinalized to designate an individual as an early, average or late maturer. In theory, age at PHV is the age at which the rate of growth reaches the maximum. Theoretically, for a well behaved growth function, this could be obtained by setting the second derivative of the growth function to zero and …

Go to article

Bayesian And Frequentist Approaches For The Analysis Of Multiple Endpoints Data Resulting From Exposure To Multiple Health Stressors., Epiphanie Nyirabahizi Mar 2010

Bayesian And Frequentist Approaches For The Analysis Of Multiple Endpoints Data Resulting From Exposure To Multiple Health Stressors., Epiphanie Nyirabahizi

Theses and Dissertations

In risk analysis, Benchmark dose (BMD)methodology is used to quantify the risk associated with exposure to stressors such as environmental chemicals. It consists of fitting a mathematical model to the exposure data and the BMD is the dose expected to result in a pre-specified response or benchmark response (BMR). Most available exposure data are from single chemical exposure, but living objects are exposed to multiple sources of hazards. Furthermore, in some studies, researchers may observe multiple endpoints on one subject. Statistical approaches to address multiple endpoints problem can be partitioned into a dimension reduction group and a dimension preservative group. …

Go to article

Physical Sciences and Mathematics Commons^™

Full-Text Articles in Physical Sciences and Mathematics

Inferential Methods For High-Throughput Methylation Data, Maria Capparuccini

Theses and Dissertations

Power And Sample Size For Three-Level Cluster Designs, Tina Cunningham

Theses and Dissertations

Stereotype Logit Models For High Dimensional Data, Andre Williams

Theses and Dissertations

An Inferential Framework For Network Hypothesis Tests: With Applications To Biological Networks, Phillip Yates

Theses and Dissertations

An Empirical Approach To Evaluating Sufficient Similarity: Utilization Of Euclidean Distance As A Similarity Measure, Scott Marshall

Theses and Dissertations

Cost And Accuracy Comparisons In Medical Testing Using Sequential Testing Strategies, Anwar Ahmed

Theses and Dissertations

A Numerical Method For Estimating The Variance Of Age At Maximum Growth Rate In Growth Models, Semhar Ogbagaber

Theses and Dissertations

Bayesian And Frequentist Approaches For The Analysis Of Multiple Endpoints Data Resulting From Exposure To Multiple Health Stressors., Epiphanie Nyirabahizi

Theses and Dissertations