Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 11 of 11

Full-Text Articles in Physical Sciences and Mathematics

Deephtlv: A Deep Learning Framework For Detecting Human T-Lymphotrophic Virus 1 Integration Sites, Johnathan Jia, Johnathan Jia May 2023

Deephtlv: A Deep Learning Framework For Detecting Human T-Lymphotrophic Virus 1 Integration Sites, Johnathan Jia, Johnathan Jia

Dissertations & Theses (Open Access)

In the 1980s, researchers found the first human oncogenic retrovirus called human T-lymphotrophic virus type 1 (HTLV-1). Since then, HTLV-1 has been identified as the causative agent behind several diseases such as adult T-cell leukemia/lymphoma (ATL) and a HTLV-1 associated myelopathy or tropical spastic paraparesis (HAM/TSP). As part of its normal replication cycle, the genome is converted into DNA and integrated into the genome. With several hundreds to thousands of unique viral integration sites (VISs) distributed with indeterminate preference throughout the genome, detection of HTLV-1 VISs is a challenging task. Experimental studies typically use molecular biology …


Development Of Graphical Models And Statistical Physics Motivated Approaches To Genomic Investigations, Yashwanth Lagisetty Aug 2022

Development Of Graphical Models And Statistical Physics Motivated Approaches To Genomic Investigations, Yashwanth Lagisetty

Dissertations & Theses (Open Access)

Identifying genes involved in disease pathology has been a goal of genomic research since the early days of the field. However, as technology improves and the body of research grows, we are faced with more questions than answers. Among these is the pressing matter of our incomplete understanding of the genetic underpinnings of complex diseases. Many hypotheses offer explanations as to why direct and independent analyses of variants, as done in genome-wide association studies (GWAS), may not fully elucidate disease genetics. These range from pointing out flaws in statistical testing to invoking the complex dynamics of epigenetic processes. In the …


Mixture Model Approaches To Integrative Analysis Of Multi-Omics Data And Spatially Correlated Genomic Data, Ziqiao Wang May 2021

Mixture Model Approaches To Integrative Analysis Of Multi-Omics Data And Spatially Correlated Genomic Data, Ziqiao Wang

Dissertations & Theses (Open Access)

Integrative genomic data analysis is a powerful tool to study the complex biological processes behind a disease. Statistical methods can model the interrelationships of the involved gene activities through jointly analyzing multiple types of genomic data from different platforms (vertical integration), or improve the power of a study through aggregating the same type of genomic data across studies (horizontal integration). In this dissertation, we propose statistical methods and strategies for integrative multi-omics data in association analysis of disease phenotypes, with an emphasis on cancer applications.

We develop a new strategy based on horizontal integration by leveraging publicly available datasets into …


Biases And Blind-Spots In Genome-Wide Crispr-Cas9 Knockout Screens, Merve Dede May 2021

Biases And Blind-Spots In Genome-Wide Crispr-Cas9 Knockout Screens, Merve Dede

Dissertations & Theses (Open Access)

Adaptation of the bacterial CRISPR-Cas9 system to mammalian cells revolutionized the field of functional genomics, enabling genome-scale genetic perturbations to study essential genes, whose loss of function results in a severe fitness defect. There are two types of essential genes in a cell. Core essential genes are absolutely required for growth and proliferation in every cell type. On the other hand, context-dependent essential genes become essential in an environmental or genetic context. The concept of context-dependent gene essentiality is particularly important in cancer, since killing cancer cells selectively without harming surrounding healthy tissue remains a major challenge. The toxicity of …


Statistical Methods For Resolving Intratumor Heterogeneity With Single-Cell Dna Sequencing, Alexander Davis Aug 2020

Statistical Methods For Resolving Intratumor Heterogeneity With Single-Cell Dna Sequencing, Alexander Davis

Dissertations & Theses (Open Access)

Tumor cells have heterogeneous genotypes, which drives progression and treatment resistance. Such genetic intratumor heterogeneity plays a role in the process of clonal evolution that underlies tumor progression and treatment resistance. Single-cell DNA sequencing is a promising experimental method for studying intratumor heterogeneity, but brings unique statistical challenges in interpreting the resulting data. Researchers lack methods to determine whether sufficiently many cells have been sampled from a tumor. In addition, there are no proven computational methods for determining the ploidy of a cell, a necessary step in the determination of copy number. In this work, software for calculating probabilities from …


Statistical Methods For Two Problems In Cancer Research: Analysis Of Rna-Seq Data From Archival Samples And Characterization Of Onset Of Multiple Primary Cancers, Jialu Li May 2017

Statistical Methods For Two Problems In Cancer Research: Analysis Of Rna-Seq Data From Archival Samples And Characterization Of Onset Of Multiple Primary Cancers, Jialu Li

Dissertations & Theses (Open Access)

My dissertation is focused on quantitative methodology development and application for two important topics in translational and clinical cancer research.

The first topic was motivated by the challenge of applying transcriptome sequencing (RNA-seq) to formalin-fixation and paraffin-embedding (FFPE) tumor samples for reliable diagnostic development. We designed a biospecimen study to directly compare gene expression results from different protocols to prepare libraries for RNA-seq from human breast cancer tissues, with randomization to fresh-frozen (FF) or FFPE conditions. To comprehensively evaluate the FFPE RNA-seq data quality for expression profiling, we developed multiple computational methods for assessment, such as the uniformity and continuity …


Germline Mutation Detection In Next Generation Sequencing Data And Tp53 Mutation Carrier Probability Estimation For Li-Fraumeni Syndrome, Gang Peng Aug 2015

Germline Mutation Detection In Next Generation Sequencing Data And Tp53 Mutation Carrier Probability Estimation For Li-Fraumeni Syndrome, Gang Peng

Dissertations & Theses (Open Access)

Next generation sequencing technology has been widely used in genomic analysis, but its application has been compromised by the missing true variants, especially when these variants are rare. We proposed a family-based variant calling method, FamSeq, integrating Mendelian transmission information with de novo mutation and sequencing data to improve the variant calling accuracy. We investigated the factors impacting the improvement of family-based variant calling in simulation data and validated it in real sequencing data. In both simulation and real data, FamSeq works better than the single individual based method.

In FamSeq, we implemented four different methods for the Mendelian genetic …


Genetics Of Obesity In Starr County, Texas Mexican Americans, Heather M. Highland May 2015

Genetics Of Obesity In Starr County, Texas Mexican Americans, Heather M. Highland

Dissertations & Theses (Open Access)

Currently, over two-thirds of Americans are classified as over-weight or obese. Obesity increases risk for many other diseases including type 2 diabetes, heart disease, stroke, and cancer, making obesity the largest public health problem in America and most other Westernized nations. Hispanics have a higher rate of both obesity and type 2 diabetes, making them a particularly interesting population in which to study obesity. For the last 33 years, the Starr County Health Studies has collected an array of phenotypes and biological samples from residents of Starr County, along Texas-Mexico border. This study includes 825 subjects who were not known …


Genetic Predictors Of Metabolic Side Effects Of Diuretic Therapy, Jorge L. Del Aguila Aug 2014

Genetic Predictors Of Metabolic Side Effects Of Diuretic Therapy, Jorge L. Del Aguila

Dissertations & Theses (Open Access)

Thiazide diuretics are a recommended first-line monotherapy for hypertension (i.e.SBP>140 mmHg or DBP>90 mmHg). Even so, diuretics are associated with adverse metabolic side effects, such as hyperlipidemia, hyperglycemia and hypokalemia which increase the risk of developing type II diabetes. This thesis used three analytical strategies to identify and quantify genetic factors that contribute to the development of adverse metabolic effects due to thiazide diuretic treatment. I performed a genome-wide association study (GWAS) and meta-analysis of the change in fasting plasma glucose and triglycerides in response to HCTZ from two different clinical trials: the Pharmacogenomic Evaluation of Antihypertensive Responses …


The Association Between The Il-1 Pathway, Isaac C. Wun May 2014

The Association Between The Il-1 Pathway, Isaac C. Wun

Dissertations & Theses (Open Access)

Cutaneous malignant melanoma (CMM) is a potentially lethal malignancy that warrants attention and further research, as it is known to that there is an increasing rate of incidence in theUnited States, and it is also known that exposure to UV light is its most crucial risk factor, and family history of melanoma is also an important risk factor. Melanoma is an aggressive and lethal cancer in humans. There are an estimated new 132,000 melanoma cases annually worldwide, and the trend has doubled in the past 20 years. However, attempts to treat melanoma have encountered considerable resistance and remained ineffective. The …


Gene By Bmi Interactions Influencing C-Reactive Protein Levels In European-Americans, Sarah Tudor Aug 2011

Gene By Bmi Interactions Influencing C-Reactive Protein Levels In European-Americans, Sarah Tudor

Dissertations & Theses (Open Access)

C-Reactive Protein (CRP) is a biomarker indicating tissue damage, inflammation, and infection. High-sensitivity CRP (hsCRP) is an emerging biomarker often used to estimate an individual’s risk for future coronary heart disease (CHD). hsCRP levels falling below 1.00 mg/l indicate a low risk for developing CHD, levels ranging between 1.00 mg/l and 3.00 mg/l indicate an elevated risk, and levels exceeding 3.00 mg/l indicate high risk. Multiple Genome-Wide Association Studies (GWAS) have identified a number of genetic polymorphisms which influence CRP levels. SNPs implicated in such studies have been found in or near genes of interest including: CRP, APOE, APOC, IL-6, …