Open Access. Powered by Scholars. Published by Universities.®

Genomics Commons

Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics

Institution
Keyword
Publication Year
Publication
Publication Type
File Type

Articles 1 - 30 of 111

Full-Text Articles in Genomics

Model-Based Deep Autoencoders For Clustering Single-Cell Rna Sequencing Data With Side Information, Xiang Lin Dec 2023

Model-Based Deep Autoencoders For Clustering Single-Cell Rna Sequencing Data With Side Information, Xiang Lin

Dissertations

Clustering analysis has been conducted extensively in single-cell RNA sequencing (scRNA-seq) studies. scRNA-seq can profile tens of thousands of genes' activities within a single cell. Thousands or tens of thousands of cells can be captured simultaneously in a typical scRNA-seq experiment. Biologists would like to cluster these cells for exploring and elucidating cell types or subtypes. Numerous methods have been designed for clustering scRNA-seq data. Yet, single-cell technologies develop so fast in the past few years that those existing methods do not catch up with these rapid changes and fail to fully fulfil their potential. For instance, besides profiling transcription …


Motif-Cluster: A Spatial Clustering Package For Repetitive Motif Binding Patterns, Mengyuan Zhou Nov 2023

Motif-Cluster: A Spatial Clustering Package For Repetitive Motif Binding Patterns, Mengyuan Zhou

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

Previous efforts in using genome-wide analysis of transcription factor binding sites (TFBSs) have overlooked the importance of ranking potential significant regulatory regions, especially those with repetitive binding within a local region. Identifying these homogenous binding sites is critical because they have the potential to amplify the binding affinity and regulation activity of transcription factors, impacting gene expression and cellular functions. To address this issue, we developed an open-source tool Motif-Cluster that prioritizes and visualizes transcription factor regulatory regions by incorporating the idea of local motif clusters. Motif-Cluster can rank the significant transcription factor regulatory regions without the need for experimental …


Convolutional Neural Network-Based Gene Prediction Using Buffalograss As A Model System, Michael Morikone Nov 2023

Convolutional Neural Network-Based Gene Prediction Using Buffalograss As A Model System, Michael Morikone

Complex Biosystems PhD Program: Dissertations

The task of gene prediction has been largely stagnant in algorithmic improvements compared to when algorithms were first developed for predicting genes thirty years ago. Rather than iteratively improving the underlying algorithms in gene prediction tools by utilizing better performing models, most current approaches update existing tools through incorporating increasing amounts of extrinsic data to improve gene prediction performance. The traditional method of predicting genes is done using Hidden Markov Models (HMMs). These HMMs are constrained by having strict assumptions made about the independence of genes that do not always hold true. To address this, a Convolutional Neural Network (CNN) …


Deephtlv: A Deep Learning Framework For Detecting Human T-Lymphotrophic Virus 1 Integration Sites, Johnathan Jia, Johnathan Jia May 2023

Deephtlv: A Deep Learning Framework For Detecting Human T-Lymphotrophic Virus 1 Integration Sites, Johnathan Jia, Johnathan Jia

Dissertations & Theses (Open Access)

In the 1980s, researchers found the first human oncogenic retrovirus called human T-lymphotrophic virus type 1 (HTLV-1). Since then, HTLV-1 has been identified as the causative agent behind several diseases such as adult T-cell leukemia/lymphoma (ATL) and a HTLV-1 associated myelopathy or tropical spastic paraparesis (HAM/TSP). As part of its normal replication cycle, the genome is converted into DNA and integrated into the genome. With several hundreds to thousands of unique viral integration sites (VISs) distributed with indeterminate preference throughout the genome, detection of HTLV-1 VISs is a challenging task. Experimental studies typically use molecular biology …


Complete Genome Sequences Of Chop, Delrio, And Grandslam, Three Gordonia Phages Isolated From Soil In Central Arkansas, Heidi N. Mathes, Elijah I. Christenson, John H. Crum, Emme M. Edmondson, Kassidy E. Gray, Luke W. Lawson, Lauren E. Lee, Michael P. Lee, Jackson A. Lipscomb, Morgan E. Masengale, Hannah G. Matthews, Charles M. Mcclain 4th, Tuesday N. Melton, Trace H. Morrow, Alexis M. Perry, David R. Rainwater, Grace E. Renois, Maryann F. Rettig, Duncan C. Troup, Allie J. Wilson, Nathan Reyna, Ruth Plymale Apr 2023

Complete Genome Sequences Of Chop, Delrio, And Grandslam, Three Gordonia Phages Isolated From Soil In Central Arkansas, Heidi N. Mathes, Elijah I. Christenson, John H. Crum, Emme M. Edmondson, Kassidy E. Gray, Luke W. Lawson, Lauren E. Lee, Michael P. Lee, Jackson A. Lipscomb, Morgan E. Masengale, Hannah G. Matthews, Charles M. Mcclain 4th, Tuesday N. Melton, Trace H. Morrow, Alexis M. Perry, David R. Rainwater, Grace E. Renois, Maryann F. Rettig, Duncan C. Troup, Allie J. Wilson, Nathan Reyna, Ruth Plymale

Articles

Chop, DelRio, and GrandSlam are phage with a Siphoviridae morphotype isolated from soil in Arkansas using the host Gordonia terrae 3612. All three are temperate, and their genomes share at least 96% nucleotide identity. These phage are assigned to cluster DI based on gene content similarity to other sequenced actinobacteriophage.


Physiological And Transcriptomic Responses Of Two Artemisia Californica Populations To Drought: Implications For Restoring Drought-Resilient Native Communities, Hagop S. Atamian Dr., Jennifer L. Funk Apr 2023

Physiological And Transcriptomic Responses Of Two Artemisia Californica Populations To Drought: Implications For Restoring Drought-Resilient Native Communities, Hagop S. Atamian Dr., Jennifer L. Funk

Biology, Chemistry, and Environmental Sciences Faculty Articles and Research

As climate change brings drier and more variable rainfall patterns to many arid and semi-arid regions, land managers must re-assemble appropriate plant communities for these conditions. Transcriptome sequencing can elucidate the molecular mechanisms underlying plant responses to changing environmental conditions, potentially enhancing our ability to screen suitable genotypes and species for restoration. We examined physiological and morphological traits and transcriptome sequences of coastal and inland populations of California sagebrush (Artemisia californica), a critical shrub used to restore coastal sage scrub vegetation communities, grown under low and high rainfall environments. The populations are located approximately 36 km apart but …


Intellectual Disability Related To De Novo Germline Loss Of The Distal End Of The P-Arm Of Chromosome 17: A Case Report, Eden Pope, Matthew Huertas, Amar Paul, Braden Cunningham, Matthew Jennings, Ryan Perry, Stephanie Chavez, John A. Kriak, Kyle B. Bills, David W. Sant Feb 2023

Intellectual Disability Related To De Novo Germline Loss Of The Distal End Of The P-Arm Of Chromosome 17: A Case Report, Eden Pope, Matthew Huertas, Amar Paul, Braden Cunningham, Matthew Jennings, Ryan Perry, Stephanie Chavez, John A. Kriak, Kyle B. Bills, David W. Sant

Annual Research Symposium

Hypothesis/Purpose: In this report we present a case of a 20-year-old female with congenital intellectual disability, stunted growth, and hypothyroidism. Competitive genetic hybridization (CHG) revealed a loss of 17p13.3, and the deletion was not present in either parent. This deletion has not previously been characterized, but mutations on the p-arm of chromosome 17 are responsible for Miller-Dieker Syndrome and Isolated Lissencephaly Sequence, both of which share symptoms in common with the patient.

Methods: Peripheral mononuclear cells (PBMCs) were used for karyotyping and competitive genetic hybridization (CHG). Bioinformatic analysis was carried out using the Genome Data Viewer (ncbi.nlm.nih.gov/genome/gdv).

Results: Karyotype was …


Presentation Of Paired P- And Q-Arm Mosaic Deletions On Chromosome 18 Associated With Neuropsychiatric Symptoms, Jackson Nielsen, Laura Minor, John Dougherty Jr., Paige Moore, Kailee Edwards, Brandon Burrell, Jameson Williams, John A. Kriak, David W. Sant, Kyle B. Bills Feb 2023

Presentation Of Paired P- And Q-Arm Mosaic Deletions On Chromosome 18 Associated With Neuropsychiatric Symptoms, Jackson Nielsen, Laura Minor, John Dougherty Jr., Paige Moore, Kailee Edwards, Brandon Burrell, Jameson Williams, John A. Kriak, David W. Sant, Kyle B. Bills

Annual Research Symposium

No abstract provided.


Extracting High-Molecular Weight Dna From Cyanobacteria Using Promega's Wizard® Hmw Dna Extraction Kit With A Modified Protocol, Metis, Megan A. Hept, Lesley H. Greene Jan 2023

Extracting High-Molecular Weight Dna From Cyanobacteria Using Promega's Wizard® Hmw Dna Extraction Kit With A Modified Protocol, Metis, Megan A. Hept, Lesley H. Greene

Chemistry & Biochemistry Faculty Publications

Extraction of high molecular weight (HMW) DNA for long read sequencing with little to no fragmentation and high purity is difficult to acquire from cyanobacterial species. Here we describe a modified method of extraction using Promega's Wizard® HMW DNA Extraction Kit to acquire high molecular weight DNA from cyanobacterial species. The protocol used in the kit is the “3.D. Isolating HMW DNA from Gram-Positive and Gram-Negative Bacteria” protocol. During a key step in the protocol, the lingering remnants of the mucilage layer of the cyanobacterial species is removed, preventing it from sticking to the DNA pellet produced. This customized modification …


Development Of Graphical Models And Statistical Physics Motivated Approaches To Genomic Investigations, Yashwanth Lagisetty Aug 2022

Development Of Graphical Models And Statistical Physics Motivated Approaches To Genomic Investigations, Yashwanth Lagisetty

Dissertations & Theses (Open Access)

Identifying genes involved in disease pathology has been a goal of genomic research since the early days of the field. However, as technology improves and the body of research grows, we are faced with more questions than answers. Among these is the pressing matter of our incomplete understanding of the genetic underpinnings of complex diseases. Many hypotheses offer explanations as to why direct and independent analyses of variants, as done in genome-wide association studies (GWAS), may not fully elucidate disease genetics. These range from pointing out flaws in statistical testing to invoking the complex dynamics of epigenetic processes. In the …


Halodash: The Deep And Shallow History Of Aquatic Life's Passages Between Marine And Freshwater Habitats, Eric T. Schultz, Lisa Park Boush May 2022

Halodash: The Deep And Shallow History Of Aquatic Life's Passages Between Marine And Freshwater Habitats, Eric T. Schultz, Lisa Park Boush

EEB Articles

This series of papers highlights research into how biological exchanges between salty and freshwater habitats have transformed the biosphere. Life in the ocean and in freshwaters have long been intertwined; multiple major branches of the tree of life originated in the oceans and then adapted to and diversified in freshwaters. Similar exchanges continue to this day, including some species that continually migrate between marine and fresh waters. The series addresses key themes of transitions, transformations, and current threats with a series of questions: When did major colonizations of fresh waters happen? What physiographic changes facilitated transitions? What organismal characteristics facilitate …


Computational Methods To Analyze Next-Generation Sequencing Data In Genomics And Metagenomics, Saidi Wang Jan 2022

Computational Methods To Analyze Next-Generation Sequencing Data In Genomics And Metagenomics, Saidi Wang

Electronic Theses and Dissertations, 2020-

This thesis focuses on two important computational problems in genomics and metagenomics with the public available next-generation sequencing data. One is about gene regulation, for which we explore how distal regulatory elements may interact with the proximal regulatory elements. The other is about metagenomics, in which we study how to reconstruct bacterial strain genomes from shotgun reads. Studying gene regulation, especially distal gene regulation, is important because regulatory elements, including those in distal regulatory regions, orchestrate when, where and how much a gene is activated under every experimental condition. Their dysfunction results in various types of diseases. Moreover, the current …


Genomics Of Postprandial Lipidomics In The Genetics Of Lipid-Lowering Drugs And Diet Network Study, Marguerite R. Irvin, May E. Montasser, Tobias Kind, Sili Fan, Dinesh K. Barupal, Amit Patki, Rikki M. Tanner, Nicole D. Armstrong, Kathleen A. Ryan, Steven A. Claas, Jeffrey R. O’Connell, Hemant K. Tiwari, Donna K. Arnett Nov 2021

Genomics Of Postprandial Lipidomics In The Genetics Of Lipid-Lowering Drugs And Diet Network Study, Marguerite R. Irvin, May E. Montasser, Tobias Kind, Sili Fan, Dinesh K. Barupal, Amit Patki, Rikki M. Tanner, Nicole D. Armstrong, Kathleen A. Ryan, Steven A. Claas, Jeffrey R. O’Connell, Hemant K. Tiwari, Donna K. Arnett

Epidemiology and Environmental Health Faculty Publications

Postprandial lipemia (PPL) is an important risk factor for cardiovascular disease. Inter-individual variation in the dietary response to a meal is known to be influenced by genetic factors, yet genes that dictate variation in postprandial lipids are not completely characterized. Genetic studies of the plasma lipidome can help to better understand postprandial metabolism by isolating lipid molecular species which are more closely related to the genome. We measured the plasma lipidome at fasting and 6 h after a standardized high-fat meal in 668 participants from the Genetics of Lipid-Lowering Drugs and Diet Network study (GOLDN) using ultra-performance liquid chromatography coupled …


Mixture Model Approaches To Integrative Analysis Of Multi-Omics Data And Spatially Correlated Genomic Data, Ziqiao Wang May 2021

Mixture Model Approaches To Integrative Analysis Of Multi-Omics Data And Spatially Correlated Genomic Data, Ziqiao Wang

Dissertations & Theses (Open Access)

Integrative genomic data analysis is a powerful tool to study the complex biological processes behind a disease. Statistical methods can model the interrelationships of the involved gene activities through jointly analyzing multiple types of genomic data from different platforms (vertical integration), or improve the power of a study through aggregating the same type of genomic data across studies (horizontal integration). In this dissertation, we propose statistical methods and strategies for integrative multi-omics data in association analysis of disease phenotypes, with an emphasis on cancer applications.

We develop a new strategy based on horizontal integration by leveraging publicly available datasets into …


An Ensemble Of The Icluster Method To Analyze Longitudinal Lncrna Expression Data For Psoriasis Patients, Suyan Tian, Chi Wang Apr 2021

An Ensemble Of The Icluster Method To Analyze Longitudinal Lncrna Expression Data For Psoriasis Patients, Suyan Tian, Chi Wang

Internal Medicine Faculty Publications

BACKGROUND: Psoriasis is an immune-mediated, inflammatory disorder of the skin with chronic inflammation and hyper-proliferation of the epidermis. Since psoriasis has genetic components and the diseased tissue of psoriasis is very easily accessible, it is natural to use high-throughput technologies to characterize psoriasis and thus seek targeted therapies. Transcriptional profiles change correspondingly after an intervention. Unlike cross-sectional gene expression data, longitudinal gene expression data can capture the dynamic changes and thus facilitate causal inference.

METHODS: Using the iCluster method as a building block, an ensemble method was proposed and applied to a longitudinal gene expression dataset for psoriasis, with the …


Nanopore Guided Regional Assembly, Eleni Adam, Desh Ranjan, Harold Riethman Apr 2021

Nanopore Guided Regional Assembly, Eleni Adam, Desh Ranjan, Harold Riethman

College of Sciences Posters

The telomeres are the “caps” of the chromosomes and their vital role is to protect them. Possible telomere dysfunction caused by telomere rearrangements can be fatal for the cell and result in age-related diseases, including cancer. The telomeres and subtelomeres are regions that are hard to investigate. The current technology cannot provide their complete sequence, instead the DNA is given in multiple pieces. Current methods of assembling the pieces of these regions are not accurate enough due to the region’s high variability and complex repeated patterns. We propose a hybrid assembly method, the NPGREAT, which utilizes two of the latest …


Machine Learning Approaches For The Prediction Of Bone Mineral Density By Using Genomic And Phenotypic Data Of 5130 Older Men, Qing Wu, Fatma Nasoz, Jongyun Jung, Bibek Bhattarai, Mira V. Han, Robert A. Greenes, Kenneth G. Saag Feb 2021

Machine Learning Approaches For The Prediction Of Bone Mineral Density By Using Genomic And Phenotypic Data Of 5130 Older Men, Qing Wu, Fatma Nasoz, Jongyun Jung, Bibek Bhattarai, Mira V. Han, Robert A. Greenes, Kenneth G. Saag

School of Medicine Faculty Publications

The study aimed to utilize machine learning (ML) approaches and genomic data to develop a prediction model for bone mineral density (BMD) and identify the best modeling approach for BMD prediction. The genomic and phenotypic data of Osteoporotic Fractures in Men Study (n = 5130) was analyzed. Genetic risk score (GRS) was calculated from 1103 associated SNPs for each participant after a comprehensive genotype imputation. Data were normalized and divided into a training set (80%) and a validation set (20%) for analysis. Random forest, gradient boosting, neural network, and linear regression were used to develop BMD prediction models separately. Ten-fold …


Deep Learning For Multi-Tissue Cancer Classification Of Gene Expressions, Tarek Khorshed Jan 2021

Deep Learning For Multi-Tissue Cancer Classification Of Gene Expressions, Tarek Khorshed

Theses and Dissertations

We contribute in saving the lives of cancer patients through early detection and diagnosis, since one of the major challenges in cancer treatment is that patients are diagnosed at very late stages when appropriate medical interventions become less effective and full curative treatment is no longer achievable. Cancer classification using gene expressions is extremely challenging given the complexity and high dimensionality of the data. Current classification methods typically rely on samples collected from a single tissue type and perform a prerequisite of gene feature selection to avoid processing the full set of genes. These methods fall short in taking advantage …


Statistical Methods In Genetic Studies, Cheng Gao Jan 2021

Statistical Methods In Genetic Studies, Cheng Gao

Dissertations, Master's Theses and Master's Reports

This dissertation includes three Chapters. A brief description of each chapter is organized as follows.

In Chapter 1, we proposed a new method, called MF-TOWmuT, for genome-wide association studies with multiple genetic variants and multiple phenotypes using family samples. MF-TOWmuT uses kinship matrix to account for sample relatedness. It is worth mentioning that in simulations, we considered hidden polygenic effects and varied the proportion of variance contributed by it to generate phenotypes. Simulation studies show that MF-TOWmuT can preserve the type I error rates and is more powerful than several existing methods in different simulation scenarios, MFTOWmuT is also quite …


Analysis Of Subtelomeric Rextal Assemblies Using Quast, Tunazzina Islam, Desh Ranjan, Mohammad Zubair, Eleanor Young, Ming Xiao, Harold Riethman Jan 2021

Analysis Of Subtelomeric Rextal Assemblies Using Quast, Tunazzina Islam, Desh Ranjan, Mohammad Zubair, Eleanor Young, Ming Xiao, Harold Riethman

Computer Science Faculty Publications

Genomic regions of high segmental duplication content and/or structural variation have led to gaps and misassemblies in the human reference sequence, and are refractory to assembly from whole-genome short-read datasets. Human subtelomere regions are highly enriched in both segmental duplication content and structural variations, and as a consequence are both impossible to assemble accurately and highly variable from individual to individual. Recently, we developed a pipeline for improved region-specific assembly called Regional Extension of Assemblies Using Linked-Reads (REXTAL). In this study, we evaluate REXTAL and genome-wide assembly (Supernova) approaches on 10X Genomics linked-reads data sets partitioned and barcoded using the …


Statistical Approaches Of Gene Set Analysis With Quantitative Trait Loci For High-Throughput Genomic Studies., Samarendra Das Dec 2020

Statistical Approaches Of Gene Set Analysis With Quantitative Trait Loci For High-Throughput Genomic Studies., Samarendra Das

Electronic Theses and Dissertations

Recently, gene set analysis has become the first choice for gaining insights into the underlying complex biology of diseases through high-throughput genomic studies, such as Microarrays, bulk RNA-Sequencing, single cell RNA-Sequencing, etc. It also reduces the complexity of statistical analysis and enhances the explanatory power of the obtained results. Further, the statistical structure and steps common to these approaches have not yet been comprehensively discussed, which limits their utility. Hence, a comprehensive overview of the available gene set analysis approaches used for different high-throughput genomic studies is provided. The analysis of gene sets is usually carried out based on …


Carbon Metabolism In Cave Subaerial Biofilms, Victoria E. Frazier Dec 2020

Carbon Metabolism In Cave Subaerial Biofilms, Victoria E. Frazier

Masters Theses

Subaerial biofilms (SABs) grow at the interface between the atmosphere and rock surfaces in terrestrial and subterranean environments around the world. Multi-colored SABs colonizing relatively dry and nutrient-limited cave surfaces are known to contain microbes putatively involved in chemolithoautotrophic processes using inorganic carbon like carbon dioxide (CO2) or methane (CH4). However, the importance of CO2 and CH4 to SAB biomass production has not been quantified, the environmental conditions influencing biomass production and diversity have not been thoroughly evaluated, and stable carbon and nitrogen isotope compositions have yet to be determined from epigenic cave SABs. …


Development Of A Dna Methylation Multiplex Assay For Body Fluid Identification And Age Determination, Quentin Gauthier Nov 2020

Development Of A Dna Methylation Multiplex Assay For Body Fluid Identification And Age Determination, Quentin Gauthier

FIU Electronic Theses and Dissertations

For forensic laboratories, the determination of body fluid origin of samples collected at a crime scene are typically presumptive and often destructive. However, given that in certain cases the presence of DNA is not in dispute and rather where the DNA came from is of primary concern, new methodologies are needed. Epigenetic modifications, such as DNA methylation, affect gene expression in every cell of every mammal. These DNA methylation patterns typically are observed as the addition of a methyl group on the 5’ carbon of a cytosine followed by guanine (CpG). Methylation patterns have been observed to change in response …


Statistical Methods For Resolving Intratumor Heterogeneity With Single-Cell Dna Sequencing, Alexander Davis Aug 2020

Statistical Methods For Resolving Intratumor Heterogeneity With Single-Cell Dna Sequencing, Alexander Davis

Dissertations & Theses (Open Access)

Tumor cells have heterogeneous genotypes, which drives progression and treatment resistance. Such genetic intratumor heterogeneity plays a role in the process of clonal evolution that underlies tumor progression and treatment resistance. Single-cell DNA sequencing is a promising experimental method for studying intratumor heterogeneity, but brings unique statistical challenges in interpreting the resulting data. Researchers lack methods to determine whether sufficiently many cells have been sampled from a tumor. In addition, there are no proven computational methods for determining the ploidy of a cell, a necessary step in the determination of copy number. In this work, software for calculating probabilities from …


Machine Learning With Digital Signal Processing For Rapid And Accurate Alignment-Free Genome Analysis: From Methodological Design To A Covid-19 Case Study, Gurjit Singh Randhawa Jun 2020

Machine Learning With Digital Signal Processing For Rapid And Accurate Alignment-Free Genome Analysis: From Methodological Design To A Covid-19 Case Study, Gurjit Singh Randhawa

Electronic Thesis and Dissertation Repository

In the field of bioinformatics, taxonomic classification is the scientific practice of identifying, naming, and grouping of organisms based on their similarities and differences. The problem of taxonomic classification is of immense importance considering that nearly 86% of existing species on Earth and 91% of marine species remain unclassified. Due to the magnitude of the datasets, the need exists for an approach and software tool that is scalable enough to handle large datasets and can be used for rapid sequence comparison and analysis. We propose ML-DSP, a stand-alone alignment-free software tool that uses Machine Learning and Digital Signal Processing to …


Statistical Inference Of Adaptation At Multiple Genomic Scales Using Supervised Classification And A Hidden Markov Model, Lauren A. Sugden May 2020

Statistical Inference Of Adaptation At Multiple Genomic Scales Using Supervised Classification And A Hidden Markov Model, Lauren A. Sugden

Biology and Medicine Through Mathematics Conference

No abstract provided.


Genetic Studies Of Wildlife, Brittaney L. Buchanan Apr 2020

Genetic Studies Of Wildlife, Brittaney L. Buchanan

School of Natural Resources: Dissertations, Theses, and Student Research

Genetic techniques are being more frequently used to understand the biology and management of wildlife species. The wild turkey is one species of genetic interest because the correct identification of individuals to the subspecies level is difficult using traditional methods. Currently phenotypic differences in plumage, especially the upper tail coverts, are used to assign individuals to subspecies. To hunters wanting to complete a “grand slam,” identification of birds’ subspecies is important. This study focuses on the five extant subspecies: Eastern (M. g. silvestris), Osceola (M. g. osceola), Rio Grande (M. g. intermedia), Merriam’s ( …


Machine Learning Prediction Of Glioblastoma Patient One-Year Survival, Andrew Du '20, Warren Mcgee, Jane Y. Wu Jan 2020

Machine Learning Prediction Of Glioblastoma Patient One-Year Survival, Andrew Du '20, Warren Mcgee, Jane Y. Wu

Student Publications & Research

Glioblastoma (GBM) is a grade IV astrocytoma formed primarily from cancerous astrocytes and sustained by intense angiogenesis. GBM often causes non-specific symptoms, creating difficulty for diagnosis. This study aimed to utilize machine learning techniques to provide an accurate one-year survival prognosis for GBM patients using clinical and genomic data from the Chinese Glioma Genome Atlas. Logistic regression (LR), support vector machines (SVM), random forest (RF), and ensemble models were used to identify and select predictors for GBM survival and to classify patients into those with an overall survival (OS) of less than one year and one year or greater. With …


Modeling Gene Expression With Differential Equations, Madison Kuduk Jan 2020

Modeling Gene Expression With Differential Equations, Madison Kuduk

Capstone Showcase

Gene expression is the process by which the information stored in DNA is convertedinto a functional gene product, such as protein. The two main functions that makeup the process of gene expression are transcription and translation. Transcriptionand translation are controlled by the number of mRNA and protein in the cell. Geneexpression can be represented as a system of first order differential equations for the rateof change of mRNA and proteins. These equations involve transcription, translation,degradation and feedback loops. In this paper, I investigate a system of first orderdifferential equations to model gene expression proposed by Hunt, Laplace, Miller andPham in …


Characterization Of Bacterial Communities In Biscayne Bay Through Genomic Analysis, Eric Fortman Dec 2019

Characterization Of Bacterial Communities In Biscayne Bay Through Genomic Analysis, Eric Fortman

HCNSO Student Theses and Dissertations

Biscayne Bay is a shallow oligotrophic estuary in Southeast Florida. Channelization of rivers, and dredging of canals has greatly altered the historical flow of fresh water into the bay. This, coupled with the rise of a sprawling urban & suburban development, has greatly increased the nutrient load in the bay. This study examined the bacterial community at 14 stations throughout Biscayne Bay —6 stations were located at the mouths of canals; 1 upstream-canal station; 6 stations in the center of the bay; and one ocean influenced station, located near the entrance to the bay. One liter, surface water samples were …