Open Access. Powered by Scholars. Published by Universities.®

Bioinformatics Commons

Open Access. Powered by Scholars. Published by Universities.®

Theses/Dissertations

2011

Discipline
Institution
Keyword
Publication

Articles 1 - 24 of 24

Full-Text Articles in Bioinformatics

Multivariate Models And Algorithms For Systems Biology, Lipi Rani Acharya Dec 2011

Multivariate Models And Algorithms For Systems Biology, Lipi Rani Acharya

University of New Orleans Theses and Dissertations

Rapid advances in high-throughput data acquisition technologies, such as microarraysand next-generation sequencing, have enabled the scientists to interrogate the expression levels of tens of thousands of genes simultaneously. However, challenges remain in developingeffective computational methods for analyzing data generated from such platforms. In thisdissertation, we address some of these challenges. We divide our work into two parts. Inthe first part, we present a suite of multivariate approaches for a reliable discovery of geneclusters, often interpreted as pathway components, from molecular profiling data with replicated measurements. We translate our goal into learning an optimal correlation structure from replicated complete and incomplete …


A Method For Representing Contextualized Information (Merci) To Improve Situational Awareness Among Electronic Message Brokering System Dashboard Users, Arunkumar Srinivasan Dec 2011

A Method For Representing Contextualized Information (Merci) To Improve Situational Awareness Among Electronic Message Brokering System Dashboard Users, Arunkumar Srinivasan

Dissertations & Theses (Open Access)

Electronic health information brokering systems are of interest to public health informatics because they emphasize how data can be effectively shared and utilized across healthcare institutions and among providers so as to improve the quality of care, increase efficiency, and reduce costs (Lumpkin, 2002). In the domain of public health (PH) specifically, where complete and timely reporting of data is critical for all epidemiological and disease surveillance activities (Langmuir, 1976), it is imperative to ensure proper functioning of the electronic information exchange infrastructure. Receiving multiple types of data, in various formats from numerous sources, and triaging them to the appropriate …


Database Methods For Copy Number Variant Analysis Of One Hundred Disease Associated Genes In Human Congenital Heart Disease, Maureen E. Tuffnell Oct 2011

Database Methods For Copy Number Variant Analysis Of One Hundred Disease Associated Genes In Human Congenital Heart Disease, Maureen E. Tuffnell

Master's Theses (2009 -)

Human genetic variation occurs more commonly than was recognized after the completion of the Human Genome Sequencing Project in 2003. Submicroscopic human DNA analysis has revealed copy number variation (CNV) as the deletion or duplication of a genomic region potentially affecting gene dosage. Advanced genetic research now includes the study of CNVs in diseased subject groups compared to in house controls or online published datasets of control CNV data. Research labs choose from different bioinformatic algorithms to make the copy number calls. Solutions for further processing the copy number data into quantifiable form require collaboration with data analysts and include …


Enumerating Alternate Optimal Flux Distributions For Metabolic Reconstructions, Umaporn Siangphoe Aug 2011

Enumerating Alternate Optimal Flux Distributions For Metabolic Reconstructions, Umaporn Siangphoe

Theses and Dissertations

Metabolites consumed and produced by microorganisms for mass and energy conservation may cause changes in a microorganism’s environment. The microorganisms are unable to tolerate a particular environment for a long period. They may leave their old existence to find a new environment to sustain life. Essentially, organisms need to maintain their metabolic processes to survive in the new environment. Limitations of experimental studies to explore cell functions and regulations in detail result in insufficient information to explain processes of metabolic expressions under environments of organisms. Consequently, mathematical modeling and computer simulations have been conducted to combine all possible cellular metabolic …


Computational Pipeline For Human Transcriptome Quantification Using Rna-Seq Data, Guorong Xu Aug 2011

Computational Pipeline For Human Transcriptome Quantification Using Rna-Seq Data, Guorong Xu

University of New Orleans Theses and Dissertations

The main theme of this thesis research is concerned with developing a computational pipeline for processing Next-generation RNA sequencing (RNA-seq) data. RNA-seq experiments generate tens of millions of short reads for each DNA/RNA sample. The alignment of a large volume of short reads to a reference genome is a key step in NGS data analysis. Although storing alignment information in the Sequence Alignment/Map (SAM) or Binary SAM (BAM) format is now standard, biomedical researchers still have difficulty accessing useful information. In order to assist biomedical researchers to conveniently access essential information from NGS data files in SAM/BAM format, we have …


The Evolution And Mechanics Of Translational Control In Plants, Justin N. Vaughn Aug 2011

The Evolution And Mechanics Of Translational Control In Plants, Justin N. Vaughn

Doctoral Dissertations

The expression of numerous plant mRNAs is attenuated by RNA sequence elements located in the 5' and 3' untranslated regions (UTRs). For example, in plants and many higher eukaryotes, roughly 35% of genes encode mRNAs that contain one or more upstream open reading frames (uORFs) in the 5' UTR. For this dissertation I have analyzed the pattern of conservation of such mRNA sequence elements. In the first set of studies, I have taken a comparative transcriptomics approach to address which RNA sequence elements are conserved between various families of angiosperm plants. Such conservation indicates an element's fundamental importance to plant …


Predicting Yeast Synthetic Lethal Genetic Interactions Using Protein Domains, Bo Li Aug 2011

Predicting Yeast Synthetic Lethal Genetic Interactions Using Protein Domains, Bo Li

All Dissertations

Synthetic lethal genetic interaction (SLGI) is an important biological phenomenon. Such interactions are of interest as they can be used to predict function of unknown proteins and find drug targets or drug combinations. High throughput biological experiments enhance the capability in identifying genetic interactions, but the large amount of protein pairs still make the task of genome-wide identification of genetic interactions overwhelming. Computational based prediction of SLGIs is promising but still hampered by the unclear molecular mechanism of SLGIs.
Protein domains with conserved functions serve as the building blocks of proteins. The genetic interaction that occurs between a pair of …


Heuristics For Scaling Up Distributed Protein Docking, Prachi Pradeep Jul 2011

Heuristics For Scaling Up Distributed Protein Docking, Prachi Pradeep

Master's Theses (2009 -)

Docking is a computational technique which predicts the interaction between a protein and a potential drug compound. Virtual screening is a tool, which employs docking, to investigate huge libraries of compounds and predicts potential drug molecules that bind favorably to the protein of interest. The size of one such commercially available library is about 13 million compounds. It would take approximately 400 years of CPU time to examine this library! As an alternative a high performance computing application with a distributed docking strategy is needed, which can efficiently predict the favorable compounds and can eventually be scaled for huge libraries. …


Fast Program For Sequence Alignment Using Partition Function Posterior Probabilities, Meera Prasad May 2011

Fast Program For Sequence Alignment Using Partition Function Posterior Probabilities, Meera Prasad

Theses

The key requirements of a good sequence alignment tool are high accuracy and fast execution. The existing Probalign program is a highly accurate tool for sequence alignment of both proteins and nucleotides. However, the time for execution is fairly high. The focus is therefore, to reduce the running time of the existing version of Probalign, maintaining its current accuracy level.

The thesis conducts a detail analysis of the performance of Probalign to bring down the running time of the existing code. A modified version of Probalign, Version 1.4 is released. A new program for sequence alignment with faster computation is …


Genomic And Molecular Analysis Of The Exopolysaccharide Production In The Bacterium Thauera Aminoaromatica Mz1t, Ke Jiang May 2011

Genomic And Molecular Analysis Of The Exopolysaccharide Production In The Bacterium Thauera Aminoaromatica Mz1t, Ke Jiang

Doctoral Dissertations

Thauera aminoaromatica MZ1T is an exopolysaccharide (EPS)-producing Gram negative bacterium isolated from the wastewater treatment plant of a major industrial chemical manufacturer as the causal agent for poor sludge dewatering. It shares common features with other known Thauera spp. (i.e. Thauera aromatica, and Thauera selenatis), being capable of degrading aromatic compounds anaerobically and using acetate and succinate as carbon sources. It is unique among the Thauera spp. in its production of abundant EPS which results in viscous bulking and poor sludge dewaterability. In this respect, it is similar to Azoarcus sp. EbN1 and BH72. Thaueran is the proposed …


Combining Bioinformatics And Chemical Biological Approaches To The Study Of Signaling Pathways In Parasitic Nematodes, William R. De Martini May 2011

Combining Bioinformatics And Chemical Biological Approaches To The Study Of Signaling Pathways In Parasitic Nematodes, William R. De Martini

Theses, Dissertations and Culminating Projects

Lymphatic filariasis (elephantitis) is a disfiguring disease caused by thread-like nematodes. This disease affects the lives of over 120 million people and over one billion people are at risk for infection in endemic regions. Drugs used to treat this disease suffer from toxicity and emerging resistance and new therapies need to be identified. Our laboratory has been studying the filarial parasite, Brugia malayi (B. malayi), one of the causative agents of this disease. The laboratory is focused on the study of critical protein kinase signaling pathways, necessary for parasite protective anti-oxidative responses, as potential therapeutic targets. We have previously determined, …


Functional Cloning And Characterization Of Antibiotic Resistance Genes From The Chicken Gut Microflora, Wei Zhou May 2011

Functional Cloning And Characterization Of Antibiotic Resistance Genes From The Chicken Gut Microflora, Wei Zhou

Masters Theses

A recent study using human fecal samples in conjunction with a culture-independent approach revealed immense diversity of antibiotic resistance (AR) genes in the human gut microflora. We hypothesize that food animal gut microflora also contain diverse and novel AR genes which could contribute to the emergence and transmission of AR in pathogens important in animal and human health. To test this, we examined AR reservoir in chicken gut microflora using a metagenomic, functional cloning method. Total genomic DNA was extracted from individual cecal contents of two free range chickens and two conventionally raised chickens. The DNAs were physically sheered into …


Pretictive Bioinformatic Methods For Analyzing Genes And Proteins, Shaolei Teng May 2011

Pretictive Bioinformatic Methods For Analyzing Genes And Proteins, Shaolei Teng

All Dissertations

Since large amounts of biological data are generated using various high-throughput technologies, efficient computational methods are important for understanding the biological meanings behind the complex data. Machine learning is particularly appealing for biological knowledge discovery. Tissue-specific gene expression and protein sumoylation play essential roles in the cell and are implicated in many human diseases. Protein destabilization is a common mechanism by which mutations cause human diseases. In this study, machine learning approaches were developed for predicting human tissue-specific genes, protein sumoylation sites and protein stability changes upon single amino acid substitutions. Relevant biological features were selected for input vector encoding, …


Ab Initio Protein Structure Prediction Algorithms, Maciej Kicinski Apr 2011

Ab Initio Protein Structure Prediction Algorithms, Maciej Kicinski

Master's Projects

Genes that encode novel proteins are constantly being discovered and added to databases, but the speed with which their structures are being determined is not keeping up with this rate of discovery. Currently, homology and threading methods perform the best for protein structure prediction, but they are not appropriate to use for all proteins. Still, the best way to determine a protein's structure is through biological experimentation. This research looks into possible methods and relations that pertain to ab initio protein structure prediction. The study includes the use of positional and transitional probabilities of amino acids obtained from a non-redundant …


Bioinformatics, Thermodynamics And Kinetics Analysis Of An All Alpha Helical Protein With A Gree-Key Topology, Hai Li Apr 2011

Bioinformatics, Thermodynamics And Kinetics Analysis Of An All Alpha Helical Protein With A Gree-Key Topology, Hai Li

Chemistry & Biochemistry Theses & Dissertations

Computational and experimental studies focusing on the role of conserved residues for folding and stability is an active and promising area of research. To further expand our understanding we present the results of a bioinformatics analysis of the death domain superfamily. The death domain superfamily fold consists of six α-helices arranged in a Greek-key topology, which is shared by the all β-sheet immunoglobulin and mixed α/β-plait superfamilies. Our sequence and structural studies have identified a group of conserved hydrophobic residues and corresponding long-range interactions, which we propose are important in the formation and stabilization of the hydrophobic core and native …


Rna Secondary Structure Prediction Tool, Meenakshee Mali Apr 2011

Rna Secondary Structure Prediction Tool, Meenakshee Mali

Master's Projects

Ribonucleic Acid (RNA) is one of the major macromolecules essential to all forms of life. Apart from the important role played in protein synthesis, it performs several important functions such as gene regulation, catalyst of biochemical reactions and modification of other RNAs. In some viruses, instead of DNA, RNA serves as the carrier of genetic information. RNA is an interesting subject of research in the scientific community. It has lead to important biological discoveries. One of the major problems researchers are trying to solve is the RNA structure prediction problem. It has been found that the structure of RNA is …


Aminormotiffinder - A Graph Grammar Based Tool To Effectively Search A Minor Motifs In 3d Rna Molecules, Ankur Malhotra Jan 2011

Aminormotiffinder - A Graph Grammar Based Tool To Effectively Search A Minor Motifs In 3d Rna Molecules, Ankur Malhotra

Theses

RNA Motifs are three dimensional folds that play important role in RNA folding and its interaction with other molecules. They basically have modular structure and are composed of conserved building blocks dependent upon the sequence. Their automated in silico identification remains a challenging task. Existing motif identification tools does not correctly identify motifs with large structure variations. Here a “graph rewriting” based method is proposed to identify motifs in real three dimensional structures. The unique encoding of A Minor Searcher takes into consideration the non canonical base pairs and also multipairing of RNA structural motifs. The accuracy is demonstrated by …


Prediction Of Ribonucleic Acid Secondary Structures Using A Heuristic Backtracking Search, Christopher Roman Cuellar Jan 2011

Prediction Of Ribonucleic Acid Secondary Structures Using A Heuristic Backtracking Search, Christopher Roman Cuellar

Open Access Theses & Dissertations

Ribonucleic acid (RNA) is essential for all forms of life. RNA is made up of a large chain of nucleotide bases: Guanine (G), Uracil (U), Cytosine (C), and Adenine (A). An RNA strand can fold on itself to allow G-C, A-U, and G-U bases to form hydrogen bonds, this is known as a secondary structure. Knowing the secondary structure of an RNA chain is very important because it will allow researchers to better understand its specific functions. RNA will create secondary structures that tend to minimize their free energy. RNA secondary structure prediction is the attempt to predict physical folding …


Distributional Properties Of Inversions And Segmentation Algorithms For Rna Sequences, Sameera Dhananjaya Viswakula Jan 2011

Distributional Properties Of Inversions And Segmentation Algorithms For Rna Sequences, Sameera Dhananjaya Viswakula

Open Access Theses & Dissertations

Ribonucleic acid (RNA) is a long single stranded molecule made up of four types of nucleotide bases: Adenine (A), Cytosine(C), Guanine (G) and Uracil (U). It folds back on itself and forms C-G and A-U complementary base pairs. The set of such hydrogen-bonded pairs in an RNA molecule is called its secondary structure. Knowing the secondary structure of RNA is useful for understanding its biological function. Prediction of RNA secondary structure from the nucleotide sequence has been an important bioinformatics problem for over two decades.

The work in this thesis is motivated by the need to improve the secondary structure …


Computational Methods Of Hidden Markov Models With Respect To Cpg Island Prediction In Dna Sequences, Roberto Angel Ortega Jan 2011

Computational Methods Of Hidden Markov Models With Respect To Cpg Island Prediction In Dna Sequences, Roberto Angel Ortega

Open Access Theses & Dissertations

Hidden Markov models (HMM's) are a specific case of Markov models where, contrary to Markov chains, the observer is unaware of what state the model was in when the symbol is observed. Like Markov chains, HMM's assume that the future state of a sequence is dependent only on the current state of the sequence. The parameters associated with HMM's are transition and emission probabilities, where transition probabilities are associated with the probability of transitioning from one state to another, and emission probabilities are the probabilities associated with observing a symbol given it came from a specific state.

The structure of …


Modeling And Quantitative Analysis Of White Matter Fiber Tracts In Diffusion Tensor Imaging, Xuwei Liang Jan 2011

Modeling And Quantitative Analysis Of White Matter Fiber Tracts In Diffusion Tensor Imaging, Xuwei Liang

University of Kentucky Doctoral Dissertations

Diffusion tensor imaging (DTI) is a structural magnetic resonance imaging (MRI) technique to record incoherent motion of water molecules and has been used to detect micro structural white matter alterations in clinical studies to explore certain brain disorders. A variety of DTI based techniques for detecting brain disorders and facilitating clinical group analysis have been developed in the past few years. However, there are two crucial issues that have great impacts on the performance of those algorithms. One is that brain neural pathways appear in complicated 3D structures which are inappropriate and inaccurate to be approximated by simple 2D structures, …


Analysis Of Differential Gene Expression And Alternative Splicing In The Liver And Gastrointestinal Tract In The Lactating Rat, Antony Thomas Athippozhy Jan 2011

Analysis Of Differential Gene Expression And Alternative Splicing In The Liver And Gastrointestinal Tract In The Lactating Rat, Antony Thomas Athippozhy

University of Kentucky Doctoral Dissertations

Rat exon microarrays were utilized to detect changes in mRNA expression and alternative splicing in the liver, duodenum, jejunum, and ileum of the lactating rat when compared to age-matched virgin controls. Analysis of data at the level of gene expression revealed differential expression of genes involved in cholesterol biosynthesis in each tissue examined, suggesting increased Sterol Response Element Binding Protein activity. We also detected decreased mRNA from components of the T-cell signaling pathway in the jejunum and ileum. We characterized expression of solute carrier and adenosine triphosphate binding cassette proteins. In addition to characterizing genes by pathway, we have also …


Automated Classification Of The Narrative Of Medical Reports Using Natural Language Processing, Ira J. Goldstein Jan 2011

Automated Classification Of The Narrative Of Medical Reports Using Natural Language Processing, Ira J. Goldstein

Legacy Theses & Dissertations (2009 - 2024)

In this dissertation we present three topics critical to the document level classification of the narrative in medical reports: the use of preferred terminology in light of the presence of synonymous terms, the less than optimal performance of classification systems when presented with a non-uniform distribution of classes, and the problems associated with scarcity of labeled data when presented with an imbalance of classes in the data sets.


Computational Tool For Automated Large-Scale Gpiomic Analysis, Juan Clemente Aguilar Jan 2011

Computational Tool For Automated Large-Scale Gpiomic Analysis, Juan Clemente Aguilar

Open Access Theses & Dissertations

Liquid chromatography-tandem mass spectrometry (LC-MS/MS or MS/MS) is the most efficient tool today for the identification of glycosylphosphatidylinositol (GPI) molecules. The amount of data produced in each MS/MS experiment is a major bottleneck in high-throughput GPIomic (the entire collection of free and protein-linked GPIs) projects. Efficient computational tools can significantly reduce the amount of time analyzing MS/MS data; however, at present the automatic interpretation of these data to annotate GPI structures is absent. We propose a library-based tool to identify GPI structures by matching fragment peaks in the spectra with data derived from a theoretical database of GPI structures that …