Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 16 of 16

Full-Text Articles in Physical Sciences and Mathematics

Growth Of Purple Sulfur Bacteria Allochromatium Vinosum On Solid Phase Metal Sulfides As Sulfur And Electron Sources, Hugo Alarcon Aug 2023

Growth Of Purple Sulfur Bacteria Allochromatium Vinosum On Solid Phase Metal Sulfides As Sulfur And Electron Sources, Hugo Alarcon

Open Access Theses & Dissertations

Purple sulfur bacteria (PSB) are photosynthetic microorganisms known for their vital roles in geochemical cycles, especially the sulfur cycle, within anoxic photic environments. PSB are also key contributors to the nitrogen, carbon, and oxygen cycles. This study focuses on the autotrophic growth of Allochromatium vinosum, a model strain of PSB, that utilize solid-phase metal sulfides (MS) as both sulfur and electron donors. Through characterizing the growth profiles of A. vinosum on pyrite (FeS2), nickel sulfide (NiS), and iron monosulfide (FeS) nanoparticles, respectively, and investigating the bacteria-MS interaction mechanisms, this work expands our current knowledge of the metabolic capabilities and flexibility …


Statistical Analysis Of Genetic Sequence Variants In Whole Exome Sequencing Data From Patients With Prostate Cancer, Kelvin Ofori-Minta Aug 2021

Statistical Analysis Of Genetic Sequence Variants In Whole Exome Sequencing Data From Patients With Prostate Cancer, Kelvin Ofori-Minta

Open Access Theses & Dissertations

A single variation in the genetic sequence within the DNA of an organism could easily lead to beneficial, detrimental or neutral effects. Most often than not, these effects are detrimental than beneficial. While many biomedical and bioinformatics studies have been conducted to determine the genetic cause of prostate cancer (PrCa) which is still the second leading cause of cancer related death among men in the United States. An appreciable effort in statistical bioinformatics researches has been directed towards this aim. Through statistical analyses of a set of whole exome sequencing data from patients with PrCa obtained via The Cancer Genome …


Gene Selection And Classification In High-Throughput Biological Data With Integrated Machine Learning Algorithms And Bioinformatics Approaches, Abhijeet R Patil May 2021

Gene Selection And Classification In High-Throughput Biological Data With Integrated Machine Learning Algorithms And Bioinformatics Approaches, Abhijeet R Patil

Open Access Theses & Dissertations

With the rise of high throughput technologies in biomedical research, large volumes of expression profiling, methylation profiling, and RNA-sequencing data are being generated. These high-dimensional data have large number of features with small number of samples, a characteristic called the "curse of dimensionality." The selection of optimal features, which largely affects the performance of classification algorithms in machine learning models, has led to challenging problems in bioinformatics analyses of such high-dimensional datasets. In this work, I focus on the design of two-stage frameworks of feature selection and classification and their applications in multiple sets of colorectal cancer data. The first …


Toward Automated Region Detection & Parcellation Of Rat Brain Tissue Images, Alexandro Arnal Jan 2020

Toward Automated Region Detection & Parcellation Of Rat Brain Tissue Images, Alexandro Arnal

Open Access Theses & Dissertations

People who analyze images of biological tissue often rely on segmentation of structures as a preliminary step. In particular, laboratories studying the rat brain manually delineate brain regions to position scientific findings on a brain atlas to propose hypotheses about the rat brain, and ultimately, the human brain. Our work intersects with the preliminary step of delineating regions in images of brain tissue via computational methods.

We investigate pixel-wise classification or segmentation of brain regions using ten histological images of brain tissue sections stained for Nissl substance, and two deep learning models: U-Net and Tile2Vec. Our goal is to assess …


Combination Of Resampling Based Lasso Feature Selection And Ensembles Of Regularized Regression Models, Abhijeet R. Patil Jan 2019

Combination Of Resampling Based Lasso Feature Selection And Ensembles Of Regularized Regression Models, Abhijeet R. Patil

Open Access Theses & Dissertations

In high-dimensional data, the performance of various classiers is largely dependent on the selection of important features. Most of the individual classiers using existing feature selection (FS) methods do not perform well for highly correlated data. Obtaining important

features using the FS method and selecting the best performing classier is a challenging task in high throughput data. In this research, we propose a combination of resampling based least absolute shrinkage and selection operator (LASSO) feature selection (RLFS)

and ensembles of regularized regression models (ERRM) capable of handling data with the high correlation structures. The ERRM boosts the prediction accuracy with …


Integrated Statistical And Machine Learning Algorithms For Predicting And Classifying G Protein-Coupled Receptors, Fredrick Ayivor Jan 2018

Integrated Statistical And Machine Learning Algorithms For Predicting And Classifying G Protein-Coupled Receptors, Fredrick Ayivor

Open Access Theses & Dissertations

G protein-coupled receptors (GPCRs) are transmembrane proteins with important functions in signal transduction and often serve as drug targets. With increasing availability of protein sequence information, there is much interest in computationally predicting GPCRs and classifying them according to their biological roles. Such predictions are cost-efficient and can be valuable guides for designing wet lab experiments to help elucidate signaling pathways and expedite drug discovery. There are existing computational tools of GPCR prediction that involve principal component analysis (PCA), intimate sorting (IS), support vector machine, and random forest (RF) techniques using various sequence derived features. While accuracies of over 90\% …


Label-Free Raman Imaging To Monitor Breast Tumor Signatures, John Ciubuc Jan 2017

Label-Free Raman Imaging To Monitor Breast Tumor Signatures, John Ciubuc

Open Access Theses & Dissertations

Methods built on Raman spectroscopy have shown major potential in describing and discriminating between malignant and benign specimens. Accurate, real-time medical diagnosis benefits in substantial improvements through this vibrational optical method. Not only is acquisition of data possible in milliseconds and analysis in minutes, Raman allows concurrent detection and monitoring of all biological components. Besides validating a significant Raman signature distinction between non-tumorigenic (MCF-10A) and tumorigenic (MCF-7) breast epithelial cells, this study reveals a label-free method to assess overexpression of epidermal growth factor receptors (EGFR) in tumor cells. EGFR overexpression sires Raman features associated with phosphorylated threonine and serine, and …


Computational Methods For Prediction And Classification Of G Protein-Coupled Receptors, Khodeza Begum Jan 2017

Computational Methods For Prediction And Classification Of G Protein-Coupled Receptors, Khodeza Begum

Open Access Theses & Dissertations

G protein-coupled receptors (GPCRs) constitute the largest group of membrane receptor proteins in eukaryotes. Due to their significant roles in many physiological processes such as vision, smell, and inflammation, GPCRs are the targets of many prescribed drugs. However, the functional and structural diversity of GPCRs has kept their prediction and classification based on amino acid sequence data as a challenging bioinformatics problem. As existing computational methods to predict and classify GPCRs are focused on mammalian (mostly human) data, the ultimate goal of our project is to establish an ensemble approach and implement a web-based software that can be used reliably …


Assessing Accuracies And Improving Efficiency For Segmentation-Based Rna Secondary Structure Prediction Methods, Gerardo A. Cardenas Jan 2016

Assessing Accuracies And Improving Efficiency For Segmentation-Based Rna Secondary Structure Prediction Methods, Gerardo A. Cardenas

Open Access Theses & Dissertations

RNA secondary structure prediction has become an important area of interest in biology and medicine because it helps in understanding the mechanisms of many biological processes such as gene regulation and viral replication, and in designing RNA-based therapies to treat various diseases such as cancers and AIDS. Different thermodynamics-based computational algorithms for RNA structure prediction exist, and have been used to help understand the disease mechanisms and design treatments. However, most of these computational tools that can predict complex pseudoknot structures have a sequence length limitation of few hundred nucleotide bases due to their high demands of computer resources. Yet, …


Identifying Non-Classical Active Sites As A Tool For Enzyme Inhibition, Marisol Serrano Jan 2016

Identifying Non-Classical Active Sites As A Tool For Enzyme Inhibition, Marisol Serrano

Open Access Theses & Dissertations

Chagas disease, caused by the parasite Trypanosoma cruzi, is an endemic life-threatening disease that affects mainly the heart. It remains the leading cause of heart failure in Latin American countries. Since current treatments against this parasite are highly toxic and somewhat ineffective, novel and more efficacious types of interventions are desired. Cruzain, identified as the major cathepsin for T. cruzi, plays a major role in the parasite's life cycle; making this enzyme very attractive for potential trypanocidal drugs discovery. The recombinant cruzain is synthesized as a zymogenic pro-protein (PCZN) which possesses a pro domain and a catalytic domain. In this …


Improved Efficiency Of Rna Secondary Structure Prediction Using Distributed Computing, Gerardo A. Cardenas Jan 2013

Improved Efficiency Of Rna Secondary Structure Prediction Using Distributed Computing, Gerardo A. Cardenas

Open Access Theses & Dissertations

The rapidly growing amounts of available biomolecular sequence data, which may represent information from small gene fragments to large complete genomes, have led to the a great need for powerful computational resources for data analysis and storage. With the decoding of the human and other genomes, RNA secondary structure prediction has become an important area of interest in biology and medicine because they help in understanding the mechanisms of many biological processes such as gene regulation and viral replication, and in designing RNA-based therapies to treat various diseases. Due to the complexity of their algorithms, many existing and upcoming computational …


Automatic Elucidation Of Gpi Molecular Structures With Grid Computing Technology, Juan Clemente Aguilar Bonavides Jan 2013

Automatic Elucidation Of Gpi Molecular Structures With Grid Computing Technology, Juan Clemente Aguilar Bonavides

Open Access Theses & Dissertations

Glycosylphosphatidylinositol (GPI)-anchored proteins are involved in many biological processes and are of medical importance. The identification and analysis of the entire collection of free and protein-linked GPIs within an organism (i.e., GPIomics) requires highly sensitive instruments. At present, liquid chromatography-tandem mass spectrometry (LC-MS/MS or -MSn) is the most efficient laboratory technique for these tasks. As a typical MSn experiment produces hundreds of thousands of spectra, the data analysis creates a major bottleneck in high-throughput GPIomic projects. Yet, no computational tool for characterizing the chemical structures of GPI is available to date. We propose a library-search algorithm to …


Secondary Structure Prediction Of Long Rna Sequences Based On Inversion Excursions And A Modularized Mapreduce Framework, Daniel Tesfai Yehdego Jan 2012

Secondary Structure Prediction Of Long Rna Sequences Based On Inversion Excursions And A Modularized Mapreduce Framework, Daniel Tesfai Yehdego

Open Access Theses & Dissertations

Ribonucleic acid (RNA) molecules and their secondary structures play important roles in many biological processes including gene expression and regulation. The genomes of many viruses are also RNA molecules. Since secondary structures are crucial for RNA functionality, computational predictions of the RNA secondary structures have been widely studied. However, the tremendous demands on computer memory and computing time for complex secondary structures limit the capability of existing thermodynamically based algorithms for structure predictions to handling only short RNA sequences with a few hundred bases. One approach to overcome this limitation is by first cutting long RNA sequences into shorter, non-overlapping …


Prediction Of Ribonucleic Acid Secondary Structures Using A Heuristic Backtracking Search, Christopher Roman Cuellar Jan 2011

Prediction Of Ribonucleic Acid Secondary Structures Using A Heuristic Backtracking Search, Christopher Roman Cuellar

Open Access Theses & Dissertations

Ribonucleic acid (RNA) is essential for all forms of life. RNA is made up of a large chain of nucleotide bases: Guanine (G), Uracil (U), Cytosine (C), and Adenine (A). An RNA strand can fold on itself to allow G-C, A-U, and G-U bases to form hydrogen bonds, this is known as a secondary structure. Knowing the secondary structure of an RNA chain is very important because it will allow researchers to better understand its specific functions. RNA will create secondary structures that tend to minimize their free energy. RNA secondary structure prediction is the attempt to predict physical folding …


Distributional Properties Of Inversions And Segmentation Algorithms For Rna Sequences, Sameera Dhananjaya Viswakula Jan 2011

Distributional Properties Of Inversions And Segmentation Algorithms For Rna Sequences, Sameera Dhananjaya Viswakula

Open Access Theses & Dissertations

Ribonucleic acid (RNA) is a long single stranded molecule made up of four types of nucleotide bases: Adenine (A), Cytosine(C), Guanine (G) and Uracil (U). It folds back on itself and forms C-G and A-U complementary base pairs. The set of such hydrogen-bonded pairs in an RNA molecule is called its secondary structure. Knowing the secondary structure of RNA is useful for understanding its biological function. Prediction of RNA secondary structure from the nucleotide sequence has been an important bioinformatics problem for over two decades.

The work in this thesis is motivated by the need to improve the secondary structure …


Computational Methods Of Hidden Markov Models With Respect To Cpg Island Prediction In Dna Sequences, Roberto Angel Ortega Jan 2011

Computational Methods Of Hidden Markov Models With Respect To Cpg Island Prediction In Dna Sequences, Roberto Angel Ortega

Open Access Theses & Dissertations

Hidden Markov models (HMM's) are a specific case of Markov models where, contrary to Markov chains, the observer is unaware of what state the model was in when the symbol is observed. Like Markov chains, HMM's assume that the future state of a sequence is dependent only on the current state of the sequence. The parameters associated with HMM's are transition and emission probabilities, where transition probabilities are associated with the probability of transitioning from one state to another, and emission probabilities are the probabilities associated with observing a symbol given it came from a specific state.

The structure of …