Open Access. Powered by Scholars. Published by Universities.®
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Publication Type
Articles 1 - 6 of 6
Full-Text Articles in Physical Sciences and Mathematics
Spaceflight And The Differential Gene Expression Of Human Stem Cell-Derived Cardiomyocytes, Eugenie Zhu
Spaceflight And The Differential Gene Expression Of Human Stem Cell-Derived Cardiomyocytes, Eugenie Zhu
Master's Projects
The National Aeronautics and Space Administration (NASA) has performed many experiments on the International Space Station (ISS) to further understand how conditions in space can affect life on Earth. This project analyzed GLDS-258, a gene set from NASA’s GeneLab repository which examines the impact of microgravity on human induced pluripotent stem-cell-derived cardiomyocytes (hiPSC-CMs). While many datasets have been run through NASA’s RNA-Seq Consensus Pipeline (RCP) to study differential gene expression in space, a Homo sapiens dataset has yet to be analyzed using the RCP. The aim of this project was to run the first Homo sapiens dataset, GLDS-258, through the …
Poriferal Vision, Saketh Saxena
Poriferal Vision, Saketh Saxena
Master's Projects
Sponges provide nourishment as well as a habitat for various aquatic organisms. Anatomically, sponges are made up of soft tissue with a silica based exoskeleton which serves both as support and protection for the underlying tissue. The exoskeleton persists after the tissue decomposes, and microscopic parts of the exoskeleton break away to form spicules. Oceanographic studies have shown that the density of the sponge spicules is a good indicator of the sponge population in an area. This measure can be used to study sponge population dynamics over time. The spicule density is measured by imaging spicules from samples of water …
Predicting Pancreatic Cancer Using Support Vector Machine, Akshay Bodkhe
Predicting Pancreatic Cancer Using Support Vector Machine, Akshay Bodkhe
Master's Projects
This report presents an approach to predict pancreatic cancer using Support Vector Machine Classification algorithm. The research objective of this project it to predict pancreatic cancer on just genomic, just clinical and combination of genomic and clinical data. We have used real genomic data having 22,763 samples and 154 features per sample. We have also created Synthetic Clinical data having 400 samples and 7 features per sample in order to predict accuracy of just clinical data. To validate the hypothesis, we have combined synthetic clinical data with subset of features from real genomic data. In our results, we observed that …
Word Sense Disambiguation In Biomedical Ontologies With Term Co-Occurrence Analysis And Document Clustering, Bill Andreopoulos, Dimitra Alexopoulou, Michael Schroeder
Word Sense Disambiguation In Biomedical Ontologies With Term Co-Occurrence Analysis And Document Clustering, Bill Andreopoulos, Dimitra Alexopoulou, Michael Schroeder
Faculty Publications, Computer Science
With more and more genomes being sequenced, a lot of effort is devoted to their annotation with terms from controlled vocabularies such as the GeneOntology. Manual annotation based on relevant literature is tedious, but automation of this process is difficult. One particularly challenging problem is word sense disambiguation. Terms such as |development| can refer to developmental biology or to the more general sense. Here, we present two approaches to address this problem by using term co-occurrences and document clustering. To evaluate our method we defined a corpus of 331 documents on development and developmental biology. Term co-occurrence analysis achieves an …
Finding Molecular Complexes Through Multiple Layer Clustering Of Protein Interaction Networks, Bill Andreopoulos, Aijun An, Xiangji Huang, Xiaogang Wang
Finding Molecular Complexes Through Multiple Layer Clustering Of Protein Interaction Networks, Bill Andreopoulos, Aijun An, Xiangji Huang, Xiaogang Wang
Faculty Publications, Computer Science
Clustering protein-protein interaction networks (PINs) helps to identify complexes that guide the cell machinery. Clustering algorithms often create a flat clustering, without considering the layered structure of PINs. We propose the MULIC clustering algorithm that produces layered clusters. We applied MULIC to five PINs. Clusters correlate with known MIPS protein complexes. For example, a cluster of 79 proteins overlaps with a known complex of 88 proteins. Proteins in top cluster layers tend to be more representative of complexes than proteins in bottom layers. Lab work on finding unknown complexes or determining drug effects can be guided by top layer proteins.
Bi-Level Clustering Of Mixed Categorical And Numerical Biomedical Data, Bill Andreopoulos, Aijun An, Xiaogang Wang
Bi-Level Clustering Of Mixed Categorical And Numerical Biomedical Data, Bill Andreopoulos, Aijun An, Xiaogang Wang
Faculty Publications, Computer Science
Biomedical data sets often have mixed categorical and numerical types, where the former represent semantic information on the objects and the latter represent experimental results. We present the BILCOM algorithm for |Bi-Level Clustering of Mixed categorical and numerical data types|. BILCOM performs a pseudo-Bayesian process, where the prior is categorical clustering. BILCOM partitions biomedical data sets of mixed types, such as hepatitis, thyroid disease and yeast gene expression data with Gene Ontology annotations, more accurately than if using one type alone.