Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences

San Jose State University

Bioinformatics

Articles 1 - 6 of 6

Full-Text Articles in Physical Sciences and Mathematics

Spaceflight And The Differential Gene Expression Of Human Stem Cell-Derived Cardiomyocytes, Eugenie Zhu May 2021

Spaceflight And The Differential Gene Expression Of Human Stem Cell-Derived Cardiomyocytes, Eugenie Zhu

Master's Projects

The National Aeronautics and Space Administration (NASA) has performed many experiments on the International Space Station (ISS) to further understand how conditions in space can affect life on Earth. This project analyzed GLDS-258, a gene set from NASA’s GeneLab repository which examines the impact of microgravity on human induced pluripotent stem-cell-derived cardiomyocytes (hiPSC-CMs). While many datasets have been run through NASA’s RNA-Seq Consensus Pipeline (RCP) to study differential gene expression in space, a Homo sapiens dataset has yet to be analyzed using the RCP. The aim of this project was to run the first Homo sapiens dataset, GLDS-258, through the …


Poriferal Vision, Saketh Saxena May 2019

Poriferal Vision, Saketh Saxena

Master's Projects

Sponges provide nourishment as well as a habitat for various aquatic organisms. Anatomically, sponges are made up of soft tissue with a silica based exoskeleton which serves both as support and protection for the underlying tissue. The exoskeleton persists after the tissue decomposes, and microscopic parts of the exoskeleton break away to form spicules. Oceanographic studies have shown that the density of the sponge spicules is a good indicator of the sponge population in an area. This measure can be used to study sponge population dynamics over time. The spicule density is measured by imaging spicules from samples of water …


Predicting Pancreatic Cancer Using Support Vector Machine, Akshay Bodkhe May 2017

Predicting Pancreatic Cancer Using Support Vector Machine, Akshay Bodkhe

Master's Projects

This report presents an approach to predict pancreatic cancer using Support Vector Machine Classification algorithm. The research objective of this project it to predict pancreatic cancer on just genomic, just clinical and combination of genomic and clinical data. We have used real genomic data having 22,763 samples and 154 features per sample. We have also created Synthetic Clinical data having 400 samples and 7 features per sample in order to predict accuracy of just clinical data. To validate the hypothesis, we have combined synthetic clinical data with subset of features from real genomic data. In our results, we observed that …


Word Sense Disambiguation In Biomedical Ontologies With Term Co-Occurrence Analysis And Document Clustering, Bill Andreopoulos, Dimitra Alexopoulou, Michael Schroeder Sep 2008

Word Sense Disambiguation In Biomedical Ontologies With Term Co-Occurrence Analysis And Document Clustering, Bill Andreopoulos, Dimitra Alexopoulou, Michael Schroeder

Faculty Publications, Computer Science

With more and more genomes being sequenced, a lot of effort is devoted to their annotation with terms from controlled vocabularies such as the GeneOntology. Manual annotation based on relevant literature is tedious, but automation of this process is difficult. One particularly challenging problem is word sense disambiguation. Terms such as |development| can refer to developmental biology or to the more general sense. Here, we present two approaches to address this problem by using term co-occurrences and document clustering. To evaluate our method we defined a corpus of 331 documents on development and developmental biology. Term co-occurrence analysis achieves an …


Finding Molecular Complexes Through Multiple Layer Clustering Of Protein Interaction Networks, Bill Andreopoulos, Aijun An, Xiangji Huang, Xiaogang Wang Jan 2007

Finding Molecular Complexes Through Multiple Layer Clustering Of Protein Interaction Networks, Bill Andreopoulos, Aijun An, Xiangji Huang, Xiaogang Wang

Faculty Publications, Computer Science

Clustering protein-protein interaction networks (PINs) helps to identify complexes that guide the cell machinery. Clustering algorithms often create a flat clustering, without considering the layered structure of PINs. We propose the MULIC clustering algorithm that produces layered clusters. We applied MULIC to five PINs. Clusters correlate with known MIPS protein complexes. For example, a cluster of 79 proteins overlaps with a known complex of 88 proteins. Proteins in top cluster layers tend to be more representative of complexes than proteins in bottom layers. Lab work on finding unknown complexes or determining drug effects can be guided by top layer proteins.


Bi-Level Clustering Of Mixed Categorical And Numerical Biomedical Data, Bill Andreopoulos, Aijun An, Xiaogang Wang Jun 2006

Bi-Level Clustering Of Mixed Categorical And Numerical Biomedical Data, Bill Andreopoulos, Aijun An, Xiaogang Wang

Faculty Publications, Computer Science

Biomedical data sets often have mixed categorical and numerical types, where the former represent semantic information on the objects and the latter represent experimental results. We present the BILCOM algorithm for |Bi-Level Clustering of Mixed categorical and numerical data types|. BILCOM performs a pseudo-Bayesian process, where the prior is categorical clustering. BILCOM partitions biomedical data sets of mixed types, such as hepatitis, thyroid disease and yeast gene expression data with Gene Ontology annotations, more accurately than if using one type alone.