Open Access. Powered by Scholars. Published by Universities.®
- Institution
- Keyword
-
- Feature selection (3)
- Algorithms (1)
- Cancer genomics (1)
- Computational Biology (1)
- Computational biology (1)
-
- Crossing hazards (1)
- Databases, Genetic (1)
- Disorder (1)
- Eukaryotes (1)
- Evolution (1)
- Gene Expression Regulation (1)
- Gene expression (1)
- Gene signatures (1)
- Genes (1)
- High-throughput "omics" (1)
- Humans (1)
- Kullback-Leibler information divergence (1)
- Longitudinal data (1)
- Metabolic networks (1)
- Network theory (1)
- Non-proportional hazards (1)
- Orthology (1)
- Paralogy (1)
- Pathway information (1)
- Power law distribution (1)
- Proportional odds (1)
- Protein (1)
- Rates (1)
- Sequence (1)
- Sign averages (1)
Articles 1 - 5 of 5
Full-Text Articles in Computational Biology
Do Metabolic Networks Follow A Power Law? A Psamm Analysis, Ryan Geib, Lubos Thoma, Ying Zhang
Do Metabolic Networks Follow A Power Law? A Psamm Analysis, Ryan Geib, Lubos Thoma, Ying Zhang
Senior Honors Projects
Inspired by the landmark paper “Emergence of Scaling in Random Networks” by Barabási and Albert, the field of network science has focused heavily on the power law distribution in recent years. This distribution has been used to model everything from the popularity of sites on the World Wide Web to the number of citations received on a scientific paper. The feature of this distribution is highlighted by the fact that many nodes (websites or papers) have few connections (internet links or citations) while few “hubs” are connected to many nodes. These properties lead to two very important observed effects: the …
Incorporating Pathway Information Into Feature Selection Towards Better Performed Gene Signatures, Suyan Tian, Chi Wang, Bing Wang
Incorporating Pathway Information Into Feature Selection Towards Better Performed Gene Signatures, Suyan Tian, Chi Wang, Bing Wang
Biostatistics Faculty Publications
To analyze gene expression data with sophisticated grouping structures and to extract hidden patterns from such data, feature selection is of critical importance. It is well known that genes do not function in isolation but rather work together within various metabolic, regulatory, and signaling pathways. If the biological knowledge contained within these pathways is taken into account, the resulting method is a pathway-based algorithm. Studies have demonstrated that a pathway-based method usually outperforms its gene-based counterpart in which no biological knowledge is considered. In this article, a pathway-based feature selection is firstly divided into three major categories, namely, pathway-level selection, …
Computational Analysis Of Large-Scale Trends And Dynamics In Eukaryotic Protein Family Evolution, Joseph Boehm Ahrens
Computational Analysis Of Large-Scale Trends And Dynamics In Eukaryotic Protein Family Evolution, Joseph Boehm Ahrens
FIU Electronic Theses and Dissertations
The myriad protein-coding genes found in present-day eukaryotes arose from a combination of speciation and gene duplication events, spanning more than one billion years of evolution. Notably, as these proteins evolved, the individual residues at each site in their amino acid sequences were replaced at markedly different rates. The relationship between protein structure, protein function, and site-specific rates of amino acid replacement is a topic of ongoing research. Additionally, there is much interest in the different evolutionary constraints imposed on sequences related by speciation (orthologs) versus sequences related by gene duplication (paralogs). A principal aim of this dissertation is to …
Feature Selection For Longitudinal Data By Using Sign Averages To Summarize Gene Expression Values Over Time, Suyan Tian, Chi Wang
Feature Selection For Longitudinal Data By Using Sign Averages To Summarize Gene Expression Values Over Time, Suyan Tian, Chi Wang
Biostatistics Faculty Publications
With the rapid evolution of high-throughput technologies, time series/longitudinal high-throughput experiments have become possible and affordable. However, the development of statistical methods dealing with gene expression profiles across time points has not kept up with the explosion of such data. The feature selection process is of critical importance for longitudinal microarray data. In this study, we proposed aggregating a gene’s expression values across time into a single value using the sign average method, thereby degrading a longitudinal feature selection process into a classic one. Regularized logistic regression models with pseudogenes (i.e., the sign average of genes across time as predictors) …
Unified Methods For Feature Selection In Large-Scale Genomic Studies With Censored Survival Outcomes, Lauren Spirko-Burns, Karthik Devarajan
Unified Methods For Feature Selection In Large-Scale Genomic Studies With Censored Survival Outcomes, Lauren Spirko-Burns, Karthik Devarajan
COBRA Preprint Series
One of the major goals in large-scale genomic studies is to identify genes with a prognostic impact on time-to-event outcomes which provide insight into the disease's process. With rapid developments in high-throughput genomic technologies in the past two decades, the scientific community is able to monitor the expression levels of tens of thousands of genes and proteins resulting in enormous data sets where the number of genomic features is far greater than the number of subjects. Methods based on univariate Cox regression are often used to select genomic features related to survival outcome; however, the Cox model assumes proportional hazards …