Open Access. Powered by Scholars. Published by Universities.®
- Discipline
-
- Biochemistry (1)
- Biochemistry, Biophysics, and Structural Biology (1)
- Bioinformatics (1)
- Biology (1)
- Biometry (1)
-
- Biostatistics (1)
- Biotechnology (1)
- Cancer Biology (1)
- Cell Biology (1)
- Cell and Developmental Biology (1)
- Clinical Trials (1)
- Computational Biology (1)
- Genetics (1)
- Genetics and Genomics (1)
- Genomics (1)
- Integrative Biology (1)
- Life Sciences (1)
- Mathematics (1)
- Microarrays (1)
- Molecular Genetics (1)
- Multivariate Analysis (1)
- Other Genetics and Genomics (1)
- Other Statistics and Probability (1)
- Probability (1)
- Statistical Theory (1)
- Institution
Articles 1 - 2 of 2
Full-Text Articles in Statistical Models
A Xgboost Risk Model Via Feature Selection And Bayesian Hyper-Parameter Optimization, Yan Wang, Sherry Ni
A Xgboost Risk Model Via Feature Selection And Bayesian Hyper-Parameter Optimization, Yan Wang, Sherry Ni
Published and Grey Literature from PhD Candidates
This paper aims to explore models based on the extreme gradient boosting (XGBoost) approach for business risk classification. Feature selection (FS) algorithms and hyper-parameter optimizations are simultaneously considered during model training. The five most commonly used FS methods including weight by Gini, weight by Chi-square, hierarchical variable clustering, weight by correlation, and weight by information are applied to alleviate the effect of redundant features. Two hyper-parameter optimization approaches, random search (RS) and Bayesian tree-structuredParzen Estimator (TPE), are applied in XGBoost. The effect of different FS and hyper-parameter optimization methods on the model performance are investigated by the Wilcoxon Signed Rank …
Unified Methods For Feature Selection In Large-Scale Genomic Studies With Censored Survival Outcomes, Lauren Spirko-Burns, Karthik Devarajan
Unified Methods For Feature Selection In Large-Scale Genomic Studies With Censored Survival Outcomes, Lauren Spirko-Burns, Karthik Devarajan
COBRA Preprint Series
One of the major goals in large-scale genomic studies is to identify genes with a prognostic impact on time-to-event outcomes which provide insight into the disease's process. With rapid developments in high-throughput genomic technologies in the past two decades, the scientific community is able to monitor the expression levels of tens of thousands of genes and proteins resulting in enormous data sets where the number of genomic features is far greater than the number of subjects. Methods based on univariate Cox regression are often used to select genomic features related to survival outcome; however, the Cox model assumes proportional hazards …