Open Access. Powered by Scholars. Published by Universities.®
Articles 1 - 2 of 2
Full-Text Articles in Medicine and Health Sciences
Subject Level Clustering Using A Negative Binomial Model For Small Transcriptomic Studies., Qian Li, Janelle R. Noel-Macdonnell, Devin C. Koestler, Ellen L. Goode, Brooke L. Fridley
Subject Level Clustering Using A Negative Binomial Model For Small Transcriptomic Studies., Qian Li, Janelle R. Noel-Macdonnell, Devin C. Koestler, Ellen L. Goode, Brooke L. Fridley
Manuscripts, Articles, Book Chapters and Other Papers
BACKGROUND: Unsupervised clustering represents one of the most widely applied methods in analysis of high-throughput 'omics data. A variety of unsupervised model-based or parametric clustering methods and non-parametric clustering methods have been proposed for RNA-seq count data, most of which perform well for large samples, e.g. N ≥ 500. A common issue when analyzing limited samples of RNA-seq count data is that the data follows an over-dispersed distribution, and thus a Negative Binomial likelihood model is often used. Thus, we have developed a Negative Binomial model-based (NBMB) clustering approach for application to RNA-seq studies.
RESULTS: We have developed a Negative …
A Comparison Of A Multistate Inpatient Ehr Database To The Hcup Nationwide Inpatient Sample., Jonathan P Deshazo, Mark A Hoffman
A Comparison Of A Multistate Inpatient Ehr Database To The Hcup Nationwide Inpatient Sample., Jonathan P Deshazo, Mark A Hoffman
Manuscripts, Articles, Book Chapters and Other Papers
BACKGROUND: The growing availability of electronic health records (EHRs) in the US could provide researchers with a more detailed and clinically relevant alternative to using claims-based data.
METHODS: In this study we compared a very large EHR database (Health Facts©) to a well-established population estimate (Nationwide Inpatient Sample). Weighted comparisons were made using t-value and relative difference over diagnoses and procedures for the year 2010.
RESULTS: The two databases have a similar distribution pattern across all data elements, with 24 of 50 data elements being statistically similar between the two data sources. In general, differences that were found are consistent …