Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 2 of 2

Full-Text Articles in Physical Sciences and Mathematics

Semiparametric And Nonparametric Methods For Evaluating Risk Prediction Markers In Case-Control Studies, Ying Huang, Margaret Pepe Jul 2008

Semiparametric And Nonparametric Methods For Evaluating Risk Prediction Markers In Case-Control Studies, Ying Huang, Margaret Pepe

UW Biostatistics Working Paper Series

The performance of a well calibrated risk model, Risk(Y)=P(D=1|Y), can be characterized by the population distribution of Risk(Y) and displayed with the predictiveness curve. Better performance is characterized by a wider distribution of Risk(Y), since this corresponds to better risk stratification in the sense that more subjects are identified at low and high risk for the outcome D=1. Although methods have been developed to estimate predictiveness curves from cohort studies, most studies to evaluate novel risk prediction markers employ case-control designs. Here we develop semiparametric and nonparametric methods that accommodate case-control data and assume apriori knowledge of P(D=1). Large and …


Data Mining Methods For Malware Detection, Muazzam Siddiqui Jan 2008

Data Mining Methods For Malware Detection, Muazzam Siddiqui

Electronic Theses and Dissertations

This research investigates the use of data mining methods for malware (malicious programs) detection and proposed a framework as an alternative to the traditional signature detection methods. The traditional approaches using signatures to detect malicious programs fails for the new and unknown malwares case, where signatures are not available. We present a data mining framework to detect malicious programs. We collected, analyzed and processed several thousand malicious and clean programs to find out the best features and build models that can classify a given program into a malware or a clean class. Our research is closely related to information retrieval …