Open Access. Powered by Scholars. Published by Universities.®

Computer Engineering Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 7 of 7

Full-Text Articles in Computer Engineering

A Knowledge-Based Clinical Toxicology Consultant For Diagnosing Multiple Exposures, Joel D. Schipper, Douglas D. Dankel Ii, A. Antonio Arroyo, Jay L. Schauben May 2013

A Knowledge-Based Clinical Toxicology Consultant For Diagnosing Multiple Exposures, Joel D. Schipper, Douglas D. Dankel Ii, A. Antonio Arroyo, Jay L. Schauben

Publications

Objective: This paper presents continued research toward the development of a knowledge-based system for the diagnosis of human toxic exposures. In particular, this research focuses on the challenging task of diagnosing exposures to multiple toxins. Although only 10% of toxic exposures in the United States involve multiple toxins, multiple exposures account for more than half of all toxin-related fatalities. Using simple medical mathematics, we seek to produce a practical decision support system capable of supplying useful information to aid in the diagnosis of complex cases involving multiple unknown substances.

Methods: The system is automatically trained using data mining …


Rank Based Anomaly Detection Algorithms, Huaming Huang May 2013

Rank Based Anomaly Detection Algorithms, Huaming Huang

Electrical Engineering and Computer Science - Dissertations

Anomaly or outlier detection problems are of considerable importance, arising frequently in diverse real-world applications such as finance and cyber-security. Several algorithms have been formulated for such problems, usually based on formulating a problem-dependent heuristic or distance metric. This dissertation proposes anomaly detection algorithms that exploit the notion of ``rank," expressing relative outlierness of different points in the relevant space, and exploiting asymmetry in nearest neighbor relations between points: a data point is ``more anomalous" if it is not the nearest neighbor of its nearest neighbors. Although rank is computed using distance, it is a more robust and higher level …


Data Mining The Harness Track And Predicting Outcomes, Robert P. Schumaker Apr 2013

Data Mining The Harness Track And Predicting Outcomes, Robert P. Schumaker

Journal of International Technology and Information Management

This paper presented the S&C Racing system that uses Support Vector Regression (SVR) to predict harness race finishes and analyzed it on fifteen months of data from Northfield Park. We found that our system outperforms the most common betting strategies of wagering on the favorites and the mathematical arbitrage Dr. Z system in five of the seven wager types tested. This work would suggest that an informational inequality exists within the harness racing market that is not apparent to domain experts.


Predicting Sql Injection And Cross Site Scripting Vulnerabilities Through Mining Input Sanitization Patterns, Lwin Khin Shar, Hee Beng Kuan Tan Apr 2013

Predicting Sql Injection And Cross Site Scripting Vulnerabilities Through Mining Input Sanitization Patterns, Lwin Khin Shar, Hee Beng Kuan Tan

Research Collection School Of Computing and Information Systems

ContextSQL injection (SQLI) and cross site scripting (XSS) are the two most common and serious web application vulnerabilities for the past decade. To mitigate these two security threats, many vulnerability detection approaches based on static and dynamic taint analysis techniques have been proposed. Alternatively, there are also vulnerability prediction approaches based on machine learning techniques, which showed that static code attributes such as code complexity measures are cheap and useful predictors. However, current prediction approaches target general vulnerabilities. And most of these approaches locate vulnerable code only at software component or file levels. Some approaches also involve process attributes that …


An Efficient Algorithm To Solve High-Dimensional Data Clustering: Candidate Subspace Clustering Algorithm, Chin-Chieh Kao Jan 2013

An Efficient Algorithm To Solve High-Dimensional Data Clustering: Candidate Subspace Clustering Algorithm, Chin-Chieh Kao

Theses Digitization Project

For this project, a comprehensive literature review on high dimensional data clustering is conducted and a novel density-algorithm to perform high dimensional data clustering is developed.


A Rule Induction Algorithm For Knowledge Discovery And Classification, Ömer Akgöbek Jan 2013

A Rule Induction Algorithm For Knowledge Discovery And Classification, Ömer Akgöbek

Turkish Journal of Electrical Engineering and Computer Sciences

Classification and rule induction are key topics in the fields of decision making and knowledge discovery. The objective of this study is to present a new algorithm developed for automatic knowledge acquisition in data mining. The proposed algorithm has been named RES-2 (Rule Extraction System). It aims at eliminating the pitfalls and disadvantages of the techniques and algorithms currently in use. The proposed algorithm makes use of the direct rule extraction approach, rather than the decision tree. For this purpose, it uses a set of examples to induce general rules. In this study, 15 datasets consisting of multiclass values with …


A Window Of Opportunity: Assessing Behavioural Scoring, Kenneth Kennedy, Brian Mac Namee, Sarah Jane Delany, Michael O'Sullivan, Neil Watson Jan 2013

A Window Of Opportunity: Assessing Behavioural Scoring, Kenneth Kennedy, Brian Mac Namee, Sarah Jane Delany, Michael O'Sullivan, Neil Watson

Articles

After credit has been granted, lenders use behavioural scoring to assess the likelihood of default occurring during some specific outcome period. This assessment is based on customers’ repayment performance over a given fixed period. Often the outcome period and fixed performance period are arbitrarily selected, causing instability in making predictions. Behavioural scoring has failed to receive the same attention from researchers as application scoring. The bias for application scoring research can be attributed, in part, to the large volume of data required for behavioural scoring studies. Furthermore, the commercial sensitivities associated with such a large pool of customer data often …