Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 5 of 5

Full-Text Articles in Physical Sciences and Mathematics

Classifying Imbalanced Financial Fraud Data Utilizing Enhanced Random Forest Algorithm, Charles Gardner Dec 2020

Classifying Imbalanced Financial Fraud Data Utilizing Enhanced Random Forest Algorithm, Charles Gardner

Master of Science in Computer Science Theses

Imbalanced datasets have been a unique challenge for machine learning, requiring specialized approaches to correctly classify the minority class. Financial fraud detection involves using highly imbalanced datasets with a class imbalance of up to .01% frauds to 99.99% regular transactions. It is essential to identify all frauds in financial fraud detection, even if some classifications' precision is low. I developed a random forest assembly that separates fraudulent transactions into tiers of precision. With this approach, 96% of fraudulent transactions are identified, showing an 8% increase in recall when compared to standard approaches. 59% of fraud classifications' precision increases by 10% …


Comparing Variable Importance In Prediction Of Silence Behaviours Between Random Forest And Conditional Inference Forest Models., Stephen Barrett Dr, Geraldine Gray Dr, Colm Mcguinness Dr, Michael Knoll Dr. Oct 2020

Comparing Variable Importance In Prediction Of Silence Behaviours Between Random Forest And Conditional Inference Forest Models., Stephen Barrett Dr, Geraldine Gray Dr, Colm Mcguinness Dr, Michael Knoll Dr.

Articles

This paper explores variable importance metrics of Conditional Inference Trees (CIT) and classical Classification And Regression Trees (CART) based Random Forests. The paper compares both algorithms variable importance rankings and highlights why CIT should be used when dealing with data with different levels of aggregation. The models analysed explored the role of cultural factors at individual and societal level when predicting Organisational Silence behaviours.


Machine Learning Approaches For Improving Prediction Performance Of Structure-Activity Relationship Models, Gabriel Idakwo Aug 2020

Machine Learning Approaches For Improving Prediction Performance Of Structure-Activity Relationship Models, Gabriel Idakwo

Dissertations

In silico bioactivity prediction studies are designed to complement in vivo and in vitro efforts to assess the activity and properties of small molecules. In silico methods such as Quantitative Structure-Activity/Property Relationship (QSAR) are used to correlate the structure of a molecule to its biological property in drug design and toxicological studies. In this body of work, I started with two in-depth reviews into the application of machine learning based approaches and feature reduction methods to QSAR, and then investigated solutions to three common challenges faced in machine learning based QSAR studies.

First, to improve the prediction accuracy of learning …


Analysis On Suicidal Ideation Among Adolescents (12-17 Years) In The Usa, Himani Raturi Jul 2020

Analysis On Suicidal Ideation Among Adolescents (12-17 Years) In The Usa, Himani Raturi

Electronic Theses, Projects, and Dissertations

Suicide is one of the leading health concerns in United States among adolescents and the presence of suicidal ideation (SI) is quite high, with ~20-30% of adolescents reporting it at some point. Though we have seen growth and development in the prevention of suicide, there is limited research on the ability to identify the adolescents which might be at risk for SI. The objective behind the project is to identify adolescents with SI using machine learning.

The project shows statistics from different articles on adolescents in the U.S. For this study, adolescent data was taken from NSDUH 2018. Moreover, detailed …


An Application Of Machine Learning To Explore Relationships Between Factors Of Organisational Silence And Culture, With Specific Focus On Predicting Silence Behaviours, Stephen Barrett Dr May 2020

An Application Of Machine Learning To Explore Relationships Between Factors Of Organisational Silence And Culture, With Specific Focus On Predicting Silence Behaviours, Stephen Barrett Dr

Articles

Research indicates that there are many individual reasons why people do not speak up when confronted with situations that may concern them within their working environment. One of the areas that requires more focused research is the role culture plays in why a person may remain silent when such situations arise. The purpose of this study is to use data science techniques to explore the patterns in a data set that would lead a person to engage in organisational silence. The main research question the thesis asks is: Is Machine Learning a tool that Social Scientists can use with respect …