Open Access. Powered by Scholars. Published by Universities.®
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Keyword
-
- Adversarial Attacks (1)
- Adversarial Defenses (1)
- Bias-variance decomposition (1)
- Binary Classification (1)
- Class Imbalance Problem (1)
-
- Credit scoring (1)
- Downstream analysis effects (1)
- Empirical analysis (1)
- Investment decision (1)
- Logistic Regression (1)
- Machine learning (1)
- Neural Network (1)
- Peer-to-peer lending (1)
- Penalized Log-likelihood Function (1)
- Profit Scoring; Unbanked; Underbanked; Alternative Credit Data; Likelihood Ratio Test; Unscorables (1)
- Profit scoring (1)
- Security Analytics (1)
- Theoretical framework (1)
Articles 1 - 5 of 5
Full-Text Articles in Physical Sciences and Mathematics
Quantitatively Motivated Model Development Framework: Downstream Analysis Effects Of Normalization Strategies, Jessica M. Rudd
Quantitatively Motivated Model Development Framework: Downstream Analysis Effects Of Normalization Strategies, Jessica M. Rudd
Doctor of Data Science and Analytics Dissertations
Through a review of epistemological frameworks in social sciences, history of frameworks in statistics, as well as the current state of research, we establish that there appears to be no consistent, quantitatively motivated model development framework in data science, and the downstream analysis effects of various modeling choices are not uniformly documented. Examples are provided which illustrate that analytic choices, even if justifiable and statistically valid, have a downstream analysis effect on model results. This study proposes a unified model development framework that allows researchers to make statistically motivated modeling choices within the development pipeline. Additionally, a simulation study is …
Attack And Defense In Security Analytics, Yiyun Zhou
Attack And Defense In Security Analytics, Yiyun Zhou
Doctor of Data Science and Analytics Dissertations
The security problem has gained increasing awareness due to the various kinds of global threats. Security analytics is the process of using streaming data acquisition, collection, and artificial intelligence algorithms for security monitoring and threat disclosure. In this dissertation work, we utilize practical data-driven security analytics to identify the potential threat and explore the robustness of the machine learning model. We focus on two aspects: (1) Security Analytics: utilize machine learning and statistical analytics tools to identify and resolve the threat in real life, such as cybersecurity, abnormal activities. (2) Analytic Security: Explore the security issues of the machine learning …
Data-Driven Investment Decisions In P2p Lending: Strategies Of Integrating Credit Scoring And Profit Scoring, Yan Wang
Doctor of Data Science and Analytics Dissertations
In this dissertation, we develop and discuss several loan evaluation methods to guide the investment decisions for peer-to-peer (P2P) lending. In evaluating loans, credit scoring and profit scoring are the two widely utilized approaches. Credit scoring aims at minimizing the risk while profit scoring aims at maximizing the profit. This dissertation addresses the strengths and weaknesses of each scoring method by integrating them in various ways in order to provide the optimal investment suggestions for different investors. Before developing the methods for loan evaluation at the individual level, we applied the state-of-the-art method called the Long Short Term Memory (LSTM) …
A Credit Analysis Of The Unbanked And Underbanked: An Argument For Alternative Data, Edwin Baidoo
A Credit Analysis Of The Unbanked And Underbanked: An Argument For Alternative Data, Edwin Baidoo
Doctor of Data Science and Analytics Dissertations
The purpose of this study is to ascertain the statistical and economic significance of non-traditional credit data for individuals who do not have sufficient economic data, collectively known as the unbanked and underbanked. The consequences of not having sufficient economic information often determines whether unbanked and underbanked individuals will receive higher price of credit or be denied entirely. In terms of regulation, there is a strong interest in credit models that will inform policies on how to gradually move sections of the unbanked and underbanked population into the general financial network.
In Chapter 2 of the dissertation, I establish the …
A Novel Penalized Log-Likelihood Function For Class Imbalance Problem, Lili Zhang
A Novel Penalized Log-Likelihood Function For Class Imbalance Problem, Lili Zhang
Doctor of Data Science and Analytics Dissertations
The log-likelihood function is the optimization objective in the maximum likelihood method for estimating models (e.g., logistic regression, neural network). However, its formulation is based on assumptions that the target classes are equally distributed and the overall accuracy is maximized, which do not apply to class imbalance problems (e.g., fraud detection, rare disease diagnoses, customer conversion prediction, cybersecurity, predictive maintenance). When trained on imbalanced data, the resulting models tend to be biased towards the majority class (i.e. non-event), which can bring great loss in practice. One strategy for mitigating such bias is to penalize the misclassification costs of observations differently …