Open Access. Powered by Scholars. Published by Universities.®
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
Articles 1 - 4 of 4
Full-Text Articles in Physical Sciences and Mathematics
Phishing Detection Using Natural Language Processing And Machine Learning, Apurv Mittal, Dr Daniel Engels, Harsha Kommanapalli, Ravi Sivaraman, Taifur Chowdhury
Phishing Detection Using Natural Language Processing And Machine Learning, Apurv Mittal, Dr Daniel Engels, Harsha Kommanapalli, Ravi Sivaraman, Taifur Chowdhury
SMU Data Science Review
Phishing emails are a primary mode of entry for attackers into an organization. A successful phishing attempt leads to unauthorized access to sensitive information and systems. However, automatically identifying phishing emails is often difficult since many phishing emails have composite features such as body text and metadata that are nearly indistinguishable from valid emails. This paper presents a novel machine learning-based framework, the DARTH framework, that characterizes and combines multiple models, with one model for each composite feature, that enables the accurate identification of phishing emails. The framework analyses each composite feature independently utilizing a multi-faceted approach using Natural Language …
Using Natural Language Processing To Increase Modularity And Interpretability Of Automated Essay Evaluation And Student Feedback, Chris Roche, Nathan Deinlein, Darryl Dawkins, Faizan Javed
Using Natural Language Processing To Increase Modularity And Interpretability Of Automated Essay Evaluation And Student Feedback, Chris Roche, Nathan Deinlein, Darryl Dawkins, Faizan Javed
SMU Data Science Review
For English teachers and students who are dissatisfied with the one-size-fits-all approach of current Automated Essay Scoring (AES) systems, this research uses Natural Language Processing (NLP) techniques that provide a focus on configurability and interpretability. Unlike traditional AES models which are designed to provide an overall score based on pre-trained criteria, this tool allows teachers to tailor feedback based upon specific focus areas. The tool implements a user-interface that serves as a customizable rubric. Students’ essays are inputted into the tool either by the student or by the teacher via the application’s user-interface. Based on the rubric settings, the tool …
Stock Forecasts With Lstm And Web Sentiment, Michael Burgess, Faizan Javed, Nnenna Okpara, Chance Robinson
Stock Forecasts With Lstm And Web Sentiment, Michael Burgess, Faizan Javed, Nnenna Okpara, Chance Robinson
SMU Data Science Review
Traditional time-series techniques, such as auto-regressive and moving average models, can have difficulties when applied to stock data due to the randomness inherent to the markets. In this study, Long Short-Term Memory Recurrent Neural Networks, or LSTMs, have been applied to pricing data along with sentiment scores derived from web sources such as Twitter and other financial media outlets. The project team utilized this approach to complement the technical indicators observed at the end of each trading day for three stocks from the NASDAQ stock exchange over a 12-year span. A common benchmark to assess model performance on time series …
Web Page Multiclass Classification, Brian Gaither, Antonio Debouse, Catherine Huang
Web Page Multiclass Classification, Brian Gaither, Antonio Debouse, Catherine Huang
SMU Data Science Review
As the internet age evolves, the volume of content hosted on the Web is rapidly expanding. With this ever-expanding content, the capability to accurately categorize web pages is a current challenge to serve many use cases. This paper proposes a variation in the approach to text preprocessing pipeline whereby noun phrase extraction is performed first followed by lemmatization, contraction expansion, removing special characters, removing extra white space, lower casing, and removal of stop words. The first step of noun phrase extraction is aimed at reducing the set of terms to those that best describe what the web pages are about …