Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Machine Learning

Nova Southeastern University

Articles 1 - 8 of 8

Full-Text Articles in Physical Sciences and Mathematics

Increasing Code Completion Accuracy In Pythia Models For Non-Standard Python Libraries, David Buksbaum Jan 2023

Increasing Code Completion Accuracy In Pythia Models For Non-Standard Python Libraries, David Buksbaum

CCE Theses and Dissertations

Contemporary software development with modern programming languages leverages Integrated Development Environments, smart text editors, and similar tooling with code completion capabilities to increase the efficiency of software developers. Recent code completion research has shown that the combination of natural language processing with recurrent neural networks configured with long short-term memory can improve the accuracy of code completion predictions over prior models. It is well known that the accuracy of predictive systems based on training data is correlated to the quality and the quantity of the training data. This dissertation demonstrates that by expanding the training data set to include more …


Machine Learning Methods For Septic Shock Prediction, Aiman A. Darwiche Jan 2018

Machine Learning Methods For Septic Shock Prediction, Aiman A. Darwiche

CCE Theses and Dissertations

Sepsis is an organ dysfunction life-threatening disease that is caused by a dysregulated body response to infection. Sepsis is difficult to detect at an early stage, and when not detected early, is difficult to treat and results in high mortality rates. Developing improved methods for identifying patients in high risk of suffering septic shock has been the focus of much research in recent years. Building on this body of literature, this dissertation develops an improved method for septic shock prediction. Using the data from the MMIC-III database, an ensemble classifier is trained to identify high-risk patients. A robust prediction model …


Probabilistic Clustering Ensemble Evaluation For Intrusion Detection, Steven M. Mcelwee Jan 2018

Probabilistic Clustering Ensemble Evaluation For Intrusion Detection, Steven M. Mcelwee

CCE Theses and Dissertations

Intrusion detection is the practice of examining information from computers and networks to identify cyberattacks. It is an important topic in practice, since the frequency and consequences of cyberattacks continues to increase and affect organizations. It is important for research, since many problems exist for intrusion detection systems. Intrusion detection systems monitor large volumes of data and frequently generate false positives. This results in additional effort for security analysts to review and interpret alerts. After long hours spent reviewing alerts, security analysts become fatigued and make bad decisions. There is currently no approach to intrusion detection that reduces the workload …


Pulsar Search Using Supervised Machine Learning, John M. Ford Jan 2017

Pulsar Search Using Supervised Machine Learning, John M. Ford

CCE Theses and Dissertations

Pulsars are rapidly rotating neutron stars which emit a strong beam of energy through mechanisms that are not entirely clear to physicists. These very dense stars are used by astrophysicists to study many basic physical phenomena, such as the behavior of plasmas in extremely dense environments, behavior of pulsar-black hole pairs, and tests of general relativity. Many of these tasks require information to answer the scientific questions posed by physicists. In order to provide more pulsars to study, there are several large-scale pulsar surveys underway, which are generating a huge backlog of unprocessed data. Searching for pulsars is a very …


Improved Detection For Advanced Polymorphic Malware, James B. Fraley Jan 2017

Improved Detection For Advanced Polymorphic Malware, James B. Fraley

CCE Theses and Dissertations

Malicious Software (malware) attacks across the internet are increasing at an alarming rate. Cyber-attacks have become increasingly more sophisticated and targeted. These targeted attacks are aimed at compromising networks, stealing personal financial information and removing sensitive data or disrupting operations. Current malware detection approaches work well for previously known signatures. However, malware developers utilize techniques to mutate and change software properties (signatures) to avoid and evade detection. Polymorphic malware is practically undetectable with signature-based defensive technologies. Today’s effective detection rate for polymorphic malware detection ranges from 68.75% to 81.25%. New techniques are needed to improve malware detection rates. Improved detection …


Performance Envelopes Of Adaptive Ensemble Data Stream Classifiers, Stefan Joe-Yen Jan 2017

Performance Envelopes Of Adaptive Ensemble Data Stream Classifiers, Stefan Joe-Yen

CCE Theses and Dissertations

This dissertation documents a study of the performance characteristics of algorithms designed to mitigate the effects of concept drift on online machine learning. Several supervised binary classifiers were evaluated on their performance when applied to an input data stream with a non-stationary class distribution. The selected classifiers included ensembles that combine the contributions of their member algorithms to improve overall performance. These ensembles adapt to changing class definitions, known as “concept drift,” often present in real-world situations, by adjusting the relative contributions of their members. Three stream classification algorithms and three adaptive ensemble algorithms were compared to determine the capabilities …


Evaluation Of Supervised Machine Learning For Classifying Video Traffic, Farrell R. Taylor Jan 2016

Evaluation Of Supervised Machine Learning For Classifying Video Traffic, Farrell R. Taylor

CCE Theses and Dissertations

Operational deployment of machine learning based classifiers in real-world networks has become an important area of research to support automated real-time quality of service decisions by Internet service providers (ISPs) and more generally, network administrators. As the Internet has evolved, multimedia applications, such as voice over Internet protocol (VoIP), gaming, and video streaming, have become commonplace. These traffic types are sensitive to network perturbations, e.g. jitter and delay. Automated quality of service (QoS) capabilities offer a degree of relief by prioritizing network traffic without human intervention; however, they rely on the integration of real-time traffic classification to identify applications. Accordingly, …


Using Diversity Ensembles With Time Limits To Handle Concept Drift, Robert M. Van Camp Jan 2016

Using Diversity Ensembles With Time Limits To Handle Concept Drift, Robert M. Van Camp

CCE Theses and Dissertations

While traditional supervised learning focuses on static datasets, an increasing amount of data comes in the form of streams, where data is continuous and typically processed only once. A common problem with data streams is that the underlying concept we are trying to learn can be constantly evolving. This concept drift has been of interest to researchers the last few years and there is a need for improved machine learning algorithms that are capable of dealing with concept drifts. A promising approach involves using an ensemble of a diverse set of classifiers. The constituent classifiers are re-trained when a concept …