Open Access. Powered by Scholars. Published by Universities.®
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Institution
-
- San Jose State University (6)
- University of Arkansas, Fayetteville (6)
- University of Massachusetts Amherst (4)
- Air Force Institute of Technology (2)
- Kennesaw State University (2)
-
- Minnesota State University, Mankato (2)
- California Polytechnic State University, San Luis Obispo (1)
- Clemson University (1)
- Edith Cowan University (1)
- James Madison University (1)
- La Salle University (1)
- Marshall University (1)
- Old Dominion University (1)
- Rochester Institute of Technology (1)
- University of Louisville (1)
- University of New Mexico (1)
- Washington University in St. Louis (1)
- Western Michigan University (1)
- Publication
-
- Master's Projects (6)
- Graduate Theses and Dissertations (5)
- Doctoral Dissertations (4)
- All Graduate Theses, Dissertations, and Other Capstone Projects (2)
- Master of Science in Computer Science Theses (2)
-
- Theses and Dissertations (2)
- All Dissertations (1)
- Computer Science ETDs (1)
- Computer Science Theses & Dissertations (1)
- Computer Science and Computer Engineering Undergraduate Honors Theses (1)
- Dissertations (1)
- Economic Crime Forensics Capstones (1)
- Electronic Theses and Dissertations (1)
- Master's Theses (1)
- Masters Theses, 2020-current (1)
- Senior Honors Papers / Undergraduate Theses (1)
- Theses (1)
- Theses, Dissertations and Capstones (1)
- Theses: Doctorates and Masters (1)
Articles 1 - 30 of 34
Full-Text Articles in Physical Sciences and Mathematics
Towards Robust Long-Form Text Generation Systems, Kalpesh Krishna
Towards Robust Long-Form Text Generation Systems, Kalpesh Krishna
Doctoral Dissertations
Text generation is an important emerging AI technology that has seen significant research advances in recent years. Due to its closeness to how humans communicate, mastering text generation technology can unlock several important applications such as intelligent chat-bots, creative writing assistance, or newer applications like task-agnostic few-shot learning. Most recently, the rapid scaling of large language models (LLMs) has resulted in systems like ChatGPT, capable of generating fluent, coherent and human-like text. However, despite their remarkable capabilities, LLMs still suffer from several limitations, particularly when generating long-form text. In particular, (1) long-form generated text is filled with factual inconsistencies to …
Quantifying And Enhancing The Security Of Federated Learning, Virat Vishnu Shejwalkar
Quantifying And Enhancing The Security Of Federated Learning, Virat Vishnu Shejwalkar
Doctoral Dissertations
Federated learning is an emerging distributed learning paradigm that allows multiple users to collaboratively train a joint machine learning model without having to share their private data with any third party. Due to many of its attractive properties, federated learning has received significant attention from academia as well as industry and now powers major applications, e.g., Google's Gboard and Assistant, Apple's Siri, Owkin's health diagnostics, etc. However, federated learning is yet to see widespread adoption due to a number of challenges. One such challenge is its susceptibility to poisoning by malicious users who aim to manipulate the joint machine learning …
Cyber Attack Surface Mapping For Offensive Security Testing, Douglas Everson
Cyber Attack Surface Mapping For Offensive Security Testing, Douglas Everson
All Dissertations
Security testing consists of automated processes, like Dynamic Application Security Testing (DAST) and Static Application Security Testing (SAST), as well as manual offensive security testing, like Penetration Testing and Red Teaming. This nonautomated testing is frequently time-constrained and difficult to scale. Previous literature suggests that most research is spent in support of improving fully automated processes or in finding specific vulnerabilities, with little time spent improving the interpretation of the scanned attack surface critical to nonautomated testing. In this work, agglomerative hierarchical clustering is used to compress the Internet-facing hosts of 13 representative companies as collected by the Shodan search …
On Phishing: Proposing A Host-Based Multi-Layer Passive/Active Anti-Phishing Approach Combating Counterfeit Websites, Wesam Harbi Fadheel
On Phishing: Proposing A Host-Based Multi-Layer Passive/Active Anti-Phishing Approach Combating Counterfeit Websites, Wesam Harbi Fadheel
Dissertations
Phishing is the starting point of most cyberattacks, mainly categorized as Email, Websites, Social Networks, Phone calls (Vishing), and SMS messaging (Smishing). Phishing refers to an attempt to collect sensitive data, typically in the form of usernames, passwords, credit card numbers, bank account information, etc., or other crucial facts, intending to use or sell the information obtained. Similar to how a fisherman uses bait to catch a fish, an attacker will pose as a trustworthy source to attract and deceive the victim.
This study explores the efficacy of host-side APT (Anti-Phishing Techniques) based onWebsite features like Lexical, Host-Based, or Content-Based …
Unlocking User Identity: A Study On Mouse Dynamics In Dual Gaming Environments For Continuous Authentication, Marcho Setiawan Handoko
Unlocking User Identity: A Study On Mouse Dynamics In Dual Gaming Environments For Continuous Authentication, Marcho Setiawan Handoko
All Graduate Theses, Dissertations, and Other Capstone Projects
With the surge in information management technology reliance and the looming presence of cyber threats, user authentication has become paramount in computer security. Traditional static or one-time authentication has its limitations, prompting the emergence of continuous authentication as a frontline approach for enhanced security. Continuous authentication taps into behavior-based metrics for ongoing user identity validation, predominantly utilizing machine learning techniques to continually model user behaviors. This study elucidates the potential of mouse movement dynamics as a key metric for continuous authentication. By examining mouse movement patterns across two contrasting gaming scenarios - the high-intensity "Team Fortress" and the low-intensity strategic …
Divide-And-Conquer Distributed Learning: Privacy-Preserving Offloading Of Neural Network Computations, Lewis C.L. Brown
Divide-And-Conquer Distributed Learning: Privacy-Preserving Offloading Of Neural Network Computations, Lewis C.L. Brown
Graduate Theses and Dissertations
Machine learning has become a highly utilized technology to perform decision making on high dimensional data. As dataset sizes have become increasingly large so too have the neural networks to learn the complex patterns hidden within. This expansion has continued to the degree that it may be infeasible to train a model from a singular device due to computational or memory limitations of underlying hardware. Purpose built computing clusters for training large models are commonplace while access to networks of heterogeneous devices is still typically more accessible. In addition, with the rise of 5G networks, computation at the edge becoming …
Dataset Evaluation For Data Trading Using Expected Loss And Homomorphic Encryption, Minsung Joo
Dataset Evaluation For Data Trading Using Expected Loss And Homomorphic Encryption, Minsung Joo
Senior Honors Papers / Undergraduate Theses
Supervised machine learning suffers from the ``garbage-in garbage-out" phenomenon where the performance of a model is limited by the quality of the data. While a myriad of data is collected every second, there is no general rigorous method of evaluating the quality of a given dataset. This hinders fair pricing of data in scenarios where a buyer may look to buy data for use with machine learning. In this work, I propose using the expected loss corresponding to a dataset as a measure of its quality, relying on Bayesian methods for uncertainty quantification. Furthermore, I present a secure multi-party computation …
A Predictive Model To Predict Cyberattack Using Self-Normalizing Neural Networks, Oluwapelumi Eniodunmo
A Predictive Model To Predict Cyberattack Using Self-Normalizing Neural Networks, Oluwapelumi Eniodunmo
Theses, Dissertations and Capstones
Cyberattack is a never-ending war that has greatly threatened secured information systems. The development of automated and intelligent systems provides more computing power to hackers to steal information, destroy data or system resources, and has raised global security issues. Statistical and Data mining tools have received continuous research and improvements. These tools have been adopted to create sophisticated intrusion detection systems that help information systems mitigate and defend against cyberattacks. However, the advancement in technology and accessibility of information makes more identifiable elements that can be used to gain unauthorized access to systems and resources. Data mining and classification tools …
Faking Sensor Noise Information, Justin Chang
Faking Sensor Noise Information, Justin Chang
Master's Projects
Noise residue detection in digital images has recently been used as a method to classify images based on source camera model type. The meteoric rise in the popularity of using Neural Network models has also been used in conjunction with the concept of noise residuals to classify source camera models. However, many papers gloss over the details on the methods of obtaining noise residuals and instead rely on the self- learning aspect of deep neural networks to implicitly discover this themselves. For this project I propose a method of obtaining noise residuals (“noiseprints”) and denoising an image, as well as …
Privacy-Preserving Cloud-Assisted Data Analytics, Wei Bao
Privacy-Preserving Cloud-Assisted Data Analytics, Wei Bao
Graduate Theses and Dissertations
Nowadays industries are collecting a massive and exponentially growing amount of data that can be utilized to extract useful insights for improving various aspects of our life. Data analytics (e.g., via the use of machine learning) has been extensively applied to make important decisions in various real world applications. However, it is challenging for resource-limited clients to analyze their data in an efficient way when its scale is large. Additionally, the data resources are increasingly distributed among different owners. Nonetheless, users' data may contain private information that needs to be protected.
Cloud computing has become more and more popular in …
Achieving Differential Privacy And Fairness In Machine Learning, Depeng Xu
Achieving Differential Privacy And Fairness In Machine Learning, Depeng Xu
Graduate Theses and Dissertations
Machine learning algorithms are used to make decisions in various applications, such as recruiting, lending and policing. These algorithms rely on large amounts of sensitive individual information to work properly. Hence, there are sociological concerns about machine learning algorithms on matters like privacy and fairness. Currently, many studies only focus on protecting individual privacy or ensuring fairness of algorithms separately without taking consideration of their connection. However, there are new challenges arising in privacy preserving and fairness-aware machine learning. On one hand, there is fairness within the private model, i.e., how to meet both privacy and fairness requirements simultaneously in …
A Methodology For Detecting Credit Card Fraud, Kayode Ayorinde
A Methodology For Detecting Credit Card Fraud, Kayode Ayorinde
All Graduate Theses, Dissertations, and Other Capstone Projects
Fraud detection has appertained to many industries such as banking, retails, financial services, healthcare, etc. As we know, fraud detection is a set of campaigns undertaken to avert the acquisition of illegal means to obtain money or property under false pretense. With an unlimited and growing number of ways fraudsters commit fraud crimes, detecting online fraud was so tricky to achieve. This research work aims to examine feasible ways to identify credit card fraudulent activities that negatively impact financial institutes. In the United States, an average of U.S consumers lost a median of $429 from credit card fraud in 2017, …
Improving A Wireless Localization System Via Machine Learning Techniques And Security Protocols, Zachary Yorio
Improving A Wireless Localization System Via Machine Learning Techniques And Security Protocols, Zachary Yorio
Masters Theses, 2020-current
The recent advancements made in Internet of Things (IoT) devices have brought forth new opportunities for technologies and systems to be integrated into our everyday life. In this work, we investigate how edge nodes can effectively utilize 802.11 wireless beacon frames being broadcast from pre-existing access points in a building to achieve room-level localization. We explain the needed hardware and software for this system and demonstrate a proof of concept with experimental data analysis. Improvements to localization accuracy are shown via machine learning by implementing the random forest algorithm. Using this algorithm, historical data can train the model and make …
The Limits Of Location Privacy In Mobile Devices, Keen Yuun Sung
The Limits Of Location Privacy In Mobile Devices, Keen Yuun Sung
Doctoral Dissertations
Mobile phones are widely adopted by users across the world today. However, the privacy implications of persistent connectivity are not well understood. This dissertation focuses on one important concern of mobile phone users: location privacy. I approach this problem from the perspective of three adversaries that users are exposed to via smartphone apps: the mobile advertiser, the app developer, and the cellular service provider. First, I quantify the proportion of mobile users who use location permissive apps and are able to be tracked through their advertising identifier, and demonstrate a mark and recapture attack that allows continued tracking of users …
Superb: Superior Behavior-Based Anomaly Detection Defining Authorized Users' Traffic Patterns, Daniel Karasek
Superb: Superior Behavior-Based Anomaly Detection Defining Authorized Users' Traffic Patterns, Daniel Karasek
Master of Science in Computer Science Theses
Network anomalies are correlated to activities that deviate from regular behavior patterns in a network, and they are undetectable until their actions are defined as malicious. Current work in network anomaly detection includes network-based and host-based intrusion detection systems. However, network anomaly detection schemes can suffer from high false detection rates due to the base rate fallacy. When the detection rate is less than the false positive rate, which is found in network anomaly detection schemes working with live data, a high false detection rate can occur. To overcome such a drawback, this paper proposes a superior behavior-based anomaly detection …
Data Mining Of Chinese Social Networks: Factors That Indicate Post Deletion, Meisam Navaki Arefi
Data Mining Of Chinese Social Networks: Factors That Indicate Post Deletion, Meisam Navaki Arefi
Computer Science ETDs
Widespread Chinese social media applications such as Sina Weibo (Chinese Twitter), the most popular social network in China, are widely known for monitoring and deleting posts to conform to Chinese government requirements. Censorship of Chinese social media is a complex process that involves many factors. There are multiple stakeholders and many different interests: economic, political, legal, personal, etc., which means that there is not a single strategy dictated by a single government authority. Moreover, sometimes Chinese social media do not follow the directives of government, out of concern that they are more strictly censoring than their competitors.
One crucial question …
Countering Cybersecurity Vulnerabilities In The Power System, Fengli Zhang
Countering Cybersecurity Vulnerabilities In The Power System, Fengli Zhang
Graduate Theses and Dissertations
Security vulnerabilities in software pose an important threat to power grid security, which can be exploited by attackers if not properly addressed. Every month, many vulnerabilities are discovered and all the vulnerabilities must be remediated in a timely manner to reduce the chance of being exploited by attackers. In current practice, security operators have to manually analyze each vulnerability present in their assets and determine the remediation actions in a short time period, which involves a tremendous amount of human resources for electric utilities. To solve this problem, we propose a machine learning-based automation framework to automate vulnerability analysis and …
Intelligent Log Analysis For Anomaly Detection, Steven Yen
Intelligent Log Analysis For Anomaly Detection, Steven Yen
Master's Projects
Computer logs are a rich source of information that can be analyzed to detect various issues. The large volumes of logs limit the effectiveness of manual approaches to log analysis. The earliest automated log analysis tools take a rule-based approach, which can only detect known issues with existing rules. On the other hand, anomaly detection approaches can detect new or unknown issues. This is achieved by looking for unusual behavior different from the norm, often utilizing machine learning (ML) or deep learning (DL) models. In this project, we evaluated various ML and DL techniques used for log anomaly detection. We …
Machine Learning Versus Deep Learning For Malware Detection, Parth Jain
Machine Learning Versus Deep Learning For Malware Detection, Parth Jain
Master's Projects
It is often claimed that the primary advantage of deep learning is that such models can continue to learn as more data is available, provided that sufficient computing power is available for training. In contrast, for other forms of machine learning it is claimed that models ‘‘saturate,’’ in the sense that no additional learning can occur beyond some point, regardless of the amount of data or computing power available. In this research, we compare the accuracy of deep learning to other forms of machine learning for malware detection, as a function of the training dataset size. We experiment with a …
Multifamily Malware Models, Samanvitha Basole
Multifamily Malware Models, Samanvitha Basole
Master's Projects
When training a machine learning model, there is likely to be a tradeoff between the accuracy of the model and the generality of the dataset. Previous research has shown that if we train a model to detect one specific malware family, we obtain stronger results as compared to a case where we train a single model on multiple diverse families. During the detection phase, it would be more efficient to have a single model that could detect multiple families, rather than having to score each sample against multiple models. In this research, we conduct experiments to quantify the relationship between …
Evaluating Machine Learning Techniques For Smart Home Device Classification, Angelito E. Aragon Jr.
Evaluating Machine Learning Techniques For Smart Home Device Classification, Angelito E. Aragon Jr.
Theses and Dissertations
Smart devices in the Internet of Things (IoT) have transformed the management of personal and industrial spaces. Leveraging inexpensive computing, smart devices enable remote sensing and automated control over a diverse range of processes. Even as IoT devices provide numerous benefits, it is vital that their emerging security implications are studied. IoT device design typically focuses on cost efficiency and time to market, leading to limited built-in encryption, questionable supply chains, and poor data security. In a 2017 report, the United States Government Accountability Office recommended that the Department of Defense investigate the risks IoT devices pose to operations security, …
Confidence Inference In Defensive Cyber Operator Decision Making, Graig S. Ganitano
Confidence Inference In Defensive Cyber Operator Decision Making, Graig S. Ganitano
Theses and Dissertations
Cyber defense analysts face the challenge of validating machine generated alerts regarding network-based security threats. Operations tempo and systematic manpower issues have increased the importance of these individual analyst decisions, since they typically are not reviewed or changed. Analysts may not always be confident in their decisions. If confidence can be accurately assessed, then analyst decisions made under low confidence can be independently reviewed and analysts can be offered decision assistance or additional training. This work investigates the utility of using neurophysiological and behavioral correlates of decision confidence to train machine learning models to infer confidence in analyst decisions. Electroencephalography …
The Benefits Of Artificial Intelligence In Cybersecurity, Ricardo Calderon
The Benefits Of Artificial Intelligence In Cybersecurity, Ricardo Calderon
Economic Crime Forensics Capstones
Cyberthreats have increased extensively during the last decade. Cybercriminals have become more sophisticated. Current security controls are not enough to defend networks from the number of highly skilled cybercriminals. Cybercriminals have learned how to evade the most sophisticated tools, such as Intrusion Detection and Prevention Systems (IDPS), and botnets are almost invisible to current tools. Fortunately, the application of Artificial Intelligence (AI) may increase the detection rate of IDPS systems, and Machine Learning (ML) techniques are able to mine data to detect botnets’ sources. However, the implementation of AI may bring other risks, and cybersecurity experts need to find a …
Learning-Based Analysis On The Exploitability Of Security Vulnerabilities, Adam Bliss
Learning-Based Analysis On The Exploitability Of Security Vulnerabilities, Adam Bliss
Computer Science and Computer Engineering Undergraduate Honors Theses
The purpose of this thesis is to develop a tool that uses machine learning techniques to make predictions about whether or not a given vulnerability will be exploited. Such a tool could help organizations such as electric utilities to prioritize their security patching operations. Three different models, based on a deep neural network, a random forest, and a support vector machine respectively, are designed and implemented. Training data for these models is compiled from a variety of sources, including the National Vulnerability Database published by NIST and the Exploit Database published by Offensive Security. Extensive experiments are conducted, including testing …
Malware Image Classification Using Machine Learning With Local Binary Pattern, Jhu-Sin Luo, Dan Lo
Malware Image Classification Using Machine Learning With Local Binary Pattern, Jhu-Sin Luo, Dan Lo
Master of Science in Computer Science Theses
Malware classification is a critical part in the cybersecurity.
Traditional methodologies for the malware classification
typically use static analysis and dynamic analysis to identify malware.
In this paper, a malware classification methodology based
on its binary image and extracting local binary pattern (LBP)
features are proposed. First, malware images are reorganized into
3 by 3 grids which is mainly used to extract LBP feature. Second,
the LBP is implemented on the malware images to extract features
in that it is useful in pattern or texture classification. Finally,
Tensorflow, a library for machine learning, is applied to classify
malware images with …
Applying Machine Learning To Advance Cyber Security: Network Based Intrusion Detection Systems, Hassan Hadi Latheeth Al-Maksousy
Applying Machine Learning To Advance Cyber Security: Network Based Intrusion Detection Systems, Hassan Hadi Latheeth Al-Maksousy
Computer Science Theses & Dissertations
Many new devices, such as phones and tablets as well as traditional computer systems, rely on wireless connections to the Internet and are susceptible to attacks. Two important types of attacks are the use of malware and exploiting Internet protocol vulnerabilities in devices and network systems. These attacks form a threat on many levels and therefore any approach to dealing with these nefarious attacks will take several methods to counter. In this research, we utilize machine learning to detect and classify malware, visualize, detect and classify worms, as well as detect deauthentication attacks, a form of Denial of Service (DoS). …
Dynamic Adversarial Mining - Effectively Applying Machine Learning In Adversarial Non-Stationary Environments., Tegjyot Singh Sethi
Dynamic Adversarial Mining - Effectively Applying Machine Learning In Adversarial Non-Stationary Environments., Tegjyot Singh Sethi
Electronic Theses and Dissertations
While understanding of machine learning and data mining is still in its budding stages, the engineering applications of the same has found immense acceptance and success. Cybersecurity applications such as intrusion detection systems, spam filtering, and CAPTCHA authentication, have all begun adopting machine learning as a viable technique to deal with large scale adversarial activity. However, the naive usage of machine learning in an adversarial setting is prone to reverse engineering and evasion attacks, as most of these techniques were designed primarily for a static setting. The security domain is a dynamic landscape, with an ongoing never ending arms race …
Problems In Graph-Structured Modeling And Learning, James Atwood
Problems In Graph-Structured Modeling And Learning, James Atwood
Doctoral Dissertations
This thesis investigates three problems in graph-structured modeling and learning. We first present a method for efficiently generating large instances from nonlinear preferential attachment models of network structure. This is followed by a description of diffusion-convolutional neural networks, a new model for graph-structured data which is able to outperform probabilistic relational models and kernel-on-graph methods at node classification tasks. We conclude with an optimal privacy-protection method for users of online services that remains effective when users have poor knowledge of an adversary's behavior.
Image Spam Detection, Aneri Chavda
Image Spam Detection, Aneri Chavda
Master's Projects
Email is one of the most common forms of digital communication. Spam can be de ned as unsolicited bulk email, while image spam includes spam text embedded inside images. Image spam is used by spammers so as to evade text-based spam lters and hence it poses a threat to email based communication. In this research, we analyze image spam detection methods based on various combinations of image processing and machine learning techniques.
Malware Detection Using The Index Of Coincidence, Bhavna Gurnani
Malware Detection Using The Index Of Coincidence, Bhavna Gurnani
Master's Projects
In this research, we apply the Index of Coincidence (IC) to problems in malware analysis. The IC, which is often used in cryptanalysis of classic ciphers, is a technique for measuring the repeat rate in a string of symbols. A score based on the IC is applied to a variety of challenging malware families. We nd that this relatively simple IC score performs surprisingly well, with superior results in comparison to various machine learning based scores, at least in some cases.