Open Access. Powered by Scholars. Published by Universities.®

Computer Engineering Commons

Open Access. Powered by Scholars. Published by Universities.®

Machine learning

Theses/Dissertations

Discipline
Institution
Publication Year
Publication

Articles 31 - 60 of 148

Full-Text Articles in Computer Engineering

Digitalization Of Construction Project Requirements Using Natural Language Processing (Nlp) Techniques, Fahad Ul Hassan May 2022

Digitalization Of Construction Project Requirements Using Natural Language Processing (Nlp) Techniques, Fahad Ul Hassan

All Dissertations

Contract documents are a critical legal component of a construction project that specify all wishes and expectations of the owner toward the design, construction, and handover of a project. A single contract package, especially of a design-build (DB) project, comprises hundreds of documents including thousands of requirements. Precise comprehension and management of the requirements are critical to ensure that all important explicit and implicit requirements of the project scope are captured, managed, and completed. Since requirements are mainly written in a natural human language, the current manual methods impose a significant burden on practitioners to process and restructure them into …


Data-Driven Framework For Understanding & Modeling Ride-Sourcing Transportation Systems, Bishoy Kelleny May 2022

Data-Driven Framework For Understanding & Modeling Ride-Sourcing Transportation Systems, Bishoy Kelleny

Civil & Environmental Engineering Theses & Dissertations

Ride-sourcing transportation services offered by transportation network companies (TNCs) like Uber and Lyft are disrupting the transportation landscape. The growing demand on these services, along with their potential short and long-term impacts on the environment, society, and infrastructure emphasize the need to further understand the ride-sourcing system. There were no sufficient data to fully understand the system and integrate it within regional multimodal transportation frameworks. This can be attributed to commercial and competition reasons, given the technology-enabled and innovative nature of the system. Recently, in 2019, the City of Chicago the released an extensive and complete ride-sourcing trip-level data for …


Machine Learning Assisted Discovery Of Shape Memory Polymers And Their Thermomechanical Modeling, Cheng Yan Apr 2022

Machine Learning Assisted Discovery Of Shape Memory Polymers And Their Thermomechanical Modeling, Cheng Yan

LSU Doctoral Dissertations

As a new class of smart materials, shape memory polymer (SMP) is gaining great attention in both academia and industry. One challenge is that the chemical space is huge, while the human intelligence is limited, so that discovery of new SMPs becomes more and more difficult. In this dissertation, by adopting a series of machine learning (ML) methods, two frameworks are established for discovering new thermoset shape memory polymers (TSMPs). Specifically, one of them is performed by a combination of four methods, i.e., the most recently proposed linear notation BigSMILES, supplementing existing dataset by reasonable approximation, a mixed dimension (1D …


Classification Of Electropherograms Using Machine Learning For Parkinson’S Disease, Soroush Dehghan Jan 2022

Classification Of Electropherograms Using Machine Learning For Parkinson’S Disease, Soroush Dehghan

Electronic Theses and Dissertations

Parkinson’s disease (PD) is a neurodegenerative movement disorder that progresses gradually over time. The onset of symptoms in people who are suffering from PD can vary from case to case, and it depends on the progression of the disease in each patient. The PD symptoms gradually develop and exacerbate the patient’s movements throughout time. An early diagnosis of PD could improve the outcomes of treatments and could potentially delay the progression of this disorder and that makes discovering a new diagnostic method valuable. In this study, I investigate the feasibility of using a machine learning (ML) approach to classify PD …


Design, Development And Benchmarking Of Machine Learning Algorithms In Biomedical Applications, Qi Sun Jan 2022

Design, Development And Benchmarking Of Machine Learning Algorithms In Biomedical Applications, Qi Sun

Theses and Dissertations--Computer Science

Machine learning algorithms are becoming the most effective methods for knowledge discovery from high dimensional datasets. Machine learning seeks to construct predictive models through the analysis of large-scale heterogeneous data. While machine learning has been widely used in many domains including computer vision, natural language processing, product recommendation, its application in biomedical science for clinical diagnosis and treatment is only emerging. However, the wealthy amount of data in the biomedical domain offers not only challenges but also opportunities for machine learning. In this dissertation, we focus on three biomedical applications from vastly different domains to understand the opportunities and challenges …


Dynamic Instance-Wise Decision-Making For Machine Learning, Yasitha Warahena Liyanage Jan 2022

Dynamic Instance-Wise Decision-Making For Machine Learning, Yasitha Warahena Liyanage

Legacy Theses & Dissertations (2009 - 2024)

In a typical supervised machine learning setting, the predictions on all test instances are based on a common subset of features discovered during model training. However, using a different subset of features that are most informative for each test instance individually may improve not only the quality of prediction but also the overall interpretability of the model. To this end, in this dissertation, we study the problem of optimizing the trade-off between instance-level sparsity and the quality of prediction using a dynamic instance-wise decision-making approach. Specifically, this approach sequentially reviews features one at a time for each data instance given …


Few-Shot Malware Detection Using A Novel Adversarial Reprogramming Model, Ekula Praveen Kumar Jan 2022

Few-Shot Malware Detection Using A Novel Adversarial Reprogramming Model, Ekula Praveen Kumar

Browse all Theses and Dissertations

The increasing sophistication of malware has made detecting and defending against new strains a major challenge for cybersecurity. One promising approach to this problem is using machine learning techniques that extract representative features and train classification models to detect malware in an early stage. However, training such machine learning-based malware detection models represents a significant challenge that requires a large number of high-quality labeled data samples while it is very costly to obtain them in real-world scenarios. In other words, training machine learning models for malware detection requires the capability to learn from only a few labeled examples. To address …


Detecting User Emotions From Audio Conversations With The Smart Assistants, Sunanda Guha Jan 2022

Detecting User Emotions From Audio Conversations With The Smart Assistants, Sunanda Guha

MSU Graduate Theses

With the proliferation of smart home devices like Google Home or Amazon Alexa, significant research endeavors are being carried out to improve the user experience while interacting with these smart assistants. One such dimension in this endeavor is ongoing research on successful emotion detection from short voice commands used in smart home environment. Besides facial expression and body language, etc., speech plays a pivotal role in the classification of emotions when it comes to smart home application. Upon successful implementation of accurate emotion recognition, the smart devices will be able to intelligently and empathetically suggest appropriate actions based on the …


Statistics-Based Anomaly Detection And Correction Method For Amazon Customer Reviews, Ishani Chatterjee Dec 2021

Statistics-Based Anomaly Detection And Correction Method For Amazon Customer Reviews, Ishani Chatterjee

Dissertations

People nowadays use the Internet to project their assessments, impressions, ideas, and observations about various subjects or products on numerous social networking sites. These sites serve as a great source of gathering information for data analytics, sentiment analysis, natural language processing, etc. The most critical challenge is interpreting this data and capturing the sentiment behind these expressions. Sentiment analysis is analyzing, processing, concluding, and inferencing subjective texts with the views. Companies use sentiment analysis to understand public opinions, perform market research, analyze brand reputation, recognize customer experiences, and study social media influence. According to the different needs for aspect granularity, …


On Resource-Efficiency And Performance Optimization In Big Data Computing And Networking Using Machine Learning, Wuji Liu Dec 2021

On Resource-Efficiency And Performance Optimization In Big Data Computing And Networking Using Machine Learning, Wuji Liu

Dissertations

Due to the rapid transition from traditional experiment-based approaches to large-scale, computational intensive simulations, next-generation scientific applications typically involve complex numerical modeling and extreme-scale simulations. Such model-based simulations oftentimes generate colossal amounts of data, which must be transferred over high-performance network (HPN) infrastructures to remote sites and analyzed against experimental or observation data on high-performance computing (HPC) facility. Optimizing the performance of both data transfer in HPN and simulation-based model development on HPC is critical to enabling and accelerating knowledge discovery and scientific innovation. However, such processes generally involve an enormous set of attributes including domain-specific model parameters, network transport …


Detecting Malware In Memory With Memory Object Relationships, Demarcus M. Thomas Sr. Dec 2021

Detecting Malware In Memory With Memory Object Relationships, Demarcus M. Thomas Sr.

Theses and Dissertations

Malware is a growing concern that not only affects large businesses but the basic consumer as well. As a result, there is a need to develop tools that can identify the malicious activities of malware authors. A useful technique to achieve this is memory forensics. Memory forensics is the study of volatile data and its structures in Random Access Memory (RAM). It can be utilized to pinpoint what actions have occurred on a computer system.

This dissertation utilizes memory forensics to extract relationships between objects and supervised machine learning as a novel method for identifying malicious processes in a system …


Network Management, Optimization And Security With Machine Learning Applications In Wireless Networks, Mariam Nabil Dec 2021

Network Management, Optimization And Security With Machine Learning Applications In Wireless Networks, Mariam Nabil

Theses and Dissertations

Wireless communication networks are emerging fast with a lot of challenges and ambitions. Requirements that are expected to be delivered by modern wireless networks are complex, multi-dimensional, and sometimes contradicting. In this thesis, we investigate several types of emerging wireless networks and tackle some challenges of these various networks. We focus on three main challenges. Those are Resource Optimization, Network Management, and Cyber Security. We present multiple views of these three aspects and propose solutions to probable scenarios. The first challenge (Resource Optimization) is studied in Wireless Powered Communication Networks (WPCNs). WPCNs are considered a very promising approach towards sustainable, …


Deepfakes Generated By Generative Adversarial Networks, Olympia A. Paul Nov 2021

Deepfakes Generated By Generative Adversarial Networks, Olympia A. Paul

Honors College Theses

Deep learning is a type of Artificial Intelligence (AI) that mimics the workings of the human brain in processing data such as speech recognition, visual object recognition, object detection, language translation, and making decisions. A Generative adversarial network (GAN) is a special type of deep learning, designed by Goodfellow et al. (2014), which is what we call convolution neural networks (CNN). How a GAN works is that when given a training set, they can generate new data with the same information as the training set, and this is often what we refer to as deep fakes. CNN takes an input …


Benchmarking Small-Dataset Structure-Activity-Relationship Models For Prediction Of Wnt Signaling Inhibition, Mahtab Kokabi Oct 2021

Benchmarking Small-Dataset Structure-Activity-Relationship Models For Prediction Of Wnt Signaling Inhibition, Mahtab Kokabi

Masters Theses

Quantitative structure-activity relationship (QSAR) models based on machine learning algorithms are powerful tools to expedite drug discovery processes and therapeutics development. Given the cost in acquiring large-sized training datasets, it is useful to examine if QSAR analysis can reasonably predict drug activity with only a small-sized dataset (size < 100) and benchmark these small-dataset QSAR models in application-specific studies. To this end, here we present a systematic benchmarking study on small-dataset QSAR models built for prediction of effective Wnt signaling inhibitors, which are essential to therapeutics development in prevalent human diseases (e.g., cancer). Specifically, we examined a total of 72 two-dimensional (2D) QSAR models based on 4 best-performing algorithms, 6 commonly used molecular fingerprints, and 3 typical fingerprint lengths. We trained these models using a training dataset (56 compounds), benchmarked their performance on 4 figures-of-merit (FOMs), and examined their prediction accuracy using an external validation dataset (14 compounds). Our data show that the model performance is maximized when: 1) molecular fingerprints are selected to provide sufficient, unique, and not overly detailed representations of the chemical structures of drug compounds; 2) algorithms are selected to reduce the number of false predictions due to class imbalance in the dataset; and 3) models are selected to reach balanced performance on all 4 FOMs. These results may provide general guidelines in developing high-performance small-dataset QSAR models for drug activity prediction.


Data-Driven Learning For Robot Physical Intelligence, Leidi Zhao Aug 2021

Data-Driven Learning For Robot Physical Intelligence, Leidi Zhao

Dissertations

The physical intelligence, which emphasizes physical capabilities such as dexterous manipulation and dynamic mobility, is essential for robots to physically coexist with humans. Much research on robot physical intelligence has achieved success on hyper robot motor capabilities, but mostly through heavily case-specific engineering. Meanwhile, in terms of robot acquiring skills in a ubiquitous manner, robot learning from human demonstration (LfD) has achieved great progress, but still has limitations handling dynamic skills and compound actions. In this dissertation, a composite learning scheme which goes beyond LfD and integrates robot learning from human definition, demonstration, and evaluation is proposed. This method tackles …


Machine Learning For Analog/Mixed-Signal Integrated Circuit Design Automation, Weidong Cao Aug 2021

Machine Learning For Analog/Mixed-Signal Integrated Circuit Design Automation, Weidong Cao

McKelvey School of Engineering Theses & Dissertations

Analog/mixed-signal (AMS) integrated circuits (ICs) play an essential role in electronic systems by processing analog signals and performing data conversion to bridge the analog physical world and our digital information world.Their ubiquitousness powers diverse applications ranging from smart devices and autonomous cars to crucial infrastructures. Despite such critical importance, conventional design strategies of AMS circuits still follow an expensive and time-consuming manual process and are unable to meet the exponentially-growing productivity demands from industry and satisfy the rapidly-changing design specifications from many emerging applications. Design automation of AMS IC is thus the key to tackling these challenges and has been …


Privacy-Preserving Cloud-Assisted Data Analytics, Wei Bao Jul 2021

Privacy-Preserving Cloud-Assisted Data Analytics, Wei Bao

Graduate Theses and Dissertations

Nowadays industries are collecting a massive and exponentially growing amount of data that can be utilized to extract useful insights for improving various aspects of our life. Data analytics (e.g., via the use of machine learning) has been extensively applied to make important decisions in various real world applications. However, it is challenging for resource-limited clients to analyze their data in an efficient way when its scale is large. Additionally, the data resources are increasingly distributed among different owners. Nonetheless, users' data may contain private information that needs to be protected.

Cloud computing has become more and more popular in …


Off-Chain Transaction Routing In Payment Channel Networks: A Machine Learning Approach, Heba Kadry Jun 2021

Off-Chain Transaction Routing In Payment Channel Networks: A Machine Learning Approach, Heba Kadry

Theses and Dissertations

Blockchain is a foundational technology that has the potential to create new prospects for our economic and social systems. However, the scalability problem limits the capability to deliver a target throughput and latency, compared to the traditional financial systems, with increasing workload. Layer-two is a collective term for solutions designed to help solve the scalability by handling transactions off the main chain, also known as layer one. These solutions have the capability to achieve high throughput, fast settlement, and cost efficiency without sacrificing network security. For example, bidirectional payment channels are utilized to allow the execution of fast transactions between …


Data Mining Of Unstructured Textual Information In Transportation Safety Domain: Exploring Methods, Opportunities And Limitations, Keneth Morgan Kwayu Jun 2021

Data Mining Of Unstructured Textual Information In Transportation Safety Domain: Exploring Methods, Opportunities And Limitations, Keneth Morgan Kwayu

Dissertations

The unprecedented increase in volume and influx of structured and unstructured data has overwhelmed conventional data management system capabilities in organizing, analyzing, and procuring useful information in a timely fashion. Structured data sources have a pre-defined pattern that makes data preprocessing and information retrieval tasks relatively easy for the current technologies that have been designed to handle structured and repeatable data. Unlike structured data, unstructured data usually exists in an unorganized format that offers no or little insight unless indexed and stored in an organized fashion. The inherent format of unstructured data exacerbates difficulties in data preprocessing and information extraction. …


Impact Assessment, Detection, And Mitigation Of False Data Attacks In Electrical Power Systems, Sagnik Basumallik May 2021

Impact Assessment, Detection, And Mitigation Of False Data Attacks In Electrical Power Systems, Sagnik Basumallik

Dissertations - ALL

The global energy market has seen a massive increase in investment and capital flow in the last few decades. This has completely transformed the way power grids operate - legacy systems are now being replaced by advanced smart grid infrastructures that attest to better connectivity and increased reliability. One popular example is the extensive deployment of phasor measurement units, which is referred to PMUs, that constantly provide time-synchronized phasor measurements at a high resolution compared to conventional meters. This enables system operators to monitor in real-time the vast electrical network spanning thousands of miles. However, a targeted cyber attack on …


Redai: A Machine Learning Approach To Cyber Threat Intelligence, Luke Noel May 2021

Redai: A Machine Learning Approach To Cyber Threat Intelligence, Luke Noel

Masters Theses, 2020-current

The world is continually demanding more effective and intelligent solutions and strategies to combat adversary groups across the cyber defense landscape. Cyber Threat Intelligence (CTI) is a field within the domain of cyber security that allows for organizations to utilize threat intelligence and serves as a tool for organizations to proactively harden their defense posture. However, there is a large volume of CTI and it is often a daunting task for organizations to effectively consume, utilize, and apply it to their defense strategies. In this thesis we develop a machine learning solution, named RedAI, to investigate whether open-source intelligence (OSINT) …


Human Fatigue Predictions In Complex Aviation Crew Operational Impact Conditions, Suresh Rangan May 2021

Human Fatigue Predictions In Complex Aviation Crew Operational Impact Conditions, Suresh Rangan

Doctoral Dissertations

In this last decade, several regulatory frameworks across the world in all modes of transportation had brought fatigue and its risk management in operations to the forefront. Of all transportation modes air travel has been the safest means of transportation. Still as part of continuous improvement efforts, regulators are insisting the operators to adopt strong fatigue science and its foundational principles to reinforce safety risk assessment and management. Fatigue risk management is a data driven system that finds a realistic balance between safety and productivity in an organization. This work discusses the effects of mathematical modeling of fatigue and its …


Multi-Style Explainable Matrix Factorization Techniques For Recommender Systems., Olurotimi Nugbepo Seton May 2021

Multi-Style Explainable Matrix Factorization Techniques For Recommender Systems., Olurotimi Nugbepo Seton

Electronic Theses and Dissertations

Black-box recommender system models are machine learning models that generate personalized recommendations without explaining how the recommendations were generated to the user or giving them a way to correct wrong assumptions made about them by the model. However, compared to white-box models, which are transparent and scrutable, black-box models are generally more accurate. Recent research has shown that accuracy alone is not sufficient for user satisfaction. One such black-box model is Matrix Factorization, a State of the Art recommendation technique that is widely used due to its ability to deal with sparse data sets and to produce accurate recommendations. Recent …


Machine Learning Approaches For Lung Cancer Diagnosis., Ahmed Mahmoud Ahmed Shaffie May 2021

Machine Learning Approaches For Lung Cancer Diagnosis., Ahmed Mahmoud Ahmed Shaffie

Electronic Theses and Dissertations

The enormity of changes and development in the field of medical imaging technology is hard to fathom, as it does not just represent the technique and process of constructing visual representations of the body from inside for medical analysis and to reveal the internal structure of different organs under the skin, but also it provides a noninvasive way for diagnosis of various disease and suggest an efficient ways to treat them. While data surrounding all of our lives are stored and collected to be ready for analysis by data scientists, medical images are considered a rich source that could provide …


An Inside Vs. Outside Classification System For Wi-Fi Iot Devices, Paul Gralla Apr 2021

An Inside Vs. Outside Classification System For Wi-Fi Iot Devices, Paul Gralla

Dartmouth College Undergraduate Theses

We are entering an era in which Smart Devices are increasingly integrated into our daily lives. Everyday objects are gaining computational power to interact with their environments and communicate with each other and the world via the Internet. While the integration of such devices offers many potential benefits to their users, it also gives rise to a unique set of challenges. One of those challenges is to detect whether a device belongs to one’s own ecosystem, or to a neighbor – or represents an unexpected adversary. An important part of determining whether a device is friend or adversary is to …


A Tiered Recommender System For Cost-Effective Cloud Instance Selection, Xusheng Ai Jan 2021

A Tiered Recommender System For Cost-Effective Cloud Instance Selection, Xusheng Ai

University of the Pacific Theses and Dissertations

Cloud computing has greatly impacted the scientific community and the end users. By leveraging cloud computing, small research institutions and undergraduate colleges are able to alleviate costs and achieve research goals without purchasing and maintaining all the hardware and software. In addition, cloud computing allows researchers to access resources as their teams require and allows real-time collaboration with team members across the globe. Nowadays however, users are easily overwhelmed by the wide range of cloud servers and instances. Due to differences between the cloud server platforms and between instances within the platform, users find it difficult to identify the right …


Iot Malicious Traffic Classification Using Machine Learning, Michael Austin Jan 2021

Iot Malicious Traffic Classification Using Machine Learning, Michael Austin

Graduate Theses, Dissertations, and Problem Reports

Although desktops and laptops have historically composed the bulk of botnet nodes, Internet of Things (IoT) devices have become more recent targets. Lightbulbs, outdoor cameras, watches, and many other small items are connected to WiFi and each other; and few have well-developed security or hardening. Research on botnets typically leverages honeypots, PCAPs, and network traffic analysis tools to develop detection models. The research questions addressed in this Problem Report are: (1) What machine learning algorithm performs the best in a binary classification task for a representative dataset of malicious and benign IoT traffic; and (2) What features have the most …


Unobtrusive Assessment Of Student Engagement Levels In Online Classroom Environment Using Emotion Analysis, Sasirekha Anbusegaran Jan 2021

Unobtrusive Assessment Of Student Engagement Levels In Online Classroom Environment Using Emotion Analysis, Sasirekha Anbusegaran

Electronic Theses and Dissertations

Measuring student engagement has emerged as a significant factor in the process of learning and a good indicator of the knowledge retention capacity of the student. As synchronous online classes have become more prevalent in recent years, gauging a student's attention level is more critical in validating the progress of every student in an online classroom environment. This paper details the study on profiling the student attentiveness to different gradients of engagement level using multiple machine learning models. Results from the high accuracy model and the confidence score obtained from the cloud-based computer vision platform - Amazon Rekognition were then …


Texture-Driven Image Clustering In Laser Powder Bed Fusion, Alexander H. Groeger Jan 2021

Texture-Driven Image Clustering In Laser Powder Bed Fusion, Alexander H. Groeger

Browse all Theses and Dissertations

The additive manufacturing (AM) field is striving to identify anomalies in laser powder bed fusion (LPBF) using multi-sensor in-process monitoring paired with machine learning (ML). In-process monitoring can reveal the presence of anomalies but creating a ML classifier requires labeled data. The present work approaches this problem by printing hundreds of Inconel-718 coupons with different processing parameters to capture a wide range of process monitoring imagery with multiple sensor types. Afterwards, the process monitoring images are encoded into feature vectors and clustered to isolate groups in each sensor modality. Four texture representations were learned by training two convolutional neural network …


Analysis Of Classifier Weaknesses Based On Patterns And Corrective Methods, Nicholas Skapura Jan 2021

Analysis Of Classifier Weaknesses Based On Patterns And Corrective Methods, Nicholas Skapura

Browse all Theses and Dissertations

Classification is an important branch of machine learning that impacts many areas of modern life. Many classification algorithms (classifiers for short) have been developed. They have highly different levels of sophistication and classification accuracy. Classification problems often have highly different levels of hardness and complexity. Practitioners of classification modeling need better understanding of those algorithms in order to select the optimal algorithm for given classification problems. Researchers of classification need new insight on how given classifiers are weak and how they can be improved by correcting their classification errors. This dissertation introduces new tools and concepts to analyze classifier weakness …