Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Software Engineering

Machine learning

Institution
Publication Year
Publication
Publication Type

Articles 31 - 60 of 65

Full-Text Articles in Physical Sciences and Mathematics

Development Of Fully Balanced Ssfp And Computer Vision Applications For Mri-Assisted Radiosurgery (Mars), Jeremiah Sanders May 2020

Development Of Fully Balanced Ssfp And Computer Vision Applications For Mri-Assisted Radiosurgery (Mars), Jeremiah Sanders

Dissertations & Theses (Open Access)

Prostate cancer is the second most common cancer in men and the second-leading cause of cancer death in men. Brachytherapy is a highly effective treatment option for prostate cancer, and is the most cost-effective initial treatment among all other therapeutic options for low to intermediate risk patients of prostate cancer. In low-dose-rate (LDR) brachytherapy, verifying the location of the radioactive seeds within the prostate and in relation to critical normal structures after seed implantation is essential to ensuring positive treatment outcomes.

One current gap in knowledge is how to simultaneously image the prostate, surrounding anatomy, and radioactive seeds within the …


Applying Imitation And Reinforcement Learning To Sparse Reward Environments, Haven Brown May 2020

Applying Imitation And Reinforcement Learning To Sparse Reward Environments, Haven Brown

Computer Science and Computer Engineering Undergraduate Honors Theses

The focus of this project was to shorten the time it takes to train reinforcement learning agents to perform better than humans in a sparse reward environment. Finding a general purpose solution to this problem is essential to creating agents in the future capable of managing large systems or performing a series of tasks before receiving feedback. The goal of this project was to create a transition function between an imitation learning algorithm (also referred to as a behavioral cloning algorithm) and a reinforcement learning algorithm. The goal of this approach was to allow an agent to first learn to …


Treatment Effects Of Modafinil For Cocaine Use Disorders: A Retrospective Analysis Of Aggregated Clinical Trial Data From Three Cocaine Treatment Studies, Daniel Ruskin Mar 2020

Treatment Effects Of Modafinil For Cocaine Use Disorders: A Retrospective Analysis Of Aggregated Clinical Trial Data From Three Cocaine Treatment Studies, Daniel Ruskin

Honors Scholar Theses

Approximately 913,000 individuals in the United States meet the diagnostic criteria for cocaine use disorder (CUD). The widespread usage of cocaine, along with the negative cardiac and neurological effects associated with the drug, has made cocaine one of the top three drugs associated with overdose deaths in the United States. This epidemic has brought cocaine dependency into the public spotlight and has prompted extensive research into treatment strategies. However, at the time of writing, no drugs have been approved by the United States Food and Drug Administration (FDA) for use in treating CUD. The purpose of this study is to …


Graph Classification With Kernels, Embeddings And Convolutional Neural Networks, Monica Golahalli Seenappa, Katerina Potika, Petros Potikas Mar 2020

Graph Classification With Kernels, Embeddings And Convolutional Neural Networks, Monica Golahalli Seenappa, Katerina Potika, Petros Potikas

Faculty Publications, Computer Science

In the graph classification problem, given is a family of graphs and a group of different categories, and we aim to classify all the graphs (of the family) into the given categories. Earlier approaches, such as graph kernels and graph embedding techniques have focused on extracting certain features by processing the entire graph. However, real world graphs are complex and noisy and these traditional approaches are computationally intensive. With the introduction of the deep learning framework, there have been numerous attempts to create more efficient classification approaches. We modify a kernel graph convolutional neural network approach, that extracts subgraphs (patches) …


Are The Code Snippets What We Are Searching For? A Benchmark And An Empirical Study On Code Search With Natural-Language Queries, Shuhan Yan, Hang Yu, Yuting Chen, Beijun Shen Feb 2020

Are The Code Snippets What We Are Searching For? A Benchmark And An Empirical Study On Code Search With Natural-Language Queries, Shuhan Yan, Hang Yu, Yuting Chen, Beijun Shen

Research Collection School Of Computing and Information Systems

Code search methods, especially those that allow programmers to raise queries in a natural language, plays an important role in software development. It helps to improve programmers' productivity by returning sample code snippets from the Internet and/or source-code repositories for their natural-language queries. Meanwhile, there are many code search methods in the literature that support natural-language queries. Difficulties exist in recognizing the strengths and weaknesses of each method and choosing the right one for different usage scenarios, because (1) the implementations of those methods and the datasets for evaluating them are usually not publicly available, and (2) some methods leverage …


Learning-Guided Network Fuzzing For Testing Cyber-Physical System Defences, Yuqi Chen, Christopher M. Poskitt, Jun Sun, Sridhar Adepu, Fan Zhang Jan 2020

Learning-Guided Network Fuzzing For Testing Cyber-Physical System Defences, Yuqi Chen, Christopher M. Poskitt, Jun Sun, Sridhar Adepu, Fan Zhang

Research Collection School Of Computing and Information Systems

The threat of attack faced by cyber-physical systems (CPSs), especially when they play a critical role in automating public infrastructure, has motivated research into a wide variety of attack defence mechanisms. Assessing their effectiveness is challenging, however, as realistic sets of attacks to test them against are not always available. In this paper, we propose smart fuzzing, an automated, machine learning guided technique for systematically finding 'test suites' of CPS network attacks, without requiring any knowledge of the system's control programs or physical processes. Our approach uses predictive machine learning models and metaheuristic search algorithms to guide the fuzzing of …


Computer Vision Gesture Recognition For Rock Paper Scissors, Nicholas Hunter Jan 2020

Computer Vision Gesture Recognition For Rock Paper Scissors, Nicholas Hunter

Senior Independent Study Theses

This project implements a human versus computer game of rock-paper-scissors using machine learning and computer vision. Player’s hand gestures are detected using single images with the YOLOv3 object detection system. This provides a generalized detection method which can recognize player moves without the need for a special background or lighting setup. Additionally, past moves are examined in context to predict the most probable next move of the system’s opponent. In this way, the system achieves higher win rates against human opponents than by using a purely random strategy.


Renewable Energy Integration In Distribution System With Artificial Intelligence, Yi Gu Jan 2020

Renewable Energy Integration In Distribution System With Artificial Intelligence, Yi Gu

Electronic Theses and Dissertations

With the increasing attention of renewable energy development in distribution power system, artificial intelligence (AI) can play an indispensiable role. In this thesis, a series of artificial intelligence based methods are studied and implemented to further enhance the performance of power system operation and control.

Due to the large volume of heterogeneous data provided by both the customer and the grid side, a big data visualization platform is built to feature out the hidden useful knowledge for smart grid (SG) operation, control and situation awareness. An open source cluster calculation framework with Apache Spark is used to discover big data …


Development Of Machine Learning Tutorials For R, John Pintar Jan 2020

Development Of Machine Learning Tutorials For R, John Pintar

All Undergraduate Theses and Capstone Projects

Machine learning (ML) techniques developed in computer science have revolutionized nearly every sector of industry. Despite the prevalence and usefulness of ML, students outside of computer science rarely receive training in ML. Students frequently receive training in statistical analysis, often using the software package R, which is free, open source, and has additional downloadable modules. A popular module is the ML package caret, which contains 238 different ML algorithms, each with 0-9 hyperparameters. caret is powerful, flexible, and provides consistent syntax across algorithms. In the hands of an experienced practitioner, this tunability is welcomed and can increase accuracy. However, when …


Countering Cybersecurity Vulnerabilities In The Power System, Fengli Zhang Dec 2019

Countering Cybersecurity Vulnerabilities In The Power System, Fengli Zhang

Graduate Theses and Dissertations

Security vulnerabilities in software pose an important threat to power grid security, which can be exploited by attackers if not properly addressed. Every month, many vulnerabilities are discovered and all the vulnerabilities must be remediated in a timely manner to reduce the chance of being exploited by attackers. In current practice, security operators have to manually analyze each vulnerability present in their assets and determine the remediation actions in a short time period, which involves a tremendous amount of human resources for electric utilities. To solve this problem, we propose a machine learning-based automation framework to automate vulnerability analysis and …


Ml4iot: A Framework To Orchestrate Machine Learning Workflows On Internet Of Things Data, Jose Miguel Alves, Leonardo Honorio, Miriam A M Capretz Oct 2019

Ml4iot: A Framework To Orchestrate Machine Learning Workflows On Internet Of Things Data, Jose Miguel Alves, Leonardo Honorio, Miriam A M Capretz

Electrical and Computer Engineering Publications

Internet of Things (IoT) applications generate vast amounts of real-time data. Temporal analysis of these data series to discover behavioural patterns may lead to qualified knowledge affecting a broad range of industries. Hence, the use of machine learning (ML) algorithms over IoT data has the potential to improve safety, economy, and performance in critical processes. However, creating ML workflows at scale is a challenging task that depends upon both production and specialized skills. Such tasks require investigation, understanding, selection, and implementation of specific ML workflows, which often lead to bottlenecks, production issues, and code management complexity and even then may …


How Does Machine Learning Change Software Development Practices?, Zhiyuan Wan, Xin Xia, David Lo, Gail C. Murphy Aug 2019

How Does Machine Learning Change Software Development Practices?, Zhiyuan Wan, Xin Xia, David Lo, Gail C. Murphy

Research Collection School Of Computing and Information Systems

Adding an ability for a system to learn inherently adds uncertainty into the system. Given the rising popularity of incorporating machine learning into systems, we wondered how the addition alters software development practices. We performed a mixture of qualitative and quantitative studies with 14 interviewees and 342 survey respondents from 26 countries across four continents to elicit significant differences between the development of machine learning systems and the development of non-machine-learning systems. Our study uncovers significant differences in various aspects of software engineering (e.g., requirements, design, testing, and process) and work characteristics (e.g., skill variety, problem solving and task identity). …


Deepreview: Automatic Code Review Using Deep Multi-Instance Learning, Hengyi Li, Shuting Shi, Ferdian Thung, Xuan Huo, Bowen Xu, Ming Li, David Lo Apr 2019

Deepreview: Automatic Code Review Using Deep Multi-Instance Learning, Hengyi Li, Shuting Shi, Ferdian Thung, Xuan Huo, Bowen Xu, Ming Li, David Lo

Research Collection School Of Computing and Information Systems

Code review, an inspection of code changes in order to identify and fix defects before integration, is essential in Software Quality Assurance (SQA). Code review is a time-consuming task since the reviewers need to understand, analysis and provide comments manually. To alleviate the burden of reviewers, automatic code review is needed. However, this task has not been well studied before. To bridge this research gap, in this paper, we formalize automatic code review as a multi-instance learning task that each change consisting of multiple hunks is regarded as a bag, and each hunk is described as an instance. We propose …


Learning To Map The Visual And Auditory World, Tawfiq Salem Jan 2019

Learning To Map The Visual And Auditory World, Tawfiq Salem

Theses and Dissertations--Computer Science

The appearance of the world varies dramatically not only from place to place but also from hour to hour and month to month. Billions of images that capture this complex relationship are uploaded to social-media websites every day and often are associated with precise time and location metadata. This rich source of data can be beneficial to improve our understanding of the globe. In this work, we propose a general framework that uses these publicly available images for constructing dense maps of different ground-level attributes from overhead imagery. In particular, we use well-defined probabilistic models and a weakly-supervised, multi-task training …


Cleaver: Classification Of Everyday Activities Via Ensemble Recognizers, Samantha Hsu Dec 2018

Cleaver: Classification Of Everyday Activities Via Ensemble Recognizers, Samantha Hsu

Master's Theses

Physical activity can have immediate and long-term benefits on health and reduce the risk for chronic diseases. Valid measures of physical activity are needed in order to improve our understanding of the exact relationship between physical activity and health. Activity monitors have become a standard for measuring physical activity; accelerometers in particular are widely used in research and consumer products because they are objective, inexpensive, and practical. Previous studies have experimented with different monitor placements and classification methods. However, the majority of these methods were developed using data collected in controlled, laboratory-based settings, which is not reliably representative of real …


A Model-Based Ai-Driven Test Generation System, Dionny Santiago Nov 2018

A Model-Based Ai-Driven Test Generation System, Dionny Santiago

FIU Electronic Theses and Dissertations

Achieving high software quality today involves manual analysis, test planning, documentation of testing strategy and test cases, and development of automated test scripts to support regression testing. This thesis is motivated by the opportunity to bridge the gap between current test automation and true test automation by investigating learning-based solutions to software testing. We present an approach that combines a trainable web component classifier, a test case description language, and a trainable test generation and execution system that can learn to generate new test cases. Training data was collected and hand-labeled across 7 systems, 95 web pages, and 17,360 elements. …


Identifying Elderlies At Risk Of Becoming More Depressed With Internet-Of-Things, Jiajue Ou, Huiguang Liang, Hwee Xian Tan Jul 2018

Identifying Elderlies At Risk Of Becoming More Depressed With Internet-Of-Things, Jiajue Ou, Huiguang Liang, Hwee Xian Tan

Research Collection School Of Computing and Information Systems

Depression in the elderly is common and dangerous. Current methods to monitor elderly depression, however, are costly, time-consuming and inefficient. In this paper, we present a novel depression-monitoring system that infers an elderly’s changes in depression level based on his/her activity patterns, extracted from wireless sensor data. To do so, we build predictive models to learn the relationship between depression level changes and behaviors using historical data. We also deploy the system for a group of elderly, in their homes, and run the experiments for more than one year. Our experimental study gives encouraging results, suggesting that our IoT system …


Advanced Malware Detection For Android Platform, Ke Xu Jun 2018

Advanced Malware Detection For Android Platform, Ke Xu

Dissertations and Theses Collection (Open Access)

In the first quarter of 2018, 75.66% of smartphones sales were devices running An- droid. Due to its popularity, cyber-criminals have increasingly targeted this ecosys- tem. Malware running on Android severely violates end users security and privacy, allowing many attacks such as defeating two factor authentication of mobile bank- ing applications, capturing real-time voice calls and leaking sensitive information. In this dissertation, I describe the pieces of work that I have done to effectively de- tect malware on Android platform, i.e., ICC-based malware detection system (IC- CDetector), multi-layer malware detection system (DeepRefiner), and self-evolving and scalable malware detection system (DroidEvolver) …


Learning From Mutants: Using Code Mutation To Learn And Monitor Invariants Of A Cyber-Physical System, Yuqi Chen, Christopher M. Poskitt, Jun Sun May 2018

Learning From Mutants: Using Code Mutation To Learn And Monitor Invariants Of A Cyber-Physical System, Yuqi Chen, Christopher M. Poskitt, Jun Sun

Research Collection School Of Computing and Information Systems

Cyber-physical systems (CPS) consist of sensors, actuators, and controllers all communicating over a network; if any subset becomes compromised, an attacker could cause significant damage. With access to data logs and a model of the CPS, the physical effects of an attack could potentially be detected before any damage is done. Manually building a model that is accurate enough in practice, however, is extremely difficult. In this paper, we propose a novel approach for constructing models of CPS automatically, by applying supervised machine learning to data traces obtained after systematically seeding their software components with faults ("mutants"). We demonstrate the …


Anomaly Detection For A Water Treatment System Using Unsupervised Machine Learning, Jun Inoue, Yoriyuki Yamagata, Yuqi Chen, Christopher M. Poskitt, Jun Sun Nov 2017

Anomaly Detection For A Water Treatment System Using Unsupervised Machine Learning, Jun Inoue, Yoriyuki Yamagata, Yuqi Chen, Christopher M. Poskitt, Jun Sun

Research Collection School Of Computing and Information Systems

In this paper, we propose and evaluate the application of unsupervised machine learning to anomaly detection for a Cyber-Physical System (CPS). We compare two methods: Deep Neural Networks (DNN) adapted to time series data generated by a CPS, and one-class Support Vector Machines (SVM). These methods are evaluated against data from the Secure Water Treatment (SWaT) testbed, a scaled-down but fully operational raw water purification plant. For both methods, we first train detectors using a log generated by SWaT operating under normal conditions. Then, we evaluate the performance of both methods using a log generated by SWaT operating under 36 …


Inferring Spread Of Readers’ Emotion Affected By Online News, Agus Sulistya, Ferdian Thung, David Lo Sep 2017

Inferring Spread Of Readers’ Emotion Affected By Online News, Agus Sulistya, Ferdian Thung, David Lo

Research Collection School Of Computing and Information Systems

Depending on the reader, A news article may be viewed from many different perspectives, thus triggering different (and possibly contradicting) emotions. In this paper, we formulate a problem of predicting readers’ emotion distribution affected by a news article. Our approach analyzes affective annotations provided by readers of news articles taken from a non-English online news site. We create a new corpus from the annotated articles, and build a domain-specific emotion lexicon and word embedding features. We finally construct a multi-target regression model from a set of features extracted from online news articles. Our experiments show that by combining lexicon and …


Sugarmate: Non-Intrusive Blood Glucose Monitoring With Smartphones, Weixi Gu, Yuxun Zhou, Zimu Zhou, Xi Liu, Han Zou, Pei Zhang, Costas J. Spanos, Lin Zhang Sep 2017

Sugarmate: Non-Intrusive Blood Glucose Monitoring With Smartphones, Weixi Gu, Yuxun Zhou, Zimu Zhou, Xi Liu, Han Zou, Pei Zhang, Costas J. Spanos, Lin Zhang

Research Collection School Of Computing and Information Systems

Inferring abnormal glucose events such as hyperglycemia and hypoglycemia is crucial for the health of both diabetic patients and non-diabetic people. However, regular blood glucose monitoring can be invasive and inconvenient in everyday life. We present SugarMate, a first smartphone-based blood glucose inference system as a temporary alternative to continuous blood glucose monitors (CGM) when they are uncomfortable or inconvenient to wear. In addition to the records of food, drug and insulin intake, it leverages smartphone sensors to measure physical activities and sleep quality automatically. Provided with the imbalanced and often limited measurements, a challenge of SugarMate is the inference …


Employing Smartwatch For Enhanced Password Authentication, Bing Chang, Ximing Liu, Yingjiu Li, Pingjian Wang, Wen-Tao Zhu, Zhan Wang Jun 2017

Employing Smartwatch For Enhanced Password Authentication, Bing Chang, Ximing Liu, Yingjiu Li, Pingjian Wang, Wen-Tao Zhu, Zhan Wang

Research Collection School Of Computing and Information Systems

This paper presents an enhanced password authentication scheme by systematically exploiting the motion sensors in a smartwatch. We extract unique features from the sensor data when a smartwatch bearer types his/her password (or PIN), and train certain machine learning classifiers using these features. We then implement smartwatch-aided password authentication using the classifiers. Our scheme is user-friendly since it does not require users to perform any additional actions when typing passwords or PINs other than wearing smartwatches. We conduct a user study involving 51 participants on the developed prototype so as to evaluate its feasibility and performance. Experimental results show that …


K-Mer Analysis Pipeline For Classification Of Dna Sequences From Metagenomic Samples, Russell Kaehler Jan 2017

K-Mer Analysis Pipeline For Classification Of Dna Sequences From Metagenomic Samples, Russell Kaehler

Graduate Student Theses, Dissertations, & Professional Papers

Biological sequence datasets are increasing at a prodigious rate. The volume of data in these datasets surpasses what is observed in many other fields of science. New developments wherein metagenomic DNA from complex bacterial communities is recovered and sequenced are producing a new kind of data known as metagenomic data, which is comprised of DNA fragments from many genomes. Developing a utility to analyze such metagenomic data and predict the sample class from which it originated has many possible implications for ecological and medical applications. Within this document is a description of a series of analytical techniques used to process …


Collective Personalized Change Classification With Multiobjective Search, Xin Xia, David Lo, Xinyu Wang, Xiaohu Yang Dec 2016

Collective Personalized Change Classification With Multiobjective Search, Xin Xia, David Lo, Xinyu Wang, Xiaohu Yang

Research Collection School Of Computing and Information Systems

Many change classification techniques have been proposed to identify defect-prone changes. These techniques consider all developers' historical change data to build a global prediction model. In practice, since developers have their own coding preferences and behavioral patterns, which causes different defect patterns, a separate change classification model for each developer can help to improve performance. Jiang, Tan, and Kim refer to this problem as personalized change classification, and they propose PCC+ to solve this problem. A software project has a number of developers; for a developer, building a prediction model not only based on his/her change data, but also on …


Predicting Changes To Source Code, Justin James Roll Apr 2016

Predicting Changes To Source Code, Justin James Roll

Master's Theses

Organizations typically use issue tracking systems (ITS) such as Jira to plan software releases and assign requirements to developers. Organizations typically also use source control management (SCM) repositories such as Git to track historical changes to a code-base. These ITS and SCM repositories contain valuable data that remains largely untapped. As developers churn through an organization, it becomes expensive for developers to spend time determining which software artifact must be modified to implement a requirement. In this work we created, developed, tested and evaluated a tool called Class Change Predictor, otherwise known as CCP, for predicting which class will implement …


Algorithmic Music Composition And Accompaniment Using Neural Networks, Daniel Wilton Risdon Jan 2016

Algorithmic Music Composition And Accompaniment Using Neural Networks, Daniel Wilton Risdon

Senior Projects Spring 2016

The goal of this project was to use neural networks as a tool for live music performance. Specifically, the intention was to adapt a preexisting neural network code library to work in Max, a visual programming language commonly used to create instruments and effects for electronic music and audio processing. This was done using ConvNetJS, a JavaScript library created by Andrej Karpathy.

Several neural network models were trained using a range of different training data, including music from various genres. The resulting neural network-based instruments were used to play brief pieces of music, which they used as input to create …


Energy Forecasting For Event Venues: Big Data And Prediction Accuracy, Katarina Grolinger, Alexandra L'Heureux, Miriam Am Capretz, Luke Seewald Dec 2015

Energy Forecasting For Event Venues: Big Data And Prediction Accuracy, Katarina Grolinger, Alexandra L'Heureux, Miriam Am Capretz, Luke Seewald

Electrical and Computer Engineering Publications

Advances in sensor technologies and the proliferation of smart meters have resulted in an explosion of energy-related data sets. These Big Data have created opportunities for development of new energy services and a promise of better energy management and conservation. Sensor-based energy forecasting has been researched in the context of office buildings, schools, and residential buildings. This paper investigates sensor-based forecasting in the context of event-organizing venues, which present an especially difficult scenario due to large variations in consumption caused by the hosted events. Moreover, the significance of the data set size, specifically the impact of temporal granularity, on energy …


Energy Cost Forecasting For Event Venues, Katarina Grolinger, Andrea Zagar, Miriam Am Capretz, Luke Seewald Jan 2015

Energy Cost Forecasting For Event Venues, Katarina Grolinger, Andrea Zagar, Miriam Am Capretz, Luke Seewald

Electrical and Computer Engineering Publications

Electricity price, consumption, and demand forecasting has been a topic of research interest for a long time. The proliferation of smart meters has created new opportunities in energy prediction. This paper investigates energy cost forecasting in the context of entertainment event-organizing venues, which poses significant difficulty due to fluctuations in energy demand and wholesale electricity prices. The objective is to predict the overall cost of energy consumed during an entertainment event. Predictions are carried out separately for each event category and feature selection is used to select the most effective combination of event attributes for each category. Three machine learning …


Improving Structural Features Prediction In Protein Structure Modeling, Ashraf Yaseen Jul 2014

Improving Structural Features Prediction In Protein Structure Modeling, Ashraf Yaseen

Computer Science Theses & Dissertations

Proteins play a vital role in the biological activities of all living species. In nature, a protein folds into a specific and energetically favorable three-dimensional structure which is critical to its biological function. Hence, there has been a great effort by researchers in both experimentally determining and computationally predicting the structures of proteins.

The current experimental methods of protein structure determination are complicated, time-consuming, and expensive. On the other hand, the sequencing of proteins is fast, simple, and relatively less expensive. Thus, the gap between the number of known sequences and the determined structures is growing, and is expected to …