Open Access. Powered by Scholars. Published by Universities.®

Computer Engineering Commons

Open Access. Powered by Scholars. Published by Universities.®

Machine Learning

Discipline
Institution
Publication Year
Publication
Publication Type

Articles 1 - 30 of 66

Full-Text Articles in Computer Engineering

Identifying Hourly Traffic Patterns With Python Deep Learning, Christopher L. Leavitt Jun 2019

Identifying Hourly Traffic Patterns With Python Deep Learning, Christopher L. Leavitt

Computer Engineering

This project was designed to explore and analyze the potential abilities and usefulness of applying machine learning models to data collected by parking sensors at a major metro shopping mall. By examining patterns in rates at which customer enter and exit parking garages on the campus of the Bellevue Collection shopping mall in Bellevue, Washington, a recurrent neural network will use data points from the previous hours will be trained to forecast future trends.


Investigating The Use Of Bayesian Networks For Small Dataset Problems, Anastacia Maria Macallister Jan 2019

Investigating The Use Of Bayesian Networks For Small Dataset Problems, Anastacia Maria Macallister

Anastacia MacAllister

Benefits associated with machine learning are extensive. Industry is increasingly beginning to recognize the wealth of information stored in the data they are collecting. To sort through and analyze all of this data specialized tools are required to come up with actionable strategies. Often this is done with supervised machine learning algorithms. While these algorithms can be extremely powerful data analysis tools, they require considerable understanding, expertise, and a significant amount of data to use. Selecting the appropriate data analysis method is important to coming up with valid strategies based on the collected data. In addition, a characteristic of machine ...


Comparative Study Of Sentiment Analysis With Product Reviews Using Machine Learning And Lexicon-Based Approaches, Heidi Nguyen, Aravind Veluchamy, Mamadou Diop, Rashed Iqbal Jan 2019

Comparative Study Of Sentiment Analysis With Product Reviews Using Machine Learning And Lexicon-Based Approaches, Heidi Nguyen, Aravind Veluchamy, Mamadou Diop, Rashed Iqbal

SMU Data Science Review

In this paper, we present a comparative study of text sentiment classification models using term frequency inverse document frequency vectorization in both supervised machine learning and lexicon-based techniques. There have been multiple promising machine learning and lexicon-based techniques, but the relative goodness of each approach on specific types of problems is not well understood. In order to offer researchers comprehensive insights, we compare a total of six algorithms to each other. The three machine learning algorithms are: Logistic Regression (LR), Support Vector Machine (SVM), and Gradient Boosting. The three lexicon-based algorithms are: Valence Aware Dictionary and Sentiment Reasoner (VADER), Pattern ...


Improving Vix Futures Forecasts Using Machine Learning Methods, James Hosker, Slobodan Djurdjevic, Hieu Nguyen, Robert Slater Jan 2019

Improving Vix Futures Forecasts Using Machine Learning Methods, James Hosker, Slobodan Djurdjevic, Hieu Nguyen, Robert Slater

SMU Data Science Review

The problem of forecasting market volatility is a difficult task for most fund managers. Volatility forecasts are used for risk management, alpha (risk) trading, and the reduction of trading friction. Improving the forecasts of future market volatility assists fund managers in adding or reducing risk in their portfolios as well as in increasing hedges to protect their portfolios in anticipation of a market sell-off event. Our analysis compares three existing financial models that forecast future market volatility using the Chicago Board Options Exchange Volatility Index (VIX) to six machine/deep learning supervised regression methods. This analysis determines which models provide ...


An Investigation Of Three Subjective Rating Scales Of Mental Workload In Third Level Education, Nha Vu Thanh Nguyen Jan 2019

An Investigation Of Three Subjective Rating Scales Of Mental Workload In Third Level Education, Nha Vu Thanh Nguyen

Dissertations

Mental Workload assessment in educational settings is still recognized as an open research problem. Although its application is useful for instructional design, it is still unclear how it can be formally shaped and which factors compose it. This paper is aimed at investigating a set of features believed to shape the construct of mental workload and aggregating them together in models trained with supervised machine learning techniques. In detail, multiple linear regression and decision trees have been chosen for training models with features extracted respectively from the NASA Task Load Index and the Workload Profile, well-known self-reporting instruments for assessing ...


Predicting Customer Retention Of An App-Based Business Using Supervised Machine Learning, Jeswin Jose Jan 2019

Predicting Customer Retention Of An App-Based Business Using Supervised Machine Learning, Jeswin Jose

Dissertations

Identification of retainable customers is very essential for the functioning and growth of any business. An effective identification of retainable customers can help the business to identify the reasons of retention and plan their marketing strategies accordingly. This research is aimed at developing a machine learning model that can precisely predict the retainable customers from the total customer data of an e-learning business. Building predictive models that can efficiently classify imbalanced data is a major challenge in data mining and machine learning. Most of the machine learning algorithms deliver a suboptimal performance when introduced to an imbalanced dataset. A variety ...


Crude Oil Prices Forecasting: Time Series Vs. Svr Models, Xin James He Dec 2018

Crude Oil Prices Forecasting: Time Series Vs. Svr Models, Xin James He

Journal of International Technology and Information Management

This research explores the weekly crude oil price data from U.S. Energy Information Administration over the time period 2009 - 2017 to test the forecasting accuracy by comparing time series models such as simple exponential smoothing (SES), moving average (MA), and autoregressive integrated moving average (ARIMA) against machine learning support vector regression (SVR) models. The main purpose of this research is to determine which model provides the best forecasting results for crude oil prices in light of the importance of crude oil price forecasting and its implications to the economy. While SVR is often considered the best forecasting model in ...


Evaluating Load Adjusted Learning Strategies For Client Service Levels Prediction From Cloud-Hosted Video Servers, Ruairí De Fréin, Obinna Izima, Mark Davis Dec 2018

Evaluating Load Adjusted Learning Strategies For Client Service Levels Prediction From Cloud-Hosted Video Servers, Ruairí De Fréin, Obinna Izima, Mark Davis

Conference papers

Network managers that succeed in improving the accuracy of client video service level predictions, where the video is deployed in a cloud infrastructure, will have the ability to deliver responsive, SLA-compliant service to their customers. Meeting up-time guarantees, achieving rapid first-call resolution, and minimizing time-to-recovery af- ter video service outages will maintain customer loyalty.

To date, regression-based models have been applied to generate these predictions for client machines using the kernel metrics of a server clus- ter. The effect of time-varying loads on cloud-hosted video servers, which arise due to dynamic user requests have not been leveraged to improve prediction ...


Localization Using Convolutional Neural Networks, Shannon D. Fong Dec 2018

Localization Using Convolutional Neural Networks, Shannon D. Fong

Computer Engineering

With the increased accessibility to powerful GPUs, ability to develop machine learning algorithms has increased significantly. Coupled with open source deep learning frameworks, average users are now able to experiment with convolutional neural networks (CNNs) to solve novel problems. This project sought to train a CNN capable of classifying between various locations within a building. A single continuous video was taken while standing at each desired location so that every class in the neural network was represented by a single video. Each location was given a number to be used for classification and the video was subsequently titled locX. These ...


Automatic Identification Of Animals In The Wild: A Comparative Study Between C-Capsule Networks And Deep Convolutional Neural Networks., Joel Kamdem Teto, Ying Xie Nov 2018

Automatic Identification Of Animals In The Wild: A Comparative Study Between C-Capsule Networks And Deep Convolutional Neural Networks., Joel Kamdem Teto, Ying Xie

Master of Science in Computer Science Theses

The evolution of machine learning and computer vision in technology has driven a lot of

improvements and innovation into several domains. We see it being applied for credit decisions, insurance quotes, malware detection, fraud detection, email composition, and any other area having enough information to allow the machine to learn patterns. Over the years the number of sensors, cameras, and cognitive pieces of equipment placed in the wilderness has been growing exponentially. However, the resources (human) to leverage these data into something meaningful are not improving at the same rate. For instance, a team of scientist volunteers took 8.4 ...


On The Feasibility Of Profiling, Forecasting And Authenticating Internet Usage Based On Privacy Preserving Netflow Logs, Soheil Sarmadi Nov 2018

On The Feasibility Of Profiling, Forecasting And Authenticating Internet Usage Based On Privacy Preserving Netflow Logs, Soheil Sarmadi

Graduate Theses and Dissertations

Understanding Internet user behavior and Internet usage patterns is fundamental in developing future access networks and services that meet technical as well as Internet user needs. User behavior is routinely studied and measured, but with different methods depending on the research discipline of the investigator, and these disciplines rarely cross. We tackle this challenge by developing frameworks that the Internet usage statistics used as the main features in understanding Internet user behaviors, with the purpose of finding a complete picture of the user behavior and working towards a unified analysis methodology. In this dissertation we collected Internet usage statistics via ...


A Comprehensive Framework To Replicate Process-Level Concurrency Faults, Supat Rattanasuksun Nov 2018

A Comprehensive Framework To Replicate Process-Level Concurrency Faults, Supat Rattanasuksun

Computer Science and Engineering: Theses, Dissertations, and Student Research

Concurrency faults are one of the most damaging types of faults that can affect the dependability of today’s computer systems. Currently, concurrency faults such as process-level races, order violations, and atomicity violations represent the largest class of faults that has been reported to various Linux bug repositories. Clearly, existing approaches for testing such faults during software development processes are not adequate as these faults escape in-house testing efforts and are discovered during deployment and must be debugged.

The main reason concurrency faults are hard to test is because the conditions that allow these to occur can be difficult to ...


Real-Time Intrusion Detection Using Multidimensional Sequence-To-Sequence Machine Learning And Adaptive Stream Processing, Gobinath Loganathan Aug 2018

Real-Time Intrusion Detection Using Multidimensional Sequence-To-Sequence Machine Learning And Adaptive Stream Processing, Gobinath Loganathan

Electronic Thesis and Dissertation Repository

A network intrusion is any unauthorized activity on a computer network. There are host-based and network-based Intrusion Detection Systems (IDS's), of which there are each signature-based and anomaly-based detection methods. An anomalous network behavior can be defined as an intentional violation of the expected sequence of packets. In a real-time network-based IDS, incoming packets are treated as a stream of data. A stream processor takes any stream of data or events and extracts interesting patterns on the fly. This representation allows applying statistical anomaly detection using sequence prediction algorithms as well as using a stream processor to perform signature-based ...


Deep Neural Network Architectures For Modulation Classification Using Principal Component Analysis, Sharan Ramjee, Shengtai Ju, Diyu Yang, Aly El Gamal Aug 2018

Deep Neural Network Architectures For Modulation Classification Using Principal Component Analysis, Sharan Ramjee, Shengtai Ju, Diyu Yang, Aly El Gamal

The Summer Undergraduate Research Fellowship (SURF) Symposium

In this work, we investigate the application of Principal Component Analysis to the task of wireless signal modulation recognition using deep neural network architectures. Sampling signals at the Nyquist rate, which is often very high, requires a large amount of energy and space to collect and store the samples. Moreover, the time taken to train neural networks for the task of modulation classification is large due to the large number of samples. These problems can be drastically reduced using Principal Component Analysis, which is a technique that allows us to reduce the dimensionality or number of features of the samples ...


Hierarchical Bayesian Data Fusion Using Autoencoders, Yevgeniy Vladimirovich Reznichenko Jul 2018

Hierarchical Bayesian Data Fusion Using Autoencoders, Yevgeniy Vladimirovich Reznichenko

Master's Theses (2009 -)

In this thesis, a novel method for tracker fusion is proposed and evaluated for vision-based tracking. This work combines three distinct popular techniques into a recursive Bayesian estimation algorithm. First, semi supervised learning approaches are used to partition data and to train a deep neural network that is capable of capturing normal visual tracking operation and is able to detect anomalous data. We compare various methods by examining their respective receiver operating conditions (ROC) curves, which represent the trade off between specificity and sensitivity for various detection threshold levels. Next, we incorporate the trained neural networks into an existing data ...


Machine Learning Models For Context-Aware Recommender Systems, Yogesh Jhamb Jun 2018

Machine Learning Models For Context-Aware Recommender Systems, Yogesh Jhamb

Engineering Ph.D. Theses

The mass adoption of the internet has resulted in the exponential growth of products and services on the world wide web. An individual consumer, faced with this data deluge, is expected to make reasonable choices saving time and money. Organizations are facing increased competition, and they are looking for innovative ways to increase revenue and customer loyalty. A business wants to target the right product or service to an individual consumer, and this drives personalized recommendation. Recommender systems, designed to provide personalized recommendations, initially focused only on the user-item interaction. However, these systems evolved to provide a context-aware recommendations. Context-aware ...


Machine Learning For Omics Data Analysis., Ameni Trabelsi May 2018

Machine Learning For Omics Data Analysis., Ameni Trabelsi

Electronic Theses and Dissertations

In proteomics and metabolomics, to quantify the changes of abundance levels of biomolecules in a biological system, multiple sample analysis steps are involved. The steps include mass spectrum deconvolution and peak list alignment. Each analysis step introduces a certain degree of technical variation in the abundance levels (i.e. peak areas) of those molecules. Some analysis steps introduce technical variations that affect the peak areas of all molecules equally while others affect the peak areas of a subset of molecules with varying degrees. To correct these technical variations, some existing normalization methods simply scale the peak areas of all molecules ...


Acceleration Of K-Nearest Neighbor And Srad Algorithms Using Intel Fpga Sdk For Opencl, Liyuan Liu Mar 2018

Acceleration Of K-Nearest Neighbor And Srad Algorithms Using Intel Fpga Sdk For Opencl, Liyuan Liu

Electronic Theses and Dissertations

Field Programmable Gate Arrays (FPGAs) have been widely used for accelerating machine learning algorithms. However, the high design cost and time for implementing FPGA-based accelerators using traditional HDL-based design methodologies has discouraged users from designing FPGA-based accelerators. In recent years, a new CAD tool called Intel FPGA SDK for OpenCL (IFSO) allowed fast and efficient design of FPGA-based hardware accelerators from high level specification such as OpenCL. Even software engineers with basic hardware design knowledge could design FPGA-based accelerators. In this thesis, IFSO has been used to explore acceleration of k-Nearest-Neighbour (kNN) algorithm and Speckle Reducing Anisotropic Diffusion (SRAD) simulation ...


Predicting The Vote Using Legislative Speech, Aditya Budhwar Mar 2018

Predicting The Vote Using Legislative Speech, Aditya Budhwar

Master's Theses and Project Reports

As most dedicated observers of voting bodies like the U.S. Supreme Court can attest, it is possible to guess vote outcomes based on statements made during deliberations or questioning by the voting members. In most forms of representative democracy, citizens can actively petition or lobby their representatives, and that often means understanding their intentions to vote for or against an issue of interest. In some U.S. state legislators, professional lobby groups and dedicated press members are highly informed and engaged, but the process is basically closed to ordinary citizens because they do not have enough background and familiarity ...


Recent Advances In Randomized Methods For Big Data Optimization, Jie Liu Jan 2018

Recent Advances In Randomized Methods For Big Data Optimization, Jie Liu

Theses and Dissertations

In this thesis, we discuss and develop randomized algorithms for big data problems. In particular, we study the finite-sum optimization with newly emerged variance- reduction optimization methods (Chapter 2), explore the efficiency of second-order information applied to both convex and non-convex finite-sum objectives (Chapter 3) and employ the fast first-order method in power system problems (Chapter 4).In Chapter 2, we propose two variance-reduced gradient algorithms – mS2GD and SARAH. mS2GD incorporates a mini-batching scheme for improving the theoretical complexity and practical performance of SVRG/S2GD, aiming to minimize a strongly convex function represented as the sum of an average of ...


Use Of Adaptive Mobile Applications To Improve Mindfulness, Wiehan Boshoff Jan 2018

Use Of Adaptive Mobile Applications To Improve Mindfulness, Wiehan Boshoff

Browse all Theses and Dissertations

Mindfulness is the state of retaining awareness of what is happening at the current point in time. It has been used in multiple forms to reduce stress, anxiety, and even depression. Promoting Mindfulness can be done in various ways, but current research shows a trend towards preferential usage of breathing exercises over other methods to reach a mindful state. Studies have showcased that breathing can be used as a tool to promote brain control, specifically in the auditory cortex region. Research pertaining to disorders such as Tinnitus, the phantom awareness of sound, could potentially benefit from using these brain control ...


Investigating The Use Of Bayesian Networks For Small Dataset Problems, Anastacia Maria Macallister Jan 2018

Investigating The Use Of Bayesian Networks For Small Dataset Problems, Anastacia Maria Macallister

Graduate Theses and Dissertations

Benefits associated with machine learning are extensive. Industry is increasingly beginning to recognize the wealth of information stored in the data they are collecting. To sort through and analyze all of this data specialized tools are required to come up with actionable strategies. Often this is done with supervised machine learning algorithms. While these algorithms can be extremely powerful data analysis tools, they require considerable understanding, expertise, and a significant amount of data to use. Selecting the appropriate data analysis method is important to coming up with valid strategies based on the collected data. In addition, a characteristic of machine ...


Can Machine Learning Beat Physics At Modeling Car Crashes?, Gavin Byrne Jan 2018

Can Machine Learning Beat Physics At Modeling Car Crashes?, Gavin Byrne

Dissertations

This study aimed to look at a traditional method used for measuring the severity and principle direction of force of a car crash and see if it could be improved on using machine learning models. The data used was publicly available from the NHTSA database and included descriptions of the vehicle, test and sensors as well as the accelerometer data over the period of the crashes. The models built were SVM classifiers and multinomial regression models. Although the SVM and Regression models were built successfully and gave higher levels of accuracy than the momentum models in terms of the severity ...


Machine Learning Techniques Implementation In Power Optimization, Data Processing, And Bio-Medical Applications, Khalid Khairullah Mezied Al-Jabery Jan 2018

Machine Learning Techniques Implementation In Power Optimization, Data Processing, And Bio-Medical Applications, Khalid Khairullah Mezied Al-Jabery

Doctoral Dissertations

"The rapid progress and development in machine-learning algorithms becomes a key factor in determining the future of humanity. These algorithms and techniques were utilized to solve a wide spectrum of problems extended from data mining and knowledge discovery to unsupervised learning and optimization. This dissertation consists of two study areas. The first area investigates the use of reinforcement learning and adaptive critic design algorithms in the field of power grid control. The second area in this dissertation, consisting of three papers, focuses on developing and applying clustering algorithms on biomedical data. The first paper presents a novel modelling approach for ...


Towards Privacy-Aware Mobile-Based Continuous Authentication Systems, Mohammad Al-Rubaie Jan 2018

Towards Privacy-Aware Mobile-Based Continuous Authentication Systems, Mohammad Al-Rubaie

Graduate Theses and Dissertations

User authentication is used to verify the identify of individuals attempting to gain access to a certain system. It traditionally refers to the initial authentication using knowledge factors (e.g. passwords), or ownership factors (e.g. smart cards). However, initial authentication cannot protect the computer (or smartphone), if left unattended, after the initial login. Thus, continuous authentication was proposed to complement initial authentication by transparently and continuously testing the users' behavior against the stored profile (machine learning model).

Since continuous authentication utilizes users' behavioral data to build machine learning models, certain privacy and security concerns have to be addressed before ...


Dynamic And System Agnostic Malware Detection Via Machine Learning, Michael Sgroi, Doug Jacobson Jan 2018

Dynamic And System Agnostic Malware Detection Via Machine Learning, Michael Sgroi, Doug Jacobson

Creative Components

This paper discusses malware detection in personal computers. Current malware detection solutions are static. Antiviruses rely on lists of malicious signatures that are then used in file scanning. These antiviruses are also very dependent on the operating system, requiring different solutions for different systems. This paper presents a solution that detects malware based on runtime attributes. It also emphasizes that these attributes are easily accessible and fairly generic meaning that it functions across systems and without specialized information. The attributes are used in a machine learning system that makes it flexible for retraining if necessary, but capable of handling new ...


Genealogy Extraction And Tree Generation From Free Form Text, Timothy Sui-Tim Chu Dec 2017

Genealogy Extraction And Tree Generation From Free Form Text, Timothy Sui-Tim Chu

Master's Theses and Project Reports

Genealogical records play a crucial role in helping people to discover their lineage and to understand where they come from. They provide a way for people to celebrate their heritage and to possibly reconnect with family they had never considered. However, genealogical records are hard to come by for ordinary people since their information is not always well established in known databases. There often is free form text that describes a person’s life, but this must be manually read in order to extract the relevant genealogical information. In addition, multiple texts may have to be read in order to ...


Demand Side Management In Smart Grid Using Big Data Analytics, Sidhant Chatterjee Dec 2017

Demand Side Management In Smart Grid Using Big Data Analytics, Sidhant Chatterjee

All Graduate Plan B and other Reports

Smart Grids are the next generation electrical grid system that utilizes smart meter-ing devices and sensors to manage the grid operations. Grid management includes the prediction of load and and classification of the load patterns and consumer usage behav-iors. These predictions can be performed using machine learning methods which are often supervised. Supervised machine learning signifies that the algorithm trains the model to efficiently predict decisions based on the previously available data.

Smart grids are employed with numerous smart meters that send user statistics to a central server. The data can be accumulated and processed using data mining and machine ...


Computer-Aided Detection Of Pathologically Enlarged Lymph Nodes On Non-Contrast Ct In Cervical Cancer Patients For Low-Resource Settings, Brian M. Anderson, Laurence E. Court, Ann Klopp, Stephen F. Kry, Jennifer Johnson, Erik Cressman, Arvind Rao, Jinzhong Yang Aug 2017

Computer-Aided Detection Of Pathologically Enlarged Lymph Nodes On Non-Contrast Ct In Cervical Cancer Patients For Low-Resource Settings, Brian M. Anderson, Laurence E. Court, Ann Klopp, Stephen F. Kry, Jennifer Johnson, Erik Cressman, Arvind Rao, Jinzhong Yang

UT GSBS Dissertations and Theses (Open Access)

The mortality rate of cervical cancer is approximately 266,000 people each year, and 70% of the burden occurs in Low- and Middle- Income Countries (LMICs). Radiation therapy is the primary modality for treatment of locally advanced cervical cancer cases. In the absence of high quality diagnostic imaging needed to identify nodal metastasis, many LMIC sites treat standard pelvic fields, failing to include node metastasis outside of the field and/or to boost lymph nodes in the abdomen and pelvis. The first goal of this project was to create a program which automatically identifies positive cervical cancer lymph nodes on ...


Operating System Identification By Ipv6 Communication Using Machine Learning Ensembles, Adrian Ordorica Aug 2017

Operating System Identification By Ipv6 Communication Using Machine Learning Ensembles, Adrian Ordorica

Theses and Dissertations

Operating system (OS) identification tools, sometimes called fingerprinting tools, are essential for the reconnaissance phase of penetration testing. While OS identification is traditionally performed by passive or active tools that use fingerprint databases, very little work has focused on using machine learning techniques. Moreover, significantly more work has focused on IPv4 than IPv6. We introduce a collaborative neural network ensemble that uses a unique voting system and a random forest ensemble to deliver accurate predictions. This approach uses IPv6 features as well as packet metadata features for OS identification. Our experiment shows that our approach is valid and we achieve ...