Building Recommendation Systems, 2019 The University of Akron
Building Recommendation Systems, Orion Davis
Williams Honors College, Honors Research Projects
Recommendation systems are pieces of software that suggest new items to a user. There are many moving parts to these systems including data, the actual recommendation model, processing data and finally displaying data. This project explores the role each part plays in the overall system and how to develop a recommendation system for beer from scratch. This project highlights the algorithm behind the recommendations and a user facing Android application.
Optimaztion Of Fantasy Basketball Lineups Via Machine Learning, 2019 Liberty University
Optimaztion Of Fantasy Basketball Lineups Via Machine Learning, James Earl
Senior Honors Theses
Machine learning is providing a way to glean never before known insights from the data that gets recorded every day. This paper examines the application of machine learning to the novel field of Daily Fantasy Basketball. The particularities of the fantasy basketball ruleset and playstyle are discussed, and then the results of a data science case study are reviewed. The data set consists of player performance statistics as well as Fantasy Points, implied team total, DvP, and player status. The end goal is to evaluate how accurately the computer can predict a player’s fantasy performance based off a chosen feature …
A Hott Approach To Computational Effects, 2019 The College of Wooster
A Hott Approach To Computational Effects, Phillip A. Wells
Senior Independent Study Theses
A computational effect is any mutation of real-world state that occurs as the result of a computation. We develop a model for describing computational effects within homotopy type theory, a branch of mathematics separate from other foundations such as set theory. Such a model allows us to describe programs as total functions over values while preserving information about the effects those programs induce.
Vertex Coloring With Forbidden Subgraphs, 2019 Wilfrid Laurier University
Vertex Coloring With Forbidden Subgraphs, Yingjun Dai
Theses and Dissertations (Comprehensive)
Given a set $L$ of graphs, a graph $G$ is $L$-free if $G$ does not contain any graph in $L$ as induced subgraph. A $hole$ is an induced cycle of length at least $4$. A $hole$-$twin$ is a graph obtained by adding a vertex adjacent to three consecutive vertices in a $hole$. Hole-twins are closely related to the characterization of the line graphs in terms of forbidden subgraphs.
By using {\it clique-width} and {\it perfect graphs} theory, we show that ($claw$,$4K_1$,$hole$-$twin$)-free graphs and ($4K_1$,$hole$-$twin$,$5$-$wheel$)-free graphs are either perfect or have bounded clique-width. And thus the coloring of them can be …
Large Scale Online Multiple Kernel Regression With Application To Time-Series Prediction, 2019 Singapore Management University
Large Scale Online Multiple Kernel Regression With Application To Time-Series Prediction, Doyen Sahoo, Steven C. H. Hoi, Bin Lin
Research Collection School Of Computing and Information Systems
Kernel-based regression represents an important family of learning techniques for solving challenging regression tasks with non-linear patterns. Despite being studied extensively, most of the existing work suffers from two major drawbacks as follows: (i) they are often designed for solving regression tasks in a batch learning setting, making them not only computationally inefficient and but also poorly scalable in real-world applications where data arrives sequentially; and (ii) they usually assume that a fixed kernel function is given prior to the learning task, which could result in poor performance if the chosen kernel is inappropriate. To overcome these drawbacks, this work …
Quantifying Human Biological Age: A Machine Learning Approach, 2019 West Virginia University
Quantifying Human Biological Age: A Machine Learning Approach, Syed Ashiqur Rahman
Graduate Theses, Dissertations, and Problem Reports
Quantifying human biological age is an important and difficult challenge. Different biomarkers and numerous approaches have been studied for biological age prediction, each with its advantages and limitations. In this work, we first introduce a new anthropometric measure (called Surface-based Body Shape Index, SBSI) that accounts for both body shape and body size, and evaluate its performance as a predictor of all-cause mortality. We analyzed data from the National Health and Human Nutrition Examination Survey (NHANES). Based on the analysis, we introduce a new body shape index constructed from four important anthropometric determinants of body shape and body size: body …
Analyzing Satisfiability And Refutability In Selected Constraint Systems, 2019 West Virginia University
Analyzing Satisfiability And Refutability In Selected Constraint Systems, Piotr Jerzy Wojciechowski
Graduate Theses, Dissertations, and Problem Reports
This dissertation is concerned with the satisfiability and refutability problems for several constraint systems. We examine both Boolean constraint systems, in which each variable is limited to the values true and false, and polyhedral constraint systems, in which each variable is limited to the set of real numbers R in the case of linear polyhedral systems or the set of integers Z in the case of integer polyhedral systems. An important aspect of our research is that we focus on providing certificates. That is, we provide satisfying assignments or easily checkable proofs of infeasibility depending on whether the instance …
Object-Based Supervised Machine Learning Regional-Scale Land-Cover Classification Using High Resolution Remotely Sensed Data, 2019 West Virginia University
Object-Based Supervised Machine Learning Regional-Scale Land-Cover Classification Using High Resolution Remotely Sensed Data, Christopher A. Ramezan
Graduate Theses, Dissertations, and Problem Reports
High spatial resolution (HR) (1m – 5m) remotely sensed data in conjunction with supervised machine learning classification are commonly used to construct land-cover classifications. Despite the increasing availability of HR data, most studies investigating HR remotely sensed data and associated classification methods employ relatively small study areas. This work therefore drew on a 2,609 km2, regional-scale study in northeastern West Virginia, USA, to investigates a number of core aspects of HR land-cover supervised classification using machine learning. Issues explored include training sample selection, cross-validation parameter tuning, the choice of machine learning algorithm, training sample set size, and feature selection. A …
The Global Disinformation Order: 2019 Global Inventory Of Organised Social Media Manipulation, 2019 University of Oxford
The Global Disinformation Order: 2019 Global Inventory Of Organised Social Media Manipulation, Samantha Bradshaw, Philip N. Howard
Copyright, Fair Use, Scholarly Communication, etc.
Executive Summary
Over the past three years, we have monitored the global organization of social media manipulation by governments and political parties. Our 2019 report analyses the trends of computational propaganda and the evolving tools, capacities, strategies, and resources.
1. Evidence of organized social media manipulation campaigns which have taken place in 70 countries, up from 48 countries in 2018 and 28 countries in 2017. In each country, there is at least one political party or government agency using social media to shape public attitudes domestically.
2.Social media has become co-opted by many authoritarian regimes. In 26 countries, computational propaganda …
Separability And Vertex Ordering Of Graphs, 2019 Wilfrid Laurier University
Separability And Vertex Ordering Of Graphs, Elizabeth Gorbonos
Theses and Dissertations (Comprehensive)
Many graph optimization problems, such as finding an optimal coloring, or a largest clique, can be solved by a divide-and-conquer approach. One such well-known technique is decomposition by clique separators where a graph is decomposed into special induced subgraphs along their clique separators. While the most common practice of this method employs minimal clique separators, in this work we study other variations as well. We strive to characterize their structure and in particular the bound on the number of atoms. In fact, we strengthen the known bounds for the general clique cutset decomposition and the minimal clique separator decomposition. Graph …
High Dimensional Outlier Detection, 2019 University of Montana
High Dimensional Outlier Detection, Omid Khormali
Graduate Student Theses, Dissertations, & Professional Papers
In statistics and data science, outliers are data points that differ greatly from other observations in a data set. They are important attributes of the data because they can dramatically influence patterns and relationships manifested by non-outliers. It is therefore very important to detect and adequately deal with outliers. Recently, a novel algorithm, the ROMA algorithm, has been proposed [11]. In this paper, we propose a modification of the ROMA algorithm that reduces its computational complexity from $O(n^2 m)$ to $O((n/(2^m-o(1)))^2 m)$ where $n$ is the number of data points and $m$ is the dimension of the space. And as …
Automatic Program Rewriting In Non-Ground Answer Set Programs, 2018 University of Nebraska at Omaha
Automatic Program Rewriting In Non-Ground Answer Set Programs, Nicholas Hippen, Yuliya Lierler
Yuliya Lierler
Strong Equivalence And Program's Structure In Arguing Essential Equivalence Between First-Order Logic Programs, 2018 Department of Compter Science
Strong Equivalence And Program's Structure In Arguing Essential Equivalence Between First-Order Logic Programs, Yuliya Lierler
Yuliya Lierler
Complexity Results For Fourier-Motzkin Elimination, 2018 The University of Western Ontario
Complexity Results For Fourier-Motzkin Elimination, Delaram Talaashrafi
Electronic Thesis and Dissertation Repository
In this thesis, we propose a new method for removing all the redundant inequalities generated by Fourier-Motzkin elimination. This method is based on Kohler’s work and an improved version of Balas’ work. Moreover, this method only uses arithmetic operations on matrices. Algebraic complexity estimates and experimental results show that our method outperforms alternative approaches based on linear programming.
Reinforcement Learning For Collective Multi-Agent Decision Making, 2018 Singapore Management University
Reinforcement Learning For Collective Multi-Agent Decision Making, Duc Thien Nguyen
Dissertations and Theses Collection (Open Access)
In this thesis, we study reinforcement learning algorithms to collectively optimize decentralized policy in a large population of autonomous agents. We notice one of the main bottlenecks in large multi-agent system is the size of the joint trajectory of agents which quickly increases with the number of participating agents. Furthermore, the noiseof actions concurrently executed by different agents in a large system makes it difficult for each agent to estimate the value of its own actions, which is well-known as the multi-agent credit assignment problem. We propose a compact representation for multi-agent systems using the aggregate counts to address …
Data Center Holistic Demand Response Algorithm To Smooth Microgrid Tie-Line Power Fluctuation, 2018 Singapore Management University
Data Center Holistic Demand Response Algorithm To Smooth Microgrid Tie-Line Power Fluctuation, Ting Yang, Yingjie Zhao, Haibo Pen, Zhaoxia Wang
Research Collection School Of Computing and Information Systems
With the rapid development of cloud computing, artificial intelligence technologies and big data applications, data centers have become widely deployed. High density IT equipment in data centers consumes a lot of electrical power, and makes data center a hungry monster of energy consumption. To solve this problem, renewable energy is increasingly integrated into data center power provisioning systems. Compared to the traditional power supply methods, renewable energy has its unique characteristics, such as intermittency and randomness. When renewable energy supplies power to the data center industrial park, this kind of power supply not only has negative effects on the normal …
Sequence Pattern Mining With Variables, 2018 Air Force Institute of Technology
Sequence Pattern Mining With Variables, James S. Okolica, Gilbert L. Peterson, Robert F. Mills, Michael R. Grimaila
Faculty Publications
Sequence pattern mining (SPM) seeks to find multiple items that commonly occur together in a specific order. One common assumption is that all of the relevant differences between items are captured through creating distinct items, e.g., if color matters then the same item in two different colors would have two items created, one for each color. In some domains, that is unrealistic. This paper makes two contributions. The first extends SPM algorithms to allow item differentiation through attribute variables for domains with large numbers of items, e.g, by having one item with a variable with a color attribute rather than …
Constrained K-Means Clustering Validation Study, 2018 Southwestern Oklahoma State University
Constrained K-Means Clustering Validation Study, Nicholas Mcdaniel, Stephen Burgess, Jeremy Evert
Student Research
Machine Learning (ML) is a growing topic within Computer Science with applications in many fields. One open problem in ML is data separation, or data clustering. Our project is a validation study of, “Constrained K-means Clustering with Background Knowledge" by Wagstaff et. al. Our data validates the finding by Wagstaff et. al., which shows that a modified k-means clustering approach can outperform more general unsupervised learning algorithms when some domain information about the problem is available. Our data suggests that k-means clustering augmented with domain information can be a time efficient means for segmenting data sets. Our validation study focused …
Criticality Assessments For Improving Algorithmic Robustness, 2018 University of New Mexico
Criticality Assessments For Improving Algorithmic Robustness, Thomas B. Jones
Computer Science ETDs
Though computational models typically assume all program steps execute flawlessly, that does not imply all steps are equally important if a failure should occur. In the "Constrained Reliability Allocation" problem, sufficient resources are guaranteed for operations that prompt eventual program termination on failure, but those operations that only cause output errors are given a limited budget of some vital resource, insufficient to ensure correct operation for each of them.
In this dissertation, I present a novel representation of failures based on a combination of their timing and location combined with criticality assessments---a method used to predict the behavior of systems …
A Multi-Task Approach To Incremental Dialogue State Tracking, 2018 Technological University Dublin
A Multi-Task Approach To Incremental Dialogue State Tracking, Anh Duong Trinh, Robert J. Ross, John D. Kelleher
Conference papers
Incrementality is a fundamental feature of language in real world use. To this point, however, the vast majority of work in automated dialogue processing has focused on language as turn based. In this paper we explore the challenge of incremental dialogue state tracking through the development and analysis of a multi-task approach to incremental dialogue state tracking. We present the design of our incremental dialogue state tracker in detail and provide evaluation against the well known Dialogue State Tracking Challenge 2 (DSTC2) dataset. In addition to a standard evaluation of the tracker, we also provide an analysis of the Incrementality …