Open Access. Powered by Scholars. Published by Universities.®

Databases and Information Systems Commons

Open Access. Powered by Scholars. Published by Universities.®

Theory and Algorithms

Institution
Keyword
Publication Year
Publication
Publication Type
File Type

Articles 31 - 60 of 379

Full-Text Articles in Databases and Information Systems

Implementing The Cms+ Sports Rankings Algorithm In A Javafx Environment, Luke Welch May 2022

Implementing The Cms+ Sports Rankings Algorithm In A Javafx Environment, Luke Welch

Industrial Engineering Undergraduate Honors Theses

Every year, sports teams and athletes get cut from championship opportunities because of their rank. While this reality is easier to swallow if a team or athlete is distant from the cut, it is much harder when they are right on the edge. Many times, it leaves fans and athletes wondering, “Why wasn’t I ranked higher? What factors when into the ranking? Are the rankings based on opinion alone?” These are fair questions that deserve an answer. Many times, sports rankings are derived from opinion polls. Other times, they are derived from a combination of opinion polls and measured performance. …


Learning Transferable Perturbations For Image Captioning, Hanjie Wu, Yongtuo Liu, Hongmin Cai, Shengfeng He May 2022

Learning Transferable Perturbations For Image Captioning, Hanjie Wu, Yongtuo Liu, Hongmin Cai, Shengfeng He

Research Collection School Of Computing and Information Systems

Present studies have discovered that state-of-the-art deep learning models can be attacked by small but well-designed perturbations. Existing attack algorithms for the image captioning task is time-consuming, and their generated adversarial examples cannot transfer well to other models. To generate adversarial examples faster and stronger, we propose to learn the perturbations by a generative model that is governed by three novel loss functions. Image feature distortion loss is designed to maximize the encoded image feature distance between original images and the corresponding adversarial examples at the image domain, and local-global mismatching loss is introduced to separate the mapping encoding representation …


The Effect Of Using The Gamification Strategy On Academic Achievement And Motivation Towards Learning Problem-Solving Skills In Computer And Information Technology Course Among Tenth Grade Female Students, Mazyunah Almutairi, Prof. Ahmad Almassaad Feb 2022

The Effect Of Using The Gamification Strategy On Academic Achievement And Motivation Towards Learning Problem-Solving Skills In Computer And Information Technology Course Among Tenth Grade Female Students, Mazyunah Almutairi, Prof. Ahmad Almassaad

International Journal for Research in Education

Abstract

This study aimed to identify the effect of using the gamification strategy on academic achievement and motivation towards learning problem-solving skills in computer and information technology course. A quasi-experimental method was adopted. The study population included tenth-grade female students in Al-Badi’ah schools in Riyadh. The sample consisted of 54 students divided into two equal groups: control group and experimental group. The study tools comprised an achievement test and the motivation scale. The results showed that there were statistically significant differences between the two groups in the academic achievement test in favor of the experimental group, with a large effect …


Algorithm-Based Fault Tolerance At Scale, Joshua Dennis Booth Jan 2022

Algorithm-Based Fault Tolerance At Scale, Joshua Dennis Booth

Summer Community of Scholars (RCEU and HCR) Project Proposals

No abstract provided.


Predicting League Of Legends Ranked Games Outcome, Ngoc Linh Chi Nguyen Jan 2022

Predicting League Of Legends Ranked Games Outcome, Ngoc Linh Chi Nguyen

Senior Projects Spring 2022

League of Legends (LoL) is the one of most popular multiplayer online battle arena (MOBA) games in the world. For LoL, the most competitive way to evaluate a player’s skill level, below the professional Esports level, is competitive ranked games. These ranked games utilize a matchmaking system based on the player’s ranks to form a fair team for each game. However, a rank game's outcome cannot necessarily be predicted using just players’ ranks, there are a significant number of different variables impacting a rank game depending on how well each team plays. In this paper, I propose a method to …


Machine Learning In Requirements Elicitation: A Literature Review, Cheligeer Cheligeer, Jingwei Huang, Guosong Wu, Nadia Bhuiyan, Yuan Xu, Yong Zeng Jan 2022

Machine Learning In Requirements Elicitation: A Literature Review, Cheligeer Cheligeer, Jingwei Huang, Guosong Wu, Nadia Bhuiyan, Yuan Xu, Yong Zeng

Engineering Management & Systems Engineering Faculty Publications

A growing trend in requirements elicitation is the use of machine learning (ML) techniques to automate the cumbersome requirement handling process. This literature review summarizes and analyzes studies that incorporate ML and natural language processing (NLP) into demand elicitation. We answer the following research questions: (1) What requirement elicitation activities are supported by ML? (2) What data sources are used to build ML-based requirement solutions? (3) What technologies, algorithms, and tools are used to build ML-based requirement elicitation? (4) How to construct an ML-based requirements elicitation method? (5) What are the available tools to support ML-based requirements elicitation methodology? Keywords …


Fair And Diverse Group Formation Based On Multidimensional Features, Mohammed Saad A Alqahtani Dec 2021

Fair And Diverse Group Formation Based On Multidimensional Features, Mohammed Saad A Alqahtani

Graduate Theses and Dissertations

The goal of group formation is to build a team to accomplish a specific task. Algorithms are being developed to improve the team's effectiveness so formed and the efficiency of the group selection process. However, there is concern that team formation algorithms could be biased against minorities due to the algorithms themselves or the data on which they are trained. Hence, it is essential to build fair team formation systems that incorporate demographic information into the process of building the group. Although there has been extensive work on modeling individuals’ expertise for expert recommendation and/or team formation, there has been …


Context-Aware Graph Convolutional Network For Dynamic Origin-Destination Prediction, Juan Nathaniel, Baihua Zheng Dec 2021

Context-Aware Graph Convolutional Network For Dynamic Origin-Destination Prediction, Juan Nathaniel, Baihua Zheng

Research Collection School Of Computing and Information Systems

A robust Origin-Destination (OD) prediction is key to urban mobility. A good forecasting model can reduce operational risks and improve service availability, among many other upsides. Here, we examine the use of Graph Convolutional Net-work (GCN) and its hybrid Markov-Chain (GCN-MC) variant to perform a context-aware OD prediction based on a large-scale public transportation dataset in Singapore. Compared with the baseline Markov-Chain algorithm and GCN, the proposed hybrid GCN-MC model improves the prediction accuracy by 37% and 12% respectively. Lastly, the addition of temporal and historical contextual information further improves the performance of the proposed hybrid model by 4 –12%.


Transfer-Learned Pruned Deep Convolutional Neural Networks For Efficient Plant Classification In Resource-Constrained Environments, Martinson Ofori Nov 2021

Transfer-Learned Pruned Deep Convolutional Neural Networks For Efficient Plant Classification In Resource-Constrained Environments, Martinson Ofori

Masters Theses & Doctoral Dissertations

Traditional means of on-farm weed control mostly rely on manual labor. This process is time-consuming, costly, and contributes to major yield losses. Further, the conventional application of chemical weed control can be economically and environmentally inefficient. Site-specific weed management (SSWM) counteracts this by reducing the amount of chemical application with localized spraying of weed species. To solve this using computer vision, precision agriculture researchers have used remote sensing weed maps, but this has been largely ineffective for early season weed control due to problems such as solar reflectance and cloud cover in satellite imagery. With the current advances in artificial …


Information Extraction And Classification On Journal Papers, Lei Yu Nov 2021

Information Extraction And Classification On Journal Papers, Lei Yu

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

The importance of journals for diffusing the results of scientific research has increased considerably. In the digital era, Portable Document Format (PDF) became the established format of electronic journal articles. This structured form, combined with a regular and wide dissemination, spread scientific advancements easily and quickly. However, the rapidly increasing numbers of published scientific articles requires more time and effort on systematic literature reviews, searches and screens. The comprehension and extraction of useful information from the digital documents is also a challenging task, due to the complex structure of PDF.

To help a soil science team from the United States …


On A Multistage Discrete Stochastic Optimization Problem With Stochastic Constraints And Nested Sampling, Thuy Anh Ta, Tien Mai, Fabian Bastin, Pierre L'Ecuyer Nov 2021

On A Multistage Discrete Stochastic Optimization Problem With Stochastic Constraints And Nested Sampling, Thuy Anh Ta, Tien Mai, Fabian Bastin, Pierre L'Ecuyer

Research Collection School Of Computing and Information Systems

We consider a multistage stochastic discrete program in which constraints on any stage might involve expectations that cannot be computed easily and are approximated by simulation. We study a sample average approximation (SAA) approach that uses nested sampling, in which at each stage, a number of scenarios are examined and a number of simulation replications are performed for each scenario to estimate the next-stage constraints. This approach provides an approximate solution to the multistage problem. To establish the consistency of the SAA approach, we first consider a two-stage problem and show that in the second-stage problem, given a scenario, the …


Expediting The Accuracy-Improving Process Of Svms For Class Imbalance Learning, Bin Cao, Yuqi Liu, Chenyu Hou, Jing Fan, Baihua Zheng, Jianwei Jin Nov 2021

Expediting The Accuracy-Improving Process Of Svms For Class Imbalance Learning, Bin Cao, Yuqi Liu, Chenyu Hou, Jing Fan, Baihua Zheng, Jianwei Jin

Research Collection School Of Computing and Information Systems

To improve the classification performance of support vector machines (SVMs) on imbalanced datasets, cost-sensitive learning methods have been proposed, e.g., DEC (Different Error Costs) and FSVM-CIL (Fuzzy SVM for Class Imbalance Learning). They relocate the hyperplane by adjusting the costs associated with misclassifying samples. However, the error costs are determined either empirically or by performing an exhaustive search in the parameter space. Both strategies can not guarantee effectiveness and efficiency simultaneously. In this paper, we propose ATEC, a solution that can efficiently find a preferable hyperplane by automatically tuning the error cost for between-class samples. ATEC distinguishes itself from all …


Unified And Incremental Simrank: Index-Free Approximation With Scheduled Principle, Fanwei Zhu, Yuan Fang, Kai Zhang, Kevin C.-C. Chang, Hongtai Cao, Zhen Jiang, Minghui Wu Sep 2021

Unified And Incremental Simrank: Index-Free Approximation With Scheduled Principle, Fanwei Zhu, Yuan Fang, Kai Zhang, Kevin C.-C. Chang, Hongtai Cao, Zhen Jiang, Minghui Wu

Research Collection School Of Computing and Information Systems

SimRank is a popular link-based similarity measure on graphs. It enables a variety of applications with different modes of querying (e.g., single-pair, single-source and all-pair modes). In this paper, we propose UISim, a unified and incremental framework for all SimRank modes based on a scheduled approximation principle. UISim processes queries with incremental and prioritized exploration of the entire computation space, and thus allows flexible tradeoff of time and accuracy. On the other hand, it creates and shares common “building blocks” for online computation without relying on indexes, and thus is efficient to handle both static and dynamic graphs. Our experiments …


Quantum Computing For Supply Chain Finance, Paul R. Griffin, Ritesh Sampat Sep 2021

Quantum Computing For Supply Chain Finance, Paul R. Griffin, Ritesh Sampat

Research Collection School Of Computing and Information Systems

Applying quantum computing to real world applications to assess the potential efficacy is a daunting task for non-quantum specialists. This paper shows an implementation of two quantum optimization algorithms applied to portfolios of trade finance portfolios and compares the selections to those chosen by experienced underwriters and a classical optimizer. The method used is to map the financial risk and returns for a trade finance portfolio to an optimization function of a quantum algorithm developed in a Qiskit tutorial. The results show that whilst there is no advantage seen by using the quantum algorithms, the performance of the quantum algorithms …


Multilateration Index., Chip Lynch Aug 2021

Multilateration Index., Chip Lynch

Electronic Theses and Dissertations

We present an alternative method for pre-processing and storing point data, particularly for Geospatial points, by storing multilateration distances to fixed points rather than coordinates such as Latitude and Longitude. We explore the use of this data to improve query performance for some distance related queries such as nearest neighbor and query-within-radius (i.e. “find all points in a set P within distance d of query point q”). Further, we discuss the problem of “Network Adequacy” common to medical and communications businesses, to analyze questions such as “are at least 90% of patients living within 50 miles of a covered emergency …


Design And Development Of Techniques To Ensure Integrity In Fog Computing Based Databases, Abdulwahab Fahad S. Alazeb Jul 2021

Design And Development Of Techniques To Ensure Integrity In Fog Computing Based Databases, Abdulwahab Fahad S. Alazeb

Graduate Theses and Dissertations

The advancement of information technology in coming years will bring significant changes to the way sensitive data is processed. But the volume of generated data is rapidly growing worldwide. Technologies such as cloud computing, fog computing, and the Internet of things (IoT) will offer business service providers and consumers opportunities to obtain effective and efficient services as well as enhance their experiences and services; increased availability and higher-quality services via real-time data processing augment the potential for technology to add value to everyday experiences. This improves human life quality and easiness. As promising as these technological innovations, they are prone …


Promoting Diversity In Academic Research Communities Through Multivariate Expert Recommendation, Omar Salman Jul 2021

Promoting Diversity In Academic Research Communities Through Multivariate Expert Recommendation, Omar Salman

Graduate Theses and Dissertations

Expert recommendation is the process of identifying individuals who have the appropriate knowledge and skills to achieve a specific task. It has been widely used in the educational environment mainly in the hiring process, paper-reviewer assignment, and assembling conference program committees. In this research, we highlight the problem of diversity and fair representation of underrepresented groups in expertise recommendation, factors that current expertise recommendation systems rarely consider. We introduce a novel way to model experts in academia by considering demographic attributes in addition to skills. We use the h-index score to quantify skills for a researcher and we identify five …


Counting And Sampling Small Structures In Graph And Hypergraph Data Streams, Themistoklis Haris Jun 2021

Counting And Sampling Small Structures In Graph And Hypergraph Data Streams, Themistoklis Haris

Dartmouth College Undergraduate Theses

In this thesis, we explore the problem of approximating the number of elementary substructures called simplices in large k-uniform hypergraphs. The hypergraphs are assumed to be too large to be stored in memory, so we adopt a data stream model, where the hypergraph is defined by a sequence of hyperedges.

First we propose an algorithm that (ε, δ)-estimates the number of simplices using O(m1+1/k / T) bits of space. In addition, we prove that no constant-pass streaming algorithm can (ε, δ)- approximate the number of simplices using less than O( m 1+1/k / T ) bits of space. Thus …


Self-Adaptive Graph Traversal On Gpus, Mo Sha, Yuchen Li, Kian-Lee Tan Jun 2021

Self-Adaptive Graph Traversal On Gpus, Mo Sha, Yuchen Li, Kian-Lee Tan

Research Collection School Of Computing and Information Systems

GPU’s massive computing power offers unprecedented opportunities to enable large graph analysis. Existing studies proposed various preprocessing approaches that convert the input graphs into dedicated structures for GPU-based optimizations. However, these dedicated approaches incur significant preprocessing costs as well as weak programmability to build general graph applications. In this paper, we introduce SAGE, a self-adaptive graph traversal on GPUs, which is free from preprocessing and operates on ubiquitous graph representations directly. We propose Tiled Partitioning and Resident Tile Stealing to fully exploit the computing power of GPUs in a runtime and self-adaptive manner. We also propose Sampling-based Reordering to further …


Machine Learning Approaches To Dribble Hand-Off Action Classification With Sportvu Nba Player Coordinate Data, Dembe Stephanos May 2021

Machine Learning Approaches To Dribble Hand-Off Action Classification With Sportvu Nba Player Coordinate Data, Dembe Stephanos

Electronic Theses and Dissertations

Recently, strategies of National Basketball Association teams have evolved with the skillsets of players and the emergence of advanced analytics. One of the most effective actions in dynamic offensive strategies in basketball is the dribble hand-off (DHO). This thesis proposes an architecture for a classification pipeline for detecting DHOs in an accurate and automated manner. This pipeline consists of a combination of player tracking data and event labels, a rule set to identify candidate actions, manually reviewing game recordings to label the candidates, and embedding player trajectories into hexbin cell paths before passing the completed training set to the classification …


A Deep Analysis And Algorithmic Approach To Solving Complex Fitness Issues In Collegiate Student Athletes, Holly N. Puckett Apr 2021

A Deep Analysis And Algorithmic Approach To Solving Complex Fitness Issues In Collegiate Student Athletes, Holly N. Puckett

Honors College Theses

Sports are not simply an entertainment source. For many, it creates a sense of community, support, and trust among both fans and athletes alike. In order to continue the sense of community sports provides, athletes must be properly cared for in order to perform at the highest level possible. Thus, their fitness and health must be monitored continuously. In a professional sense, one can expect individualized attention to athletes daily due to an abundance of funding and resources. However, when looking at college communities and student athletes within them, the number of athletes per athletic trainer increases due to both …


Sql Injection & Web Application Security: A Python-Based Network Traffic Detection Model, Nyki Anderson Apr 2021

Sql Injection & Web Application Security: A Python-Based Network Traffic Detection Model, Nyki Anderson

Cybersecurity Undergraduate Research Showcase

The Internet of Things (IoT) presents a great many challenges in cybersecurity as the world grows more and more digitally dependent. Personally identifiable information (PII) (i,e., names, addresses, emails, credit card numbers) is stored in databases across websites the world over. The greatest threat to privacy, according to the Open Worldwide Application Security Project (OWASP) is SQL injection attacks (SQLIA) [1]. In these sorts of attacks, hackers use malicious statements entered into forms, search bars, and other browser input mediums to trick the web application server into divulging database assets. A proposed technique against such exploitation is convolution neural network …


A Fully Dynamic Algorithm For K-Regret Minimizing Sets, Yanhao Wang, Yuchen Li, Raymond Chi-Wing Wong, Kian-Lee Tan Apr 2021

A Fully Dynamic Algorithm For K-Regret Minimizing Sets, Yanhao Wang, Yuchen Li, Raymond Chi-Wing Wong, Kian-Lee Tan

Research Collection School Of Computing and Information Systems

Selecting a small set of representatives from a large database is important in many applications such as multi-criteria decision making, web search, and recommendation. The k-regret minimizing set (k-RMS) problem was recently proposed for representative tuple discovery. Specifically, for a large database P of tuples with multiple numerical attributes, the k-RMS problem returns a size-r subset Q of P such that, for any possible ranking function, the score of the top-ranked tuple in Q is not much worse than the score of the kth-ranked tuple in P. Although the k-RMS problem has been extensively studied in the literature, existing methods …


Improving Multi-Hop Knowledge Base Question Answering By Learning Intermediate Supervision Signals, Gaole He, Yunshi Lan, Jing Jiang, Wayne Xin Zhao, Ji Rong Wen Mar 2021

Improving Multi-Hop Knowledge Base Question Answering By Learning Intermediate Supervision Signals, Gaole He, Yunshi Lan, Jing Jiang, Wayne Xin Zhao, Ji Rong Wen

Research Collection School Of Computing and Information Systems

Multi-hop Knowledge Base Question Answering (KBQA) aims to find the answer entities that are multiple hops away in the Knowledge Base (KB) from the entities in the question. A major challenge is the lack of supervision signals at intermediate steps. Therefore, multi-hop KBQA algorithms can only receive the feedback from the final answer, which makes the learning unstable or ineffective. To address this challenge, we propose a novel teacher-student approach for the multi-hop KBQA task. In our approach, the student network aims to find the correct answer to the query, while the teacher network tries to learn intermediate supervision signals …


Unsupervised Data Mining Technique For Clustering Library In Indonesia, Robbi Rahim, Joseph Teguh Santoso, Sri Jumini, Gita Widi Bhawika, Daniel Susilo, Danny Wibowo Feb 2021

Unsupervised Data Mining Technique For Clustering Library In Indonesia, Robbi Rahim, Joseph Teguh Santoso, Sri Jumini, Gita Widi Bhawika, Daniel Susilo, Danny Wibowo

Library Philosophy and Practice (e-journal)

Organizing school libraries not only keeps library materials, but helps students and teachers in completing tasks in the teaching process so that national development goals are in order to improve community welfare by producing quality and competitive human resources. The purpose of this study is to analyze the Unsupervised Learning technique in conducting cluster mapping of the number of libraries at education levels in Indonesia. The data source was obtained from the Ministry of Education and Culture which was processed by the Central Statistics Agency (abbreviated as BPS) with url: bps.go.id/. The data consisted of 34 records where the attribute …


Visual Analysis Of Discrimination In Machine Learning, Qianwen Wang, Zhenghua Xu, Zhutian Chen, Yong Wang, Shixia Liu, Huamin Qu Feb 2021

Visual Analysis Of Discrimination In Machine Learning, Qianwen Wang, Zhenghua Xu, Zhutian Chen, Yong Wang, Shixia Liu, Huamin Qu

Research Collection School Of Computing and Information Systems

The growing use of automated decision-making in critical applications, such as crime prediction and college admission, has raised questions about fairness in machine learning. How can we decide whether different treatments are reasonable or discriminatory? In this paper, we investigate discrimination in machine learning from a visual analytics perspective and propose an interactive visualization tool, DiscriLens, to support a more comprehensive analysis. To reveal detailed information on algorithmic discrimination, DiscriLens identifies a collection of potentially discriminatory itemsets based on causal modeling and classification rules mining. By combining an extended Euler diagram with a matrix-based visualization, we develop a novel set …


Quantifying The Impact Of Non-Stationarity In Reinforcement Learning-Based Traffic Signal Control, Lucas N. Alegre, Ana L.C. Bazzan, Bruno C. Da Silva Jan 2021

Quantifying The Impact Of Non-Stationarity In Reinforcement Learning-Based Traffic Signal Control, Lucas N. Alegre, Ana L.C. Bazzan, Bruno C. Da Silva

Computer Science Department Faculty Publication Series

In reinforcement learning (RL), dealing with non-stationarity is a challenging issue. However, some domains such as traffic optimization are inherently non-stationary. Causes for and effects of this are manifold. In particular, when dealing with traffic signal controls, addressing non-stationarity is key since traffic conditions change over time and as a function of traffic control decisions taken in other parts of a network. In this paper we analyze the effects that different sources of non-stationarity have in a network of traffic signals, in which each signal is modeled as a learning agent. More precisely, we study both the effects of changing …


Deep Unsupervised Anomaly Detection, Tangqing Li, Zheng Wang, Siying Liu, Wen-Yan Lin Jan 2021

Deep Unsupervised Anomaly Detection, Tangqing Li, Zheng Wang, Siying Liu, Wen-Yan Lin

Research Collection School Of Computing and Information Systems

This paper proposes a novel method to detect anomalies in large datasets under a fully unsupervised setting. The key idea behind our algorithm is to learn the representation underlying normal data. To this end, we leverage the latest clustering technique suitable for handling high dimensional data. This hypothesis provides a reliable starting point for normal data selection. We train an autoencoder from the normal data subset, and iterate between hypothesizing normal candidate subset based on clustering and representation learning. The reconstruction error from the learned autoencoder serves as a scoring function to assess the normality of the data. Experimental results …


A Near-Optimal Change-Detection Based Algorithm For Piecewise-Stationary Combinatorial Semi-Bandits, Huozhi Zhou, Lingda Wang, Lav N. Varshney, Ee-Peng Lim Dec 2020

A Near-Optimal Change-Detection Based Algorithm For Piecewise-Stationary Combinatorial Semi-Bandits, Huozhi Zhou, Lingda Wang, Lav N. Varshney, Ee-Peng Lim

Research Collection School Of Computing and Information Systems

We investigate the piecewise-stationary combinatorial semi-bandit problem. Compared to the original combinatorial semi-bandit problem, our setting assumes the reward distributions of base arms may change in a piecewise-stationary manner at unknown time steps. We propose an algorithm, GLR-CUCB, which incorporates an efficient combinatorial semi-bandit algorithm, CUCB, with an almost parameter-free change-point detector, the Generalized Likelihood Ratio Test (GLRT). Our analysis shows that the regret of GLR-CUCB is upper bounded by O(√NKT logT), where N is the number of piecewise-stationary segments, K is the number of base arms, and T is the number of time steps. As a complement, we also …


A Survey Of Typical Attributed Graph Queries, Yanhao Wang, Yuchen Li, Ju Fan, Chang Ye, Mingke Chai Nov 2020

A Survey Of Typical Attributed Graph Queries, Yanhao Wang, Yuchen Li, Ju Fan, Chang Ye, Mingke Chai

Research Collection School Of Computing and Information Systems

Graphs are commonly used for representing complex structures such as social relationships, biological interactions, and knowledge bases. In many scenarios, graphs not only represent topological relationships but also store the attributes that denote the semantics associated with their vertices and edges, known as attributed graphs. Attributed graphs can meet demands for a wide range of applications, and thus a variety of queries on attributed graphs have been proposed. However, these diverse types of attributed graph queries have not been systematically investigated yet. In this paper, we provide an extensive survey of several typical types of attributed graph queries. We propose …