Open Access. Powered by Scholars. Published by Universities.®

Databases and Information Systems Commons

Open Access. Powered by Scholars. Published by Universities.®

4,509 Full-Text Articles 4,969 Authors 1,167,439 Downloads 162 Institutions

All Articles in Databases and Information Systems

Faceted Search

4,509 full-text articles. Page 3 of 161.

Utilizing Machine Learning Classifiers To Identify Ssh Brute Force Attacks, Dmytro Shmagin 2019 William & Mary

Utilizing Machine Learning Classifiers To Identify Ssh Brute Force Attacks, Dmytro Shmagin

Undergraduate Honors Theses

SSH brute force attacks are a type of network attack in which an attacker tries to guess the username and password of a user on the Secure Shell protocol. This kind of attack is simple to perform, and the results from a successfully compromised system can lead to a number of destructive outcomes. Because of its simplicity and potential payout, large networks experience many instances of these attacks in their traffic, and current prevention methods rely heavily on per-machine logs that, in aggregate, take up a large amount of space. This paper explores the usage of machine learning algorithms in ...


Applications Of Fog Computing In Video Streaming, Kyle Smith 2019 University of Arkansas, Fayetteville

Applications Of Fog Computing In Video Streaming, Kyle Smith

Computer Science and Computer Engineering Undergraduate Honors Theses

The purpose of this paper is to show the viability of fog computing in the area of video streaming in vehicles. With the rise of autonomous vehicles, there needs to be a viable entertainment option for users. The cloud fails to address these options due to latency problems experienced during high internet traffic. To improve video streaming speeds, fog computing seems to be the best option. Fog computing brings the cloud closer to the user through the use of intermediary devices known as fog nodes. It does not attempt to replace the cloud but improve the cloud by allowing faster ...


Studying And Handling Iterated Algorithmic Biases In Human And Machine Learning Interaction., Wenlong Sun 2019 University of Louisville

Studying And Handling Iterated Algorithmic Biases In Human And Machine Learning Interaction., Wenlong Sun

Electronic Theses and Dissertations

Algorithmic bias consists of biased predictions born from ingesting unchecked information, such as biased samples and biased labels. Furthermore, the interaction between people and algorithms can exacerbate bias such that neither the human nor the algorithms receive unbiased data. Thus, algorithmic bias can be introduced not only before and after the machine learning process but sometimes also in the middle of the learning process. With a handful of exceptions, only a few categories of bias have been studied in Machine Learning, and there are few, if any, studies of the impact of bias on both human behavior and algorithm performance ...


Querying Over Encrypted Databases In A Cloud Environment, Jake Douglas 2019 Boise State University

Querying Over Encrypted Databases In A Cloud Environment, Jake Douglas

Boise State University Theses and Dissertations

The adoption of cloud computing has created a huge shift in where data is processed and stored. Increasingly, organizations opt to store their data outside of their own network to gain the benefits offered by shared cloud resources. With these benefits also come risks; namely, another organization has access to all of the data. A malicious insider at the cloud services provider could steal any personal information contained on the cloud or could use the data for the cloud service provider's business advantage. By encrypting the data, some of these risks can be mitigated. Unfortunately, encrypting the data also ...


Multimodal Review Generation For Recommender Systems, Quoc Tuan TRUONG, Hady Wirawan LAUW 2019 Singapore Management University

Multimodal Review Generation For Recommender Systems, Quoc Tuan Truong, Hady Wirawan Lauw

Research Collection School Of Information Systems

Key to recommender systems is learning user preferences, which are expressed through various modalities. In online reviews, for instance, this manifests in numerical rating, textual content, as well as visual images. In this work, we hypothesize that modelling these modalities jointly would result in a more holistic representation of a review towards more accurate recommendations. Therefore, we propose Multimodal Review Generation (MRG), a neural approach that simultaneously models a rating prediction component and a review text generation component. We hypothesize that the shared user and item representations would augment the rating prediction with richer information from review text, while sensitizing ...


Robust Factorization Machine: A Doubly Capped Norms Minimization, Chenghao LIU, Teng ZHANG, Jundong LI, Jianwen YIN, Peilin ZHAO, Jianling SUN, Steven C. H. HOI 2019 Singapore Management University

Robust Factorization Machine: A Doubly Capped Norms Minimization, Chenghao Liu, Teng Zhang, Jundong Li, Jianwen Yin, Peilin Zhao, Jianling Sun, Steven C. H. Hoi

Research Collection School Of Information Systems

Factorization Machine (FM) is a general supervised learning framework for many AI applications due to its powerful capability of feature engineering. Despite being extensively studied, existing FM methods have several limitations in common. First of all, most existing FM methods often adopt the squared loss in the modeling process, which can be very sensitive when the data for learning contains noises and outliers. Second, some recent FM variants often explore the low-rank structure of the feature interactions matrix by relaxing the low-rank minimization problem as a trace norm minimization, which cannot always achieve a tight approximation to the original one ...


Project Sidewalk: A Web-Based Crowdsourcing Tool For Collecting Sidewalk Accessibility Data At Scale, Manaswi SAHA, Michael SAUGSTAD, Hanuma MADDALI, Aileen ZENG, Ryan HOLLAND, Steven BOWER, Aditya DASH, Sage CHEN, Anthony Li, Kotaro HARA, Jon FROEHLICH 2019 Singapore Management University

Project Sidewalk: A Web-Based Crowdsourcing Tool For Collecting Sidewalk Accessibility Data At Scale, Manaswi Saha, Michael Saugstad, Hanuma Maddali, Aileen Zeng, Ryan Holland, Steven Bower, Aditya Dash, Sage Chen, Anthony Li, Kotaro Hara, Jon Froehlich

Research Collection School Of Information Systems

We introduce Project Sidewalk, a new web-based tool that enables online crowdworkers to remotely label pedestrian-related accessibility problems by virtually walking through city streets in Google Street View. To train, engage, and sustain users, we apply basic game design principles such as interactive onboarding, mission-based tasks, and progress dashboards. In an 18-month deployment study, 797 online users contributed 205,385 labels and audited 2,941 miles of Washington DC streets. We compare behavioral and labeling quality differences between paid crowdworkers and volunteers, investigate the effects of label type, label severity, and majority vote on accuracy, and analyze common labeling errors ...


Learning To Detect And Understand Drug Discontinuation Events From Clinical Narratives, Feifan Liu, Richeek Pradhan, Emily Druhl, Elaine Freund, Weisong Liu, Brian C. Sauer, Fran Cunningham, Adam J. Gordon, Celena B. Peters, Hong Yu 2019 University of Massachusetts Medical School

Learning To Detect And Understand Drug Discontinuation Events From Clinical Narratives, Feifan Liu, Richeek Pradhan, Emily Druhl, Elaine Freund, Weisong Liu, Brian C. Sauer, Fran Cunningham, Adam J. Gordon, Celena B. Peters, Hong Yu

Population and Quantitative Health Sciences Publications

OBJECTIVE: Identifying drug discontinuation (DDC) events and understanding their reasons are important for medication management and drug safety surveillance. Structured data resources are often incomplete and lack reason information. In this article, we assessed the ability of natural language processing (NLP) systems to unlock DDC information from clinical narratives automatically.

MATERIALS AND METHODS: We collected 1867 de-identified providers' notes from the University of Massachusetts Medical School hospital electronic health record system. Then 2 human experts chart reviewed those clinical notes to annotate DDC events and their reasons. Using the annotated data, we developed and evaluated NLP systems to automatically identify ...


Yelp Improved : Aggregating Restaurant Reviews, Kunal Sonar 2019 University of San Francisco

Yelp Improved : Aggregating Restaurant Reviews, Kunal Sonar

Creative Activity and Research Day - CARD

In the near future, online food delivery service companies would occupy a big market share in the food industry. This project aims to provide factual information from customer reviews as part of the numerous innovations in place to drive business and demands. Natural Language Processing is used to provide a comprehensive view of individual restaurants using technologies like NLTK, SpaCy, Gensim and Sklearn. Data of one million Las Vegas restaurant customer reviews is curated from the Yelp Dataset Challenge. Reviews are pre-processed, split into chunks of phrases and mapped to attributes like food, budget, service etc. These attributes are derived ...


Image Use In Social Network Communication: A Case Study Of Tweets On The Boston Marathon Bombing, JungWon Yoon, EunKyung Chung 2019 University of South Florida

Image Use In Social Network Communication: A Case Study Of Tweets On The Boston Marathon Bombing, Jungwon Yoon, Eunkyung Chung

JungWon Yoon

Introduction. This study aimed to understand how images are used in communication practices in the Twitter environment. Method. 1,428 Boston marathon bombing related Twitter messages with embedded images were collected, and content analysis was conducted. Analysis. Characteristics of image use were examined and were analysed by type of Twitter messages. Results. People used diverse types of images in Twitter messages including: direct photos, captured images, computer graphics, and maps. Depending on the content of Twitter messages, uses of images were categorised into four types: 1) to illustrate news, information, and anecdotes, 2) to disseminate visual information that cannot be ...


Knowledge Activation For Patient Centered Care: Bridging The Health Information Technology Divide, Sajda Qureshi, Cherie Notebloom 2019 University of Nebraska at Omaha

Knowledge Activation For Patient Centered Care: Bridging The Health Information Technology Divide, Sajda Qureshi, Cherie Notebloom

Sajda Qureshi

The provision of healthcare is a collaborative process. It follows evidence based treatments which are becoming increasingly data driven and focusing on the best clinical outcomes. Patient centered care requires participation of patients in the decision making of the best treatment options. Healthcare provision requires both evidence based and patient centered care. In practice, these two perspectives conflict with each other due to the use of an information technology designed primarily for billing purposes. Using the knowledge activation framework developed by Qureshi and Keen [25], we analyze data from two hospitals in the Midwest that aim to achieve quality of ...


Understanding The Role Of Information Technology In The Development Of Micro-Enterprises: Concepts To Study In Making A Better World, Sajda Qureshi, Jason Jie Xiong 2019 University of Nebraska at Omaha

Understanding The Role Of Information Technology In The Development Of Micro-Enterprises: Concepts To Study In Making A Better World, Sajda Qureshi, Jason Jie Xiong

Sajda Qureshi

The concept of Development has alluded scholars and practitioners when information technology becomes prevalent. The majority of research in the Information Technology for Development (ICT4D) field is considered to be practice intended to make the world better with Information and Communications technologies (ICTs). In addition, a majority of wellintentioned ICT4D projects tend to fail, often due to unrealistic expectation set by development agencies responding to their political objectives. At the same time, Information Systems (IS) research is ripe with well-studied concepts that do little to make a better world. This paper investigates ICT interventions in three case studies of micro-enterprises ...


A Data Citation Roadmap For Scholarly Data Repositories, Martin Fenner, Merce Crosas, Jeffrey S. Grethe, David N. Kennedy, Henning Hermjakob, Phillippe Rocca-Serra, Gustavo Durand, Robin Berjon, Sebastian Karcher, Maryann Martone, Tim Clark 2019 DataCite

A Data Citation Roadmap For Scholarly Data Repositories, Martin Fenner, Merce Crosas, Jeffrey S. Grethe, David N. Kennedy, Henning Hermjakob, Phillippe Rocca-Serra, Gustavo Durand, Robin Berjon, Sebastian Karcher, Maryann Martone, Tim Clark

Open Access Articles

This article presents a practical roadmap for scholarly data repositories to implement data citation in accordance with the Joint Declaration of Data Citation Principles, a synopsis and harmonization of the recommendations of major science policy bodies. The roadmap was developed by the Repositories Expert Group, as part of the Data Citation Implementation Pilot (DCIP) project, an initiative of FORCE11.org and the NIH-funded BioCADDIE ( https://biocaddie.org ) project. The roadmap makes 11 specific recommendations, grouped into three phases of implementation: a) required steps needed to support the Joint Declaration of Data Citation Principles, b) recommended steps that facilitate article/data ...


Google Trends Data As A Proxy For Interest In Leadership, Finley W. Walker 2019 Southeastern University - Lakeland

Google Trends Data As A Proxy For Interest In Leadership, Finley W. Walker

Doctor of Education (Ed.D)

The purpose of this quantitative study was to investigate the observable patterns of online search behavior in the topic of leadership using Google Trends data. Institutions have had a historically difficult time predicting good leadership candidates. Better predictions can be made by using the big data offered by groups such as Google to learn who, where, and when people are interested in leadership. The study utilized descriptive, comparative, and correlative methodologies to study Google users’ interest in leadership from 2004 to 2017. Society has placed great value into leadership throughout history, and though overall interest remains strong, it appears that ...


Alpha Insurance: A Predictive Analytics Case To Analyze Automobile Insurance Fraud Using Sas Enterprise Miner (Tm), Richard McCarthy, Wendy Ceccucci, Mary McCarthy, Leila Halawi 2019 Quinnipiac University

Alpha Insurance: A Predictive Analytics Case To Analyze Automobile Insurance Fraud Using Sas Enterprise Miner (Tm), Richard Mccarthy, Wendy Ceccucci, Mary Mccarthy, Leila Halawi

Publications

Automobile Insurance fraud costs the insurance industry billions of dollars annually. This case study addresses claim fraud based on data extracted from Alpha Insurance’s automobile claim database. Students are provided the business problem and data sets. Initially, the students are required to develop their hypotheses and analyze the data. This includes identification of any missing or inaccurate data values and outliers as well as evaluation of the 22 variables. Next students will develop and optimize their predictive models using five techniques: regression, decision tree, neural network, gradient boosting, and ensemble. Then students will determine which model is the best ...


Question Answering With Textual Sequence Matching, Shuohang WANG 2019 Singapore Management University

Question Answering With Textual Sequence Matching, Shuohang Wang

Dissertations and Theses Collection (Open Access)

Question answering (QA) is one of the most important applications in natural language processing. With the explosive text data from the Internet, intelligently getting answers of questions will help humans more efficiently collect useful information. My research in this thesis mainly focuses on solving question answering problem with textual sequence matching model which is to build vectorized representations for pairs of text sequences to enable better reasoning. And our thesis consists of three major parts.

In Part I, we propose two general models for building vectorized representations over a pair of sentences, which can be directly used to solve the ...


Online Collaborative Filtering With Implicit Feedback, Jianwen YIN, Chenghao LIU, Jundong LI, Bing Tian DAI, Yunchen CHEN, Min WU, Jianling SUN 2019 Singapore Management University

Online Collaborative Filtering With Implicit Feedback, Jianwen Yin, Chenghao Liu, Jundong Li, Bing Tian Dai, Yunchen Chen, Min Wu, Jianling Sun

Research Collection School Of Information Systems

Studying recommender systems with implicit feedback has become increasingly important. However, most existing works are designed in an offline setting while online recommendation is quite challenging due to the one-class nature of implicit feedback. In this paper, we propose an online collaborative filtering method for implicit feedback. We highlight three critical issues of existing works. First, when positive feedback arrives sequentially, if we treat all the other missing items for this given user as the negative samples, the mis-classified items will incur a large deviation since some items might appear as the positive feedback in the subsequent rounds. Second, the ...


Efficient Algorithms For Solving Aggregate Keyword Routing Problems, Qize JIANG, Weiwei SUN, Baihua ZHENG, Kunjie CHEN 2019 Singapore Management University

Efficient Algorithms For Solving Aggregate Keyword Routing Problems, Qize Jiang, Weiwei Sun, Baihua Zheng, Kunjie Chen

Research Collection School Of Information Systems

With the emergence of smart phones and the popularity of GPS, the number of point of interest (POIs) is growing rapidly and spatial keyword search based on POIs has attracted significant attention. In this paper, we study a more sophistic type of spatial keyword searches that considers multiple query points and multiple query keywords, namely Aggregate Keyword Routing (AKR). AKR looks for an aggregate point m together with routes from each query point to m. The aggregate point has to satisfy the aggregate keywords, the routes from query points to the aggregate point have to pass POIs in order to ...


Imitating Human Responses Via A Dual-Process Model Approach, Matthew A. Grimm 2019 Air Force Institute of Technology

Imitating Human Responses Via A Dual-Process Model Approach, Matthew A. Grimm

Theses and Dissertations

Human-autonomous system teaming is becoming more prevalent in the Air Force and in society. Often, the concept of a shared mental model is discussed as a means to enhance collaborative work arrangements between a human and an autonomous system. The idea being that when the models are aligned, the team is more productive due to an increase in trust, predictability, and apparent understanding. This research presents the Dual-Process Model using multivariate normal probability density functions (DPM-MN), which is a cognitive architecture algorithm based on the psychological dual-process theory. The dual-process theory proposes a bipartite decision-making process in people. It labels ...


Testing The Fault Tolerance Of A Wide Area Backup Protection System Using Spin, Kenneth James 2019 Air Force Institute of Technology

Testing The Fault Tolerance Of A Wide Area Backup Protection System Using Spin, Kenneth James

Theses and Dissertations

Cyber-physical systems are increasingly prevalent in daily life. Smart grids in particular are becoming more interconnected and autonomously operated. Despite the advantages, new challenges arise in the form of defending these assets. Recent studies reveal that small-scale, coordinated cyber-attacks on only a few substations across the U.S. could result in cascading failures affecting the entire nation. In support of defending critical infrastructure, this thesis tests the fault tolerance of a backup protection system. Each transmission line in the system incorporates autonomous agents which monitor the status of the line and make decisions regarding the safety of the grid. Various ...


Digital Commons powered by bepress