Open Access. Powered by Scholars. Published by Universities.®

Databases and Information Systems Commons

Open Access. Powered by Scholars. Published by Universities.®

3030 Full-Text Articles 3416 Authors 574579 Downloads 123 Institutions

All Articles in Databases and Information Systems

Faceted Search

3030 full-text articles. Page 1 of 104.

Estimating Accuracy Of Personal Identifiable Information In Integrated Data Systems, Amani "Mohammad Jum'h" Amin Shatnawi 2017 Utah State University

Estimating Accuracy Of Personal Identifiable Information In Integrated Data Systems, Amani "Mohammad Jum'h" Amin Shatnawi

All Graduate Theses and Dissertations

Without a valid assessment of accuracy there is a risk of data users coming to incorrect conclusions or making bad decision based on inaccurate data. This dissertation proposes a theoretical method for developing data-accuracy metrics specific for any given person-centric integrated system and how a data analyst can use these metrics to estimate the overall accuracy of person-centric data.

Estimating the accuracy of Personal Identifiable Information (PII) creates a corresponding need to model and formalize PII for both the real-world and electronic data, in a way that supports rigorous reasoning relative to real-world facts, expert opinions, and aggregate knowledge. This ...


Question Type Recognition Using Natural Language Input, Aishwarya Soni 2017 San Jose State University

Question Type Recognition Using Natural Language Input, Aishwarya Soni

Master's Projects

Recently, numerous specialists are concentrating on the utilization of Natural Language Processing (NLP) systems in various domains, for example, data extraction and content mining. One of the difficulties with these innovations is building up a precise Question and Answering (QA) System. Question type recognition is the most significant task in a QA system, for example, chat bots. Organization such as National Institute of Standards (NIST) hosts a conference series called as Text REtrieval Conference (TREC) series which keeps a competition every year to encourage and improve the technique of information retrieval from a large corpus of text. When a user ...


A Primer To The Structure, Content And Linkage Of The Fda’S Manufacturer And User Facility Device Experience (Maude) Files, Lisa Garnsey Ensign, K. Bretonnel Cohen 2017 University of Colorado, Denver, Anschutz Medical Campus

A Primer To The Structure, Content And Linkage Of The Fda’S Manufacturer And User Facility Device Experience (Maude) Files, Lisa Garnsey Ensign, K. Bretonnel Cohen

eGEMs (Generating Evidence & Methods to improve patient outcomes)

Introduction and Background: The US Food and Drug Administration (FDA)’s Manufacturer and User Facility Device Experience (MAUDE) database is a publicly available resource providing over 4 million records relating to medical device safety. Using downloadable MAUDE files avoids limitations of the online MAUDE search interface. However, naïve file usage can result in errors, while independent discovery of the nuances required to correctly work with the database can be time-consuming. Practical information is provided to shorten this learning curve and obtain accurate results when using the MAUDE database files.

MAUDE File Descriptions: The MAUDE database consists of 135 fields in ...


The Use And Effectiveness Of Online Social Media In Volunteer Organizations, Amy J. Connolly 2017 University of South Florida

The Use And Effectiveness Of Online Social Media In Volunteer Organizations, Amy J. Connolly

Amy J Connolly

Volunteer organizations face two challenges not found in non-volunteer organizations: recruiting and retaining volunteers. While social media use is increasing amongst individuals, its use and effectiveness for volunteer recruitment and retention by volunteer organizations is unknown. The dissertation reports the results of three studies to investigate this important question. Using a mixed-methods approach, it addressed the dual nature of social media and its effectiveness by including volunteer organizations and social media users. This dissertation found that although volunteer organizations are not using social media effectively, they could virtualize requirements of the recruitment process by focusing on relatable events instead of ...


The Use And Effectiveness Of Online Social Media In Volunteer Organizations, Amy J. Connolly 2017 University of South Florida

The Use And Effectiveness Of Online Social Media In Volunteer Organizations, Amy J. Connolly

Amy J Connolly

Volunteer organizations face two challenges not found in non-volunteer organizations: recruiting and retaining volunteers. While social media use is increasing amongst individuals, its use and effectiveness for volunteer recruitment and retention by volunteer organizations is unknown. The dissertation reports the results of three studies to investigate this important question. Using a mixed-methods approach, it addressed the dual nature of social media and its effectiveness by including volunteer organizations and social media users. This dissertation found that although volunteer organizations are not using social media effectively, they could virtualize requirements of the recruitment process by focusing on relatable events instead of ...


An Object Oriented Approach To Modeling And Simulation Of Routing In Large Communication Networks, Armin Mikler, Johnny S. Wong, Vasant Honavar 2017 Iowa State University

An Object Oriented Approach To Modeling And Simulation Of Routing In Large Communication Networks, Armin Mikler, Johnny S. Wong, Vasant Honavar

Johnny Wong

The complexity (number of entities, interactions between entities, and resulting emergent dynamic behavior) of large communication environments which contain hundreds of nodes and links make simulation an important tool for the study of such systems. Given the difficulties associated with complete analytical treatment of complex dynamical systems, it is often the only practical tool that is available. This paper presents an example of a flexible, modular, object-oriented toolbox designed to support modeling and experimental analysis of a large family of heuristic knowledge representation and decision functions for adaptive self-managing communication networks with particular emphasis on routing strategies. It discusses in ...


Design And Implementation Of A Media Uploading System, Mu Zhang, Johnny S. Wong, Wallapak Tavanapong 2017 Iowa State University

Design And Implementation Of A Media Uploading System, Mu Zhang, Johnny S. Wong, Wallapak Tavanapong

Johnny Wong

This paper presents the design and performance analysis of an uploading system that automatically uploads multimedia files to a centralized server given client hard deadlines. If not uploaded by the deadlines, existing files may be lost or new files cannot be recorded. The uploading systems with hard deadlines have several important applications in practice. For instance, such systems can be used in hospitals to gather videos generated from medical devices from various operating rooms for post-procedure analysis and in law enforcement to collect video recordings from police cars during routine patrolling. In this paper, we study the uploading system with ...


Eds Usability Testing Final Report, 2017 Selected Works

Eds Usability Testing Final Report

Sally Krash

Usability studies in academic libraries are essential tools to assess functionality and accessibility of library services.  The University of Massachusetts Amherst Libraries recently conducted usability studies on EBSCO’s Discovery search platform, which is to be the default search platform on the UMass Amherst Libraries’ website beginning on July 2017.   During the spring of 2017, Information Resources Management utilized surveys, focus groups, and hands-on testing of students and faculty to assess how library patrons interacted with the new discovery service (EDS) and other related library services.  The following report documents this usability study, findings discovered, and recommendations hitherto.  The following ...


A Comparison Of Data Quality Assessment Checks In Six Data Sharing Networks, Tiffany J. Callahan, Alan E. Bauck, David Bertoch, Jeff Brown, Ritu Khare, Patrick B. Ryan, Jenny Staab, Meredith N. Zozus, Michael G. Kahn 2017 Computational Bioscience Program, University of Colorado Denver Anschutz Medical Campus

A Comparison Of Data Quality Assessment Checks In Six Data Sharing Networks, Tiffany J. Callahan, Alan E. Bauck, David Bertoch, Jeff Brown, Ritu Khare, Patrick B. Ryan, Jenny Staab, Meredith N. Zozus, Michael G. Kahn

eGEMs (Generating Evidence & Methods to improve patient outcomes)

Objective: To compare rule-based data quality (DQ) assessment approaches across multiple national clinical data sharing organizations.

Methods: Six organizations with established data quality assessment (DQA) programs provided documentation or source code describing current DQ checks. DQ checks were mapped to the categories within the data verification context of the harmonized DQA terminology. To ensure all DQ checks were consistently mapped, conventions were developed and four iterations of mapping performed. Difficult-to-map DQ checks were discussed with research team members until consensus was achieved.

Results: Participating organizations provided 11,026 DQ checks, of which 99.97% were successfully mapped to a DQA ...


Selecting Link Resolver And Knowledge Base Software: Implications Of Interoperability, Cyndy Chisare, Jody C. Fagan, David J. Gaines, Michael Trocchia 2017 James Madison University

Selecting Link Resolver And Knowledge Base Software: Implications Of Interoperability, Cyndy Chisare, Jody C. Fagan, David J. Gaines, Michael Trocchia

Libraries

Link resolver software and their associated knowledge bases are essential technologies for modern academic libraries. However, because of the increasing number of possible integrations involving link resolver software and knowledge bases, a library’s vendor relationships, product choices, and consortial arrangements may have the most dramatic effects on the user experience and back-end maintenance workloads. A project team at a large comprehensive university recently investigated link resolver products in an attempt to increase efficiency of back-end workflows while maintaining or improving the patron experience. The methodology used for product comparison may be useful for other libraries.


Analyzing The Keystroke Dynamics Of Web Identifiers, Andrew G. West 2017 University of Pennsylvania

Analyzing The Keystroke Dynamics Of Web Identifiers, Andrew G. West

Dr. Andrew G. West

Web identifiers such as usernames, hashtags, and domain names serve important roles in online navigation, communication, and community building. Therefore the entities that choose such names must ensure that end-users are able to quickly and accurately enter them in applications. Uniqueness requirements, a desire for short strings, and an absence of delimiters often constrain this name selection process.

To gain perspective on the speed and correctness of name entry, we crowdsource the typing of 51,000+ web identifiers. Surface level analysis reveals, for example, that typing speed is generally a linear function of identifier length. Examining keystroke dynamics at finer ...


An Open Source Discussion Group Recommendation System, Sarika Padmashali 2017 San Jose State University

An Open Source Discussion Group Recommendation System, Sarika Padmashali

Master's Projects

A recommendation system analyzes user behavior on a website to make suggestions about what a user should do in the future on the website. It basically tries to predict the “rating” or “preference” a user would have for an action. Yioop is an open source search engine, wiki system, and user discussion group system managed by Dr. Christopher Pollett at SJSU. In this project, we have developed a recommendation system for Yioop where users are given suggestions about the threads and groups they could join based on their user history. We have used collaborative filtering techniques to make recommendations and ...


Adding Differential Privacy In An Open Board Discussion Board System, Pragya Rana 2017 San Jose State University

Adding Differential Privacy In An Open Board Discussion Board System, Pragya Rana

Master's Projects

This project implements a privacy system for statistics generated by the Yioop search and discussion board system. Statistical data for such a system consists of various counts, sums, and averages that might be displayed for groups, threads, etc. When statistical data is made publicly available, there is no guarantee of preserving the privacy of an individual. Ideally, any data extracted should not reveal any sensitive information about an individual. In order to help achieve this, we implemented a Differential Privacy mechanism for Yioop. Differential privacy preserves privacy up to some controllable parameters of the number of items or individuals being ...


Document Classification Using Machine Learning, Ankit Basarkar 2017 San Jose State University

Document Classification Using Machine Learning, Ankit Basarkar

Master's Projects

To perform document classification algorithmically, documents need to be represented such that it is understandable to the machine learning classifier. The report discusses the different types of feature vectors through which document can be represented and later classified. The project aims at comparing the Binary, Count and TfIdf feature vectors and their impact on document classification. To test how well each of the three mentioned feature vectors perform, we used the 20-newsgroup dataset and converted the documents to all the three feature vectors. For each feature vector representation, we trained the Naïve Bayes classifier and then tested the generated ...


Reducing Query Latency For Information Retrieval, Swapnil Satish Kamble 2017 San Jose State University

Reducing Query Latency For Information Retrieval, Swapnil Satish Kamble

Master's Projects

As the world is moving towards Big Data, NoSQL (Not only SQL) databases are gaining much more popularity. Among the other advantages of NoSQL databases, one of their key advantage is that they facilitate faster retrieval for huge volumes of data, as compared to traditional relational databases. This project deals with one such popular NoSQL database, Apache HBase. It performs quite efficiently in cases of retrieving information using the rowkey (similar to a primary key in a SQL database). But, in cases where one needs to get information based on non-rowkey columns, the response latency is higher than what we ...


A Chatbot Framework For Yioop, Harika Nukala 2017 San Jose State University

A Chatbot Framework For Yioop, Harika Nukala

Master's Projects

Over the past few years, messaging applications have become more popular than Social networking sites. Instead of using a specific application or website to access some service, chatbots are created on messaging platforms to allow users to interact with companies’ products and also give assistance as needed. In this project, we designed and implemented a chatbot Framework for Yioop. The goal of the Chatbot Framework for Yioop project is to provide a platform for developers in Yioop to build and deploy chatbot applications. A chatbot is a web service that can converse with users using artificial intelligence in messaging platforms ...


Headline Generation Using Deep Neural Networks, Dhruven Vora 2017 San Jose State University

Headline Generation Using Deep Neural Networks, Dhruven Vora

Master's Projects

News headline generation is one of the important text summarization tasks. Human generated news headlines are generally intended to catch the eye rather than provide useful information. There have been many approaches to generate meaningful headlines by either using neural networks or using linguistic features. In this report, we are proposing a novel approach based on integrating Hedge Trimmer, which is a grammar based extractive summarization system with a deep neural network abstractive summarization system to generate meaningful headlines. We analyze the results against current recurrent neural network based headline generation system.


Recognition, Internalization, Growth: Intuitive Design For Archival Representation, Jaime L. Ganzel 2017 Western Washington University

Recognition, Internalization, Growth: Intuitive Design For Archival Representation, Jaime L. Ganzel

Graduate Student Conference

Although there is a pressing need for archival description and access systems to be more intuitive and user-friendly, the uniqueness of archival records presents significant barriers to establishing simplistic and standardized conventions for the representation of archival materials. Indecipherable finding aids and access tools prevent new and inexperienced researchers from accessing the unique information and documentation held in archives. This article aims to help open the archival record to new and non-traditional archival users, support individual development of archival literacy skills, and cultivate a greater level of archival awareness in our society by developing a usable model for archivists to ...


Mining Of Primary Healthcare Patient Data With Selective Multimorbid Diseases, Annette Megerdichian Azad 2017 The University of Western Ontario

Mining Of Primary Healthcare Patient Data With Selective Multimorbid Diseases, Annette Megerdichian Azad

Electronic Thesis and Dissertation Repository

Despite a large volume of research on the prognosis, diagnosis and overall burden of multimorbidity, very little is known about socio-demographic characteristics of multimorbid patients. This thesis aims to analyze the socio-demographic characteristics of patients with multiple chronic conditions (multimorbidity), focusing on patient groups sharing the same combination of diseases. Several methods were explored to analyze the co-occurrence of multiple chronic diseases as well as the associations between socio-demographics and chronic conditions. These methods include disease pair distributions over gender, age groups and income level quintiles, Multimorbidity Coefficients for measuring the concurrence of disease pairs and triples, and k-modes clustering ...


Exploiting Semantic Distance In Linked Open Data For Recommendation, Sultan Dawood Alfarhood 2017 University of Arkansas, Fayetteville

Exploiting Semantic Distance In Linked Open Data For Recommendation, Sultan Dawood Alfarhood

Theses and Dissertations

The use of Linked Open Data (LOD) has been explored in recommender systems in different ways, primarily through its graphical representation. The graph structure of LOD is utilized to measure inter-resource relatedness via their semantic distance in the graph. The intuition behind this approach is that the more connected resources are to each other, the more related they are. One drawback of this approach is that it treats all inter-resource connections identically rather than prioritizing links that may be more important in semantic relatedness calculations. Another drawback of current approaches is that they only consider resources that are connected directly ...


Digital Commons powered by bepress