Open Access. Powered by Scholars. Published by Universities.®

Databases and Information Systems Commons

Open Access. Powered by Scholars. Published by Universities.®

2472 Full-Text Articles 2835 Authors 432873 Downloads 110 Institutions

All Articles in Databases and Information Systems

Faceted Search

2472 full-text articles. Page 1 of 80.

Latent Semantic Indexing In The Discovery Of Cyber-Bullying In Online Text, Jacob L. Bigelow 2016 Ursinus College

Latent Semantic Indexing In The Discovery Of Cyber-Bullying In Online Text, Jacob L. Bigelow

Computer Science Summer Fellows

The rise in the use of social media and particularly the rise of adolescent use has led to a new means of bullying. Cyber-bullying has proven consequential to youth internet users causing a need for a response. In order to effectively stop this problem we need a verified method of detecting cyber-bullying in online text; we aim to find that method. For this project we look at thirteen thousand labeled posts from Formspring and create a bank of words used in the posts. First the posts are cleaned up by taking out punctuation, normalizing emoticons, and removing high and low ...


Detection Of Cyberbullying In Sms Messaging, Bryan W. Bradley 2016 Ursinus College

Detection Of Cyberbullying In Sms Messaging, Bryan W. Bradley

Computer Science Summer Fellows

Cyberbullying is a type of bullying that uses technology such as cell phones to harass or malign another person. To detect acts of cyberbullying, we are developing an algorithm that will detect cyberbullying in SMS (text) messages. Over 80,000 text messages have been collected by software installed on cell phones carried by participants in our study. This paper describes the development of the algorithm to detect cyberbullying messages, using the cell phone data collected previously. The algorithm works by first separating the messages into conversations in an automated way. The algorithm then analyzes the conversations and scores the severity ...


Analyzing Clustered Web Concepts With Homology, Eric Nam 2016 San Jose State University

Analyzing Clustered Web Concepts With Homology, Eric Nam

Master's Projects

As data is being mined more and more from the Internet today, Data Science has become an important field of computing to make that data useful. Data Science allows people to turn all of that data into structured knowledge that is easily utilized, validated, and understandable. There are many known theories to analyze data, but this project will focus on a recently introduced method: analyzing text data with homology from mathematics to understand relationships between keyword-sets.

Using structures of algebraic topology as a starting point, keyword-sets in the text are represented by simplexes based on what they are and what ...


Factors That Affect Information And Communication Technology Adoption By Small Businesses In China, Jie Xiong, Sajda Qureshi, Lotfollah Najjar 2016 University of Nebraska at Omaha

Factors That Affect Information And Communication Technology Adoption By Small Businesses In China, Jie Xiong, Sajda Qureshi, Lotfollah Najjar

Sajda Qureshi

Emerging economies appear to be powering growth in their regions. While China is seen to lead growth in the emerging markets of Asia, 98% of its manufacturing and production base is powered by small businesses. These businesses represent the majority of all businesses in emerging countries and their growth increases with their successful adoption of Information Technology. As the driving force behind the economic growth of China, Information and Communications Technologies (ICTs) are shaping the ways in which small businesses are able to grow. The majority of current research into the user acceptance and adoption of ICTs focusses on the ...


A Model Of Icts Adoption For Sustainable Development: An Investigation Of Small Business In The United States And China, Jie Xiong, Sajda Qureshi 2016 University of Nebraska at Omaha

A Model Of Icts Adoption For Sustainable Development: An Investigation Of Small Business In The United States And China, Jie Xiong, Sajda Qureshi

Sajda Qureshi

No abstract provided.


Can Information And Communication Technologies Lead To Community Capital? An Analysis Of Development, Dave Kocsis, Sajda Qureshi, Jie Xiong 2016 University of Nebraska at Omaha

Can Information And Communication Technologies Lead To Community Capital? An Analysis Of Development, Dave Kocsis, Sajda Qureshi, Jie Xiong

Sajda Qureshi

While it is widely accepted that the increasing interconnectedness of the world economy has been fueled by the innovative uses of Information and Communication Technologies (ICTs), little attention has been paid to the increasing inequalities within developed and developing countries. These inequalities manifest themselves in the form of communities in which incomes are considerably below the rest of the country and there is a rise in poverty. This paper investigates this trend by taking a community capital perspective to investigate how ICTs may or may not enable businesses to grow. As micro-enterprises are seen to contribute to the growth of ...


Factors Affecting Information And Communications Technology Adoption Of Small Businesses: Studies In China And United States, Jie Xiong, Sajda Qureshi 2016 University of Nebraska at Omaha

Factors Affecting Information And Communications Technology Adoption Of Small Businesses: Studies In China And United States, Jie Xiong, Sajda Qureshi

Sajda Qureshi

Small businesses in China and United States generate the largest share of economic activity and employment. As the driving force behind the economic growth of both countries, Information and Communications Technology (ICTs) has fundamentally shaped the two countries. This research-in-progress paper reports the research model we conduct to analyze the factors that will affect ICTs adoption of small businesses in both countries. The purpose of the paper is to (1) report proposals of the current status of the research project (2) build an understanding of ICTs adoption in both countries (3) build the framework to explore the relationship between ICTs ...


Rasp-Qs: Efficient And Confidential Query Services In The Cloud, Zohreh S. Alavi, Lu Zhou, James L. Powers, Keke Chen 2016 Wright State University - Main Campus

Rasp-Qs: Efficient And Confidential Query Services In The Cloud, Zohreh S. Alavi, Lu Zhou, James L. Powers, Keke Chen

Keke Chen

Hosting data query services in public clouds is an attractive solution for its great scalability and significant cost savings. However, data owners also have concerns on data privacy due to the lost control of the infrastructure. This demonstration shows a prototype for efficient and confidential range/kNN query services built on top of the random space perturbation (RASP) method. The RASP approach provides a privacy guarantee practical to the setting of cloudbased computing, while enabling much faster query processing compared to the encryption-based approach. This demonstration will allow users to more intuitively understand the technical merits of the RASP approach ...


Two Roads, One Destination: A Journey Of Discovery, Karen Joc, Peta J. Hopkins, Jessie Donaghey, Wendy Abbott 2016 Bond University

Two Roads, One Destination: A Journey Of Discovery, Karen Joc, Peta J. Hopkins, Jessie Donaghey, Wendy Abbott

Karen Joc

The adoption of resource discovery platforms has been a growing trend in libraries. However, few libraries have reported on the transition from one discovery layer to another, and only a few institutions have discussed two discovery layers available in the same institution at the same time. Bond University Library recently implemented Alma as its library management system, and with this change a new discovery platform, Primo, was implemented to supersede the existing Summon platform. This paper presents the results of a usability study undertaken at Bond University Library in the move from one discovery layer to another.


Using A Data Quality Framework To Clean Data Extracted From The Electronic Health Record: A Case Study., Oliwier Dziadkowiec, Tiffany Callahan, Mustafa Ozkaynak, Blaine Reeder, John Welton 2016 University of Colorado, College of Nursing, Anschutz Medical Campus

Using A Data Quality Framework To Clean Data Extracted From The Electronic Health Record: A Case Study., Oliwier Dziadkowiec, Tiffany Callahan, Mustafa Ozkaynak, Blaine Reeder, John Welton

eGEMs (Generating Evidence & Methods to improve patient outcomes)

Objectives: Examine (1) the appropriateness of using a data quality (DQ) framework developed for relational databases as a data-cleaning tool for a dataset extracted from two EPIC databases; and (2) the differences in statistical parameter estimates on a dataset cleaned with the DQ framework and dataset not cleaned with the DQ framework.

Background: The use of data contained within electronic health records (EHRs) has the potential to open doors for a new wave of innovative research. Without adequate preparation of such large datasets for analysis, the results might be erroneous, which might affect clinical decision making or results of Comparative ...


Exploring The Human Body Space: A Geographical Information System Based Anatomical Atlas, Antonio Barbeito, Marco Painho, Pedro Cabral, João Goyri O'Neill 2016 The University of Maine

Exploring The Human Body Space: A Geographical Information System Based Anatomical Atlas, Antonio Barbeito, Marco Painho, Pedro Cabral, João Goyri O'Neill

Journal of Spatial Information Science

Anatomical atlases allow mapping the anatomical structures of the human body. Early versions of these systems consisted of analogical representations with informative text and labeled images of the human body. With computer systems, digital versions emerged and the third and fourth dimensions were introduced. Consequently, these systems increased their efficiency, allowing more realistic visualizations with improved interactivity and functionality. The 4D atlases allow modeling changes over time on the structures represented. The anatomical atlases based on geographic information system (GIS) environments allow the creation of platforms with a high degree of interactivity and new tools to explore and analyze the ...


A Context-Sensitive Conceptual Framework For Activity Modeling, Rahul Deb Das, Stephan Winter 2016 The University of Maine

A Context-Sensitive Conceptual Framework For Activity Modeling, Rahul Deb Das, Stephan Winter

Journal of Spatial Information Science

Human motion trajectories, however captured, provide a rich spatiotemporal data source for human activity recognition, and the rich literature in motion trajectory analysis provides the tools to bridge the gap between this data and its semantic interpretation. But activity is an ambiguous term across research communities. For example, in urban transport research activities are generally characterized around certain locations assuming the opportunities and resources are present in that location, and traveling happens between these locations for activity participation, i.e., travel is not an activity, rather a mean to overcome spatial constraints. In contrast, in human-computer interaction (HCI) research and ...


Μ-Shapes: Delineating Urban Neighborhoods Using Volunteered Geographic Information, Matt Aadland, Christopher Farah, Kevin Magee 2016 The University of Maine

Μ-Shapes: Delineating Urban Neighborhoods Using Volunteered Geographic Information, Matt Aadland, Christopher Farah, Kevin Magee

Journal of Spatial Information Science

Urban neighborhoods are a unique form of geography in that their boundaries rely on a social definition rather than a well-defined physical or administrative boundary. Currently, geographic gazetteers capture little more than then the centroid of a neighborhood, limiting potential applications of the data. In this paper, we present µ-shapes, an algorithm that employs fuzzy-set theory to model neighborhood boundaries suitable for populating gazetteers using volunteered geographic information (VGI). The algorithm is evaluated using a reference dataset and VGI from the Map Kibera Project. A confusion matrix comparison between the reference dataset and µ-shape's output demonstrated high sensitivity and ...


Creating The 2011 Area Classification For Output Areas (2011 Oac), Christopher G. Gale, Alexander D. Singleton, Andrew G. Bates, Paul A. Longley 2016 The University of Maine

Creating The 2011 Area Classification For Output Areas (2011 Oac), Christopher G. Gale, Alexander D. Singleton, Andrew G. Bates, Paul A. Longley

Journal of Spatial Information Science

This paper presents the methodology that has been used to create the 2011 Area Classification for Output Areas (2011 OAC). This extends a lineage of widely used public domain census-only geodemographic classifications in the UK. It provides an update to the successful 2001 OAC methodology, and summarizes the social and physical structure of neighborhoods using data from the 2011 UK Census. The results of a user engagement exercise that underpinned the creation of an updated methodology for the 2011 OAC are also presented. The 2011 OAC comprises 8 Supergroups, 26 Groups, and 76 Subgroups. An example of the results of ...


Analyze Large Multidimensional Datasets Using Algebraic Topology, David Le 2016 San Jose State University

Analyze Large Multidimensional Datasets Using Algebraic Topology, David Le

Master's Projects

This paper presents an efficient algorithm to extract knowledge from high-dimensionality, high- complexity datasets using algebraic topology, namely simplicial complexes. Based on concept of isomorphism of relations, our method turn a relational table into a geometric object (a simplicial complex is a polyhedron). So, conceptually association rule searching is turned into a geometric traversal problem. By leveraging on the core concepts behind Simplicial Complex, we use a new technique (in computer science) that improves the performance over existing methods and uses far less memory. It was designed and developed with a strong emphasis on scalability, reliability, and extensibility. This paper ...


Musictrakr, Benjamin Lin 2016 California Polytechnic State University, San Luis Obispo

Musictrakr, Benjamin Lin

Computer Engineering

MusicTrackr is an IoT device that musicians attach to their instruments. The device has a start and stop button that allows users to record their playing sessions. Each recorded session is sent wirelessly to a cloud database. An accompanying website displays all of the recorded sessions, organized by date. After picking a specific date, the user can view graphs showing total practice time and average session length as well play back any recordings during that date. In addition, the user may add comments to any specific date or recording. Lastly, the user may tag a specific date with a color ...


Hybrid Similarity Function For Big Data Entity Matching With R-Swoosh, Vimal Chandra Gorijala 2016 San Jose State University

Hybrid Similarity Function For Big Data Entity Matching With R-Swoosh, Vimal Chandra Gorijala

Master's Projects

Entity Matching (EM) is the problem of determining if two entities in a data set refer to the same real-world object. For example, it decides if two given mentions in the data, such as “Helen Hunt” and “H. M. Hunt”, refer to the same real-world entity by using different similarity functions. This problem plays a key role in information integration, natural language understanding, information processing on the World-Wide Web, and on the emerging Semantic Web. This project deals with the similarity functions and thresholds utilized in them to determine the similarity of the entities. The work contains two major parts ...


Efficient Pair-Wise Similarity Computation Using Apache Spark, Parineetha Gandhi Tirumali 2016 San Jose State University

Efficient Pair-Wise Similarity Computation Using Apache Spark, Parineetha Gandhi Tirumali

Master's Projects

Entity matching is the process of identifying different manifestations of the same real world entity. These entities can be referred to as objects(string) or data instances. These entities are in turn split over several databases or clusters based on the signatures of the entities. When entity matching algorithms are performed on these databases or clusters, there is a high possibility that a particular entity pair is compared more than once. The number of comparison for any two entities depend on the number of common signatures or keys they possess. This effects the performance of any entity matching algorithm. This ...


Library Writers Reward Project, Saravana Kumar Gajendran 2016 San Jose State University

Library Writers Reward Project, Saravana Kumar Gajendran

Master's Projects

Open-source library development exploits the distributed intelligence of participants in Internet communities. Nowadays, contribution to the open-source community is fading [16] (Stackalytics, 2016) as there is not much recognition for library writers. They can start exploring ways to generate revenue as they actively contribute to the open-source community.

This project helps library writers to generate revenue in the form of bitcoins for their contribution. Our solution to generate revenue for library writers is to integrate bitcoin mining with existing JavaScript libraries, such as jQuery. More use of the library leads to more revenue for the library writers. It uses the ...


Visual Knowledge Representation Of Conceptual Semantic Networks, Leyla Zhuhadar, Olfa Nasraoui, Robert Wyatt, Rong Yang 2016 Western Kentucky Univeristy

Visual Knowledge Representation Of Conceptual Semantic Networks, Leyla Zhuhadar, Olfa Nasraoui, Robert Wyatt, Rong Yang

Leyla Zhuhadar

This article presents methods of using visual analysis to visually represent large amounts of massive, dynamic, ambiguous data allocated in a repository of learning objects. These methods are based on the semantic representation of these resources. We use a graphical model represented as a semantic graph. The formalization of the semantic graph has been intuitively built to solve a real problem which is browsing and searching for lectures in a vast repository of colleges/courses located at Western Kentucky University1. This study combines Formal Concept Analysis (FCA) with Semantic Factoring to decompose complex, vast concepts into their primitives in order ...


Digital Commons powered by bepress