Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Databases and Information Systems

2017

Institution
Keyword
Publication
Publication Type
File Type

Articles 1 - 30 of 385

Full-Text Articles in Physical Sciences and Mathematics

Breadcrumbs: Privacy As A Privilege, Prachi Bhardwaj Dec 2017

Breadcrumbs: Privacy As A Privilege, Prachi Bhardwaj

Capstones

Breadcrumbs: Privacy as a Privilege Abstract

By: Prachi Bhardwaj

In 2017, the world saw more data breaches than in any year prior. The count was more than the all-time high record in 2016, which was 40 percent more than the year before that.

That’s because consumer data is incredibly valuable today. In the last three decades, data storage has gone from being stored physically to being stored almost entirely digitally, which means consumer data is more accessible and applicable to business strategies. As a result, companies are gathering data in ways previously unknown to the average consumer, and hackers are …


Introduction To The Usu Library Of Solutions To The Einstein Field Equations, Ian M. Anderson, Charles G. Torre Dec 2017

Introduction To The Usu Library Of Solutions To The Einstein Field Equations, Ian M. Anderson, Charles G. Torre

Tutorials on... in 1 hour or less

This is a Maple worksheet providing an introduction to the USU Library of Solutions to the Einstein Field Equations. The library is part of the DifferentialGeometry software project and is a collection of symbolic data and metadata describing solutions to the Einstein equations.


Proactive Sequential Resource (Re)Distribution For Improving Efficiency In Urban Environments, Supriyo Ghosh Dec 2017

Proactive Sequential Resource (Re)Distribution For Improving Efficiency In Urban Environments, Supriyo Ghosh

Dissertations and Theses Collection (Open Access)

Due to the increasing population and lack of coordination, there is a mismatch in supply and demand of common resources (e.g., shared bikes, ambulances, taxis) in urban environments, which has deteriorated a wide variety of quality of life metrics such as success rate in issuing shared bikes, response times for emergency needs, waiting times in queues etc. Thus, in my thesis, I propose efficient algorithms that optimise the quality of life metrics by proactively redistributing the resources using intelligent operational (day-to-day) and strategic (long-term) decisions in the context of urban transportation and health & safety. For urban transportation, Bike Sharing …


Altering The Expression Of Artemisinin Through Osmotic Manipulation, Tyler Friesen Dec 2017

Altering The Expression Of Artemisinin Through Osmotic Manipulation, Tyler Friesen

Theses/Capstones/Creative Projects

Artemisinin is an anti-malarial drug used in combination therapy to treat all malarial parasites in the blood stage. The expression of artemisinin within the plant Artemisia annua is only 1% of the dry weight. Methods for increasing the level of artemisinin within the plant were proposed. This paper looks into finding homologous enzymes across multiple species in order to find species where genetic manipulations will be useful. The second part of this paper looks at the use of osmotic stress to increase the reactive oxygen species in order to increase the amount of artemisinin within the plant. The database portion …


Ethics And Bias In Machine Learning: A Technical Study Of What Makes Us “Good”, Ashley Nicole Shadowen Dec 2017

Ethics And Bias In Machine Learning: A Technical Study Of What Makes Us “Good”, Ashley Nicole Shadowen

Student Theses

The topic of machine ethics is growing in recognition and energy, but bias in machine learning algorithms outpaces it to date. Bias is a complicated term with good and bad connotations in the field of algorithmic prediction making. Especially in circumstances with legal and ethical consequences, we must study the results of these machines to ensure fairness. This paper attempts to address ethics at the algorithmic level of autonomous machines. There is no one solution to solving machine bias, it depends on the context of the given system and the most reasonable way to avoid biased decisions while maintaining the …


Online Learning With Nonlinear Models, Doyen Sahoo Dec 2017

Online Learning With Nonlinear Models, Doyen Sahoo

Dissertations and Theses Collection (Open Access)

Recent years have witnessed the success of two broad categories of machine learning algorithms: (i) Online Learning; and (ii) Learning with nonlinear models. Typical machine learning algorithms assume that the entire data is available prior to the training task. This is often not the case in the real world, where data often arrives sequentially in a stream, or is too large to be stored in memory. To address these challenges, Online Learning techniques evolved as a promising solution to having highly scalable and efficient learning methodologies which could learn from data arriving sequentially. Next, as the real world data exhibited …


Design And Implementation Of A Stand-Alone Tool For Metabolic Simulations, Milad Ghiasi Rad Dec 2017

Design And Implementation Of A Stand-Alone Tool For Metabolic Simulations, Milad Ghiasi Rad

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

In this thesis, we present the design and implementation of a stand-alone tool for metabolic simulations. This system is able to integrate custom-built SBML models along with external user’s input information and produces the estimation of any reactants participating in the chain of the reactions in the provided model, e.g., ATP, Glucose, Insulin, for the given duration using numerical analysis and simulations. This tool offers the food intake arguments in the calculations to consider the personalized metabolic characteristics in the simulations. The tool has also been generalized to take into consideration of temporal genomic information and be flexible for simulation …


Secure Server-Aided Top-K Monitoring, Yujue Wang, Hwee Hwa Pang, Yanjiang Yang, Xuhua Ding Dec 2017

Secure Server-Aided Top-K Monitoring, Yujue Wang, Hwee Hwa Pang, Yanjiang Yang, Xuhua Ding

Research Collection School Of Computing and Information Systems

In a data streaming model, a data owner releases records or documents to a set of users with matching interests, in such a way that the match in interest can be calculated from the correlation between each pair of document and user query. For scalability and availability reasons, this calculation is delegated to third-party servers, which gives rise to the need to protect the integrity and privacy of the documents and user queries. In this paper, we propose a server-aided data stream monitoring scheme (DSM) to address the aforementioned integrity and privacy challenges, so that the users are able to …


Using Data Analytics For Discovering Library Resource Insights: Case From Singapore Management University, Ning Lu, Rui Song, Dina Li Gwek Heng, Swapna Gottipati, Aaron Tay Dec 2017

Using Data Analytics For Discovering Library Resource Insights: Case From Singapore Management University, Ning Lu, Rui Song, Dina Li Gwek Heng, Swapna Gottipati, Aaron Tay

Research Collection School Of Computing and Information Systems

Library resources are critical in supporting teaching, research and learning processes. Several universities have employed online platforms and infrastructure for enabling the online services to students, faculty and staff. To provide efficient services by understanding and predicting user needs libraries are looking into the area of data analytics. Library analytics in Singapore Management University is the project committed to provide an interface for data-intensive project collaboration, while supporting one of the library’s key pillars on its commitment to collaborate on initiatives with SMU Communities and external groups. In this paper, we study the transaction logs for user behavior analysis that …


Disease Gene Classification With Metagraph Representations, Sezin Kircali Ata, Yuan Fang, Min Wu, Xiao-Li Li, Xiaokui Xiao Dec 2017

Disease Gene Classification With Metagraph Representations, Sezin Kircali Ata, Yuan Fang, Min Wu, Xiao-Li Li, Xiaokui Xiao

Research Collection School Of Computing and Information Systems

Protein-protein interaction (PPI) networks play an important role in studying the functional roles of proteins, including their association with diseases. However, protein interaction networks are not sufficient without the support of additional biological knowledge for proteins such as their molecular functions and biological processes. To complement and enrich PPI networks, we propose to exploit biological properties of individual proteins. More specifically, we integrate keywords describing protein properties into the PPI network, and construct a novel PPI-Keywords (PPIK) network consisting of both proteins and keywords as two different types of nodes. As disease proteins tend to have a similar topological characteristics …


Robust Human Activity Recognition Using Lesser Number Of Wearable Sensors, Di Wang, Edwin Candinegara, Junhui Hou, Ah-Hwee Tan, Chunyan Miao Dec 2017

Robust Human Activity Recognition Using Lesser Number Of Wearable Sensors, Di Wang, Edwin Candinegara, Junhui Hou, Ah-Hwee Tan, Chunyan Miao

Research Collection School Of Computing and Information Systems

In recent years, research on the recognition of human physical activities solely using wearable sensors has received more and more attention. Compared to other types of sensory devices such as surveillance cameras, wearable sensors are preferred in most activity recognition applications mainly due to their non-intrusiveness and pervasiveness. However, many existing activity recognition applications or experiments using wearable sensors were conducted in the confined laboratory settings using specifically developed gadgets. These gadgets may be useful for a small group of people in certain specific scenarios, but probably will not gain their popularity because they introduce additional costs and they are …


Using Teaching Cases For Achieving Bloom’S High-Order Cognitive Levels: An Application In Technically-Oriented Information Systems Course, Kar Way Tan Dec 2017

Using Teaching Cases For Achieving Bloom’S High-Order Cognitive Levels: An Application In Technically-Oriented Information Systems Course, Kar Way Tan

Research Collection School Of Computing and Information Systems

Case-teaching has been an attractive pedagogy method for bringing in real-world examples into the classroom. However, it is challenging to introduce cases to address high-order cognitive skills such as analyzing and creating new IT solutions in technically-oriented computing course. In this research, we present our experience in introducing three types of case studies -- Story-Telling case, Design-and-Problem-Solving case, and Create-Design-Implement case to a course in an undergraduate Information Systems programme. For each case study, we plan and map the learning objectives to address various cognitive levels in the revised Bloom’s Taxonomy. Using surveys conducted over two academic years, we show …


Inferring Social Media Users’ Demographics From Profile Pictures: A Face++ Analysis On Twitter Users, Soon-Gyo Jung, Jisun An, Haewoon Kwak, Joni Salminen, Bernard J. Jansen Dec 2017

Inferring Social Media Users’ Demographics From Profile Pictures: A Face++ Analysis On Twitter Users, Soon-Gyo Jung, Jisun An, Haewoon Kwak, Joni Salminen, Bernard J. Jansen

Research Collection School Of Computing and Information Systems

In this research, we evaluate the applicability of using facial recognition of social media account profile pictures to infer the demographic attributes of gender, race, and age of the account owners leveraging a commercial and well-known image service, specifically Face++. Our goal is to determine the feasibility of this approach for actual system implementation. Using a dataset of approximately 10,000 Twitter profile pictures, we use Face++ to classify this set of images for gender, race, and age. We determine that about 30% of these profile pictures contain identifiable images of people using the current state-of-the-art automated means. We then employ …


Policy Analytics For Environmental Sustainability: Household Hazardous Waste And Water Impacts Of Carbon Pollution Standards, Kustini Dec 2017

Policy Analytics For Environmental Sustainability: Household Hazardous Waste And Water Impacts Of Carbon Pollution Standards, Kustini

Dissertations and Theses Collection (Open Access)

Policy analytics are essential in supporting more informed policy-making in environmental management. This dissertation employs a fusion of machine methods and explanatory empiricism that involves data analytics, math programming, optimization, econometrics, geospatial and spatiotemporal analysis, and other approaches for assessing and evaluating current and future environmental policies.
Essay 1 discusses household informedness and its impact on the collection and recycling of household hazardous waste (HHW). Household informedness is the degree to which households have the necessary information to make utility-maximizing decisions about the handling of their waste. Such informedness seems to be influenced by HHW public education and environmental quality …


On Modeling Sense Relatedness In Multi-Prototype Word Embedding, Yixin Cao, Juanzi Li, Jiaxin Shi, Zhiyuan Liu, Chengjiang Li Dec 2017

On Modeling Sense Relatedness In Multi-Prototype Word Embedding, Yixin Cao, Juanzi Li, Jiaxin Shi, Zhiyuan Liu, Chengjiang Li

Research Collection School Of Computing and Information Systems

To enhance the expression ability of distributional word representation learning model, many researchers tend to induce word senses through clustering, and learn multiple embedding vectors for each word, namely multi-prototype word embedding model. However, most related work ignores the relatedness among word senses which actually plays an important role. In this paper, we propose a novel approach to capture word sense relatedness in multi-prototype word embedding model. Particularly, we differentiate the original sense and extended senses of a word by introducing their global occurrence information and model their relatedness through the local textual context information. Based on the idea of …


Enhanced Version Control For Unconventional Applications, Ahmed Saleh Shatnawi Dec 2017

Enhanced Version Control For Unconventional Applications, Ahmed Saleh Shatnawi

Theses and Dissertations

The Extensible Markup Language (XML) is widely used to store, retrieve, and share digital documents. Recently, a form of Version Control System has been applied to the language, resulting in Version-Aware XML allowing for enhanced portability and scalability. While Version Control Systems are able to keep track of changes made to documents, we think that there is untapped potential in the technology. In this dissertation, we present novel ways of using Version Control System to enhance the security and performance of existing applications. We present a framework to maintain integrity in offline XML documents and provide non-repudiation security features that …


Utilizing Consumer Health Posts For Pharmacovigilance: Identifying Underlying Factors Associated With Patients’ Attitudes Towards Antidepressants, Maryam Zolnoori Dec 2017

Utilizing Consumer Health Posts For Pharmacovigilance: Identifying Underlying Factors Associated With Patients’ Attitudes Towards Antidepressants, Maryam Zolnoori

Theses and Dissertations

Non-adherence to antidepressants is a major obstacle to antidepressants therapeutic benefits, resulting in increased risk of relapse, emergency visits, and significant burden on individuals and the healthcare system. Several studies showed that non-adherence is weakly associated with personal and clinical variables, but strongly associated with patients’ beliefs and attitudes towards medications. The traditional methods for identifying the key dimensions of patients’ attitudes towards antidepressants are associated with some methodological limitations, such as concern about confidentiality of personal information. In this study, attempts have been made to address the limitations by utilizing patients’ self report experiences in online healthcare forums to …


D-Watch: Embracing “Bad” Multipaths For Device-Free Localization With Cots Rfid Devices, Ju Wang, Jie Xiong, Hongbo Jiang, Xiaojiang Chen, Dingyi Fang Dec 2017

D-Watch: Embracing “Bad” Multipaths For Device-Free Localization With Cots Rfid Devices, Ju Wang, Jie Xiong, Hongbo Jiang, Xiaojiang Chen, Dingyi Fang

Research Collection School Of Computing and Information Systems

Device-free localization, which does not require any device attached to the target, is playing a critical role in many applications, such as intrusion detection, elderly monitoring and so on. This paper introduces D-Watch, a device-free system built on the top of low cost commodity-off-the-shelf RFID hardware. Unlike previous works which consider multipaths detrimental, D-Watch leverages the ''bad'' multipaths to provide a decimeter-level localization accuracy without offline training. D-Watch harnesses the angle-of-arrival information from the RFID tags' backscatter signals. The key intuition is that whenever a target blocks a signal's propagation path, the signal power experiences a drop which can be …


Leveraging Auxiliary Tasks For Document-Level Cross-Domain Sentiment Classification, Jianfei Yu, Jing Jiang Dec 2017

Leveraging Auxiliary Tasks For Document-Level Cross-Domain Sentiment Classification, Jianfei Yu, Jing Jiang

Research Collection School Of Computing and Information Systems

In this paper, we study domain adaptationwith a state-of-the-art hierarchicalneural network for document-level sentimentclassification. We first design a newauxiliary task based on sentiment scoresof domain-independent words. We thenpropose two neural network architecturesto respectively induce document embeddingsand sentence embeddings that workwell for different domains. When thesedocument and sentence embeddings areused for sentiment classification, we findthat with both pseudo and external sentimentlexicons, our proposed methods canperform similarly to or better than severalhighly competitive domain adaptationmethods on a benchmark dataset of productreviews.


Btci: A New Framework For Identifying Congestion Cascades Using Bus Trajectory Data, Meng-Fen Chiang, Ee Peng Lim, Wang-Chien Lee, Agus Trisnajaya Kwee Dec 2017

Btci: A New Framework For Identifying Congestion Cascades Using Bus Trajectory Data, Meng-Fen Chiang, Ee Peng Lim, Wang-Chien Lee, Agus Trisnajaya Kwee

Research Collection School Of Computing and Information Systems

The knowledge of traffic health status is essential to the general public and urban traffic management. To identify congestion cascades, an important phenomenon of traffic health, we propose a Bus Trajectory based Congestion Identification (BTCI) framework that explores the anomalous traffic health status and structure properties of congestion cascades using bus trajectory data. BTCI consists of two main steps, congested segment extraction and congestion cascades identification. The former constructs path speed models from historical vehicle transitions and design a non-parametric Kernel Density Estimation (KDE) function to derive a measure of congestion score. The latter aggregates congested segments (i.e., those with …


Analyzing The E-Learning Video Environment Requirements Of Generation Z Students Using Echo360 Platform, Swapna Gottipati, Venky Shankararaman Dec 2017

Analyzing The E-Learning Video Environment Requirements Of Generation Z Students Using Echo360 Platform, Swapna Gottipati, Venky Shankararaman

Research Collection School Of Computing and Information Systems

As with any other generational cohort,Generation Z students have their own unique characteristics that influencetheir approach to learning process. They are the future workforce and severalefforts are undertaken by Government and education institutes to consider thecharacteristics of Gen-Z in developing the curriculum and teaching environmentsuitable for these students. E-learning plays a key role in students learningprocess and has been widely adopted by many education institutions. Inparticular, videos play a major role in the learning process of Gen-Zstudents. The purpose of this paper isto focus the on requirements of Gen-Z students and to provide suggestions forhow to create a e-learning video …


Leveraging The Trade-Off Between Accuracy And Interpretability In A Hybrid Intelligent System, Di Wang, Chai Quek, Ah-Hwee Tan, Chunyan Miao, Geok See Ng, You Zhou Dec 2017

Leveraging The Trade-Off Between Accuracy And Interpretability In A Hybrid Intelligent System, Di Wang, Chai Quek, Ah-Hwee Tan, Chunyan Miao, Geok See Ng, You Zhou

Research Collection School Of Computing and Information Systems

Neural Fuzzy Inference System (NFIS) is a widely adopted paradigm to develop a data-driven learning system. This hybrid system has been widely adopted due to its accurate reasoning procedure and comprehensible inference rules. Although most NFISs primarily focus on accuracy, we have observed an ever increasing demand on improving the interpretability of NFISs and other types of machine learning systems. In this paper, we illustrate how we leverage the trade-off between accuracy and interpretability in an NFIS called Genetic Algorithm and Rough Set Incorporated Neural Fuzzy Inference System (GARSINFIS). In a nutshell, GARSINFIS self-organizes its network structure with a small …


Who Are Your Users? Comparing Media Professionals' Preconception Of Users To Data-Driven Personas, Lene Nielsen, Soon-Gyu Jung, Jisun An, Joni Salminen, Haewoon Kwak, Bernard J. Jansen Dec 2017

Who Are Your Users? Comparing Media Professionals' Preconception Of Users To Data-Driven Personas, Lene Nielsen, Soon-Gyu Jung, Jisun An, Joni Salminen, Haewoon Kwak, Bernard J. Jansen

Research Collection School Of Computing and Information Systems

One of the reasons for using personas is to align user understandings across project teams and sites. As part of a larger persona study, at Al Jazeera English (AJE), we conducted 16 qualitative interviews with media producers, the end users of persona descriptions. We asked the participants about their understanding of a typical AJE media consumer, and the variety of answers shows that the understandings are not aligned and are built on a mix of own experiences, own self, assumptions, and data given by the company. The answers are sometimes aligned with the data-driven personas and sometimes not. The end …


A Novel Density Peak Clustering Algorithm Based On Squared Residual Error, Milan Parmar, Di Wang, Ah-Hwee Tan, Chunyan Miao, Jianhua Jiang, You Zhou Dec 2017

A Novel Density Peak Clustering Algorithm Based On Squared Residual Error, Milan Parmar, Di Wang, Ah-Hwee Tan, Chunyan Miao, Jianhua Jiang, You Zhou

Research Collection School Of Computing and Information Systems

The density peak clustering (DPC) algorithm is designed to quickly identify intricate-shaped clusters with high dimensionality by finding high-density peaks in a non-iterative manner and using only one threshold parameter. However, DPC has certain limitations in processing low-density data points because it only takes the global data density distribution into account. As such, DPC may confine in forming low-density data clusters, or in other words, DPC may fail in detecting anomalies and borderline points. In this paper, we analyze the limitations of DPC and propose a novel density peak clustering algorithm to better handle low-density clustering tasks. Specifically, our algorithm …


The Graph Database: Jack Of All Trades Or Just Not Sql?, George F. Hurlburt, Maria R. Lee, George K. Thiruvathukal Nov 2017

The Graph Database: Jack Of All Trades Or Just Not Sql?, George F. Hurlburt, Maria R. Lee, George K. Thiruvathukal

Computer Science: Faculty Publications and Other Works

This special issue of IT Professional focuses on the graph database. The graph database, a relatively new phenomenon, is well suited to the burgeoning information era in which we are increasingly becoming immersed. Here, the guest editors briefly explain how a graph database works, its relation to the relational database management system (RDBMS), and its quantitative and qualitative pros and cons, including how graph databases can be harnessed in a hybrid environment. They also survey the excellent articles submitted for this special issue.


Nbpmf: Novel Network-Based Inference Methods For Peptide Mass Fingerprinting, Zhewei Liang Nov 2017

Nbpmf: Novel Network-Based Inference Methods For Peptide Mass Fingerprinting, Zhewei Liang

Electronic Thesis and Dissertation Repository

Proteins are large, complex molecules that perform a vast array of functions in every living cell. A proteome is a set of proteins produced in an organism, and proteomics is the large-scale study of proteomes. Several high-throughput technologies have been developed in proteomics, where the most commonly applied are mass spectrometry (MS) based approaches. MS is an analytical technique for determining the composition of a sample. Recently it has become a primary tool for protein identification, quantification, and post translational modification (PTM) characterization in proteomics research. There are usually two different ways to identify proteins: top-down and bottom-up. Top-down approaches …


Multi-Step Tokenization Of Automated Clearing House Payment Transactions, Privin Alexander Nov 2017

Multi-Step Tokenization Of Automated Clearing House Payment Transactions, Privin Alexander

USF Tampa Graduate Theses and Dissertations

Since its beginnings in 1974, the Automated Clearing House (ACH) network has grown into one of the largest, safest, and most efficient payment systems in the world. An ACH transaction is an electronic funds transfer between bank accounts using a batch processing system.

Currently, the ACH Network moves almost $43 trillion and 25 billion electronic financial transactions each year. With the increasing movement toward an electronic, interconnected and mobile infrastructure, it is critical that electronic payments work safely and efficiently for all users. ACH transactions carry sensitive data, such as a consumer's name, account number, tax identification number, account holder …


A Study On The Practical Use Of Operations Research And Vessels Big Data In Benefit Of Efficient Ports Utilization In Panama, Gabriel Fuentes Lezcano Nov 2017

A Study On The Practical Use Of Operations Research And Vessels Big Data In Benefit Of Efficient Ports Utilization In Panama, Gabriel Fuentes Lezcano

World Maritime University Dissertations

No abstract provided.


Constructing A Clinical Research Data Management System, Michael C. Quintero Nov 2017

Constructing A Clinical Research Data Management System, Michael C. Quintero

USF Tampa Graduate Theses and Dissertations

Clinical study data is usually collected without knowing what kind of data is going to be collected in advance. In addition, all of the possible data points that can apply to a patient in any given clinical study is almost always a superset of the data points that are actually recorded for a given patient. As a result of this, clinical data resembles a set of sparse data with an evolving data schema. To help researchers at the Moffitt Cancer Center better manage clinical data, a tool was developed called GURU that uses the Entity Attribute Value model to handle …


Database Usability Enhancement In Data Exploration, Yue Wang Nov 2017

Database Usability Enhancement In Data Exploration, Yue Wang

Doctoral Dissertations

Database usability has become an important research topic over the last decade. In the early days, database management systems were maintained by sophisticated users like database administrators. Today, due to the availability of data and computing resources, more non-expert users are involved in database computation. From their point of view, database systems lack ease of use. So researchers believe that usability is as important as the performance and functionality of databases and therefore developed many techniques such as natural language interface to enhance the ease of use of databases. In this thesis, we find some deeper technical issues in database …