Open Access. Powered by Scholars. Published by Universities.®

Databases and Information Systems Commons

Open Access. Powered by Scholars. Published by Universities.®

6,554 Full-Text Articles 8,879 Authors 3,346,977 Downloads 209 Institutions

All Articles in Databases and Information Systems

Faceted Search

6,554 full-text articles. Page 109 of 247.

Detect Rumor And Stance Jointly By Neural Multi-Task Learning, Jing MA, Wei GAO, Kam-Fai WONG 2018 Singapore Management University

Detect Rumor And Stance Jointly By Neural Multi-Task Learning, Jing Ma, Wei Gao, Kam-Fai Wong

Research Collection School Of Computing and Information Systems

In recent years, an unhealthy phenomenon characterized as the massive spread of fake news or unverified information (i.e., rumors) has become increasingly a daunting issue in human society. The rumors commonly originate from social media outlets, primarily microblogging platforms, being viral afterwards by the wild, willful propagation via a large number of participants. It is observed that rumorous posts often trigger versatile, mostly controversial stances among participating users. Thus, determining the stances on the posts in question can be pertinent to the successful detection of rumors, and vice versa. Existing studies, however, mainly regard rumor detection and stance classification as …


A Novel Representation And Compression For Queries On Trajectories In Road Networks, Xiaochun YANG, Bin WANG, Kai YANG, Chengfei LIU, Baihua ZHENG 2018 Northeastern University

A Novel Representation And Compression For Queries On Trajectories In Road Networks, Xiaochun Yang, Bin Wang, Kai Yang, Chengfei Liu, Baihua Zheng

Research Collection School Of Computing and Information Systems

Recording and querying time-stamped trajectories incurs high cost of data storage and computing. In this paper, we explore several characteristics of the trajectories in road mbox{networks}, which have motivated the idea of coding trajectories by associating timestamps with relative spatial path and locations. Such a representation contains large number of duplicate information to achieve a lower entropy compared with the existing representations, thereby drastically cutting the storage cost. We propose several techniques to compress spatial path and locations separately, which can support fast positioning and achieve better compression ratio. For locations, we propose two novel encoding schemes such that the …


Continuous Top-K Monitoring On Document Streams (Extended Abstract), Leong Hou U, Junjie ZHANG, Kyriakos MOURATIDIS, Ye LI 2018 Singapore Management University

Continuous Top-K Monitoring On Document Streams (Extended Abstract), Leong Hou U, Junjie Zhang, Kyriakos Mouratidis, Ye Li

Research Collection School Of Computing and Information Systems

The efficient processing of document streams plays an important role in many information filtering systems. Emerging applications, such as news update filtering and social network notifications, demand presenting end-users with the most relevant content to their preferences. In this work, user preferences are indicated by a set of keywords. A central server monitors the document stream and continuously reports to each user the top-k documents that are most relevant to her keywords. The objective is to support large numbers of users and high stream rates, while refreshing the topk results almost instantaneously. Our solution abandons the traditional frequency-ordered indexing approach, …


A Data-Driven Analysis Of Workers' Earnings On Amazon Mechanical Turk, Kotaro HARA, Abigail ADAMS, Kristy MILLAND, Saiph SAVAGE, Chris CALLISON-BURCH, Jeffrey P. BIGHAM 2018 Singapore Management University

A Data-Driven Analysis Of Workers' Earnings On Amazon Mechanical Turk, Kotaro Hara, Abigail Adams, Kristy Milland, Saiph Savage, Chris Callison-Burch, Jeffrey P. Bigham

Research Collection School Of Computing and Information Systems

A growing number of people are working as part of on-line crowd work. Crowd work is often thought to be low wage work. However, we know little about the wage distribution in practice and what causes low/high earnings in this setting. We recorded 2,676 workers performing 3.8 million tasks on Amazon Mechanical Turk. Our task-level analysis revealed that workers earned a median hourly wage of only ~$2/h, and only 4% earned more than $7.25/h. While the average requester pays more than $11/h, lower-paying requesters post much more work. Our wage calculations are influenced by how unpaid work is accounted for, …


The Role Of Urban Mobility In Retail Business Survival, Krittika D'SILVA, Kasthuri JAYARAJAH, Anastasios NOULAS, Cecilia MASCOLO, Archan MISRA 2018 University of Cambridge

The Role Of Urban Mobility In Retail Business Survival, Krittika D'Silva, Kasthuri Jayarajah, Anastasios Noulas, Cecilia Mascolo, Archan Misra

Research Collection School Of Computing and Information Systems

Economic and urban planning agencies have strong interest in tackling the hard problem of predicting the odds of survival of individual retail businesses. In this work, we tap urban mobility data available both from a location-based intelligence platform, Foursquare, and from public transportation agencies, and investigate whether mobility-derived features can help foretell the failure of such retail businesses, over a 6 month horizon, across 10 distinct cities spanning the globe. We hypothesise that the survival of such a retail outlet is correlated with not only venue-specific characteristics but also broader neighbourhood-level effects. Through careful statistical analysis of Foursquare and taxi …


Social Network Monitoring For Bursty Cascade Detection, Wei XIE, Feida ZHU, Jing XIAO, Jianzong WANG 2018 Singapore Management University

Social Network Monitoring For Bursty Cascade Detection, Wei Xie, Feida Zhu, Jing Xiao, Jianzong Wang

Research Collection School Of Computing and Information Systems

Social network services have become important and efficient platforms for users to share all kinds of information. The capability to monitor user-generated information and detect bursts from information diffusions in these social networks brings value to a wide range of real-life applications, such as viral marketing. However, in reality, as a third party, there is always a cost for gathering information from each user or so-called social network sensor. The question then arises how to select a budgeted set of social network sensors to form the data stream for burst detection without compromising the detection performance. In this article, we …


Pccf: Periodic And Continual Temporal Co-Factorization For Recommender Systems, Guibing GUO, Feida ZHU, Shilin QU, Xingwei WANG 2018 Singapore Management University

Pccf: Periodic And Continual Temporal Co-Factorization For Recommender Systems, Guibing Guo, Feida Zhu, Shilin Qu, Xingwei Wang

Research Collection School Of Computing and Information Systems

Rating-only collaborative filtering has been extensively studied for decades with great improvements achieved in predicting a user’s preference on a target item at a particular time point. Yet, it remains a research challenge on how to capture users’ rating patterns which may drift over time. In this article, we propose a time-aware matrix co-factorization model, called PCCF, which considers two types of temporal effects, i.e., periodic and continual. Specifically, periodic effects refer to the impact of discrete periodic time slices with which users’ preferences may be associated, and continual effects refer to the impact of continuous gradual time over which …


Distributed Multi-Task Classification: A Decentralized Online Learning Approach, Chi ZHANG, Peilin ZHAO, Shuji HAO, Yeng Chai SOH, Bu Sung LEE, Chunyan MIAO, Steven C. H. HOI 2018 Nanyang Technological University

Distributed Multi-Task Classification: A Decentralized Online Learning Approach, Chi Zhang, Peilin Zhao, Shuji Hao, Yeng Chai Soh, Bu Sung Lee, Chunyan Miao, Steven C. H. Hoi

Research Collection School Of Computing and Information Systems

Although dispersing one single task to distributed learning nodes has been intensively studied by the previous research, multi-task learning on distributed networks is still an area that has not been fully exploited, especially under decentralized settings. The challenge lies in the fact that different tasks may have different optimal learning weights while communication through the distributed network forces all tasks to converge to an unique classifier. In this paper, we present a novel algorithm to overcome this challenge and enable learning multiple tasks simultaneously on a decentralized distributed network. Specifically, the learning framework can be separated into two phases: (i) …


Domain-Specific Cross-Language Relevant Question Retrieval, Bowen XU, Zhenchang XING, Xin XIA, David LO, Shanping LI 2018 Singapore Management University

Domain-Specific Cross-Language Relevant Question Retrieval, Bowen Xu, Zhenchang Xing, Xin Xia, David Lo, Shanping Li

Research Collection School Of Computing and Information Systems

Chinese developers often cannot effectively search questions in English, because they may have difficulties in translating technical words from Chinese to English and formulating proper English queries. For the purpose of helping Chinese developers take advantage of the rich knowledge base of Stack Overflow and simplify the question retrieval process, we propose an automated cross-language relevant question retrieval (CLRQR) system to retrieve relevant English questions for a given Chinese question. CLRQR first extracts essential information (both Chinese and English) from the title and description of the input Chinese question, then performs domain-specific translation of the essential Chinese information into English, …


Does Journaling Encourage Healthier Choices? Analyzing Healthy Eating Behaviors Of Food Journalers, Palakorn ACHANANUPARP, Ee Peng LIM, Vibhanshu ABHISHEK 2018 Singapore Management University

Does Journaling Encourage Healthier Choices? Analyzing Healthy Eating Behaviors Of Food Journalers, Palakorn Achananuparp, Ee Peng Lim, Vibhanshu Abhishek

Research Collection School Of Computing and Information Systems

Past research has shown the benefits of food journaling in promoting mindful eating and healthier food choices. However, the links between journaling and healthy eating have not been thoroughly examined. Beyond caloric restriction, do journalers consistently and sufficiently consume healthful diets? How different are their eating habits compared to those of average consumers who tend to be less conscious about health? In this study, we analyze the healthy eating behaviors of active food journalers using data from MyFitnessPal. Surprisingly, our findings show that food journalers do not eat as healthily as they should despite their proclivity to health eating and …


Exploiting User And Venue Characteristics For Fine-Grained Tweet Geolocation, Wen Haw CHONG, Ee Peng LIM 2018 Singapore Management University

Exploiting User And Venue Characteristics For Fine-Grained Tweet Geolocation, Wen Haw Chong, Ee Peng Lim

Research Collection School Of Computing and Information Systems

Which venue is a tweet posted from? We call this a fine-grained geolocation problem. Given an observed tweet, the task is to infer its discrete posting venue, e.g., a specific restaurant. This recovers the venue context and differs from prior work, which geolocats tweets to location coordinates or cities/neighborhoods. First, we conduct empirical analysis to uncover venue and user characteristics for improving geolocation. For venues, we observe spatial homophily, in which venues near each other have more similar tweet content (i.e., text representations) compared to venues further apart. For users, we observe that they are spatially focused and more likely …


A Feasibility Study On Crowdsourcing To Monitor Municipal Resources In Smart Cities, Thivya KANDAPPU, Archan MISRA, Ming Hui, Desmond (XU Minghui) KOH, Randy Tandriansyah DARATAN, Nikita JAIMAN 2018 Singapore Management University

A Feasibility Study On Crowdsourcing To Monitor Municipal Resources In Smart Cities, Thivya Kandappu, Archan Misra, Ming Hui, Desmond (Xu Minghui) Koh, Randy Tandriansyah Daratan, Nikita Jaiman

Research Collection School Of Computing and Information Systems

Active citizenry, whereby citizens actively participate inreporting and addressing challenges in urban service delivery is a strategic goalof smart cities such as Singapore. In spite of the promise, we believe that thesuccess of such large-scale nation-wide crowdsourcing deployments depend on thereal-word user preferences and behavioral characteristics of citizens. In thispaper, we first present our findings on behavioral preferences and key concernsof citizens regarding smart-city services via an opinion survey conducted with 1300participants. We then propose a “citizen-controlled” urban services reportingplatform where citizens actively report on the status of various municipalresources. We advocate the importance of matching user mobility patternsagainst task …


A Sliding-Window Framework For Representative Subset Selection, Yanhao WANG, Yuchen LI, Kian-Lee TAN 2018 Singapore Management University

A Sliding-Window Framework For Representative Subset Selection, Yanhao Wang, Yuchen Li, Kian-Lee Tan

Research Collection School Of Computing and Information Systems

Representative subset selection (RSS) is an important tool for users to draw insights from massive datasets. A common approach is to model RSS as the submodular maximization problem because the utility of extracted representatives often satisfies the "diminishing returns" property. To capture the data recency issue and support different types of constraints in real-world problems, we formulate RSS as maximizing a submodular function subject to a d-knapsack constraint (SMDK) over sliding windows. Then, we propose a novel KnapWindow framework for SMDK. Theoretically, KnapWindow is 1-ε/1+d - approximate for SMDK and achieves sublinear complexity. Finally, we evaluate the efficiency and effectiveness …


'Is More Better?': Impact Of Multiple Photos On Perception Of Persona Profiles, Joni SALMINEN, Lene NIELSEN, Soon-Gyo JUNG, Jisun AN, Haewoon KWAK, Bernard J. JANSEN 2018 Hamad Bin Khalifa University

'Is More Better?': Impact Of Multiple Photos On Perception Of Persona Profiles, Joni Salminen, Lene Nielsen, Soon-Gyo Jung, Jisun An, Haewoon Kwak, Bernard J. Jansen

Research Collection School Of Computing and Information Systems

In this research, we investigate if and how more photos than a single headshot can heighten the level of information provided by persona profiles. We conduct eye-tracking experiments and qualitative interviews with variations in the photos: a single headshot, a headshot and images of the persona in different contexts, and a headshot with pictures of different people representing key persona attributes. The results show that more contextual photos significantly improve the information end users derive from a persona profile; however, showing images of different people creates confusion and lowers the informativeness. Moreover, we discover that choice of pictures results in …


Supporting Scientific Analytics Under Data Uncertainty And Query Uncertainty, Liping Peng 2018 University of Massachusetts Amherst

Supporting Scientific Analytics Under Data Uncertainty And Query Uncertainty, Liping Peng

Doctoral Dissertations

Data management is becoming increasingly important in many applications, in particular, in large scientific databases where (1) data can be naturally modeled by continuous random variables, and (2) queries can involve complex predicates and/or be difficult for users to express explicitly. My thesis work aims to provide efficient support to both the "data uncertainty" and the "query uncertainty". When data is uncertain, an important class of queries requires query answers to be returned if their existence probabilities pass a threshold. I start with optimizing such threshold query processing for continuous uncertain data in the relational model by (i) expediting selections …


The Application Of Text Mining And Data Visualization Techniques To Textual Corpus Exploration, Jeffrey R. Smith Jr. 2018 Air Force Institute of Technology

The Application Of Text Mining And Data Visualization Techniques To Textual Corpus Exploration, Jeffrey R. Smith Jr.

Theses and Dissertations

Unstructured data in the digital universe is growing rapidly and shows no evidence of slowing anytime soon. With the acceleration of growth in digital data being generated and stored on the World Wide Web, the prospect of information overload is much more prevalent now than it has been in the past. As a preemptive analytic measure, organizations across many industries have begun implementing text mining techniques to analyze such large sources of unstructured data. Utilizing various text mining techniques such as n -gram analysis, document and term frequency analysis, correlation analysis, and topic modeling methodologies, this research seeks to develop …


Text Mining In Chinese Ancient Attires, Lu Wang 2018 Western University

Text Mining In Chinese Ancient Attires, Lu Wang

Western Research Forum

Starting from the Shang Dynasty (1600-1046 BCE) when writing system appeared in China, clothing was recorded as symbols to denote social statuses. The hierarchical signification of clothing remained in the following dynasties until the end of imperial China in 1911. The imperial period produced twenty-five official dynastic histories with rich corpuses on the subject of attire, documenting regulations and prohibitions of detailed dress code, a subject being scarcely studied and treated with assumptions today. This research will use text mining tools to identify descriptive words of clothing that reflect Chinese hierarchal ideology from the twenty-five histories. The method is to …


Seed Dormancy-Life Form Profile For 358 Species From The Xishuangbanna Seasonal Tropical Rainforest, Yunnan Province, China Compared To World Database, Qinying Lan, Shouhua Yin, Huiyin He, Yunhong Tan, Qiang Liu, Yongmei Xia, Bin Wen, Carol C. Baskin, Jerry M. Baskin 2018 Chinese Academy of Sciences, China

Seed Dormancy-Life Form Profile For 358 Species From The Xishuangbanna Seasonal Tropical Rainforest, Yunnan Province, China Compared To World Database, Qinying Lan, Shouhua Yin, Huiyin He, Yunhong Tan, Qiang Liu, Yongmei Xia, Bin Wen, Carol C. Baskin, Jerry M. Baskin

Biology Faculty Publications

Seed dormancy profiles are available for the major vegetation regions/types on earth. These were constructed using a composite of data from locations within each region. Furthermore, the proportion of species with nondormant (ND) seeds and the five classes of dormancy is available for each life form in each region. Using these data, we asked: will the results be the same if many species from a specific area as opposed to data compiled from many locations are considered? Germination was tested for fresh seeds of 358 species in 95 families from the Xishuangbanna seasonal tropical rainforest (XSTRF): 177 trees, 66 shrubs, …


The Big Revolution: Future Potential Of Blockchain Technology, Sweksha Poudel, Sushant Bhatta, Jeremy Evert 2018 Southwestern Oklahoma State University

The Big Revolution: Future Potential Of Blockchain Technology, Sweksha Poudel, Sushant Bhatta, Jeremy Evert

Student Research

Blockchain is the continuation of humanity’s connection with technology. If we think back to a more ancient era, trade was done in a very informal manner. Often the result of one’s desire to get what they wanted was with violence. Society as a whole then started becoming more formalized and grew in complexity. Institutions like banks and governments established currency, policy, and regulation. Eventually, we had access to these same institutions on the internet and the list grew exponentially. Marketplaces like Amazon and eBay made trade much easier for the common man to use and it kept lowering uncertainties of …


The Role Of Ehealth In Disasters: A Strategy For Education, Training And Integration In Disaster Medicine, Anthony C. Norris, Jose J. Gonzalez, David T. Parry, Richard E. Scott, Julie Dugdale, Deepak Khazanchi 2018 AUT University

The Role Of Ehealth In Disasters: A Strategy For Education, Training And Integration In Disaster Medicine, Anthony C. Norris, Jose J. Gonzalez, David T. Parry, Richard E. Scott, Julie Dugdale, Deepak Khazanchi

Information Systems and Quantitative Analysis Faculty Publications

This paper describes the origins and progress of an international project to advance disaster eHealth (DEH) – the application of eHealth technologies to enhance the delivery of healthcare in disasters. The study to date has focused on two major themes; the role of DEH in facilitating inter-agency communication in disaster situations, and the fundamental need to promote awareness of DEH in the education of disaster managers and health professionals. The paper deals mainly with on-going research on the second of these themes, surveying the current provision of disaster medicine education, the design considerations for a DEH programme for health professionals, …


Digital Commons powered by bepress