Open Access. Powered by Scholars. Published by Universities.®

Databases and Information Systems Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 21 of 21

Full-Text Articles in Databases and Information Systems

Modeling Social Media Content With Word Vectors For Recommendation, Ying Ding, Jing Jiang Dec 2015

Modeling Social Media Content With Word Vectors For Recommendation, Ying Ding, Jing Jiang

Research Collection School Of Computing and Information Systems

In social media, recommender systems are becoming more and more important. Different techniques have been designed for recommendations under various scenarios, but many of them do not use user-generated content, which potentially reflects users’ opinions and interests. Although a few studies have tried to combine user-generated content with rating or adoption data, they mostly reply on lexical similarity to calculate textual similarity. However, in social media, a diverse range of words is used. This renders the traditional ways of calculating textual similarity ineffective. In this work, we apply vector representation of words to measure the semantic similarity between text. We …


Intelligshop: Enabling Intelligent Shopping In Malls Through Location-Based Augmented Reality, Aditi Adhikari, Vincent W. Zheng, Hong Cao, Miao Lin, Yuan Fang, Kevin Chen-Chuan Chang Nov 2015

Intelligshop: Enabling Intelligent Shopping In Malls Through Location-Based Augmented Reality, Aditi Adhikari, Vincent W. Zheng, Hong Cao, Miao Lin, Yuan Fang, Kevin Chen-Chuan Chang

Research Collection School Of Computing and Information Systems

Shopping experience is important for both citizens and tourists. We present IntelligShop, a novel location-based augmented reality application that supports intelligent shopping experience in malls. As the key functionality, IntelligShop provides an augmented reality interface-people can simply use ubiquitous smartphones to face mall retailers, then IntelligShop will automatically recognize the retailers and fetch their online reviews from various sources (including blogs, forums and publicly accessible social media) to display on the phones. Technically, IntelligShop addresses two challenging data mining problems, including robust feature learning to support heterogeneous smartphones in localization and learning to query for automatically gathering the retailer content …


Where Are The Passengers? A Grid-Based Gaussian Mixture Model For Taxi Bookings, Meng-Fen Chiang, Tuan Anh Hoang, Ee-Peng Lim Nov 2015

Where Are The Passengers? A Grid-Based Gaussian Mixture Model For Taxi Bookings, Meng-Fen Chiang, Tuan Anh Hoang, Ee-Peng Lim

Research Collection School Of Computing and Information Systems

Taxi bookings are events where requests for taxis are made by passengers either over voice calls or mobile apps. As the demand for taxis changes with space and time, it is important to model both the space and temporal dimensions in dynamic booking data. Several applications can benefit from a good taxi booking model. These include the prediction of number of bookings at certain location and time of the day, and the detection of anomalous booking events. In this paper, we propose a Grid-based Gaussian Mixture Model (GGMM) with spatio-temporal dimensions that groups booking data into a number of spatio-temporal …


Two Formulas For Success In Social Media: Learning And Network Effects, Liangfei Qiu, Qian Tang, Andrew B. Whinston Oct 2015

Two Formulas For Success In Social Media: Learning And Network Effects, Liangfei Qiu, Qian Tang, Andrew B. Whinston

Research Collection School Of Computing and Information Systems

Recent years have witnessed an unprecedented explosion in information technology that enables dynamic diffusion of user-generated content in social networks. Online videos, in particular, have changed the landscape of marketing and entertainment, competing with premium content and spurring business innovations. In the present study, we examine how learning and network effects drive the diffusion of online videos. While learning happens through informational externalities, network effects are direct payoff externalities. Using a unique data set from YouTube, we empirically identify learning and network effects separately, and find that both mechanisms have statistically and economically significant effects on video views; furthermore, the …


Did You Expect Your Users To Say This?: Distilling Unexpected Micro-Reviews For Venue Owners, Wen-Haw Chong, Bingtian Dai, Ee-Peng Lim Sep 2015

Did You Expect Your Users To Say This?: Distilling Unexpected Micro-Reviews For Venue Owners, Wen-Haw Chong, Bingtian Dai, Ee-Peng Lim

Research Collection School Of Computing and Information Systems

With social media platforms such as Foursquare, users can now generate concise reviews, i.e. micro-reviews, about entities such as venues (or products). From the venue owner's perspective, analysing these micro-reviews will offer interesting insights, useful for event detection and customer relationship management. However not all micro-reviews are equally important, especially since a venue owner should already be familiar with his venue's primary aspects. Instead we envisage that a venue owner will be interested in micro-reviews that are unexpected to him. These can arise in many ways, such as users focusing on easily overlooked aspects (by the venue owner), making comparisons …


On Mining Lifestyles From User Trip Data, Meng-Fen Chiang, Ee-Peng Lim Aug 2015

On Mining Lifestyles From User Trip Data, Meng-Fen Chiang, Ee-Peng Lim

Research Collection School Of Computing and Information Systems

Large cities today are facing major challenges in planning and policy formulation to keep their growth sustainable. In this paper, we aim to gain useful insights about people living in a city by developing novel models to mine user lifestyles represented by the users' activity centers. Two models, namely ACMM and ACHMM, have been developed to learn the activity centers of each user using a large dataset of bus and subway train trips performed by passengers in Singapore. We show that ACHMM and ACMM yield similar accuracies in location prediction task. We also propose methods to automatically predict "home", "work" …


Structured Learning From Heterogeneous Behavior For Social Identity Linkage, Siyuan Liu, Shuhui Wang, Feida Zhu Jul 2015

Structured Learning From Heterogeneous Behavior For Social Identity Linkage, Siyuan Liu, Shuhui Wang, Feida Zhu

Research Collection School Of Computing and Information Systems

Social identity linkage across different social media platforms is of critical importance to business intelligence by gaining from social data a deeper understanding and more accurate profiling of users. In this paper, we propose a solution framework, HYDRA, which consists of three key steps: (I) we model heterogeneous behavior by long-term topical distribution analysis and multi-resolution temporal behavior matching against high noise and information missing, and the behavior similarity are described by multi-dimensional similarity vector for each user pair; (II) we build structure consistency models to maximize the structure and behavior consistency on users' core social structure across different platforms, …


Fast Optimal Aggregate Point Search For A Merged Set On Road Networks, Weiwei Sun, Chong Chen, Baihua Zheng, Chunan Chen, Liang Zhu, Weimo Liu, Yan Huang Jul 2015

Fast Optimal Aggregate Point Search For A Merged Set On Road Networks, Weiwei Sun, Chong Chen, Baihua Zheng, Chunan Chen, Liang Zhu, Weimo Liu, Yan Huang

Research Collection School Of Computing and Information Systems

Aggregate nearest neighbor query, which returns an optimal target point that minimizes the aggregate distance for a given query point set, is one of the most important operations in spatial databases and their application domains. This paper addresses the problem of finding the aggregate nearest neighbor for a merged set that consists of the given query point set and multiple points needed to be selected from a candidate set, which we name as merged aggregate nearest neighbor(MANN) query. This paper proposes two algorithms to process MANN query on road networks when aggregate function is max. Then, we extend the algorithms …


Should We Use The Sample? Analyzing Datasets Sampled From Twitter's Stream Api, Yazhe Wang, Jamie Callan, Baihua Zheng Jun 2015

Should We Use The Sample? Analyzing Datasets Sampled From Twitter's Stream Api, Yazhe Wang, Jamie Callan, Baihua Zheng

Research Collection School Of Computing and Information Systems

Researchers have begun studying content obtained from microblogging services such as Twitter to address a variety of technological, social, and commercial research questions. The large number of Twitter users and even larger volume of tweets often make it impractical to collect and maintain a complete record of activity; therefore, most research and some commercial software applications rely on samples, often relatively small samples, of Twitter data. For the most part, sample sizes have been based on availability and practical considerations. Relatively little attention has been paid to how well these samples represent the underlying stream of Twitter data. To fill …


Author Topic Model-Based Collaborative Filtering For Personalized Poi Recommendations, Shuhui Jiang, Xueming Qian, Jialie Shen, Yun Fu, Tao Mei Jun 2015

Author Topic Model-Based Collaborative Filtering For Personalized Poi Recommendations, Shuhui Jiang, Xueming Qian, Jialie Shen, Yun Fu, Tao Mei

Research Collection School Of Computing and Information Systems

From social media has emerged continuous needs for automatic travel recommendations. Collaborative filtering (CF) is the most well-known approach. However, existing approaches generally suffer from various weaknesses. For example, sparsity can significantly degrade the performance of traditional CF. If a user only visits very few locations, accurate similar user identification becomes very challenging due to lack of sufficient information for effective inference. Moreover, existing recommendation approaches often ignore rich user information like textual descriptions of photos which can reflect users' travel preferences. The topic model (TM) method is an effective way to solve the "sparsity problem," but is still far …


Breaking The News: First Impressions Matter On Online News, Julio Reis, Fabr´Icio Benevenuto, Pedro Olmo, Raquel Prates, Haewoon Kwak, Jisun An May 2015

Breaking The News: First Impressions Matter On Online News, Julio Reis, Fabr´Icio Benevenuto, Pedro Olmo, Raquel Prates, Haewoon Kwak, Jisun An

Research Collection School Of Computing and Information Systems

A growing number of people are changing the way they consume news, replacing the traditional physical newspapers and magazines by their virtual online versions or/and weblogs. The interactivity and immediacy present in online news are changing the way news are being produced and exposed by media corporations. News websites have to create effective strategies to catch people’s attention and attract their clicks. In this paper we investigate possible strategies used by online news corporations in the design of their news headlines. We analyze the content of 69,907 headlines produced by four major global media corporations during a minimum of eight …


Characterizing Silent Users In Social Media Communities, Wei Gong, Ee-Peng Lim, Feida Zhu May 2015

Characterizing Silent Users In Social Media Communities, Wei Gong, Ee-Peng Lim, Feida Zhu

Research Collection School Of Computing and Information Systems

Silent users often constitute a significant proportion of an online user-generated content system. In the context of social media such as Twitter, users can opt to be silent all or most of the time. They are often called the invisible participants or lurkers. As lurkers contribute little to the online content, existing analysis often overlooks their presence and voices. However, we argue that understanding lurkers is important in many applications such as recommender systems, targeted advertising, and social sensing. This research therefore seeks to characterize lurkers in social media and propose methods to profile them. We examine 18 weeks of …


Efficient Reverse Top-K Boolean Spatial Keyword Queries On Road Networks, Yunjun Gao, Xu Qin, Baihua Zheng, Gang Chen May 2015

Efficient Reverse Top-K Boolean Spatial Keyword Queries On Road Networks, Yunjun Gao, Xu Qin, Baihua Zheng, Gang Chen

Research Collection School Of Computing and Information Systems

Reverse k nearest neighbor (RkNN) queries have a broad application base such as decision support, profile-based marketing, and resource allocation. Previous work on RkNN search does not take textual information into consideration or limits to the Euclidean space. In the real world, however, most spatial objects are associated with textual information and lie on road networks. In this paper, we introduce a new type of queries, namely, reverse top-k Boolean spatial keyword (RkBSK) retrieval, which assumes objects are on the road network and considers both spatial and textual information. Given a data set P on a road network and a …


Multi-Roles Affiliation Model For General User Profiling, Lizi Liao, Heyan Huang, Yashen Wang Apr 2015

Multi-Roles Affiliation Model For General User Profiling, Lizi Liao, Heyan Huang, Yashen Wang

Research Collection School Of Computing and Information Systems

Online social networks release user attributes, which is important for many applications. Due to the sparsity of such user attributes online, many works focus on profiling user attributes automatically. However, in order to profile a specific user attribute, an unique model is built and such model usually does not fit other profiling tasks. In our work, we design a novel, flexible general user profiling model which naturally models users’ friendships with user attributes. Experiments show that our method simultaneously profile multiple attributes with better performance.


Measuring User Influence, Susceptibility And Cynicalness In Sentiment Diffusion, Roy Ka-Wei Lee, Ee Peng Lim Apr 2015

Measuring User Influence, Susceptibility And Cynicalness In Sentiment Diffusion, Roy Ka-Wei Lee, Ee Peng Lim

Research Collection School Of Computing and Information Systems

Diffusion in social networks is an important research topic lately due to massive amount of information shared on social media and Web. As information diffuses, users express sentiments which can affect the sentiments of others. In this paper, we analyze how users reinforce or modify sentiment of one another based on a set of inter-dependent latent user factors as they are engaged in diffusion of event information. We introduce these sentiment-based latent user factors, namely influence, susceptibility and cynicalness. We also propose the ISC model to relate the three factors together and develop an iterative computation approach to …


Review Selection Using Micro-Reviews, Thanh-Son Nguyen, Hady W. Lauw, Panayiotis Tsaparas Apr 2015

Review Selection Using Micro-Reviews, Thanh-Son Nguyen, Hady W. Lauw, Panayiotis Tsaparas

Research Collection School Of Computing and Information Systems

Given the proliferation of review content, and the fact that reviews are highly diverse and often unnecessarily verbose, users frequently face the problem of selecting the appropriate reviews to consume. Micro-reviews are emerging as a new type of online review content in the social media. Micro-reviews are posted by users of check-in services such as Foursquare. They are concise (up to 200 characters long) and highly focused, in contrast to the comprehensive and verbose reviews. In this paper, we propose a novel mining problem, which brings together these two disparate sources of review content. Specifically, we use coverage of micro-reviews …


Nirmal: Automatic Identification Of Software Relevant Tweets Leveraging Language Model, Abishek Sharma, Yuan Tian, David Lo Mar 2015

Nirmal: Automatic Identification Of Software Relevant Tweets Leveraging Language Model, Abishek Sharma, Yuan Tian, David Lo

Research Collection School Of Computing and Information Systems

Twitter is one of the most widely used social media platforms today. It enables users to share and view short 140-character messages called 'tweets'. About 284 million active users generate close to 500 million tweets per day. Such rapid generation of user generated content in large magnitudes results in the problem of information overload. Users who are interested in information related to a particular domain have limited means to filter out irrelevant tweets and tend to get lost in the huge amount of data they encounter. A recent study by Singer et al. found that software developers use Twitter to …


Prediction Of Venues In Foursquare Using Flipped Topic Models, Wen Haw Chong, Bing Tian Dai, Ee Peng Lim Mar 2015

Prediction Of Venues In Foursquare Using Flipped Topic Models, Wen Haw Chong, Bing Tian Dai, Ee Peng Lim

Research Collection School Of Computing and Information Systems

Foursquare is a highly popular location-based social platform, where users indicate their presence at venues via check-ins and/or provide venue-related tips. On Foursquare, we explore Latent Dirichlet Allocation (LDA) topic models for venue prediction: predict venues that a user is likely to visit, given his history of other visited venues. However we depart from prior works which regard the users as documents and their visited venues as terms. Instead we ‘flip’ LDA models such that we regard venues as documents that attract users, which are now the terms. Flipping is simple and requires no changes to the LDA mechanism. Yet …


Use Of A High-Value Social Audience Index For Target Audience Identification On Twitter, Siaw Ling Lo, David Cornforth, Raymond. Chiong Feb 2015

Use Of A High-Value Social Audience Index For Target Audience Identification On Twitter, Siaw Ling Lo, David Cornforth, Raymond. Chiong

Research Collection School Of Computing and Information Systems

With the large and growing user base of social media, it is not an easy feat to identify potential customers for business. This is mainly due to the challenge of extracting commercially viable contents from the vast amount of free-form conversations. In this paper, we analyse the Twitter content of an account owner and its list of followers through various text mining methods and segment the list of followers via an index. We have termed this index as the High-Value Social Audience (HVSA) index. This HVSA index enables a company or organisation to devise their marketing and engagement plan according …


Review Synthesis For Micro-Review Summarization, Thanh-Son Nguyen, Hady W. Lauw, Panayiotis Tsaparas Feb 2015

Review Synthesis For Micro-Review Summarization, Thanh-Son Nguyen, Hady W. Lauw, Panayiotis Tsaparas

Research Collection School Of Computing and Information Systems

Micro-reviews is a new type of user-generated content arising from the prevalence of mobile devices and social media in the past few years. Micro-reviews are bite-size reviews (usually under 200 characters), commonly posted on social media or check-in services, using a mobile device. They capture the immediate reaction of users, and they are rich in information, concise, and to the point. However, the abundance of micro-reviews, and their telegraphic nature make it increasingly difficult to go through them and extract the useful information, especially on a mobile device. In this paper, we address the problem of summarizing the micro-reviews of …


Community Discovery From Social Media By Low-Rank Matrix Recovery, Jinfeng Zhuang, Mei Tao, Steven C. H. Hoi, Xian-Sheng Hua, Yongdong Zhang Jan 2015

Community Discovery From Social Media By Low-Rank Matrix Recovery, Jinfeng Zhuang, Mei Tao, Steven C. H. Hoi, Xian-Sheng Hua, Yongdong Zhang

Research Collection School Of Computing and Information Systems

The pervasive usage and reach of social media have attracted a surge of attention in the multimedia research community. Community discovery from social media has therefore become an important yet challenging issue. However, due to the subjective generating process, the explicitly observed communities (e.g., group-user and user-user relationship) are often noisy and incomplete in nature. This paper presents a novel approach to discovering communities from social media, including the group membership and user friend structure, by exploring a low-rank matrix recovery technique. In particular, we take Flickr as one exemplary social media platform. We first model the observed indicator matrix …