Open Access. Powered by Scholars. Published by Universities.®

Databases and Information Systems Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 46

Full-Text Articles in Databases and Information Systems

Topicsketch: Real-Time Bursty Topic Detection From Twitter, Wei Xie, Feida Zhu, Jing Jiang, Ee Peng Lim, Ke Wang Dec 2013

Topicsketch: Real-Time Bursty Topic Detection From Twitter, Wei Xie, Feida Zhu, Jing Jiang, Ee Peng Lim, Ke Wang

Research Collection School Of Computing and Information Systems

Twitter has become one of the largest platforms for users around the world to share anything happening around them with friends and beyond. A bursty topic in Twitter is one that triggers a surge of relevant tweets within a short time, which often reflects important events of mass interest. How to leverage Twitter for early detection of bursty topics has therefore become an important research problem with immense practical value. Despite the wealth of research work on topic modeling and analysis in Twitter, it remains a huge challenge to detect bursty topics in real-time. As existing methods can hardly scale …


Two Formulas For Success In Social Media: Social Learning And Network Effects, Liangfei Qiu, Qian Tang, Andrew B. Whinston Dec 2013

Two Formulas For Success In Social Media: Social Learning And Network Effects, Liangfei Qiu, Qian Tang, Andrew B. Whinston

Research Collection School Of Computing and Information Systems

This paper examines social learning and network effects that are particularly important for online videos, considering the limited marketing campaigns of user-generated content. Rather than combining both social learning and network effects under the umbrella of social contagion or peer influence, we develop a theoretical model and empirically identify social learning and network effects separately. Using a unique data set from YouTube, we find that both mechanisms have statistically and economically significant effects on video views, and which mechanism dominates depends on the specific video type.


Predicting Best Answerers For New Questions: An Approach Leveraging Topic Modeling And Collaborative Voting, Yuan Tian, Pavneet Singh Kochhar, Ee Peng Lim, Feida Zhu, David Lo Nov 2013

Predicting Best Answerers For New Questions: An Approach Leveraging Topic Modeling And Collaborative Voting, Yuan Tian, Pavneet Singh Kochhar, Ee Peng Lim, Feida Zhu, David Lo

Research Collection School Of Computing and Information Systems

Community Question Answering (CQA) sites are becoming increasingly important source of information where users can share knowledge on various topics. Although these platforms bring new opportunities for users to seek help or provide solutions, they also pose many challenges with the ever growing size of the community. The sheer number of questions posted everyday motivates the problem of routing questions to the appropriate users who can answer them. In this paper, we propose an approach to predict the best answerer for a new question on CQA site. Our approach considers both user interest and user expertise relevant to the topics …


Automatic Domain Identification For Linked Open Data, Sarasi Lalithsena, Pascal Hitzler, Amit P. Sheth, Prateek Jain Nov 2013

Automatic Domain Identification For Linked Open Data, Sarasi Lalithsena, Pascal Hitzler, Amit P. Sheth, Prateek Jain

Kno.e.sis Publications

Linked Open Data (LOD) has emerged as one of the largest collections of interlinked structured datasets on the Web. Although the adoption of such datasets for applications is increasing, identifying relevant datasets for a specific task or topic is still challenging. As an initial step to make such identification easier, we provide an approach to automatically identify the topic domains of given datasets. Our method utilizes existing knowledge sources, more specifically Freebase, and we present an evaluation which validates the topic domains we can identify with our system. Furthermore, we evaluate the effectiveness of identified topic domains for the purpose …


Semantics-Empowered Big Data Processing With Applications, Krishnaprasad Thirunarayan, Amit P. Sheth Nov 2013

Semantics-Empowered Big Data Processing With Applications, Krishnaprasad Thirunarayan, Amit P. Sheth

Kno.e.sis Publications

We discuss the nature of Big Data and address the role of semantics in analyzing and processing Big Data that arises in the context of Physical-Cyber-Social Systems. We organize our research around the Five Vs of Big Data, where four of the Vs are harnessed to produce the fifth V - value. To handle the challenge of Volume, we advocate semantic perception that can convert low-level observational data to higher-level abstractions more suitable for decision-making. To handle the challenge of Variety, we resort to the use of semantic models and annotations of data so that much of the intelligent processing …


Why Do I Retweet It? An Information Propagation Model For Microblogs, Fabio Pezzoni, Jisun An, Andrea Passarella, Jon Crowcroft, Marco Conti Nov 2013

Why Do I Retweet It? An Information Propagation Model For Microblogs, Fabio Pezzoni, Jisun An, Andrea Passarella, Jon Crowcroft, Marco Conti

Research Collection School Of Computing and Information Systems

Microblogging platforms are Web 2.0 services that represent a suitable environment for studying how information is propagated in social networks and how users can become influential. In this work we analyse the impact of the network features and of the users' behaviour on the information diffusion. Our analysis highlights a strong relation between the level of visibility of a message in the flow of information seen by a user and the probability that the user further disseminates the message. In addition, we also highlight the existence of other latent factors that impact on the dissemination probability, correlated with the properties …


Predicting User's Political Party Using Ideological Stances, Swapna Gottopati, Minghui Qiu, Liu Yang, Feida Zhu, Jing Jiang Nov 2013

Predicting User's Political Party Using Ideological Stances, Swapna Gottopati, Minghui Qiu, Liu Yang, Feida Zhu, Jing Jiang

Research Collection School Of Computing and Information Systems

Predicting users political party in social media has important impacts on many real world applications such as targeted advertising, recommendation and personalization. Several political research studies on it indicate that political parties’ ideological beliefs on sociopolitical issues may influence the users political leaning. In our work, we exploit users’ ideological stances on controversial issues to predict political party of online users. We propose a collaborative filtering approach to solve the data sparsity problem of users stances on ideological topics and apply clustering method to group the users with the same party. We evaluated several state-of-the-art methods for party prediction task …


Social Sensing For Urban Crisis Management: The Case Of Singapore Haze, Philips Kokoh Prasetyo, Ming Gao, Ee Peng Lim, Christie N. Scollon Nov 2013

Social Sensing For Urban Crisis Management: The Case Of Singapore Haze, Philips Kokoh Prasetyo, Ming Gao, Ee Peng Lim, Christie N. Scollon

Research Collection School Of Computing and Information Systems

Sensing social media for trends and events has become possible as increasing number of users rely on social media to share information. In the event of a major disaster or social event, one can therefore study the event quickly by gathering and analyzing social media data. One can also design appropriate responses such as allocating resources to the affected areas, sharing event related information, and managing public anxiety. Past research on social event studies using social media often focused on one type of data analysis (e.g., hashtag clusters, diffusion of events, influential users, etc.) on a single social media data …


Information Vs Interaction: An Alternative User Ranking Model For Social Networks, Wei Xie, Ai Phuong Hoang, Feida Zhu, Ee Peng Lim Nov 2013

Information Vs Interaction: An Alternative User Ranking Model For Social Networks, Wei Xie, Ai Phuong Hoang, Feida Zhu, Ee Peng Lim

Research Collection School Of Computing and Information Systems

The recent years have seen an unprecedented boom of social network services, such as Twitter, which boasts over 200 million users. In such big social platforms, the influential users are ideal targets for viral marketing to potentially reach an audience of maximal size. Most proposed algorithms rely on the linkage structure of the respective underlying network to determine the information flow and hence indicate a users influence. From social interaction perspective, we built a model based on the dynamic user interactions constantly taking place on top of these linkage structures. In particular, in the Twitter setting we supposed a principle …


Social Listening For Customer Acquisition, Juan Du, Biying Tan, Feida Zhu, Ee-Peng Lim Nov 2013

Social Listening For Customer Acquisition, Juan Du, Biying Tan, Feida Zhu, Ee-Peng Lim

Research Collection School Of Computing and Information Systems

Social network analysis has received much attention from corporations recently. Corporations are trying to utilize social media platforms such as Twitter, Facebook and Sina Weibo to expand their own markets. Our system is an online tool to assist these corporations to 1) find potential customers, and 2) track a list of users by specific events from social networks. We employ both textual and network information, and thus produce a keyword-based relevance score for each user in pre-defined dimensions, which indicates the probability of the adoption of a product. Based on the score and its trend, out tool is able to …


City Notifications As A Data Source For Traffic Management, Pramod Anantharam, Biplav Srivastava Oct 2013

City Notifications As A Data Source For Traffic Management, Pramod Anantharam, Biplav Srivastava

Kno.e.sis Publications

A common problem for cities of developing countries like India in managing traffic is the lack of basic automated instrumentation to track road conditions or vehicle locations. Still, to help their citizens make informed travel decisions based on changing city dynamics; many cities have an authorized, city-initiated, notification service in place to alert subscribing commuters about road conditions. Here, alternative means may be used to create informal textual notifications e.g., inputs from field personnel, citizen updates, and pre-authorized events from city calendar. In this paper, we show that collections of such notifications, when processed with information extraction techniques, can turn …


Toward A New Understanding Of Virtual Research Collaborations: Complex Adaptive Systems Framework, Arsev U. Aydinoglu Oct 2013

Toward A New Understanding Of Virtual Research Collaborations: Complex Adaptive Systems Framework, Arsev U. Aydinoglu

DataONE Sociocultural and Usability & Assessment Working Groups

Virtual research collaborations (VRCs) have become an important method of conducting scientific activity; however, they are often regarded and treated as traditional scientific collaborations. Their success is measured by scholarly productivity and adherence to budget by funding agencies, participating scientists, and scholars. VRCs operate in complex environments interacting with other complex systems. A holistic (or organicist) approach is needed to make sense of this complexity. For that purpose, this study proposes using a new perspective, namely, the complex adaptive systems theory that can provide a better understanding of a VRC’s potential creativity, adaptability, resilience, and probable success. The key concepts …


A Unified Model For Topics, Events And Users On Twitter, Qiming Diao, Jing Jiang Oct 2013

A Unified Model For Topics, Events And Users On Twitter, Qiming Diao, Jing Jiang

Research Collection School Of Computing and Information Systems

With the rapid growth of social media, Twitter has become one of the most widely adopted platforms for people to post short and instant message. On the one hand, people tweets about their daily lives, and on the other hand, when major events happen, people also follow and tweet about them. Moreover, people’s posting behaviors on events are often closely tied to their personal interests. In this paper, we try to model topics, events and users on Twitter in a unified way. We propose a model which combines an LDA-like topic model and the Recurrent Chinese Restaurant Process to capture …


Mining Effective Multi-Segment Sliding Window For Pathogen Incidence Rate Prediction, Lei Duan, Changjie Tang, Xiasong Li, Guozhu Dong, Xianming Wang, Jie Zuo, Min Jiang, Zhongqi Li, Yongqing Zhang Sep 2013

Mining Effective Multi-Segment Sliding Window For Pathogen Incidence Rate Prediction, Lei Duan, Changjie Tang, Xiasong Li, Guozhu Dong, Xianming Wang, Jie Zuo, Min Jiang, Zhongqi Li, Yongqing Zhang

Kno.e.sis Publications

Pathogen incidence rate prediction, which can be considered as time series modeling, is an important task for infectious disease incidence rate prediction and for public health. This paper investigates the application of a genetic computation technique, namely GEP, for pathogen incidence rate prediction. To overcome the shortcomings of traditional sliding windows in GEP-based time series modeling, the paper introduces the problem of mining effective sliding window, for discovering optimal sliding windows for building accurate prediction models. To utilize the periodical characteristic of pathogen incidence rates, a multi-segment sliding window consisting of several segments from different periodical intervals is proposed and …


A Statistical And Schema Independent Approach To Identify Equivalent Properties On Linked Data, Kalpa Gunaratna, Krishnaprasad Thirunarayan, Prateek Jain, Amit P. Sheth, Sanjaya Wijeratne Sep 2013

A Statistical And Schema Independent Approach To Identify Equivalent Properties On Linked Data, Kalpa Gunaratna, Krishnaprasad Thirunarayan, Prateek Jain, Amit P. Sheth, Sanjaya Wijeratne

Kno.e.sis Publications

Linked Open Data (LOD) cloud has gained significant attention in the Semantic Web community recently. Currently it consists of approximately 295 interlinked datasets with over 50 billion triples including 500 million links, and continues to expand in size. This vast source of structured information has the potential to have a significant impact on knowledge-based applications. However, a key impediment to the use of LOD cloud is limited support for data integration tasks over concepts, instances, and properties. Efforts to address this limitation over properties have focused on matching data-type properties across datasets; however, matching of object-type properties has not received …


Types Of Property Pairs And Alignment On Linked Datasets - A Preliminary Analysis, Kalpa Gunaratna, Krishnaprasad Thirunarayan, Amit P. Sheth Sep 2013

Types Of Property Pairs And Alignment On Linked Datasets - A Preliminary Analysis, Kalpa Gunaratna, Krishnaprasad Thirunarayan, Amit P. Sheth

Kno.e.sis Publications

Dataset publication on the Web has been greatly influenced by the Linked Open Data (LOD) project. Many interlinked datasets have become freely available on the Web creating a structured and distributed knowledge representation. Analysis and aligning of concepts and instances in these interconnected datasets have received a lot of attention in the recent past compared to properties. We identify three different categories of property pairs found in the alignment process and study their relative distribution among well known LOD datasets. We also provide comparative analysis of state-of-the-art techniques with regard to different categories, highlighting their capabilities. This could lead to …


Generative Models For Item Adoptions Using Social Correlation, Freddy Chong Tat Chua, Hady Wirawan Lauw, Ee Peng Lim Sep 2013

Generative Models For Item Adoptions Using Social Correlation, Freddy Chong Tat Chua, Hady Wirawan Lauw, Ee Peng Lim

Research Collection School Of Computing and Information Systems

Users face many choices on the Web when it comes to choosing which product to buy, which video to watch, etc. In making adoption decisions, users rely not only on their own preferences, but also on friends. We call the latter social correlation which may be caused by the homophily and social influence effects. In this paper, we focus on modeling social correlation on users’ item adoptions. Given a user-user social graph and an item-user adoption graph, our research seeks to answer the following questions: whether the items adopted by a user correlate to items adopted by her friends, and …


Has Safeer Improved Sacm's Work And Helped Saudi Students In The Usa Resolve Their Needs Quickly, Faisal M. Alzomily Aug 2013

Has Safeer Improved Sacm's Work And Helped Saudi Students In The Usa Resolve Their Needs Quickly, Faisal M. Alzomily

Masters Theses & Specialist Projects

This study examined efficiency of the Safeer by gathering and analyzing the perception of 131 Saudi students from Bowling Green, KY. The purpose of the study was to ensure that the system is able to perform its function as the bridge between different institutions and Saudi students studying in the US who require assistance in processing their academic requirements. A self-administered survey using five scale points was employed. Results were summarized using descriptive statistics at 95% confidence level. The result confirmed the hypothesis that the use of the Safeer program provides quality service delivery within SACM, which in turn benefits …


Politics, Sharing And Emotion In Microblogs, Tuan-Anh Hoang, William Cohen, Ee Peng Lim, Doug Pierce, David Redlawsk Aug 2013

Politics, Sharing And Emotion In Microblogs, Tuan-Anh Hoang, William Cohen, Ee Peng Lim, Doug Pierce, David Redlawsk

Research Collection School Of Computing and Information Systems

In political contexts, it is known that people act as "motivated reasoners", i.e., information is evaluated first for emotional affect, and this emotional reaction influences later deliberative reasoning steps. As social media becomes a more and more prevalent way of receiving political information, it becomes important to understand more completely the interaction between information, emotion, social community, and information-sharing behavior. In this paper, we describe a high-precision classifier for politically-oriented tweets, and an accurate classifier of a Twitter user's political affiliation. Coupled with existing sentiment-analysis tools for microblogs, these methods enable us to systematically study the interaction of emotion and …


From Questions To Effective Answers: On The Utility Of Knowledge-Driven Querying Systems For Life Sciences Data, Amir H. Asiaee, Prashant Doshi, Todd Minning, Satya S. Sahoo, Priti Parikh, Amit P. Sheth, Rick L. Tarleton Jul 2013

From Questions To Effective Answers: On The Utility Of Knowledge-Driven Querying Systems For Life Sciences Data, Amir H. Asiaee, Prashant Doshi, Todd Minning, Satya S. Sahoo, Priti Parikh, Amit P. Sheth, Rick L. Tarleton

Kno.e.sis Publications

We compare two distinct approaches for querying data in the context of the life sciences. The first approach utilizes conventional databases to store the data and provides intuitive form-based interfaces to facilitate querying of the data, commonly used by the life science researchers that we study. The second approach utilizes a large OWL ontology and the same datasets associated as RDF instances of the ontology. Both approaches are being used in parallel by a team of cell biologists in their daily research activities, with the objective of gradually replacing the conventional approach with the knowledge-driven one. We describe several benefits …


Reviving Dormant Ties In An Online Social Network Experiment, Ee Peng Lim, Denzil Correa, David Lo, Michael Finegold, Feida Zhu Jul 2013

Reviving Dormant Ties In An Online Social Network Experiment, Ee Peng Lim, Denzil Correa, David Lo, Michael Finegold, Feida Zhu

Research Collection School Of Computing and Information Systems

Social network users connect and interact with one another to fulfil different kinds of social and information needs. When interaction ceases between two users, we say that their tie becomes dormant. While there are different underlying reasons of dormant ties, it is important to find means to revive such ties so as to maintain vibrancy in the relationships. In this work, we thus focus on designing an online experiment to evaluate the effectiveness of personalized social messages to revive dormant ties. The experiment carefully selects users with dormant ties so that no user gets mixed treatments and be affected by …


Mining Direct Antagonistic Communities In Signed Social Networks, David Lo, Didi Surian, Philips Kokoh Prasetyo, Zhang Kuan, Ee Peng Lim Jul 2013

Mining Direct Antagonistic Communities In Signed Social Networks, David Lo, Didi Surian, Philips Kokoh Prasetyo, Zhang Kuan, Ee Peng Lim

Research Collection School Of Computing and Information Systems

Social networks provide a wealth of data to study relationship dynamics among people. Most social networks such as Epinions and Facebook allow users to declare trusts or friendships with other users. Some of them also allow users to declare distrusts or negative relationships. When both positive and negative links co-exist in a network, some interesting community structures can be studied. In this work, we mine Direct Antagonistic Communities (DACs) within such signed networks. Each DAC consists of two sub-communities with positive relationships among members of each sub-community, and negative relationships among members of the other sub-community. Identifying direct antagonistic communities …


Crisis Response Coordination In Online Communities, Hemant Purohit Jun 2013

Crisis Response Coordination In Online Communities, Hemant Purohit

Kno.e.sis Publications

During recent crises, citizens (sensors) are increasingly using social media to share variety of information- situation on the ground, emerging needs, donation offers, damage, etc. In such an evolving ad-hoc community, how can we extract actionable nuggets from the social media streams to aid relief efforts? This doctoral consortium presentation summarizes a framework to analyze social data and manage information to assist coordination by focusing on three important questions to answer: Whom to coordinate with, Why to coordinate and How to coordinate, with exemplary insights for needs and availability from the recent disaster events.


A Latent Variable Model For Viewpoint Discovery From Threaded Forum Posts, Minghui Qiu, Jing Jiang Jun 2013

A Latent Variable Model For Viewpoint Discovery From Threaded Forum Posts, Minghui Qiu, Jing Jiang

Research Collection School Of Computing and Information Systems

Threaded discussion forums provide an important social media platform. Its rich user generated content has served as an important source of public feedback. To automatically discover the viewpoints or stances on hot issues from forum threads is an important and useful task. In this paper, we propose a novel latent variable model for viewpoint discovery from threaded forum posts. Our model is a principled generative latent variable model which captures three important factors: viewpoint specific topic preference, user identity and user interactions. Evaluation results show that our model clearly outperforms a number of baseline models in terms of both clustering …


Demo: Approximate Semantic Matching In The Collider Event Processing Engine, Souleiman Hasan, Kalpa Gunaratna, Yongrui Qin, Edward Curry Jun 2013

Demo: Approximate Semantic Matching In The Collider Event Processing Engine, Souleiman Hasan, Kalpa Gunaratna, Yongrui Qin, Edward Curry

Kno.e.sis Publications

This demo presents a use case from the energy management domain. It builds upon previous work on approximate semantic matching of heterogeneous events and compares two semantic matching scenarios: exact and approximate. It illustrates how a large number of exact matching event subscriptions are needed to match heterogeneous power consumption events. It then demonstrates how a small number of approximate semantic matching subscriptions are needed but possibly with a lower true positives/negatives performance. The demo is delivered via the COLLIDER approximate event processing engine currently under development in DERI.


Real Time Event Detection In Twitter, Xun Wang, Feida Zhu, Jing Jiang, Sujian Li Jun 2013

Real Time Event Detection In Twitter, Xun Wang, Feida Zhu, Jing Jiang, Sujian Li

Research Collection School Of Computing and Information Systems

Event detection has been an important task for a long time. When it comes to Twitter, new problems are presented. Twitter data is a huge temporal data flow with much noise and various kinds of topics. Traditional sophisticated methods with a high computational complexity aren’t designed to handle such data flow efficiently. In this paper, we propose a mixture Gaussian model for bursty word extraction in Twitter and then employ a novel time-dependent HDP model for new topic detection. Our model can grasp new events, the location and the time an event becomes bursty promptly and accurately. Experiments show the …


Fragmented Social Media: A Look Into Selective Exposure To Political News, Jisun An, Daniele Quercia, Jon Crowcroft May 2013

Fragmented Social Media: A Look Into Selective Exposure To Political News, Jisun An, Daniele Quercia, Jon Crowcroft

Research Collection School Of Computing and Information Systems

The hypothesis of selective exposure assumes that people crave like-minded information and eschew information that conflicts with their beliefs, and that has negative consequences on political life. Yet, despite decades of research, this hypothesis remains theoretically promising but empirically difficult to test. We look into news articles shared on Facebook and examine whether selective exposure exists or not in social media. We find a concrete evidence for a tendency that users predominantly share like-minded news articles and avoid conflicting ones, and partisans are more likely to do that. Building tools to counter partisanship on social media would require the ability …


Unified Entity Search In Social Media Community, Ting Yao, Yuan Liu, Chong-Wah Ngo, Tao Mei May 2013

Unified Entity Search In Social Media Community, Ting Yao, Yuan Liu, Chong-Wah Ngo, Tao Mei

Research Collection School Of Computing and Information Systems

The search for entities is the most common search behavior on the Web, especially in social media communities where entities (such as images, videos, people, locations, and tags) are highly heterogeneous and correlated. While previous research usually deals with these social media entities separately, we are investigating in this paper a unified, multilevel, and correlative entity graph to represent the unstructured social media data, through which various applications (e.g., friend suggestion, personalized image search, image tagging, etc.) can be realized more effectively in one single framework. We regard the social media objects equally as “entities” and all of these applications …


Your Love Is Public Now: Questioning The Use Of Personal Information In Authentication, Payas Gupta, Swapna Gottipati, Jing Jiang, Debin Gao May 2013

Your Love Is Public Now: Questioning The Use Of Personal Information In Authentication, Payas Gupta, Swapna Gottipati, Jing Jiang, Debin Gao

Research Collection School Of Computing and Information Systems

Most social networking platforms protect user's private information by limiting access to it to a small group of members, typically friends of the user, while allowing (virtually) everyone's access to the user's public data. In this paper, we exploit public data available on Facebook to infer users' undisclosed interests on their profile pages. In particular, we infer their undisclosed interests from the public data fetched using Graph APIs provided by Facebook. We demonstrate that simply liking a Facebook page does not corroborate that the user is interested in the page. Instead, we perform sentiment-oriented mining on various attributes of a …


Retweeting: An Act Of Viral Users, Susceptible Users, Or Viral Topics?, Tuan-Anh Hoang, Ee Peng Lim May 2013

Retweeting: An Act Of Viral Users, Susceptible Users, Or Viral Topics?, Tuan-Anh Hoang, Ee Peng Lim

Research Collection School Of Computing and Information Systems

When a user retweets, there are three behavioral factors that cause the actions. They are the topic virality, user virality and user susceptibility. Topic virality captures the degree to which a topic attracts retweets by users. For each topic, user virality and susceptibility refer to the likelihood that a user attracts retweets and performs retweeting respectively. To model a set of observed retweet data as a result of these three topic specific factors, we first represent the retweets as a three-dimensional tensor of the tweet authors, their followers, and the tweets themselves. We then propose the V 2S model, a …