Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 31 - 48 of 48

Full-Text Articles in Computer Sciences

Social Network Monitoring For Bursty Cascade Detection, Wei Xie, Feida Zhu, Jing Xiao, Jianzong Wang Apr 2018

Social Network Monitoring For Bursty Cascade Detection, Wei Xie, Feida Zhu, Jing Xiao, Jianzong Wang

Research Collection School Of Computing and Information Systems

Social network services have become important and efficient platforms for users to share all kinds of information. The capability to monitor user-generated information and detect bursts from information diffusions in these social networks brings value to a wide range of real-life applications, such as viral marketing. However, in reality, as a third party, there is always a cost for gathering information from each user or so-called social network sensor. The question then arises how to select a budgeted set of social network sensors to form the data stream for burst detection without compromising the detection performance. In this article, we …


Exploiting User And Venue Characteristics For Fine-Grained Tweet Geolocation, Wen Haw Chong, Ee Peng Lim Apr 2018

Exploiting User And Venue Characteristics For Fine-Grained Tweet Geolocation, Wen Haw Chong, Ee Peng Lim

Research Collection School Of Computing and Information Systems

Which venue is a tweet posted from? We call this a fine-grained geolocation problem. Given an observed tweet, the task is to infer its discrete posting venue, e.g., a specific restaurant. This recovers the venue context and differs from prior work, which geolocats tweets to location coordinates or cities/neighborhoods. First, we conduct empirical analysis to uncover venue and user characteristics for improving geolocation. For venues, we observe spatial homophily, in which venues near each other have more similar tweet content (i.e., text representations) compared to venues further apart. For users, we observe that they are spatially focused and more likely …


A Sliding-Window Framework For Representative Subset Selection, Yanhao Wang, Yuchen Li, Kian-Lee Tan Apr 2018

A Sliding-Window Framework For Representative Subset Selection, Yanhao Wang, Yuchen Li, Kian-Lee Tan

Research Collection School Of Computing and Information Systems

Representative subset selection (RSS) is an important tool for users to draw insights from massive datasets. A common approach is to model RSS as the submodular maximization problem because the utility of extracted representatives often satisfies the "diminishing returns" property. To capture the data recency issue and support different types of constraints in real-world problems, we formulate RSS as maximizing a submodular function subject to a d-knapsack constraint (SMDK) over sliding windows. Then, we propose a novel KnapWindow framework for SMDK. Theoretically, KnapWindow is 1-ε/1+d - approximate for SMDK and achieves sublinear complexity. Finally, we evaluate the efficiency and effectiveness …


What Is Gab: A Bastion Of Free Speech Or An Alt-Right Echo Chamber, Savvas Zannettou, Barry Bradlyn, Emiliano De Cristofaro, Haewoon Kwak, Michael Sirivianos, Gianluca Stringhini, Jeremy Blackburn Apr 2018

What Is Gab: A Bastion Of Free Speech Or An Alt-Right Echo Chamber, Savvas Zannettou, Barry Bradlyn, Emiliano De Cristofaro, Haewoon Kwak, Michael Sirivianos, Gianluca Stringhini, Jeremy Blackburn

Research Collection School Of Computing and Information Systems

Over the past few years, a number of new "fringe" communities, like 4chan or certain subreddits, have gained traction on the Web at a rapid pace. However, more often than not, little is known about how they evolve or what kind of activities they attract, despite recent research has shown that they influence how false information reaches mainstream communities. This motivates the need to monitor these communities and analyze their impact on the Web's information ecosystem. In August 2016, a new social network called Gab was created as an alternative to Twitter. It positions itself as putting "people and free …


A Novel Representation And Compression For Queries On Trajectories In Road Networks, Xiaochun Yang, Bin Wang, Kai Yang, Chengfei Liu, Baihua Zheng Apr 2018

A Novel Representation And Compression For Queries On Trajectories In Road Networks, Xiaochun Yang, Bin Wang, Kai Yang, Chengfei Liu, Baihua Zheng

Research Collection School Of Computing and Information Systems

Recording and querying time-stamped trajectories incurs high cost of data storage and computing. In this paper, we explore several characteristics of the trajectories in road mbox{networks}, which have motivated the idea of coding trajectories by associating timestamps with relative spatial path and locations. Such a representation contains large number of duplicate information to achieve a lower entropy compared with the existing representations, thereby drastically cutting the storage cost. We propose several techniques to compress spatial path and locations separately, which can support fast positioning and achieve better compression ratio. For locations, we propose two novel encoding schemes such that the …


Octopus: An Online Topic-Aware Influence Analysis System For Social Networks, Ju Fan, Jiarong Qiu, Yuchen Li, Qingfei Meng, Dongxiang Zhang, Guoliang Li, Kian-Lee Tan, Xiaoyong Du Apr 2018

Octopus: An Online Topic-Aware Influence Analysis System For Social Networks, Ju Fan, Jiarong Qiu, Yuchen Li, Qingfei Meng, Dongxiang Zhang, Guoliang Li, Kian-Lee Tan, Xiaoyong Du

Research Collection School Of Computing and Information Systems

The wide adoption of social networks has brought a new demand on influence analysis. This paper presents OCTOPUS that offers social network users and analysts valuable insights through topic-aware social influence analysis services. OCTOPUS has the following novel features. First, OCTOPUS provides a user-friendly interface that allows users to employ simple and easy-to-use keywords to perform influence analysis. Second, OCTOPUS provides three powerful keyword-based topic-aware influence analysis tools: keyword-based influential user discovery, personalized influential keywords suggestion, and interactive influential paths exploration. These tools can not only discover influential users, but also provide insights on how the users influence the network. …


Does Journaling Encourage Healthier Choices? Analyzing Healthy Eating Behaviors Of Food Journalers, Palakorn Achananuparp, Ee Peng Lim, Vibhanshu Abhishek Apr 2018

Does Journaling Encourage Healthier Choices? Analyzing Healthy Eating Behaviors Of Food Journalers, Palakorn Achananuparp, Ee Peng Lim, Vibhanshu Abhishek

Research Collection School Of Computing and Information Systems

Past research has shown the benefits of food journaling in promoting mindful eating and healthier food choices. However, the links between journaling and healthy eating have not been thoroughly examined. Beyond caloric restriction, do journalers consistently and sufficiently consume healthful diets? How different are their eating habits compared to those of average consumers who tend to be less conscious about health? In this study, we analyze the healthy eating behaviors of active food journalers using data from MyFitnessPal. Surprisingly, our findings show that food journalers do not eat as healthily as they should despite their proclivity to health eating and …


A Data-Driven Analysis Of Workers' Earnings On Amazon Mechanical Turk, Kotaro Hara, Abigail Adams, Kristy Milland, Saiph Savage, Chris Callison-Burch, Jeffrey P. Bigham Apr 2018

A Data-Driven Analysis Of Workers' Earnings On Amazon Mechanical Turk, Kotaro Hara, Abigail Adams, Kristy Milland, Saiph Savage, Chris Callison-Burch, Jeffrey P. Bigham

Research Collection School Of Computing and Information Systems

A growing number of people are working as part of on-line crowd work. Crowd work is often thought to be low wage work. However, we know little about the wage distribution in practice and what causes low/high earnings in this setting. We recorded 2,676 workers performing 3.8 million tasks on Amazon Mechanical Turk. Our task-level analysis revealed that workers earned a median hourly wage of only ~$2/h, and only 4% earned more than $7.25/h. While the average requester pays more than $11/h, lower-paying requesters post much more work. Our wage calculations are influenced by how unpaid work is accounted for, …


The Role Of Urban Mobility In Retail Business Survival, Krittika D'Silva, Kasthuri Jayarajah, Anastasios Noulas, Cecilia Mascolo, Archan Misra Apr 2018

The Role Of Urban Mobility In Retail Business Survival, Krittika D'Silva, Kasthuri Jayarajah, Anastasios Noulas, Cecilia Mascolo, Archan Misra

Research Collection School Of Computing and Information Systems

Economic and urban planning agencies have strong interest in tackling the hard problem of predicting the odds of survival of individual retail businesses. In this work, we tap urban mobility data available both from a location-based intelligence platform, Foursquare, and from public transportation agencies, and investigate whether mobility-derived features can help foretell the failure of such retail businesses, over a 6 month horizon, across 10 distinct cities spanning the globe. We hypothesise that the survival of such a retail outlet is correlated with not only venue-specific characteristics but also broader neighbourhood-level effects. Through careful statistical analysis of Foursquare and taxi …


Predicting Episodes Of Non-Conformant Mobility In Indoor Environments, Kasthuri Jayarajah, Archan Misra Apr 2018

Predicting Episodes Of Non-Conformant Mobility In Indoor Environments, Kasthuri Jayarajah, Archan Misra

Research Collection School Of Computing and Information Systems

Traditional mobility prediction literature focuses primarily on improved methods to extract latent patterns from individual-specific movement data. When such predictions are incorrect, we ascribe it to 'random' or 'unpredictable' changes in a user's movement behavior. Our hypothesis, however, is that such apparently-random deviations from daily movement patterns can, in fact, of ten be anticipated. In particular, we develop a methodology for predicting Likelihood of Future Non-Conformance (LFNC), based on two central hypotheses: (a) the likelihood of future deviations in movement behavior is positively correlated to the intensity of such trajectory deviations observed in the user's recent past, and (b) the …


Fixation And Confusion: Investigating Eye-Tracking Participants' Exposure To Information In Personas, Joni Salminen, Bernard J. Jansen, Jisun An, Soon-Gyo Jung, Lene Nielsen, Haewoon Kwak Mar 2018

Fixation And Confusion: Investigating Eye-Tracking Participants' Exposure To Information In Personas, Joni Salminen, Bernard J. Jansen, Jisun An, Soon-Gyo Jung, Lene Nielsen, Haewoon Kwak

Research Collection School Of Computing and Information Systems

To more effectively convey relevant information to end users of persona profiles, we conducted a user study consisting of 29 participants engaging with three persona layout treatments. We were interested in confusion engendered by the treatments on the participants, and conducted a within-subjects study in the actual work environment, using eye-tracking and talk-aloud data collection. We coded the verbal data into classes of informativeness and confusion and correlated it with fixations and durations on the Areas of Interests recorded by the eye-tracking device. We used various analysis techniques, including Mann-Whitney, regression, and Levenshtein distance, to investigate how confused users differed …


Mining Sandboxes: Are We There Yet?, Lingfeng Bao, Tien Duy B. Le, David Lo Mar 2018

Mining Sandboxes: Are We There Yet?, Lingfeng Bao, Tien Duy B. Le, David Lo

Research Collection School Of Computing and Information Systems

The popularity of Android platform on mobile devices has attracted much attention from many developers and researchers, as well as malware writers. Recently, Jamrozik et al. proposed a technique to secure Android applications referred to as mining sandboxes. They used an automated test case generation technique to explore the behavior of the app under test and then extracted a set of sensitive APIs that were called. Based on the extracted sensitive APIs, they built a sandbox that can block access to APIs not used during testing. However, they only evaluated the proposed technique with benign apps but not investigated whether …


Unified Locally Linear Classifiers With Diversity-Promoting Anchor Points, Chenghao Liu, Teng Zhang, Peilin Zhao, Jianling Sun, Steven C. H. Hoi Feb 2018

Unified Locally Linear Classifiers With Diversity-Promoting Anchor Points, Chenghao Liu, Teng Zhang, Peilin Zhao, Jianling Sun, Steven C. H. Hoi

Research Collection School Of Computing and Information Systems

Locally Linear Support Vector Machine (LLSVM) has been actively used in classification tasks due to its capability of classifying nonlinear patterns. However, existing LLSVM suffers from two drawbacks: (1) a particular and appropriate regularization for LLSVM has not yet been addressed; (2) it usually adopts a three-stage learning scheme composed of learning anchor points by clustering, learning local coding coordinates by a predefined coding scheme, and finally learning for training classifiers. We argue that this decoupled approaches oversimplifies the original optimization problem, resulting in a large deviation due to the disparate purpose of each step. To address the first issue, …


Compressive Representation For Device-Free Activity Recognition With Passive Rfid Signal Strength, Lina Yao, Quan Z. Sheng, Xue Li, Tao Gu, Mingkui Tan, Xianzhi Wang, Sen Wang, Wenjie Ruan Feb 2018

Compressive Representation For Device-Free Activity Recognition With Passive Rfid Signal Strength, Lina Yao, Quan Z. Sheng, Xue Li, Tao Gu, Mingkui Tan, Xianzhi Wang, Sen Wang, Wenjie Ruan

Research Collection School Of Computing and Information Systems

Understanding and recognizing human activities is a fundamental research topic for a wide range of important applications such as fall detection and remote health monitoring and intervention. Despite active research in human activity recognition over the past years, existing approaches based on computer vision or wearable sensor technologies present several significant issues such as privacy (e.g., using video camera to monitor the elderly at home) and practicality (e.g., not possible for an older person with dementia to remember wearing devices). In this paper, we present a low-cost, unobtrusive, and robust system that supports independent living of older people. The system …


Sparse Passive-Aggressive Learning For Bounded Online Kernel Methods, Jing Lu, Doyen Sahoo, Peilin Zhao, Steven C. H. Hoi Feb 2018

Sparse Passive-Aggressive Learning For Bounded Online Kernel Methods, Jing Lu, Doyen Sahoo, Peilin Zhao, Steven C. H. Hoi

Research Collection School Of Computing and Information Systems

One critical deficiency of traditional online kernel learning methods is their unbounded and growing number of support vectors in the online learning process, making them inefficient and non-scalable for large-scale applications. Recent studies on scalable online kernel learning have attempted to overcome this shortcoming, e.g., by imposing a constant budget on the number of support vectors. Although they attempt to bound the number of support vectors at each online learning iteration, most of them fail to bound the number of support vectors for the final output hypothesis, which is often obtained by averaging the series of hypotheses over all the …


Competency Analytics Tool: Analyzing Curriculum Using Course Competencies, Swapna Gottipati, Venky Shankararaman Jan 2018

Competency Analytics Tool: Analyzing Curriculum Using Course Competencies, Swapna Gottipati, Venky Shankararaman

Research Collection School Of Computing and Information Systems

The applications of learning outcomes and competency frameworks have brought better clarity to engineering programs in many universities. Several frameworks have been proposed to integrate outcomes and competencies into course design, delivery and assessment. However, in many cases, competencies are course-specific and their overall impact on the curriculum design is unknown. Such impact analysis is important for analyzing, discovering gaps and improving the curriculum design. Unfortunately, manual analysis is a painstaking process due to large amounts of competencies across the curriculum. In this paper, we propose an automated method to analyze the competencies and discover their impact on the overall …


Collaboration Patterns In Software Developer Network, Didi Surian, Ee-Peng Lim, David Lo Jan 2018

Collaboration Patterns In Software Developer Network, Didi Surian, Ee-Peng Lim, David Lo

Research Collection School Of Computing and Information Systems

In this entry, we mine collaboration patterns from a large software developer network (Surian et al. 2010). We consider high- and low-level patterns. High-level patterns correspond to various network-level statistics that we observe to hold in this network. Low-level patterns are topological subgraph patterns that are frequently observed among developers collaborating in the network. Mining topological subgraph patterns are difficult as it is an NP-hard problem. To address this issue, we use a combination of frequent subgraph mining and graph matching by leveraging the power law property exhibited by a large collaboration graph. The technique is applicable to any software …


Identifying And Computing The Exact Core-Determining Class, Ye Luo, Hai Wang Jan 2018

Identifying And Computing The Exact Core-Determining Class, Ye Luo, Hai Wang

Research Collection School Of Computing and Information Systems

The indeterministic relations between unobservable events andobserved outcomes in partially identified models can be characterized bya bipartite graph. Given a probability measure on observed outcomes, theset of feasible probability measures on unobservable events can be definedby a set of linear inequality constraints, according to Artstein’s Theorem.This set of inequalities is called the “core-determining class”. However, thenumber of inequalities defined by Artstein’s Theorem is exponentially increasing with the number of unobservable events, and many inequalitiesmay in fact be redundant. In this paper, we show that the “exact coredetermining class”, i.e., the smallest possible core-determining class, canbe characterized by a set of …