Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 48

Full-Text Articles in Computer Sciences

An Essential Applied Statistical Analysis Course Using Rstudio With Project-Based Learning For Data Science, Aldy Gunawan, Michelle L. F. Cheong, Johnson Poh Dec 2018

An Essential Applied Statistical Analysis Course Using Rstudio With Project-Based Learning For Data Science, Aldy Gunawan, Michelle L. F. Cheong, Johnson Poh

Research Collection School Of Computing and Information Systems

This paper presents a newpostgraduate level course, named Applied Statistical Analysis with R. Wepresent the course structure, teaching methodology including the assessmentframework and student feedback. The course covers the basic concepts ofstatistics, the knowledge of applying statistical theory in analyzing real dataand the skill of developing statistical applications with R programminglanguage. The first half of each lesson is dedicated to teaching students thestatistical concepts while the second half focuses on the practical aspects ofimplementing the concepts within the RStudio console. The Project-BasedLearning (PBL) approach is adopted to encourage students to apply the knowledgegained to solve real world problems, answer complex …


Deep Air Learning: Interpolation, Prediction, And Feature Analysis Of Fine-Grained Air Quality, Zhongang Qi, Tianchun Wang, Guojie Song, Weisong Hu, Xi Li, Zhongfei Mark Zhang Dec 2018

Deep Air Learning: Interpolation, Prediction, And Feature Analysis Of Fine-Grained Air Quality, Zhongang Qi, Tianchun Wang, Guojie Song, Weisong Hu, Xi Li, Zhongfei Mark Zhang

Research Collection School Of Computing and Information Systems

The interpolation, prediction, and feature analysis of fine-gained air quality are three important topics in the area of urban air computing. The solutions to these topics can provide extremely useful information to support air pollution control, and consequently generate great societal and technical impacts. Most of the existing work solves the three problems separately by different models. In this paper, we propose a general and effective approach to solve the three problems in one model called the Deep Air Learning (DAL). The main idea of DAL lies in embedding feature selection and semi-supervised learning in different layers of the deep …


Using Smart Card Data To Model Commuters’ Responses Upon Unexpected Train Delays, Xiancai Tian, Baihua Zheng Dec 2018

Using Smart Card Data To Model Commuters’ Responses Upon Unexpected Train Delays, Xiancai Tian, Baihua Zheng

Research Collection School Of Computing and Information Systems

The mass rapid transit (MRT) network is playing an increasingly important role in Singapore's transit network, thanks to its advantages of higher capacity and faster speed. Unfortunately, due to aging infrastructure, increasing demand, and other reasons like adverse weather condition, commuters in Singapore recently have been facing increasing unexpected train delays (UTDs), which has become a source of frustration for both commuters and operators. Most, if not all, existing works on delay management do not consider commuters' behavior. We dedicate this paper to the study of commuters' behavior during UTDs. We adopt a data-driven approach to analyzing the six-month' real …


Mobility-Driven Ble Transmit-Power Adaptation For Participatory Data Muling, Chung-Kyun Han, Archan Misra, Shih-Fen Cheng Dec 2018

Mobility-Driven Ble Transmit-Power Adaptation For Participatory Data Muling, Chung-Kyun Han, Archan Misra, Shih-Fen Cheng

Research Collection School Of Computing and Information Systems

This paper analyzes a human-centric framework, called SmartABLE, for easy retrieval of the sensor values from pervasively deployed smart objects in a campus-like environment. In this framework, smartphones carried by campus occupants act as data mules, opportunistically retrieving data from nearby BLE (Bluetooth Low Energy) equipped smart object sensors and relaying them to a backend repository. We focus specifically on dynamically varying the transmission power of the deployed BLE beacons, so as to extend their operational lifetime without sacrificing the frequency of sensor data retrieval. We propose a memetic algorithm-based power adaptation strategy that can handle deployments of thousands of …


A Cloud-Based Data Gathering And Processing System For Intelligent Demand Forecasting, Colin K. L. Tay, Kyong Jin Shim Dec 2018

A Cloud-Based Data Gathering And Processing System For Intelligent Demand Forecasting, Colin K. L. Tay, Kyong Jin Shim

Research Collection School Of Computing and Information Systems

Demand forecasting has been a challenging problem especially for products with short life cycles such as electronic goods and fashion items. Additionally, in the presence of limited past or historical data as well as the need for fast turnaround for forecast, producing timely and accurate demand forecast can be extremely challenging. In this study, we describe a cloud-based data gathering and processing system for intelligent demand forecasting.


Data Mining Approach To The Identification Of At-Risk Students, Li Chin Ho, Kyong Jin Shim Dec 2018

Data Mining Approach To The Identification Of At-Risk Students, Li Chin Ho, Kyong Jin Shim

Research Collection School Of Computing and Information Systems

In recent years, the use of digital tools and technologies in educational institutions are continuing to generate large amounts of digital traces of student learning behavior. This study presents a proof-of-concept analytics system that can detect at-risk students along their learning journey. Educators can benefit from the early detection of at-risk students by understanding factors which may lead to failure or drop-out. Further, educators can devise appropriate intervention measures before the students drop out of the course. Our system was built using SAS ® Enterprise Miner (EM) and SAS ® JMP Pro.


Data Mining Approach To The Detection Of Suicide In Social Media: A Case Study Of Singapore, Jane H. K. Seah, Kyong Jin Shim Dec 2018

Data Mining Approach To The Detection Of Suicide In Social Media: A Case Study Of Singapore, Jane H. K. Seah, Kyong Jin Shim

Research Collection School Of Computing and Information Systems

In this research, we focus on the social phenomenon of suicide. Specifically, we perform social sensing on digital traces obtained from Reddit. We analyze the posts and comments in that are related to depression and suicide. We perform natural language processing to better understand different aspects of human life that relate to suicide.


On Learning Psycholinguistics Tools For English-Based Creole Languages Using Social Media Data, Pei-Chi Lo, Ee-Peng Lim Dec 2018

On Learning Psycholinguistics Tools For English-Based Creole Languages Using Social Media Data, Pei-Chi Lo, Ee-Peng Lim

Research Collection School Of Computing and Information Systems

The Linguistic Inquiry and Word Count (LIWC) tool is a psycholinguistics tool that has been widely used in both psychology and sociology research, and the LIWC scores derived from user-generated content are known to be good features for personality prediction [1], [2]. LIWC, however, is language specific as it relies on counting the percentage of predefined dictionary words occurring in the content. For content written in English Creoles which are languages based on English, the original English LIWC may not perform optimally due to its lack of words which are only used in the English Creoles. In this paper, we …


Fogfly: A Traffic Light Optimization Solution Based On Fog Computing, Quang Tran Minh, Chanh Minh Tran, Tuan An Le, Binh Thai Nguyen, Triet Minh Tran, Rajesh Krishna Balan Dec 2018

Fogfly: A Traffic Light Optimization Solution Based On Fog Computing, Quang Tran Minh, Chanh Minh Tran, Tuan An Le, Binh Thai Nguyen, Triet Minh Tran, Rajesh Krishna Balan

Research Collection School Of Computing and Information Systems

This paper provides a fog-based approach to solving the traffic light optimization problem which utilizes the Adaptive Traffic Signal Control (ATSC) model. ATSC systems demand the ability to strictly reflect real-time traffic state. The proposed fog computing framework, namely FogFly, aligns with this requirement by its natures in location-awareness, low latency and affordability to the changes in traffic conditions. As traffic data is updated timely and processed at fog nodes deployed close to data sources (i.e., vehicles at intersections) traffic light cycles can be optimized efficiently while virtualized resources available at network edges are efficiently utilized. Evaluation results show that …


Towards Mining Comprehensive Android Sandboxes, Tien-Duy B. Le, Lingfeng Bao, David Lo, Debin Gao, Li Li Dec 2018

Towards Mining Comprehensive Android Sandboxes, Tien-Duy B. Le, Lingfeng Bao, David Lo, Debin Gao, Li Li

Research Collection School Of Computing and Information Systems

Android is the most widely used mobile operating system with billions of users and devices. The popularity of Android apps have enticed malware writers to target them. Recently, Jamrozik et al. proposed an approach, named Boxmate, to mine sandboxes to protect Android users from malicious behaviors. In a nutshell, Boxmate analyzes the execution of an app, and collects a list of sensitive APIs that are invoked by that app in a monitoring phase. Then, it constructs a sandbox that can restrict accesses to sensitive APIs not called by the app. In such a way, malicious behaviors that are not observed …


Centroid-Amenities: An Interactive Visual Analytical Tool For Exploring And Analyzing Amenities In Singapore, Xue Qian Jazreel Siew, Sean Jia Ming Koh Nov 2018

Centroid-Amenities: An Interactive Visual Analytical Tool For Exploring And Analyzing Amenities In Singapore, Xue Qian Jazreel Siew, Sean Jia Ming Koh

Research Collection School Of Computing and Information Systems

Planning for civic amenities in a fast-changing urban setting such as Singapore is never an easy task. And as urban planners look toward more data-driven approaches toward urban planning, so grows the demand for more flexible geospatial analytics tools to facilitate a more iterative and granular approach toward urban planning. Such specific tools however, are not always readily available as plugins for traditional desktop GIS software, as numerous customizations must be made to model specific temporal planning scenarios for quick analysis, which could prove both costly and time-consuming. Hence, to address this need, open-source tools such as R Shiny could …


Latent Dirichlet Allocation For Textual Student Feedback Analysis, Swapna Gottipati, Venky Shankararaman, Jeff Lin Nov 2018

Latent Dirichlet Allocation For Textual Student Feedback Analysis, Swapna Gottipati, Venky Shankararaman, Jeff Lin

Research Collection School Of Computing and Information Systems

Education institutions collect feedback from students upon course completion and analyse it to improve curriculum design, delivery methodology and students' learning experience. A large part of feedback comes in the form textual comments, which pose a challenge in quantifying and deriving insights. In this paper, we present a novel approach of the Latent Dirichlet Allocation (LDA) model to address this difficulty in handling textual student feedback. The analysis of quantitative part of student feedback provides generalratings and helps to identify aspects of the teaching that are successful and those that can improve. The reasons for the failure or success, however, …


Sufat: An Analytics Tool For Gaining Insights From Student Feedback Comments, Siddhant Pyasi, Swapna Gottipati, Venky Shankararaman Oct 2018

Sufat: An Analytics Tool For Gaining Insights From Student Feedback Comments, Siddhant Pyasi, Swapna Gottipati, Venky Shankararaman

Research Collection School Of Computing and Information Systems

Teacher evaluation is a vital element inimproving student learning outcomes. Course and instructor feedback given bystudents, provides insights that can help improve student learning outcomes andteaching quality. Teaching and course evaluation systems help to collectquantitative and qualitative feedback from students. Since manually analysingthe qualitative feedback is painstaking and a tedious process, usually, onlythe quantitative feedback is often used for evaluating the course and theinstructor. However, useful knowledge is hidden in the qualitative comments, inthe form of sentiments and suggestions that can provide valuable insights tohelp plan improvements in the course content and delivery. In order toefficiently gather, analyse and provide …


Automating Intention Mining, Qiao Huang, Xin Xia, David Lo, Gail C. Murphy Oct 2018

Automating Intention Mining, Qiao Huang, Xin Xia, David Lo, Gail C. Murphy

Research Collection School Of Computing and Information Systems

Developers frequently discuss aspects of the systems they are developing online. The comments they post to discussions form a rich information source about the system. Intention mining, a process introduced by Di Sorbo et al., classifies sentences in developer discussions to enable further analysis. As one example of use, intention mining has been used to help build various recommenders for software developers. The technique introduced by Di Sorbo et al. to categorize sentences is based on linguistic patterns derived from two projects. The limited number of data sources used in this earlier work introduces questions about the comprehensiveness of intention …


Probabilistic Collaborative Representation Learning For Personalized Item Recommendation, Aghiles Salah, Hady W. Lauw Aug 2018

Probabilistic Collaborative Representation Learning For Personalized Item Recommendation, Aghiles Salah, Hady W. Lauw

Research Collection School Of Computing and Information Systems

We present Probabilistic Collaborative Representation Learning (PCRL), a new generative model of user preferences and item contexts. The latter builds on the assumption that relationships among items within contexts (e.g., browsing session, shopping cart, etc.) may underlie various aspects that guide the choices people make. Intuitively, PCRL seeks representations of items reflecting various regularities between them that might be useful at explaining user preferences. Formally, it relies on Bayesian Poisson Factorization to model user-item interactions, and uses a multilayered latent variable architecture to learn representations of items from their contexts. PCRL seamlessly integrates both tasks within a joint framework. However, …


Offline Versus Online: A Meaningful Categorization Of Ties For Retweets, Felicia Natali, Feida Zhu Aug 2018

Offline Versus Online: A Meaningful Categorization Of Ties For Retweets, Felicia Natali, Feida Zhu

Research Collection School Of Computing and Information Systems

With the recent proliferation of news being shared through online social networks, it is crucial to determine how news is spread and what drives people to share certain stories. In this paper, we focus on the social networking site Twitter and analyse user’s retweets. We study retweeting patterns between offline and online friends, particularly, how tweet novelty and tweet topic differ between tweets retweeted by offline friends and those retweeted by online friends.


Deep Learning For Practical Image Recognition: Case Study On Kaggle Competitions, Xulei Yang, Zeng Zeng, Sin G. Teo, Li Wang, Vijay Chandrasekar, Steven C. H. Hoi Aug 2018

Deep Learning For Practical Image Recognition: Case Study On Kaggle Competitions, Xulei Yang, Zeng Zeng, Sin G. Teo, Li Wang, Vijay Chandrasekar, Steven C. H. Hoi

Research Collection School Of Computing and Information Systems

In past years, deep convolutional neural networks (DCNN) have achieved big successes in image classification and object detection, as demonstrated on ImageNet in academic field. However, There are some unique practical challenges remain for real-world image recognition applications, e.g., small size of the objects, imbalanced data distributions, limited labeled data samples, etc. In this work, we are making efforts to deal with these challenges through a computational framework by incorporating latest developments in deep learning. In terms of two-stage detection scheme, pseudo labeling, data augmentation, cross-validation and ensemble learning, the proposed framework aims to achieve better performances for practical image …


Online Spatio-Temporal Matching In Stochastic And Dynamic Domains, Meghna Lowalekar, Pradeep Varakantham, Patrick Jaillet Aug 2018

Online Spatio-Temporal Matching In Stochastic And Dynamic Domains, Meghna Lowalekar, Pradeep Varakantham, Patrick Jaillet

Research Collection School Of Computing and Information Systems

Online spatio-temporal matching of servers/services to customers is a problem that arises at a large scale in many domains associated with shared transportation (e.g., taxis, ride sharing, super shuttles, etc.) and delivery services (e.g., food, equipment, clothing, home fuel, etc.). A key characteristic of these problems is that the matching of servers/services to customers in one stage has a direct impact on the matching in the next stage. For instance, it is efficient for taxis to pick up customers closer to the drop off point of the customer from the first stage of matching. Traditionally, greedy/myopic approaches have been adopted …


Searching For The X-Factor: Exploring Corpus Subjectivity For Word Embeddings, Maksim Tkachenko, Chong Cher Chia, Hady W. Lauw Jul 2018

Searching For The X-Factor: Exploring Corpus Subjectivity For Word Embeddings, Maksim Tkachenko, Chong Cher Chia, Hady W. Lauw

Research Collection School Of Computing and Information Systems

We explore the notion of subjectivity, and hypothesize that word embeddings learnt from input corpora of varying levels of subjectivity behave differently on natural language processing tasks such as classifying a sentence by sentiment, subjectivity, or topic. Through systematic comparative analyses, we establish this to be the case indeed. Moreover, based on the discovery of the outsized role that sentiment words play on subjectivity-sensitive tasks such as sentiment classification, we develop a novel word embedding SentiVec which is infused with sentiment information from a lexical resource, and is shown to outperform baselines on such tasks.


Online Deep Learning: Learning Deep Neural Networks On The Fly, Doyen Sahoo, Hong Quang Pham, Jing Lu, Steven C. H. Hoi Jul 2018

Online Deep Learning: Learning Deep Neural Networks On The Fly, Doyen Sahoo, Hong Quang Pham, Jing Lu, Steven C. H. Hoi

Research Collection School Of Computing and Information Systems

Deep Neural Networks (DNNs) are typically trained by backpropagation in a batch setting, requiring the entire training data to be made available prior to the learning task. This is not scalable for many real-world scenarios where new data arrives sequentially in a stream. We aim to address an open challenge of “Online Deep Learning” (ODL) for learning DNNs on the fly in an online setting. Unlike traditional online learning that often optimizes some convex objective function with respect to a shallow model (e.g., a linear/kernel-based hypothesis), ODL is more challenging as the optimization objective is non-convex, and regular DNN with …


Detecting Personal Intake Of Medicine From Twitter, Debanjan Mahata, Jasper Friedrichs, Rajiv Ratn Shah, Jing Jiang Jul 2018

Detecting Personal Intake Of Medicine From Twitter, Debanjan Mahata, Jasper Friedrichs, Rajiv Ratn Shah, Jing Jiang

Research Collection School Of Computing and Information Systems

Mining social media messages such as tweets, blogs, and Facebook posts for health and drug related information has received significant interest in pharmacovigilance research. Social media sites (e.g., Twitter), have been used for monitoring drug abuse, adverse reactions to drug usage, and analyzing expression of sentiments related to drugs. Most of these studies are based on aggregated results from a large population rather than specific sets of individuals. In order to conduct studies at an individual level or specific groups of people, identifying posts mentioning intake of medicine by the user is necessary. Toward this objective we develop a classifier …


Deeptravel: A Neural Network Based Travel Time Estimation Model With Auxiliary Supervision, Hanyuan Zhang, Hao Wu, Weiwei Sun, Baihua Zheng Jul 2018

Deeptravel: A Neural Network Based Travel Time Estimation Model With Auxiliary Supervision, Hanyuan Zhang, Hao Wu, Weiwei Sun, Baihua Zheng

Research Collection School Of Computing and Information Systems

Estimating the travel time of a path is of great importance to smart urban mobility. Existing approaches are either based on estimating the time cost of each road segment or designed heuristically in a non-learning-based way. The former is not able to capture many cross-segment complex factors while the latter fails to utilize the existing abundant temporal labels of the data, i.e., the time stamp of each trajectory point. In this paper, we leverage on new development of deep neural networks and propose a novel auxiliary supervision model, namely DeepTravel, that can automatically and effectively extract different features, as well …


Online Active Learning With Expert Advice, Shuji Hao, Peiying Hu, Peilin Zhao, Steven C. H. Hoi, Chunyan Miao Jul 2018

Online Active Learning With Expert Advice, Shuji Hao, Peiying Hu, Peilin Zhao, Steven C. H. Hoi, Chunyan Miao

Research Collection School Of Computing and Information Systems

In literature, learning with expert advice methods usually assume that a learner always obtain the true label of every incoming training instance at the end of each trial. However, in many real-world applications, acquiring the true labels of all instances can be both costly and time consuming, especially for large-scale problems. For example, in the social media, data stream usually comes in a high speed and volume, and it is nearly impossible and highly costly to label all of the instances. In this article, we address this problem with active learning with expert advice, where the ground truth of an …


Disentangled Person Image Generation, Liqian Ma, Qianru Sun, Stamatios Georgoulis, Luc Van Gool, Bernt Schiele, Mario Fritz Jun 2018

Disentangled Person Image Generation, Liqian Ma, Qianru Sun, Stamatios Georgoulis, Luc Van Gool, Bernt Schiele, Mario Fritz

Research Collection School Of Computing and Information Systems

Generating novel, yet realistic, images of persons is a challenging task due to the complex interplay between the different image factors, such as the foreground, background and pose information. In this work, we aim at generating such images based on a novel, two-stage reconstruction pipeline that learns a disentangled representation of the aforementioned image factors and generates novel person images at the same time. First, a multi-branched reconstruction network is proposed to disentangle and encode the three factors into embedding features, which are then combined to re-compose the input image itself. Second, three corresponding mapping functions are learned in an …


Social Stream Classification With Emerging New Labels, Xin Mu, Feida Zhu, Yue Liu, Ee-Peng Lim, Zhi-Hua Zhou Jun 2018

Social Stream Classification With Emerging New Labels, Xin Mu, Feida Zhu, Yue Liu, Ee-Peng Lim, Zhi-Hua Zhou

Research Collection School Of Computing and Information Systems

As an important research topic with well-recognized practical values, classification of social streams has been identified with increasing popularity with social data, such as the tweet stream generated by Twitter users in chronological order. A salient, and perhaps also the most interesting, feature of such user-generated content is its never-failing novelty, which, unfortunately, would challenge most traditional pre-trained classification models as they are built based on fixed label set and would therefore fail to identify new labels as they emerge. In this paper, we study the problem of classification of social streams with emerging new labels, and propose a novel …


Natural And Effective Obfuscation By Head Inpainting, Qianru Sun, Liqian Ma, Seong Joon Oh, Luc Van Gool, Bernt Schiele, Mario Fritz Jun 2018

Natural And Effective Obfuscation By Head Inpainting, Qianru Sun, Liqian Ma, Seong Joon Oh, Luc Van Gool, Bernt Schiele, Mario Fritz

Research Collection School Of Computing and Information Systems

As more and more personal photos are shared online, being able to obfuscate identities in such photos is becoming a necessity for privacy protection. People have largely resorted to blacking out or blurring head regions, but they result in poor user experience while being surprisingly ineffective against state of the art person recognizers. In this work, we propose a novel head inpainting obfuscation technique. Generating a realistic head inpainting in social media photos is challenging because subjects appear in diverse activities and head orientations. We thus split the task into two sub-tasks: (1) facial landmark generation from image context (e.g. …


Discovering Hidden Topical Hubs And Authorities In Online Social Networks, Roy Ka-Wei Lee, Tuan-Anh Hoang, Ee-Peng Lim May 2018

Discovering Hidden Topical Hubs And Authorities In Online Social Networks, Roy Ka-Wei Lee, Tuan-Anh Hoang, Ee-Peng Lim

Research Collection School Of Computing and Information Systems

Finding influential users in online social networks is an important problem with many possible useful applications. HITS and other link analysis methods, in particular, have been often used to identify hub and authority users in web graphs and online social networks. These works, however, have not considered topical aspect of links in their analysis. A straightforward approach to overcome this limitation is to first apply topic models to learn the user topics before applying the HITS algorithm. In this paper, we instead propose a novel topic model known as Hub and Authority Topic (HAT) model to combines the two process …


Finding All Nearest Neighbors With A Single Graph Traversal, Yixin Xu, Qi Jianzhong, Borovica‐Gajic Renata, Kulik Lars May 2018

Finding All Nearest Neighbors With A Single Graph Traversal, Yixin Xu, Qi Jianzhong, Borovica‐Gajic Renata, Kulik Lars

Research Collection School Of Computing and Information Systems

Finding the nearest neighbor is a key operation in data analysis and mining. An important variant of nearest neighbor query is the all nearest neighbor (ANN) query, which reports all nearest neighbors for a given set of query objects. Existing studies on ANN queries have focused on Euclidean space. Given the widespread occurrence of spatial networks in urban environments, we study the ANN query in spatial network settings. An example of an ANN query on spatial networks is finding the nearest car parks for all cars currently on the road. We propose VIVET, an index-based algorithm to efficiently process ANN …


Domain-Specific Cross-Language Relevant Question Retrieval, Bowen Xu, Zhenchang Xing, Xin Xia, David Lo, Shanping Li Apr 2018

Domain-Specific Cross-Language Relevant Question Retrieval, Bowen Xu, Zhenchang Xing, Xin Xia, David Lo, Shanping Li

Research Collection School Of Computing and Information Systems

Chinese developers often cannot effectively search questions in English, because they may have difficulties in translating technical words from Chinese to English and formulating proper English queries. For the purpose of helping Chinese developers take advantage of the rich knowledge base of Stack Overflow and simplify the question retrieval process, we propose an automated cross-language relevant question retrieval (CLRQR) system to retrieve relevant English questions for a given Chinese question. CLRQR first extracts essential information (both Chinese and English) from the title and description of the input Chinese question, then performs domain-specific translation of the essential Chinese information into English, …


The Role Of Urban Mobility In Retail Business Survival, Krittika D'Silva, Kasthuri Jayarajah, Anastasios Noulas, Cecilia Mascolo, Archan Misra Apr 2018

The Role Of Urban Mobility In Retail Business Survival, Krittika D'Silva, Kasthuri Jayarajah, Anastasios Noulas, Cecilia Mascolo, Archan Misra

Research Collection School Of Computing and Information Systems

Economic and urban planning agencies have strong interest in tackling the hard problem of predicting the odds of survival of individual retail businesses. In this work, we tap urban mobility data available both from a location-based intelligence platform, Foursquare, and from public transportation agencies, and investigate whether mobility-derived features can help foretell the failure of such retail businesses, over a 6 month horizon, across 10 distinct cities spanning the globe. We hypothesise that the survival of such a retail outlet is correlated with not only venue-specific characteristics but also broader neighbourhood-level effects. Through careful statistical analysis of Foursquare and taxi …