Open Access. Powered by Scholars. Published by Universities.®

Databases and Information Systems Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 30

Full-Text Articles in Databases and Information Systems

Using Smart Card Data To Model Commuters’ Responses Upon Unexpected Train Delays, Xiancai Tian, Baihua Zheng Dec 2018

Using Smart Card Data To Model Commuters’ Responses Upon Unexpected Train Delays, Xiancai Tian, Baihua Zheng

Research Collection School Of Computing and Information Systems

The mass rapid transit (MRT) network is playing an increasingly important role in Singapore's transit network, thanks to its advantages of higher capacity and faster speed. Unfortunately, due to aging infrastructure, increasing demand, and other reasons like adverse weather condition, commuters in Singapore recently have been facing increasing unexpected train delays (UTDs), which has become a source of frustration for both commuters and operators. Most, if not all, existing works on delay management do not consider commuters' behavior. We dedicate this paper to the study of commuters' behavior during UTDs. We adopt a data-driven approach to analyzing the six-month' real …


Data Mining Approach To The Detection Of Suicide In Social Media: A Case Study Of Singapore, Jane H. K. Seah, Kyong Jin Shim Dec 2018

Data Mining Approach To The Detection Of Suicide In Social Media: A Case Study Of Singapore, Jane H. K. Seah, Kyong Jin Shim

Research Collection School Of Computing and Information Systems

In this research, we focus on the social phenomenon of suicide. Specifically, we perform social sensing on digital traces obtained from Reddit. We analyze the posts and comments in that are related to depression and suicide. We perform natural language processing to better understand different aspects of human life that relate to suicide.


Imaginary People Representing Real Numbers: Generating Personas From Online Social Media Data, Jisun An, Haewoon Kwak, Soongyo Jung, Joni Salminen, M. Admad, Bernard J. Jansen Nov 2018

Imaginary People Representing Real Numbers: Generating Personas From Online Social Media Data, Jisun An, Haewoon Kwak, Soongyo Jung, Joni Salminen, M. Admad, Bernard J. Jansen

Research Collection School Of Computing and Information Systems

We develop a methodology to automate creating imaginary people, referred to as personas, by processing complex behavioral and demographic data of social media audiences. From a popular social media account containing more than 30 million interactions by viewers from 198 countries engaging with more than 4,200 online videos produced by a global media corporation, we demonstrate that our methodology has several novel accomplishments, including: (a) identifying distinct user behavioral segments based on the user content consumption patterns; (b) identifying impactful demographics groupings; and (c) creating rich persona descriptions by automatically adding pertinent attributes, such as names, photos, and personal characteristics. …


Comparing Elm With Svm In The Field Of Sentiment Classification Of Social Media Text Data, Zhihuan Chen, Zhaoxia Wang, Zhiping Lin, Ting Yang Nov 2018

Comparing Elm With Svm In The Field Of Sentiment Classification Of Social Media Text Data, Zhihuan Chen, Zhaoxia Wang, Zhiping Lin, Ting Yang

Research Collection School Of Computing and Information Systems

Machine learning has been used in various fields with thousands of applications. Extreme learning machine (ELM), which is the most recently developed machine learning algorithm, has become increasingly popular for its good generalization ability. However, it has been relatively less applied to the domain of social media. Support Vector Machine (SVM), another popular learning-based algorithm, has been applied for sentiment classification of social media text data and has obtained good results. This paper investigates and compares the capabilities of these two learning-based methods in the field of sentiment classification of social media. The results indicate that SVM can obtain good …


Linky: Visualizing User Identity Linkage Results For Multiple Online Social Networks (Demo), Roy Ka-Wei Lee, Ming Shan Hee, Philips Kokoh Prasetyo, Ee-Peng Lim Nov 2018

Linky: Visualizing User Identity Linkage Results For Multiple Online Social Networks (Demo), Roy Ka-Wei Lee, Ming Shan Hee, Philips Kokoh Prasetyo, Ee-Peng Lim

Research Collection School Of Computing and Information Systems

User identity linkage across online social networks is an emerging research topic that has attracted attention in recent years. Many user identity linkage methods have been proposed so far and most of them utilize user profile, content and network information to determine if two social media accounts belong to the same person. In most cases, user identity linkage methods are evaluated by performing some prediction tasks with the results presented using some overall accuracy measures. However, the methods are rarely compared at the individual user level where a predicted matched (or linked) pair of user identities from different online social …


Improving Multi-Label Emotion Classification Via Sentiment Classification With Dual Attention Transfer Network, Jianfei Yu, Luis Marujo, Jing Jiang, Pradeep Karuturi, William Brendel Nov 2018

Improving Multi-Label Emotion Classification Via Sentiment Classification With Dual Attention Transfer Network, Jianfei Yu, Luis Marujo, Jing Jiang, Pradeep Karuturi, William Brendel

Research Collection School Of Computing and Information Systems

In this paper, we target at improving the performance of multi-label emotion classification with the help of sentiment classification. Specifically, we propose a new transfer learning architecture to divide the sentence representation into two different feature spaces, which are expected to respectively capture the general sentiment words and the other important emotion-specific words via a dual attention mechanism. Extensive experimental results demonstrate that our transfer learning approach can outperform several strong baselines and achieve the state-of-the-art performance on two benchmark datasets.


Exploiting The Interdependency Of Land Use And Mobility For Urban Planning, Kasthuri Jayarajah, Andrew Tan, Archan Misra Oct 2018

Exploiting The Interdependency Of Land Use And Mobility For Urban Planning, Kasthuri Jayarajah, Andrew Tan, Archan Misra

Research Collection School Of Computing and Information Systems

Urban planners and economists alike have strong interest in understanding the inter-dependency of land use and people flow. The two-pronged problem entails systematic modeling and understanding of how land use impacts crowd flow to an area and in turn, how the influx of people to an area (or lack thereof) can influence the viability of business entities in that area. With cities becoming increasingly sensor-rich, for example, digitized payments for public transportation and constant trajectory tracking of buses and taxis, understanding and modelling crowd flows at the city scale, as well as, at finer granularity such as at the neighborhood …


Inferring Trip Occupancies In The Rise Of Ride-Hailing Services, Meng-Fen Chiang, Ee-Peng Lim, Wang-Chien Lee, Tuan-Anh Hoang Oct 2018

Inferring Trip Occupancies In The Rise Of Ride-Hailing Services, Meng-Fen Chiang, Ee-Peng Lim, Wang-Chien Lee, Tuan-Anh Hoang

Research Collection School Of Computing and Information Systems

The knowledge of all occupied and unoccupied trips made by self-employed drivers are essential for optimized vehicle dispatch by ride-hailing services (e.g., Didi Dache, Uber, Lyft, Grab, etc.). However, the occupancy status of vehicles is not always known to the service operators due to adoption of multiple ride-hailing apps. In this paper, we propose a novel framework, Learning to INfer Trips (LINT), to infer occupancy of car trips by exploring characteristics of observed occupied trips. Two main research steps, stop point classification and structural segmentation, are included in LINT. In the stop point classification step, we represent a vehicle trajectory …


Traffic-Cascade: Mining And Visualizing Lifecycles Of Traffic Congestion Events Using Public Bus Trajectories, Agus Trisnajaya Kwee, Meng-Fen Chiang, Philips Kokoh Prasetyo, Ee-Peng Lim Oct 2018

Traffic-Cascade: Mining And Visualizing Lifecycles Of Traffic Congestion Events Using Public Bus Trajectories, Agus Trisnajaya Kwee, Meng-Fen Chiang, Philips Kokoh Prasetyo, Ee-Peng Lim

Research Collection School Of Computing and Information Systems

As road transportation supports both economic and social activities in developed cities, it is important to maintain smooth traffic on all highways and local roads. Whenever possible, traffic congestions should be detected early and resolved quickly. While existing traffic monitoring dashboard systems have been put in place in many cities, these systems require high-cost vehicle speed monitoring instruments and detect traffic congestion as independent events. There is a lack of low-cost dashboards to inspect and analyze the lifecycle of traffic congestion which is critical in assessing the overall impact of congestion, determining the possible the source(s) of congestion and its …


Diversity In Online Advertising: A Case Study Of 69 Brands On Social Media, Jisun An, Ingmar Weber Sep 2018

Diversity In Online Advertising: A Case Study Of 69 Brands On Social Media, Jisun An, Ingmar Weber

Research Collection School Of Computing and Information Systems

Lack of diversity in advertising is a long-standing problem. Despite growing cultural awareness and missed business opportunities, many minorities remain under- or inappropriately represented in advertising. Previous research has studied how people react to culturally embedded ads, but such work focused mostly on print media or television using lab experiments. In this work, we look at diversity in content posted by 69 U.S. brands on two social media platforms, Instagram and Facebook. Using face detection technology, we infer the gender, race, and age of both the faces in the ads and of the users engaging with ads. Using this dataset, …


Implicit Linking Of Food Entities In Social Media, Wen Haw Chong, Ee Peng Lim Sep 2018

Implicit Linking Of Food Entities In Social Media, Wen Haw Chong, Ee Peng Lim

Research Collection School Of Computing and Information Systems

Dining is an important part in people’s lives and this explains why food-related microblogs and reviews are popular in social media. Identifying food entities in food-related posts is important to food lover profiling and food (or restaurant) recommendations. In this work, we conduct Implicit Entity Linking (IEL) to link food-related posts to food entities in a knowledge base. In IEL, we link posts even if they do not contain explicit entity mentions. We first show empirically that food venues are entity-focused and associated with a limited number of food entities each. Hence same-venue posts are likely to share common food …


Esg And Corporate Financial Performance: Empirical Evidence From China's Listed Power Generation Companies, Changhong Zhao, Yu Guo, Jiahai Yuan, Mengya Wu, Daiyu Li, Yiou Zhou, Jiangang Kang Aug 2018

Esg And Corporate Financial Performance: Empirical Evidence From China's Listed Power Generation Companies, Changhong Zhao, Yu Guo, Jiahai Yuan, Mengya Wu, Daiyu Li, Yiou Zhou, Jiangang Kang

Research Collection School Of Computing and Information Systems

Nowadays, listed companies around the world are shifting from short-term goals of maximizing profits to long-term sustainable environmental, social, and governance (ESG) goals. People have come to realize that ESG has become an important source of the corporate risk and may affect the company's financial performance and profitability. Recent research shows that good ESG performance could improve the financial performance in some countries. Yet, the question of how does ESG affect financial performance has not been thoroughly discussed and studied in China. In this article, we study China's listed power generation groups to explore the relationship between ESG performance and …


Offline Versus Online: A Meaningful Categorization Of Ties For Retweets, Felicia Natali, Feida Zhu Aug 2018

Offline Versus Online: A Meaningful Categorization Of Ties For Retweets, Felicia Natali, Feida Zhu

Research Collection School Of Computing and Information Systems

With the recent proliferation of news being shared through online social networks, it is crucial to determine how news is spread and what drives people to share certain stories. In this paper, we focus on the social networking site Twitter and analyse user’s retweets. We study retweeting patterns between offline and online friends, particularly, how tweet novelty and tweet topic differ between tweets retweeted by offline friends and those retweeted by online friends.


Taxis Strike Back: A Field Trial Of The Driver Guidance System, Shih-Fen Cheng, Shashi Shekhar Jha, Rishikeshan Rajendram Jul 2018

Taxis Strike Back: A Field Trial Of The Driver Guidance System, Shih-Fen Cheng, Shashi Shekhar Jha, Rishikeshan Rajendram

Research Collection School Of Computing and Information Systems

Traditional taxi fleet operators world-over have been facing intense competitions from various ride-hailing services such as Uber and Grab (specific to the Southeast Asia region). Based on our studies on the taxi industry in Singapore, we see that the emergence of Uber and Grab in the ride-hailing market has greatly impacted the taxi industry: the average daily taxi ridership for the past two years has been falling continuously, by close to 20% in total. In this work, we discuss how efficient real-time data analytics and large-scale multi-agent optimization technology could potentially help taxi drivers compete against more technologically advanced service …


Pacela: A Neural Framework For User Visitation In Location-Based Social Networks, Thanh Nam Doan, Ee-Peng Lim Jul 2018

Pacela: A Neural Framework For User Visitation In Location-Based Social Networks, Thanh Nam Doan, Ee-Peng Lim

Research Collection School Of Computing and Information Systems

Check-in prediction using location-based social network data is an important research problem for both academia and industry since an accurate check-in predictive model is useful to many applications, e.g. urban planning, venue recommendation, route suggestion, and context-aware advertising. Intuitively, when considering venues to visit, users may rely on their past observed visit histories as well as some latent attributes associated with the venues. In this paper, we therefore propose a check-in prediction model based on a neural framework called Preference and Context Embeddings with Latent Attributes (PACELA). PACELA learns the embeddings space for the user and venue data as well …


Deeptravel: A Neural Network Based Travel Time Estimation Model With Auxiliary Supervision, Hanyuan Zhang, Hao Wu, Weiwei Sun, Baihua Zheng Jul 2018

Deeptravel: A Neural Network Based Travel Time Estimation Model With Auxiliary Supervision, Hanyuan Zhang, Hao Wu, Weiwei Sun, Baihua Zheng

Research Collection School Of Computing and Information Systems

Estimating the travel time of a path is of great importance to smart urban mobility. Existing approaches are either based on estimating the time cost of each road segment or designed heuristically in a non-learning-based way. The former is not able to capture many cross-segment complex factors while the latter fails to utilize the existing abundant temporal labels of the data, i.e., the time stamp of each trajectory point. In this paper, we leverage on new development of deep neural networks and propose a novel auxiliary supervision model, namely DeepTravel, that can automatically and effectively extract different features, as well …


Detecting Personal Intake Of Medicine From Twitter, Debanjan Mahata, Jasper Friedrichs, Rajiv Ratn Shah, Jing Jiang Jul 2018

Detecting Personal Intake Of Medicine From Twitter, Debanjan Mahata, Jasper Friedrichs, Rajiv Ratn Shah, Jing Jiang

Research Collection School Of Computing and Information Systems

Mining social media messages such as tweets, blogs, and Facebook posts for health and drug related information has received significant interest in pharmacovigilance research. Social media sites (e.g., Twitter), have been used for monitoring drug abuse, adverse reactions to drug usage, and analyzing expression of sentiments related to drugs. Most of these studies are based on aggregated results from a large population rather than specific sets of individuals. In order to conduct studies at an individual level or specific groups of people, identifying posts mentioning intake of medicine by the user is necessary. Toward this objective we develop a classifier …


A Driver Guidance System For Taxis In Singapore, Shashi Shekhar Jha, Shih-Fen Cheng, Meghna Lowalekar, Nicholas Wong, Rishikeshan Rajendram, Pradeep Varakantham, Nghia Troung Troung, Firmansyah Bin Abd Rahman Jul 2018

A Driver Guidance System For Taxis In Singapore, Shashi Shekhar Jha, Shih-Fen Cheng, Meghna Lowalekar, Nicholas Wong, Rishikeshan Rajendram, Pradeep Varakantham, Nghia Troung Troung, Firmansyah Bin Abd Rahman

Research Collection School Of Computing and Information Systems

Traditional taxi fleet operators world-over have been facing intense competitions from various ride-hailing services such as Uber and Grab.Based on our studies on the taxi industry in Singapore, we see that the emergence of Uber and Grab in the ride-hailing market has greatly impacted the taxi industry: the average daily taxi ridership for the past two years has been falling continuously, by close to 20% in total. In this work, we discuss how efficient real-time data analytics and large-scale multiagent optimization technology could help taxi drivers compete against more technologically advanced service platforms. Our system has been in field trial …


Analysis Of Public Transportation Patterns In A Densely Populated City With Station-Based Shared Bikes, Di Wang, Evan Wu, Ah-Hwee Tan Jul 2018

Analysis Of Public Transportation Patterns In A Densely Populated City With Station-Based Shared Bikes, Di Wang, Evan Wu, Ah-Hwee Tan

Research Collection School Of Computing and Information Systems

Densely populated cities face great challenges of high transportation demand and limited physical space. Thus, in these cities, the public transportation system is heavily relied on. Conventional public transportation modes such as bus, taxi and subway have been globally deployed over the past century. In the last decade, a new type of public transportation mode, shared bike, emerged in many cities. These shared bikes are deployed by either government-regulated or profit-driven companies and are either station-based or station-less. Nonetheless, all of them are designed to better solve the last-mile problem in densely populated cities as complements to the conventional public …


From 2,772 Segments To Five Personas: Summarizing A Diverse Online Audience By Generating Culturally Adapted Personas, Joni Salminen, Sercan Sengun, Haewoon Kwak, Bernard J. Jansen, Jisun An, Soon-Gyu Jung, Sarah Vieweg, D. Fox Harrell Jun 2018

From 2,772 Segments To Five Personas: Summarizing A Diverse Online Audience By Generating Culturally Adapted Personas, Joni Salminen, Sercan Sengun, Haewoon Kwak, Bernard J. Jansen, Jisun An, Soon-Gyu Jung, Sarah Vieweg, D. Fox Harrell

Research Collection School Of Computing and Information Systems

Understanding users in the era of social media is challenging, requiring organizations to adopt novel computation-aided approaches. To exemplify such an approach, we retrieved information on millions of interactions with YouTube video content from a major Middle Eastern media outlet, to automatically generate personas that capture how different audience segments interact with thousands of individual content pieces. Then, we used qualitative data to provide additional insights into the automatically generated persona profiles. Our findings provide insights into social media usage in the Middle East and demonstrate the application of a novel methodology that generates culturally adapted personas of social media …


Automatically Conceptualizing Social Media Analytics Data Via Personas, Jung S.G., Salminen J., An J., Kwak H., Jansen B.J. Jun 2018

Automatically Conceptualizing Social Media Analytics Data Via Personas, Jung S.G., Salminen J., An J., Kwak H., Jansen B.J.

Research Collection School Of Computing and Information Systems

Social media analytics is insightful but can also be difficult to use within organizations. To address this, we present Automatic Persona Generation (APG), a system and methodology for quantitatively generating personas using large amounts of online social media data. The APG system is operational, deployed in a pilot version with several organizations in multiple industry verticals. APG uses a robust web and stable back-end database framework to process tens of millions of user interactions with thousands of online digital products on multiple social media platforms, including Facebook and YouTube. APG identifies both distinct and impactful audience segments for an organization …


Column Generation Approach For Feeder Vessel Routing And Synchronization At A Congested Transshipment Port, Jian G. Jin, Qiang Meng, Hai Wang Jun 2018

Column Generation Approach For Feeder Vessel Routing And Synchronization At A Congested Transshipment Port, Jian G. Jin, Qiang Meng, Hai Wang

Research Collection School Of Computing and Information Systems

With increasing container-shipping traffic in major transshipment ports, unsynchronized shipping services at hub ports usually lead to loss of transshipment connections, significant vessel port-stay time, and congestion. This calls for the design of feeder vessel services to pick up from and deliver containers to neighboring local ports, and, at the same time, synchronize them with long-haul services in a manner that enables efficient container transshipment. In this paper, we present a mixed integer linear programming model to optimize the feeder vessel routes and hub port synchronization with an objective to minimize the total operating and connection cost. We exploit the …


Discovering Hidden Topical Hubs And Authorities In Online Social Networks, Roy Ka-Wei Lee, Tuan-Anh Hoang, Ee-Peng Lim May 2018

Discovering Hidden Topical Hubs And Authorities In Online Social Networks, Roy Ka-Wei Lee, Tuan-Anh Hoang, Ee-Peng Lim

Research Collection School Of Computing and Information Systems

Finding influential users in online social networks is an important problem with many possible useful applications. HITS and other link analysis methods, in particular, have been often used to identify hub and authority users in web graphs and online social networks. These works, however, have not considered topical aspect of links in their analysis. A straightforward approach to overcome this limitation is to first apply topic models to learn the user topics before applying the HITS algorithm. In this paper, we instead propose a novel topic model known as Hub and Authority Topic (HAT) model to combines the two process …


Big Data For Climate Change Actions And The Paradox Of Citizen Informedness, Kustini Lim-Wavde, Robert J. Kauffman May 2018

Big Data For Climate Change Actions And The Paradox Of Citizen Informedness, Kustini Lim-Wavde, Robert J. Kauffman

Research Collection School Of Computing and Information Systems

Advanced sensor technology, social media, and other information technologies have provided us with “big data” on climate change. Due to the World Meteorological Organization’s Global Climate Observing System, climate observations and records, as well as discussions on climate-related concerns such as measurement of air temperature, are widely available now. The United Nations’ Global Pulse visualises public engagement on climate change globally, with data such as the volume of climate-related tweets. Big data, data analytics, and the sharing of scientific results in the popular press have created, as a result, an unprecedented level of citizen informedness—the degree to which citizens have …


Understanding The Effects Of Taxi Ride-Sharing: A Case Study Of Singapore, Yazhe Wang, Baihua Zheng, Ee Peng Lim May 2018

Understanding The Effects Of Taxi Ride-Sharing: A Case Study Of Singapore, Yazhe Wang, Baihua Zheng, Ee Peng Lim

Research Collection School Of Computing and Information Systems

This paper studies the effects of ride-sharing among those calling on taxis in Singapore for similar origin and destination pairs at nearly the same time of day. It proposes a simple yet practical framework for taxi ride-sharing and scheduling, to reduce waiting times and travel times during peak demand periods. The solution method helps taxi users save money while helping taxi drivers serve multiple requests per day, thus increasing their earnings. A comprehensive simulation study is conducted, based on real taxi booking data for the city of Singapore, to evaluate the effect of various factors of the ride-sharing practice, e.g., …


Detect Rumor And Stance Jointly By Neural Multi-Task Learning, Jing Ma, Wei Gao, Kam-Fai Wong Apr 2018

Detect Rumor And Stance Jointly By Neural Multi-Task Learning, Jing Ma, Wei Gao, Kam-Fai Wong

Research Collection School Of Computing and Information Systems

In recent years, an unhealthy phenomenon characterized as the massive spread of fake news or unverified information (i.e., rumors) has become increasingly a daunting issue in human society. The rumors commonly originate from social media outlets, primarily microblogging platforms, being viral afterwards by the wild, willful propagation via a large number of participants. It is observed that rumorous posts often trigger versatile, mostly controversial stances among participating users. Thus, determining the stances on the posts in question can be pertinent to the successful detection of rumors, and vice versa. Existing studies, however, mainly regard rumor detection and stance classification as …


Exploiting User And Venue Characteristics For Fine-Grained Tweet Geolocation, Wen Haw Chong, Ee Peng Lim Apr 2018

Exploiting User And Venue Characteristics For Fine-Grained Tweet Geolocation, Wen Haw Chong, Ee Peng Lim

Research Collection School Of Computing and Information Systems

Which venue is a tweet posted from? We call this a fine-grained geolocation problem. Given an observed tweet, the task is to infer its discrete posting venue, e.g., a specific restaurant. This recovers the venue context and differs from prior work, which geolocats tweets to location coordinates or cities/neighborhoods. First, we conduct empirical analysis to uncover venue and user characteristics for improving geolocation. For venues, we observe spatial homophily, in which venues near each other have more similar tweet content (i.e., text representations) compared to venues further apart. For users, we observe that they are spatially focused and more likely …


Do Your Friends Make You Buy This Brand?: Modeling Social Recommendation With Topics And Brands, Minh Duc Luu, Ee Peng Lim Mar 2018

Do Your Friends Make You Buy This Brand?: Modeling Social Recommendation With Topics And Brands, Minh Duc Luu, Ee Peng Lim

Research Collection School Of Computing and Information Systems

Consumer behavior and marketing research have shown that brand has significant influence on product reviews and product purchase decisions. However, there is very little work on incorporating brand related factors into product recommender systems. Meanwhile, the similarity in brand preference between a user and other socially connected users also affects her adoption decisions. To integrate seamlessly the individual and social brand related factors into the recommendation process, we propose a novel model called Social Brand–Item–Topic (SocBIT). As the original SocBIT model does not enforce non-negativity, which poses some difficulty in result interpretation, we also propose a non-negative version, called SocBIT(Formula …


Upping The Game Of Taxi Driving In The Age Of Uber, Shashi Shekhar Jha, Shih-Fen Cheng, Meghna Lowalekar, Wai Hin Wong, Rajendram Rishikeshan Rajendram, Trong Khiem Tran, Pradeep Varakantham, Nghia Truong Trong, Firmansyah Abd Rahman Feb 2018

Upping The Game Of Taxi Driving In The Age Of Uber, Shashi Shekhar Jha, Shih-Fen Cheng, Meghna Lowalekar, Wai Hin Wong, Rajendram Rishikeshan Rajendram, Trong Khiem Tran, Pradeep Varakantham, Nghia Truong Trong, Firmansyah Abd Rahman

Research Collection School Of Computing and Information Systems

In most cities, taxis play an important role in providing point-to-point transportation service. If the taxi service is reliable, responsive, and cost-effective, past studies show that taxi-like services can be a viable choice in replacing a significant amount of private cars. However, making taxi services efficient is extremely challenging, mainly due to the fact that taxi drivers are self-interested and they operate with only local information. Although past research has demonstrated how recommendation systems could potentially help taxi drivers in improving their performance, most of these efforts are not feasible in practice. This is mostly due to the lack of …


Anatomy Of Online Hate: Developing A Taxonomy And Machine Learning Models For Identifying And Classifying Hate In Online News Media, Joni Salminen, Hind Almerekhi, Milica Milenkovic, Soon-Gyu Jung, Haewoon Kwak, Haewoon Kwak, Bernard J. Jansen Jan 2018

Anatomy Of Online Hate: Developing A Taxonomy And Machine Learning Models For Identifying And Classifying Hate In Online News Media, Joni Salminen, Hind Almerekhi, Milica Milenkovic, Soon-Gyu Jung, Haewoon Kwak, Haewoon Kwak, Bernard J. Jansen

Research Collection School Of Computing and Information Systems

Online social media platforms generally attempt to mitigate hateful expressions, as these comments can be detrimental to the health of the community. However, automatically identifying hateful comments can be challenging. We manually label 5,143 hateful expressions posted to YouTube and Facebook videos among a dataset of 137,098 comments from an online news media. We then create a granular taxonomy of different types and targets of online hate and train machine learning models to automatically detect and classify the hateful comments in the full dataset. Our contribution is twofold: 1) creating a granular taxonomy for hateful online comments that includes both …