Open Access. Powered by Scholars. Published by Universities.®

Databases and Information Systems

2020

Institution
Keyword
Publication
Publication Type

Articles 1 - 30 of 41

Full-Text Articles in Artificial Intelligence and Robotics

Data: The Good, The Bad And The Ethical, John D. Kelleher, Filipe Cabral Pinto, Luis M. Cortesao Dec 2020

Data: The Good, The Bad And The Ethical, John D. Kelleher, Filipe Cabral Pinto, Luis M. Cortesao

Articles

It is often the case with new technologies that it is very hard to predict their long-term impacts and as a result, although new technology may be beneficial in the short term, it can still cause problems in the longer term. This is what happened with oil by-products in different areas: the use of plastic as a disposable material did not take into account the hundreds of years necessary for its decomposition and its related long-term environmental damage. Data is said to be the new oil. The message to be conveyed is associated with its intrinsic value. But as in …


Vision-Based Analytics For Improved Ai-Driven Iot Applications, Amit Sharma Dec 2020

Vision-Based Analytics For Improved Ai-Driven Iot Applications, Amit Sharma

Dissertations and Theses Collection (Open Access)

Proliferation of Internet of Things (IoT) sensor systems, primarily driven by cheaper embedded hardware platforms and wide availability of light-weight software platforms, has opened up doors for large-scale data collection opportunities. The availability of massive amount of data has in-turn given way to rapidly growing machine learning models e.g. You Only Look Once (YOLO), Single-Shot-Detectors (SSD) and so on. There has been a growing trend of applying machine learning techniques, e.g., object detection, image classification, face detection etc., on data collected from camera sensors and therefore enabling plethora of vision-sensing applications namely self-driving cars, automatic crowd monitoring, traffic-flow analysis, occupancy …


Interventional Few-Shot Learning, Zhongqi Yue, Zhang Hanwang, Qianru Sun, Xian-Sheng Hua Dec 2020

Interventional Few-Shot Learning, Zhongqi Yue, Zhang Hanwang, Qianru Sun, Xian-Sheng Hua

Research Collection School Of Computing and Information Systems

We uncover an ever-overlooked deficiency in the prevailing Few-Shot Learning (FSL) methods: the pre-trained knowledge is indeed a confounder that limits the performance. This finding is rooted from our causal assumption: a Structural Causal Model (SCM) for the causalities among the pre-trained knowledge, sample features, and labels. Thanks to it, we propose a novel FSL paradigm: Interventional Few-Shot Learning (IFSL). Specifically, we develop three effective IFSL algorithmic implementations based on the backdoor adjustment, which is essentially a causal intervention towards the SCM of many-shot learning: the upper-bound of FSL in a causal view. It is worth noting that the contribution …


Nearest Centroid: A Bridge Between Statistics And Machine Learning, Manoj Thulasidas Dec 2020

Nearest Centroid: A Bridge Between Statistics And Machine Learning, Manoj Thulasidas

Research Collection School Of Computing and Information Systems

In order to guide our students of machine learning in their statistical thinking, we need conceptually simple and mathematically defensible algorithms. In this paper, we present the Nearest Centroid algorithm (NC) algorithm as a pedagogical tool, combining the key concepts behind two foundational algorithms: K-Means clustering and K Nearest Neighbors (k- NN). In NC, we use the centroid (as defined in the K-Means algorithm) of the observations belonging to each class in our training data set and its distance from a new observation (similar to k-NN) for class prediction. Using this obvious extension, we will illustrate how the concepts of …


Causal Intervention For Weakly-Supervised Semantic Segmentation, Zhang Dong, Hanwang Zhang, Jinhui Tang, Xian-Sheng Hua, Qianru Sun Dec 2020

Causal Intervention For Weakly-Supervised Semantic Segmentation, Zhang Dong, Hanwang Zhang, Jinhui Tang, Xian-Sheng Hua, Qianru Sun

Research Collection School Of Computing and Information Systems

We present a causal inference framework to improve Weakly-Supervised Semantic Segmentation (WSSS). Specifically, we aim to generate better pixel-level pseudo-masks by using only image-level labels --- the most crucial step in WSSS. We attribute the cause of the ambiguous boundaries of pseudo-masks to the confounding context, e.g., the correct image-level classification of "horse'' and "person'' may be not only due to the recognition of each instance, but also their co-occurrence context, making the model inspection (e.g., CAM) hard to distinguish between the boundaries. Inspired by this, we propose a structural causal model to analyze the causalities among images, contexts, and …


Deep Multi-Task Learning For Depression Detection And Prediction In Longitudinal Data, Guansong Pang, Ngoc Thien Anh Pham, Emma Baker, Rebecca Bentley, Anton Van Den Hengel Dec 2020

Deep Multi-Task Learning For Depression Detection And Prediction In Longitudinal Data, Guansong Pang, Ngoc Thien Anh Pham, Emma Baker, Rebecca Bentley, Anton Van Den Hengel

Research Collection School Of Computing and Information Systems

Depression is among the most prevalent mental disorders, affecting millions of people of all ages globally. Machine learning techniques have shown effective in enabling automated detection and prediction of depression for early intervention and treatment. However, they are challenged by the relative scarcity of instances of depression in the data. In this work we introduce a novel deep multi-task recurrent neural network to tackle this challenge, in which depression classification is jointly optimized with two auxiliary tasks, namely one-class metric learning and anomaly ranking. The auxiliary tasks introduce an inductive bias that improves the classification model's generalizability on small depression …


Heterogeneous Univariate Outlier Ensembles In Multidimensional Data, Guansong Pang, Longbing Cao Dec 2020

Heterogeneous Univariate Outlier Ensembles In Multidimensional Data, Guansong Pang, Longbing Cao

Research Collection School Of Computing and Information Systems

In outlier detection, recent major research has shifted from developing univariate methods to multivariate methods due to the rapid growth of multidimensional data. However, one typical issue of this paradigm shift is that many multidimensional data often mainly contains univariate outliers, in which many features are actually irrelevant. In such cases, multivariate methods are ineffective in identifying such outliers due to the potential biases and the curse of dimensionality brought by irrelevant features. Those univariate outliers might be well detected by applying univariate outlier detectors in individually relevant features. However, it is very challenging to choose a right univariate detector …


Espade: An Efficient And Semantically Secure Shortest Path Discovery For Outsourced Location-Based Services, Bharath K. Samanthula, Divyadharshini Karthikeyan, Boxiang Dong, K. Anitha Kumari Oct 2020

Espade: An Efficient And Semantically Secure Shortest Path Discovery For Outsourced Location-Based Services, Bharath K. Samanthula, Divyadharshini Karthikeyan, Boxiang Dong, K. Anitha Kumari

Department of Computer Science Faculty Scholarship and Creative Works

With the rapid growth of smart devices and technological advancements in tracking geospatial data, the demand for Location-Based Services (LBS) is facing a constant rise in several domains, including military, healthcare and transportation. It is a natural step to migrate LBS to a cloud environment to achieve on-demand scalability and increased resiliency. Nonetheless, outsourcing sensitive location data to a third-party cloud provider raises a host of privacy concerns as the data owners have reduced visibility and control over the outsourced data. In this paper, we consider outsourced LBS where users want to retrieve map directions without disclosing their location information. …


The Future Of Work Now: Automl At 84.51°And Kroger, Thomas H. Davenport, Steven M. Miller Oct 2020

The Future Of Work Now: Automl At 84.51°And Kroger, Thomas H. Davenport, Steven M. Miller

Research Collection School Of Computing and Information Systems

One of the most frequently-used phrases at business events these days is “the future of work.” It’s increasingly clear that artificial intelligence and other new technologies will bring substantial changes in work tasks and business processes. But while these changes are predicted for the future, they’re already present in many organizations for many different jobs. The job and incumbents described below are an example of this phenomenon.


Dual-Slam: A Framework For Robust Single Camera Navigation, Huajian Huang, Wen-Yan Lin, Siying Liu, Dong Zhang, Sai-Kit Yeung Oct 2020

Dual-Slam: A Framework For Robust Single Camera Navigation, Huajian Huang, Wen-Yan Lin, Siying Liu, Dong Zhang, Sai-Kit Yeung

Research Collection School Of Computing and Information Systems

SLAM (Simultaneous Localization And Mapping) seeks to provide a moving agent with real-time self-localization. To achieve real-time speed, SLAM incrementally propagates position estimates. This makes SLAM fast but also makes it vulnerable to local pose estimation failures. As local pose estimation is ill-conditioned, local pose estimation failures happen regularly, making the overall SLAM system brittle. This paper attempts to correct this problem. We note that while local pose estimation is ill-conditioned, pose estimation over longer sequences is well-conditioned. Thus, local pose estimation errors eventually manifest themselves as mapping inconsistencies. When this occurs, we save the current map and activate two …


Foodbot: A Goal-Oriented Just-In-Time Healthy Eating Interventions Chatbot, Philips Kokoh Prasetyo, Palakorn Achananuparp, Ee-Peng Lim Oct 2020

Foodbot: A Goal-Oriented Just-In-Time Healthy Eating Interventions Chatbot, Philips Kokoh Prasetyo, Palakorn Achananuparp, Ee-Peng Lim

Research Collection School Of Computing and Information Systems

Recent research has identified a few design flaws in popular mobile health (mHealth) applications for promoting healthy eating lifestyle, such as mobile food journals. These include tediousness of manual food logging, inadequate food database coverage, and a lack of healthy dietary goal setting. To address these issues, we present Foodbot, a chatbot-based mHealth application for goal-oriented just-in-time (JIT) healthy eating interventions. Powered by a large-scale food knowledge graph, Foodbot utilizes automatic speech recognition and mobile messaging interface to record food intake. Moreover, Foodbot allows users to set goals and guides their behavior toward the goals via JIT notification prompts, interactive …


The Future Of Work Now: Ai-Driven Transaction Surveillance At Dbs Bank, Thomas H. Davenport, Steven M. Miller Oct 2020

The Future Of Work Now: Ai-Driven Transaction Surveillance At Dbs Bank, Thomas H. Davenport, Steven M. Miller

Research Collection School Of Computing and Information Systems

One of the most frequently-used phrases at business events these days is “the future of work.” It’s increasingly clear that artificial intelligence and other new technologies will bring substantial changes in work tasks and business processes. But while these changes are predicted for the future, they’re already present in many organizations for many different jobs. The job and incumbents described below are an example of this phenomenon. Steve Miller of Singapore Management University and I co-authored the story.


Cover Song Identification - A Novel Stem-Based Approach To Improve Song-To-Song Similarity Measurements, Lavonnia Newman, Dhyan Shah, Chandler Vaughn, Faizan Javed Sep 2020

Cover Song Identification - A Novel Stem-Based Approach To Improve Song-To-Song Similarity Measurements, Lavonnia Newman, Dhyan Shah, Chandler Vaughn, Faizan Javed

SMU Data Science Review

Music is incorporated into our daily lives whether intentional or unintentional. It evokes responses and behavior so much so there is an entire study dedicated to the psychology of music. Music creates the mood for dancing, exercising, creative thought or even relaxation. It is a powerful tool that can be used in various venues and through advertisements to influence and guide human reactions. Music is also often "borrowed" in the industry today. The practices of sampling and remixing music in the digital age have made cover song identification an active area of research. While most of this research is focused …


Machine Learning Applications For Drug Repurposing, Hansaim Lim Sep 2020

Machine Learning Applications For Drug Repurposing, Hansaim Lim

Dissertations, Theses, and Capstone Projects

The cost of bringing a drug to market is astounding and the failure rate is intimidating. Drug discovery has been of limited success under the conventional reductionist model of one-drug-one-gene-one-disease paradigm, where a single disease-associated gene is identified and a molecular binder to the specific target is subsequently designed. Under the simplistic paradigm of drug discovery, a drug molecule is assumed to interact only with the intended on-target. However, small molecular drugs often interact with multiple targets, and those off-target interactions are not considered under the conventional paradigm. As a result, drug-induced side effects and adverse reactions are often neglected …


The Future Of Work Now: The Multi-Faceted Mall Security Guard At A Multi-Faceted Jewel, Thomas H. Davenport, Steven M. Miller Sep 2020

The Future Of Work Now: The Multi-Faceted Mall Security Guard At A Multi-Faceted Jewel, Thomas H. Davenport, Steven M. Miller

Research Collection School Of Computing and Information Systems

One of the most frequently-used phrases at business events these days is “the future of work.” It’s increasingly clear that artificial intelligence and other new technologies will bring substantial changes in work tasks and business processes. But while these changes are predicted for the future, they’re already present in many organizations for many different jobs. The job and incumbents described below are an example of this phenomenon. Steve Miller of Singapore Management University and I co-authored the story.


Feature Pyramid Transformer, Dong Zhang, Hanwang Zhang, Jinhui Tang, Meng Wang, Xian-Sheng Hua, Qianru Sun Aug 2020

Feature Pyramid Transformer, Dong Zhang, Hanwang Zhang, Jinhui Tang, Meng Wang, Xian-Sheng Hua, Qianru Sun

Research Collection School Of Computing and Information Systems

Feature interactions across space and scales underpin modern visual recognition systems because they introduce beneficial visual contexts. Conventionally, spatial contexts are passively hidden in the CNN’s increasing receptive fields or actively encoded by non-local convolution. Yet, the non-local spatial interactions are not across scales, and thus they fail to capture the non-local contexts of objects (or parts) residing in different scales. To this end, we propose a fully active feature interaction across both space and scales, called Feature Pyramid Transformer (FPT). It transforms any feature pyramid into another feature pyramid of the same size but with richer contexts, by using …


An Attention-Based Rumor Detection Model With Tree-Structured Recursive Neural Networks, Jing Ma, Wei Gao, Shafiq Joty, Kam-Fai Wong Aug 2020

An Attention-Based Rumor Detection Model With Tree-Structured Recursive Neural Networks, Jing Ma, Wei Gao, Shafiq Joty, Kam-Fai Wong

Research Collection School Of Computing and Information Systems

Rumor spread in social media severely jeopardizes the credibility of online content. Thus, automatic debunking of rumors is of great importance to keep social media a healthy environment. While facing a dubious claim, people often dispute its truthfulness sporadically in their posts containing various cues, which can form useful evidence with long-distance dependencies. In this work, we propose to learn discriminative features from microblog posts by following their non-sequential propagation structure and generate more powerful representations for identifying rumors. For modeling non-sequential structure, we first represent the diffusion of microblog posts with propagation trees, which provide valuable clues on how …


An Ensemble Of Epoch-Wise Empirical Bayes For Few-Shot Learning, Yaoyao Liu, Bernt Schiele, Qianru Sun Aug 2020

An Ensemble Of Epoch-Wise Empirical Bayes For Few-Shot Learning, Yaoyao Liu, Bernt Schiele, Qianru Sun

Research Collection School Of Computing and Information Systems

Few-shot learning aims to train efficient predictive models with a few examples. The lack of training data leads to poor models that perform high-variance or low-confidence predictions. In this paper, we propose to meta-learn the ensemble of epoch-wise empirical Bayes models (E3BM) to achieve robust predictions. “Epoch-wise'' means that each training epoch has a Bayes model whose parameters are specifically learned and deployed. ”Empirical'' means that the hyperparameters, e.g., used for learning and ensembling the epoch-wise models, are generated by hyperprior learners conditional on task-specific data. We introduce four kinds of hyperprior learners by considering inductive vs. transductive, and epoch-dependent …


Love A Restaurant? Swipe Right On Foodrecce, Hady W. Lauw, Smu Office Of Research Jul 2020

Love A Restaurant? Swipe Right On Foodrecce, Hady W. Lauw, Smu Office Of Research

Research@SMU Infographics

A bunch of your friends wants to meet for dinner, but nobody can agree on where and what to eat? FoodRecce can help! FoodRecce is an app, developed under the Preferred.AI initiative, that provides recommendations on restaurants based on users' locations and past preferences.


Trajectory Similarity Learning With Auxiliary Supervision And Optimal Matching, Hanyuan Zhang, Xingyu Zhang, Qize Jiang, Baihua Zheng, Zhenbang Sun, Weiwei Sun, Changhu Wang Jul 2020

Trajectory Similarity Learning With Auxiliary Supervision And Optimal Matching, Hanyuan Zhang, Xingyu Zhang, Qize Jiang, Baihua Zheng, Zhenbang Sun, Weiwei Sun, Changhu Wang

Research Collection School Of Computing and Information Systems

Trajectory similarity computation is a core problem in the field of trajectory data queries. However, the high time complexity of calculating the trajectory similarity has always been a bottleneck in real-world applications. Learning-based methods can map trajectories into a uniform embedding space to calculate the similarity of two trajectories with embeddings in constant time. In this paper, we propose a novel trajectory representation learning framework Traj2SimVec that performs scalable and robust trajectory similarity computation. We use a simple and fast trajectory simplification and indexing approach to obtain triplet training samples efficiently. We make the framework more robust via taking full …


Translating Counting Problems Into Computable Language Expressions, Zach Prescott Jun 2020

Translating Counting Problems Into Computable Language Expressions, Zach Prescott

Theses

The realm of automated problem solving is a relatively new field, even in the context of natural language processing. One area where this is often demonstrated is that of creating a program that can solve word problems. The program must understand the problem, perform some processing, and then convey this information to a user in a way that is accessible and understandable. There has been quite a lot of progress in this area with simpler problems. However, when it comes to understanding problems that involve a level of NLP, the results are not conclusive. In this paper, we would like …


Knowledge Enhanced Neural Fashion Trend Forecasting, Yunshan Ma, Yujuan Ding, Xun Yang, Lizi Liao, Wai Keung Wong, Tat-Seng Chua Jun 2020

Knowledge Enhanced Neural Fashion Trend Forecasting, Yunshan Ma, Yujuan Ding, Xun Yang, Lizi Liao, Wai Keung Wong, Tat-Seng Chua

Research Collection School Of Computing and Information Systems

Fashion trend forecasting is a crucial task for both academia andindustry. Although some efforts have been devoted to tackling this challenging task, they only studied limited fashion elements with highly seasonal or simple patterns, which could hardly reveal thereal fashion trends. Towards insightful fashion trend forecasting,this work focuses on investigating fine-grained fashion element trends for specific user groups. We first contribute a large-scale fashion trend dataset (FIT) collected from Instagram with extracted time series fashion element records and user information. Furthermore, to effectively model the time series data of fashion elements with rather complex patterns, we propose a Knowledge Enhanced …


Improved Chinese Language Processing For An Open Source Search Engine, Xianghong Sun May 2020

Improved Chinese Language Processing For An Open Source Search Engine, Xianghong Sun

Master's Projects

Natural Language Processing (NLP) is the process of computers analyzing on human languages. There are also many areas in NLP. Some of the areas include speech recognition, natural language understanding, and natural language generation.

Information retrieval and natural language processing for Asians languages has its own unique set of challenges not present for Indo-European languages. Some of these are text segmentation, named entity recognition in unsegmented text, and part of speech tagging. In this report, we describe our implementation of and experiments with improving the Chinese language processing sub-component of an open source search engine, Yioop. In particular, we rewrote …


Predictive Modeling Of Asynchronous Event Sequence Data, Jin Shang May 2020

Predictive Modeling Of Asynchronous Event Sequence Data, Jin Shang

LSU Doctoral Dissertations

Large volumes of temporal event data, such as online check-ins and electronic records of hospital admissions, are becoming increasingly available in a wide variety of applications including healthcare analytics, smart cities, and social network analysis. Those temporal events are often asynchronous, interdependent, and exhibiting self-exciting properties. For example, in the patient's diagnosis events, the elevated risk exists for a patient that has been recently at risk. Machine learning that leverages event sequence data can improve the prediction accuracy of future events and provide valuable services. For example, in e-commerce and network traffic diagnosis, the analysis of user activities can be …


Predicting Disease Progression Using Deep Recurrent Neural Networks And Longitudinal Electronic Health Record Data, Seunghwan Kim May 2020

Predicting Disease Progression Using Deep Recurrent Neural Networks And Longitudinal Electronic Health Record Data, Seunghwan Kim

McKelvey School of Engineering Theses & Dissertations

Electronic Health Records (EHR) are widely adopted and used throughout healthcare systems and are able to collect and store longitudinal information data that can be used to describe patient phenotypes. From the underlying data structures used in the EHR, discrete data can be extracted and analyzed to improve patient care and outcomes via tasks such as risk stratification and prospective disease management. Temporality in EHR is innately present given the nature of these data, however, and traditional classification models are limited in this context by the cross-sectional nature of training and prediction processes. Finding temporal patterns in EHR is especially …


Storage Management Strategy In Mobile Phones For Photo Crowdsensing, En Wang, Zhengdao Qu, Xinyao Liang, Xiangyu Meng, Yongjian Yang, Dawei Li, Weibin Meng Apr 2020

Storage Management Strategy In Mobile Phones For Photo Crowdsensing, En Wang, Zhengdao Qu, Xinyao Liang, Xiangyu Meng, Yongjian Yang, Dawei Li, Weibin Meng

Department of Computer Science Faculty Scholarship and Creative Works

In mobile crowdsensing, some users jointly finish a sensing task through the sensors equipped in their intelligent terminals. In particular, the photo crowdsensing based on Mobile Edge Computing (MEC) collects pictures for some specific targets or events and uploads them to nearby edge servers, which leads to richer data content and more efficient data storage compared with the common mobile crowdsensing; hence, it has attracted an important amount of attention recently. However, the mobile users prefer uploading the photos through Wifi APs (PoIs) rather than cellular networks. Therefore, photos stored in mobile phones are exchanged among users, in order to …


A Cue Adaptive Decoder For Controllable Neural Response Generation, Weichao Wang, Shi Feng, Wei Gao, Daling Wang, Yifei Zhang Apr 2020

A Cue Adaptive Decoder For Controllable Neural Response Generation, Weichao Wang, Shi Feng, Wei Gao, Daling Wang, Yifei Zhang

Research Collection School Of Computing and Information Systems

In open-domain dialogue systems, dialogue cues such as emotion, persona, and emoji can be incorporated into conversation models for strengthening the semantic relevance of generated responses. Existing neural response generation models either incorporate dialogue cue into decoder’s initial state or embed the cue indiscriminately into the state of every generated word, which may cause the gradients of the embedded cue to vanish or disturb the semantic relevance of generated words during back propagation. In this paper, we propose a Cue Adaptive Decoder (CueAD) that aims to dynamically determine the involvement of a cue at each generation step in the decoding. …


Recipegpt: Generative Pre-Training Based Cooking Recipe Generation And Evaluation System, Helena Huey Chong Lee, Ke Shu, Palakorn Achananuparp, Philips Kokoh Prasetyo, Yue Liu, Ee-Peng Lim, Lav R. Varshney Apr 2020

Recipegpt: Generative Pre-Training Based Cooking Recipe Generation And Evaluation System, Helena Huey Chong Lee, Ke Shu, Palakorn Achananuparp, Philips Kokoh Prasetyo, Yue Liu, Ee-Peng Lim, Lav R. Varshney

Research Collection School Of Computing and Information Systems

Interests in the automatic generation of cooking recipes have been growing steadily over the past few years thanks to a large amount of online cooking recipes. We present RecipeGPT, a novel online recipe generation and evaluation system. The system provides two modes of text generations: (1) instruction generation from given recipe title and ingredients; and (2) ingredient generation from recipe title and cooking instructions. Its back-end text generation module comprises a generative pre-trained language model GPT-2 fine-tuned on a large cooking recipe dataset. Moreover, the recipe evaluation module allows the users to conveniently inspect the quality of the generated recipe …


Stochastically Robust Personalized Ranking For Lsh Recommendation Retrieval, Dung D. Le, Hady W. Lauw Feb 2020

Stochastically Robust Personalized Ranking For Lsh Recommendation Retrieval, Dung D. Le, Hady W. Lauw

Research Collection School Of Computing and Information Systems

Locality Sensitive Hashing (LSH) has become one of the most commonly used approximate nearest neighbor search techniques to avoid the prohibitive cost of scanning through all data points. For recommender systems, LSH achieves efficient recommendation retrieval by encoding user and item vectors into binary hash codes, reducing the cost of exhaustively examining all the item vectors to identify the topk items. However, conventional matrix factorization models may suffer from performance degeneration caused by randomly-drawn LSH hash functions, directly affecting the ultimate quality of the recommendations. In this paper, we propose a framework named SRPR, which factors in the stochasticity of …


Multi-Level Head-Wise Match And Aggregation In Transformer For Textual Sequence Matching, Shuohang Wang, Yunshi Lan, Yi Tay, Jing Jiang, Jingjing Liu Feb 2020

Multi-Level Head-Wise Match And Aggregation In Transformer For Textual Sequence Matching, Shuohang Wang, Yunshi Lan, Yi Tay, Jing Jiang, Jingjing Liu

Research Collection School Of Computing and Information Systems

Transformer has been successfully applied to many natural language processing tasks. However, for textual sequence matching, simple matching between the representation of a pair of sequences might bring in unnecessary noise. In this paper, we propose a new approach to sequence pair matching with Transformer, by learning head-wise matching representations on multiple levels. Experiments show that our proposed approach can achieve new state-of-the-art performance on multiple tasks that rely only on pre-computed sequence-vectorrepresentation, such as SNLI, MNLI-match, MNLI-mismatch, QQP, and SQuAD-binary