A Unified Dialogue User Simulator For Few-Shot Data Augmentation,
2022
Singapore Management University
A Unified Dialogue User Simulator For Few-Shot Data Augmentation, Dazhen Wan, Zheng Zhang, Qi Zhu, Lizi Liao, Minlie Huang
Research Collection School Of Computing and Information Systems
Pre-trained language models have shown superior performance in task-oriented dialogues. However, existing datasets are on limited scales, which cannot support large-scale pre-training. Fortunately, various data augmentation methods have been developed to augment largescale task-oriented dialogue corpora. However, they heavily rely on annotated data in the target domain, which require a tremendous amount of data collection and human labeling work. In this paper, we build a unified dialogue user simulation model by pre-training on several publicly available datasets. The model can then be tuned on a target domain with fewshot data. The experiments on a target dataset across multiple domains show …
Rural America Is Still Technologically Behind: Why It Matters Now More Than Ever,
2022
Minnesota State University - Mankato
Rural America Is Still Technologically Behind: Why It Matters Now More Than Ever, Paul Force-Emery Mackie
Social Work Department Publications
No abstract provided.
Redefining Research In Nanotechnology Simulations: A New Approach To Data Caching And Analysis,
2022
Purdue University
Redefining Research In Nanotechnology Simulations: A New Approach To Data Caching And Analysis, Darin Tsai, Alan Zhang, Aloysius Rebeiro
The Journal of Purdue Undergraduate Research
No abstract provided.
Morphologically-Aware Vocabulary Reduction Of Word Embeddings,
2022
Singapore Management University
Morphologically-Aware Vocabulary Reduction Of Word Embeddings, Chong Cher Chia, Maksim Tkachenko, Hady Wirawan Lauw
Research Collection School Of Computing and Information Systems
We propose SubText, a compression mechanism via vocabulary reduction. The crux is to judiciously select a subset of word embeddings which support the reconstruction of the remaining word embeddings based on their form alone. The proposed algorithm considers the preservation of the original embeddings, as well as a word’s relationship to other words that are morphologically or semantically similar. Comprehensive evaluation of the compressed vocabulary reveals SubText’s efficacy on diverse tasks over traditional vocabulary reduction techniques, as validated on English, as well as a collection of inflected languages.
Photovoltaic Cells For Energy Harvesting And Indoor Positioning,
2022
Singapore Management University
Photovoltaic Cells For Energy Harvesting And Indoor Positioning, Hamada Rizk, Dong Ma, Mahbub Hassan, Moustafa Youssef
Research Collection School Of Computing and Information Systems
We propose SoLoc, a lightweight probabilistic fingerprinting-based technique for energy-free device-free indoor localization. The system harnesses photovoltaic currents harvested by the photovoltaic cells in smart environments for simultaneously powering digital devices and user positioning. The basic principle is that the location of the human interferes with the lighting received by the photovoltaic cells, thus producing a location fingerprint on the generated photocurrents. To ensure resilience to noisy measurements, SoLoc constructs probability distributions as a photovoltaic fingerprint at each location. Then, we employ a probabilistic graphical model for estimating the user location in the continuous space. Results show that SoLoc can …
Investigating Bloom's Cognitive Skills In Foundation And Advanced Programming Courses From Students' Discussions,
2022
Singapore Management University
Investigating Bloom's Cognitive Skills In Foundation And Advanced Programming Courses From Students' Discussions, Joel Jer Wei Lim, Gottipati Swapna, Kyong Jin Shim
Research Collection School Of Computing and Information Systems
Programming courses provide students with the skills to develop complex business applications. Teaching and learning programming is challenging, and collaborative learning is proposed to help with this challenge. Online discussion forums promote networking with other learners such that they can build knowledge collaboratively. It aids students open their horizons of thought processes to acquire cognitive skills. Cognitive analysis of discussion is critical to understand students' learning process. In this paper, we propose Bloom's taxonomy based cognitive model for programming discussion forums. We present machine learning (ML) based solution to extract students' cognitive skills. Our evaluations on compupting courses show that …
Meta-Complementing The Semantics Of Short Texts In Neural Topic Models,
2022
Singapore Management University
Meta-Complementing The Semantics Of Short Texts In Neural Topic Models, Ce Zhang, Hady Wirawan Lauw
Research Collection School Of Computing and Information Systems
Topic models infer latent topic distributions based on observed word co-occurrences in a text corpus. While typically a corpus contains documents of variable lengths, most previous topic models treat documents of different lengths uniformly, assuming that each document is sufficiently informative. However, shorter documents may have only a few word co-occurrences, resulting in inferior topic quality. Some other previous works assume that all documents are short, and leverage external auxiliary data, e.g., pretrained word embeddings and document connectivity. Orthogonal to existing works, we remedy this problem within the corpus itself by proposing a Meta-Complement Topic Model, which improves topic quality …
Vlstereoset: A Study Of Stereotypical Bias In Pre-Trained Vision-Language Models,
2022
Singapore Management University
Vlstereoset: A Study Of Stereotypical Bias In Pre-Trained Vision-Language Models, Kankan Zhou, Yibin Lai, Jing Jiang
Research Collection School Of Computing and Information Systems
In this paper we study how to measure stereotypical bias in pre-trained vision-language models. We leverage a recently released text-only dataset, StereoSet, which covers a wide range of stereotypical bias, and extend it into a vision-language probing dataset called VLStereoSet to measure stereotypical bias in vision-language models. We analyze the differences between text and image and propose a probing task that detects bias by evaluating a model’s tendency to pick stereotypical statements as captions for anti-stereotypical images. We further define several metrics to measure both a vision-language model’s overall stereotypical bias and its intra-modal and inter-modal bias. Experiments on six …
What Motivates Software Practitioners To Contribute To Inner Source?,
2022
Singapore Management University
What Motivates Software Practitioners To Contribute To Inner Source?, Zhiyuan Wan, Xin Xia, Yun Zhang, David Lo, Daibing Zhou, Qiuyuan Chen, Ahmed E. Hassan
Research Collection School Of Computing and Information Systems
Software development organizations have adopted open source development practices to support or augment their software development processes, a phenomenon referred to as inner source. Given the rapid adoption of inner source, we wonder what motivates software practitioners to contribute to inner source projects. We followed a mixed-methods approach--a qualitative phase of interviews with 20 interviewees, followed by a quantitative phase of an exploratory survey with 124 respondents from 13 countries across four continents. Our study uncovers practitioners' motivation to contribute to inner source projects, as well as how the motivation differs from what motivates practitioners to participate in open source …
Text Mining Policy Documents To Support Transboundary Integrated Ecosystem Assessment: The Case Of The South Mid-Atlantic Ridge,
2022
World Maritime University
Text Mining Policy Documents To Support Transboundary Integrated Ecosystem Assessment: The Case Of The South Mid-Atlantic Ridge, Debora Cristina Ferrari Ramalho
World Maritime University Dissertations
No abstract provided.
The Necessity Of Cloud-Based Simulator For Indonesia's Maritime Education And Training Institutions,
2022
World Maritime University
The Necessity Of Cloud-Based Simulator For Indonesia's Maritime Education And Training Institutions, Stevian Geerbel Adrianes Rakka
World Maritime University Dissertations
No abstract provided.
Data Sharing Through Open Access Data Repositories,
2022
Kennesaw State University
Data Sharing Through Open Access Data Repositories, Karin Bennedsen
All Things Open
The National Institutes of Health has expanded their data sharing requirements for obtaining funding to now include all awards for research producing scientific data to accelerate “biomedical research discovery, in part, by enabling validation of research results, providing accessibility to high-value datasets, and promoting data reuse for future research studies.” The new policy requiring a Data Management & Sharing Plan (DMSP) for all applications goes into effect January 25th, 2023. A DMSP includes where the data will be stored. This lightning talk will review Open Access Data Repositories. Don’t let the task of trying to find data storage hold you …
Locally Varying Distance Transform For Unsupervised Visual Anomaly Detection,
2022
Singapore Management University
Locally Varying Distance Transform For Unsupervised Visual Anomaly Detection, Wen-Yan Lin, Zhonghang Liu, Siying Liu
Research Collection School Of Computing and Information Systems
Unsupervised anomaly detection on image data is notoriously unstable. We believe this is because many classical anomaly detectors implicitly assume data is low dimensional. However, image data is always high dimensional. Images can be projected to a low dimensional embedding but such projections rely on global transformations that truncate minor variations. As anomalies are rare, the final embedding often lacks the key variations needed to distinguish anomalies from normal instances. This paper proposes a new embedding using a set of locally varying data projections, with each projection responsible for persevering the variations that distinguish a local cluster of instances from …
Multi-Functional Job Roles To Support Operations In A Multi-Faceted Jewel Enabled By Ai And Digital Transformation,
2022
Singapore Management University
Multi-Functional Job Roles To Support Operations In A Multi-Faceted Jewel Enabled By Ai And Digital Transformation, Steven Miller
Research Collection School Of Computing and Information Systems
In this story, we highlight the way in which the use of AI enabled support systems, together with work process digital transformation and innovative approaches to job redesign, have combined to dramatically change the nature of the work of the front-line service staff who protect and support the facility and visitors at the world’s most iconic airport mall and lifestyle destination.
On Mitigating Hard Clusters For Face Clustering,
2022
Singapore Management University
On Mitigating Hard Clusters For Face Clustering, Yingjie Chen, Huasong Zhong, Chong Chen, Chen Shen, Jianqiang Huang, Tao Wang, Yun Liang, Qianru Sun
Research Collection School Of Computing and Information Systems
Face clustering is a promising way to scale up face recognition systems using large-scale unlabeled face images. It remains challenging to identify small or sparse face image clusters that we call hard clusters, which is caused by the heterogeneity, i.e., high variations in size and sparsity, of the clusters. Consequently, the conventional way of using a uniform threshold (to identify clusters) often leads to a terrible misclassification for the samples that should belong to hard clusters. We tackle this problem by leveraging the neighborhood information of samples and inferring the cluster memberships (of samples) in a probabilistic way. We introduce …
Equivariance And Invariance Inductive Bias For Learning From Insufficient Data,
2022
Singapore Management University
Equivariance And Invariance Inductive Bias For Learning From Insufficient Data, Tan Wang, Qianru Sun, Sugiri Pranata, Karlekar Jayashree, Hanwang Zhang
Research Collection School Of Computing and Information Systems
We are interested in learning robust models from insufficient data, without the need for any externally pre-trained model checkpoints. First, compared to sufficient data, we show why insufficient data renders the model more easily biased to the limited training environments that are usually different from testing. For example, if all the training "swan" samples are "white", the model may wrongly use the "white" environment to represent the intrinsic class "swan". Then, we justify that equivariance inductive bias can retain the class feature while invariance inductive bias can remove the environmental feature, leaving only the class feature that generalizes to any …
Ngram-Oaxe: Phrase-Based Order-Agnostic Cross Entropy For Non-Autoregressive Machine Translation,
2022
Singapore Management University
Ngram-Oaxe: Phrase-Based Order-Agnostic Cross Entropy For Non-Autoregressive Machine Translation, Cunxiao Du, Zhaopeng Tu, Longyue Wang, Jing Jiang
Research Collection School Of Computing and Information Systems
Recently, a new training oaxe loss has proven effective to ameliorate the effect of multimodality for non-autoregressive translation (NAT), which removes the penalty of word order errors in the standard cross-entropy loss. Starting from the intuition that reordering generally occurs between phrases, we extend oaxe by only allowing reordering between ngram phrases and still requiring a strict match of word order within the phrases. Extensive experiments on NAT benchmarks across language pairs and data scales demonstrate the effectiveness and universality of our approach. Further analyses show that ngram noaxe indeed improves the translation of ngram phrases, and produces more fluent …
Two Singapore Public Healthcare Ai Applications For National Screening Programs And Other Examples,
2022
Integrated Health Information Systems Pte Ltd
Two Singapore Public Healthcare Ai Applications For National Screening Programs And Other Examples, Andy Wee An Ta, Han Leong Goh, Christine Ang, Lian Yeow Koh, Ken Poon, Steven M. Miller
Research Collection School Of Computing and Information Systems
This article explains how two AI systems have been incorporated into the everyday operations of two Singapore public healthcare nation-wide screening programs. The first example is embedded within the setting of a national level population health screening program for diabetes related eye diseases, targeting the rapidly increasing number of adults in the country with diabetes. In the second example, the AI assisted screening is done shortly after a person is admitted to one of the public hospitals to identify which inpatients—especially which elderly patients with complex conditions—have a high risk of being readmitted as an inpatient multiple times in the …
Adaptive Structural Similarity Preserving For Unsupervised Cross Modal Hashing,
2022
Singapore Management University
Adaptive Structural Similarity Preserving For Unsupervised Cross Modal Hashing, Liang Li, Baihua Zheng, Weiwei Sun
Research Collection School Of Computing and Information Systems
Relation Extraction (RE) is a vital step to complete Knowledge Graph (KG) by extracting entity relations from texts. However, it usually suffers from the long-tail issue. The training data mainly concentrates on a few types of relations, leading to the lack of sufficient annotations for the remaining types of relations. In this paper, we propose a general approach to learn relation prototypes from unlabeled texts, to facilitate the long-tail relation extraction by transferring knowledge from the relation types with sufficient training data. We learn relation prototypes as an implicit factor between entities, which reflects the meanings of relations as well …
Tgdm: Target Guided Dynamic Mixup For Cross-Domain Few-Shot Learning,
2022
Singapore Management University
Tgdm: Target Guided Dynamic Mixup For Cross-Domain Few-Shot Learning, Linhai Zhuo, Yuqian Fu, Jingjing Chen, Yixin Cao, Yu-Gang Jiang
Research Collection School Of Computing and Information Systems
Given sufficient training data on the source domain, cross-domain few-shot learning (CD-FSL) aims at recognizing new classes with a small number of labeled examples on the target domain. The key to addressing CD-FSL is to narrow the domain gap and transferring knowledge of a network trained on the source domain to the target domain. To help knowledge transfer, this paper introduces an intermediate domain generated by mixing images in the source and the target domain. Specifically, to generate the optimal intermediate domain for different target data, we propose a novel target guided dynamic mixup (TGDM) framework that leverages the target …