Open Access. Powered by Scholars. Published by Universities.®

Databases and Information Systems Commons

Open Access. Powered by Scholars. Published by Universities.®

5,920 Full-Text Articles 7,723 Authors 2,672,546 Downloads 195 Institutions

All Articles in Databases and Information Systems

Faceted Search

5,920 full-text articles. Page 3 of 219.

A Unified Dialogue User Simulator For Few-Shot Data Augmentation, Dazhen WAN, Zheng ZHANG, Qi ZHU, Lizi LIAO, Minlie HUANG 2022 Singapore Management University

A Unified Dialogue User Simulator For Few-Shot Data Augmentation, Dazhen Wan, Zheng Zhang, Qi Zhu, Lizi Liao, Minlie Huang

Research Collection School Of Computing and Information Systems

Pre-trained language models have shown superior performance in task-oriented dialogues. However, existing datasets are on limited scales, which cannot support large-scale pre-training. Fortunately, various data augmentation methods have been developed to augment largescale task-oriented dialogue corpora. However, they heavily rely on annotated data in the target domain, which require a tremendous amount of data collection and human labeling work. In this paper, we build a unified dialogue user simulation model by pre-training on several publicly available datasets. The model can then be tuned on a target domain with fewshot data. The experiments on a target dataset across multiple domains show …


Rural America Is Still Technologically Behind: Why It Matters Now More Than Ever, Paul Force-Emery Mackie 2022 Minnesota State University - Mankato

Rural America Is Still Technologically Behind: Why It Matters Now More Than Ever, Paul Force-Emery Mackie

Social Work Department Publications

No abstract provided.


Redefining Research In Nanotechnology Simulations: A New Approach To Data Caching And Analysis, Darin Tsai, Alan Zhang, Aloysius Rebeiro 2022 Purdue University

Redefining Research In Nanotechnology Simulations: A New Approach To Data Caching And Analysis, Darin Tsai, Alan Zhang, Aloysius Rebeiro

The Journal of Purdue Undergraduate Research

No abstract provided.


Morphologically-Aware Vocabulary Reduction Of Word Embeddings, Chong Cher CHIA, Maksim TKACHENKO, Hady Wirawan LAUW 2022 Singapore Management University

Morphologically-Aware Vocabulary Reduction Of Word Embeddings, Chong Cher Chia, Maksim Tkachenko, Hady Wirawan Lauw

Research Collection School Of Computing and Information Systems

We propose SubText, a compression mechanism via vocabulary reduction. The crux is to judiciously select a subset of word embeddings which support the reconstruction of the remaining word embeddings based on their form alone. The proposed algorithm considers the preservation of the original embeddings, as well as a word’s relationship to other words that are morphologically or semantically similar. Comprehensive evaluation of the compressed vocabulary reveals SubText’s efficacy on diverse tasks over traditional vocabulary reduction techniques, as validated on English, as well as a collection of inflected languages.


Photovoltaic Cells For Energy Harvesting And Indoor Positioning, Hamada RIZK, Dong MA, Mahbub HASSAN, Moustafa YOUSSEF 2022 Singapore Management University

Photovoltaic Cells For Energy Harvesting And Indoor Positioning, Hamada Rizk, Dong Ma, Mahbub Hassan, Moustafa Youssef

Research Collection School Of Computing and Information Systems

We propose SoLoc, a lightweight probabilistic fingerprinting-based technique for energy-free device-free indoor localization. The system harnesses photovoltaic currents harvested by the photovoltaic cells in smart environments for simultaneously powering digital devices and user positioning. The basic principle is that the location of the human interferes with the lighting received by the photovoltaic cells, thus producing a location fingerprint on the generated photocurrents. To ensure resilience to noisy measurements, SoLoc constructs probability distributions as a photovoltaic fingerprint at each location. Then, we employ a probabilistic graphical model for estimating the user location in the continuous space. Results show that SoLoc can …


Investigating Bloom's Cognitive Skills In Foundation And Advanced Programming Courses From Students' Discussions, Joel Jer Wei LIM, GOTTIPATI Swapna, Kyong Jin SHIM 2022 Singapore Management University

Investigating Bloom's Cognitive Skills In Foundation And Advanced Programming Courses From Students' Discussions, Joel Jer Wei Lim, Gottipati Swapna, Kyong Jin Shim

Research Collection School Of Computing and Information Systems

Programming courses provide students with the skills to develop complex business applications. Teaching and learning programming is challenging, and collaborative learning is proposed to help with this challenge. Online discussion forums promote networking with other learners such that they can build knowledge collaboratively. It aids students open their horizons of thought processes to acquire cognitive skills. Cognitive analysis of discussion is critical to understand students' learning process. In this paper, we propose Bloom's taxonomy based cognitive model for programming discussion forums. We present machine learning (ML) based solution to extract students' cognitive skills. Our evaluations on compupting courses show that …


Meta-Complementing The Semantics Of Short Texts In Neural Topic Models, Ce ZHANG, Hady Wirawan LAUW 2022 Singapore Management University

Meta-Complementing The Semantics Of Short Texts In Neural Topic Models, Ce Zhang, Hady Wirawan Lauw

Research Collection School Of Computing and Information Systems

Topic models infer latent topic distributions based on observed word co-occurrences in a text corpus. While typically a corpus contains documents of variable lengths, most previous topic models treat documents of different lengths uniformly, assuming that each document is sufficiently informative. However, shorter documents may have only a few word co-occurrences, resulting in inferior topic quality. Some other previous works assume that all documents are short, and leverage external auxiliary data, e.g., pretrained word embeddings and document connectivity. Orthogonal to existing works, we remedy this problem within the corpus itself by proposing a Meta-Complement Topic Model, which improves topic quality …


Vlstereoset: A Study Of Stereotypical Bias In Pre-Trained Vision-Language Models, Kankan ZHOU, Yibin LAI, Jing JIANG 2022 Singapore Management University

Vlstereoset: A Study Of Stereotypical Bias In Pre-Trained Vision-Language Models, Kankan Zhou, Yibin Lai, Jing Jiang

Research Collection School Of Computing and Information Systems

In this paper we study how to measure stereotypical bias in pre-trained vision-language models. We leverage a recently released text-only dataset, StereoSet, which covers a wide range of stereotypical bias, and extend it into a vision-language probing dataset called VLStereoSet to measure stereotypical bias in vision-language models. We analyze the differences between text and image and propose a probing task that detects bias by evaluating a model’s tendency to pick stereotypical statements as captions for anti-stereotypical images. We further define several metrics to measure both a vision-language model’s overall stereotypical bias and its intra-modal and inter-modal bias. Experiments on six …


What Motivates Software Practitioners To Contribute To Inner Source?, Zhiyuan WAN, Xin XIA, Yun ZHANG, David LO, Daibing ZHOU, Qiuyuan CHEN, Ahmed E. HASSAN 2022 Singapore Management University

What Motivates Software Practitioners To Contribute To Inner Source?, Zhiyuan Wan, Xin Xia, Yun Zhang, David Lo, Daibing Zhou, Qiuyuan Chen, Ahmed E. Hassan

Research Collection School Of Computing and Information Systems

Software development organizations have adopted open source development practices to support or augment their software development processes, a phenomenon referred to as inner source. Given the rapid adoption of inner source, we wonder what motivates software practitioners to contribute to inner source projects. We followed a mixed-methods approach--a qualitative phase of interviews with 20 interviewees, followed by a quantitative phase of an exploratory survey with 124 respondents from 13 countries across four continents. Our study uncovers practitioners' motivation to contribute to inner source projects, as well as how the motivation differs from what motivates practitioners to participate in open source …


Text Mining Policy Documents To Support Transboundary Integrated Ecosystem Assessment: The Case Of The South Mid-Atlantic Ridge, Debora Cristina Ferrari Ramalho 2022 World Maritime University

Text Mining Policy Documents To Support Transboundary Integrated Ecosystem Assessment: The Case Of The South Mid-Atlantic Ridge, Debora Cristina Ferrari Ramalho

World Maritime University Dissertations

No abstract provided.


The Necessity Of Cloud-Based Simulator For Indonesia's Maritime Education And Training Institutions, Stevian Geerbel Adrianes Rakka 2022 World Maritime University

The Necessity Of Cloud-Based Simulator For Indonesia's Maritime Education And Training Institutions, Stevian Geerbel Adrianes Rakka

World Maritime University Dissertations

No abstract provided.


Data Sharing Through Open Access Data Repositories, Karin Bennedsen 2022 Kennesaw State University

Data Sharing Through Open Access Data Repositories, Karin Bennedsen

All Things Open

The National Institutes of Health has expanded their data sharing requirements for obtaining funding to now include all awards for research producing scientific data to accelerate “biomedical research discovery, in part, by enabling validation of research results, providing accessibility to high-value datasets, and promoting data reuse for future research studies.” The new policy requiring a Data Management & Sharing Plan (DMSP) for all applications goes into effect January 25th, 2023. A DMSP includes where the data will be stored. This lightning talk will review Open Access Data Repositories. Don’t let the task of trying to find data storage hold you …


Locally Varying Distance Transform For Unsupervised Visual Anomaly Detection, Wen-yan LIN, Zhonghang LIU, Siying LIU 2022 Singapore Management University

Locally Varying Distance Transform For Unsupervised Visual Anomaly Detection, Wen-Yan Lin, Zhonghang Liu, Siying Liu

Research Collection School Of Computing and Information Systems

Unsupervised anomaly detection on image data is notoriously unstable. We believe this is because many classical anomaly detectors implicitly assume data is low dimensional. However, image data is always high dimensional. Images can be projected to a low dimensional embedding but such projections rely on global transformations that truncate minor variations. As anomalies are rare, the final embedding often lacks the key variations needed to distinguish anomalies from normal instances. This paper proposes a new embedding using a set of locally varying data projections, with each projection responsible for persevering the variations that distinguish a local cluster of instances from …


Multi-Functional Job Roles To Support Operations In A Multi-Faceted Jewel Enabled By Ai And Digital Transformation, Steven MILLER 2022 Singapore Management University

Multi-Functional Job Roles To Support Operations In A Multi-Faceted Jewel Enabled By Ai And Digital Transformation, Steven Miller

Research Collection School Of Computing and Information Systems

In this story, we highlight the way in which the use of AI enabled support systems, together with work process digital transformation and innovative approaches to job redesign, have combined to dramatically change the nature of the work of the front-line service staff who protect and support the facility and visitors at the world’s most iconic airport mall and lifestyle destination.


On Mitigating Hard Clusters For Face Clustering, Yingjie CHEN, Huasong ZHONG, Chong CHEN, Chen SHEN, Jianqiang HUANG, Tao WANG, Yun LIANG, Qianru SUN 2022 Singapore Management University

On Mitigating Hard Clusters For Face Clustering, Yingjie Chen, Huasong Zhong, Chong Chen, Chen Shen, Jianqiang Huang, Tao Wang, Yun Liang, Qianru Sun

Research Collection School Of Computing and Information Systems

Face clustering is a promising way to scale up face recognition systems using large-scale unlabeled face images. It remains challenging to identify small or sparse face image clusters that we call hard clusters, which is caused by the heterogeneity, i.e., high variations in size and sparsity, of the clusters. Consequently, the conventional way of using a uniform threshold (to identify clusters) often leads to a terrible misclassification for the samples that should belong to hard clusters. We tackle this problem by leveraging the neighborhood information of samples and inferring the cluster memberships (of samples) in a probabilistic way. We introduce …


Equivariance And Invariance Inductive Bias For Learning From Insufficient Data, Tan WANG, Qianru SUN, Sugiri PRANATA, Karlekar JAYASHREE, Hanwang ZHANG 2022 Singapore Management University

Equivariance And Invariance Inductive Bias For Learning From Insufficient Data, Tan Wang, Qianru Sun, Sugiri Pranata, Karlekar Jayashree, Hanwang Zhang

Research Collection School Of Computing and Information Systems

We are interested in learning robust models from insufficient data, without the need for any externally pre-trained model checkpoints. First, compared to sufficient data, we show why insufficient data renders the model more easily biased to the limited training environments that are usually different from testing. For example, if all the training "swan" samples are "white", the model may wrongly use the "white" environment to represent the intrinsic class "swan". Then, we justify that equivariance inductive bias can retain the class feature while invariance inductive bias can remove the environmental feature, leaving only the class feature that generalizes to any …


Ngram-Oaxe: Phrase-Based Order-Agnostic Cross Entropy For Non-Autoregressive Machine Translation, Cunxiao DU, Zhaopeng TU, Longyue WANG, Jing JIANG 2022 Singapore Management University

Ngram-Oaxe: Phrase-Based Order-Agnostic Cross Entropy For Non-Autoregressive Machine Translation, Cunxiao Du, Zhaopeng Tu, Longyue Wang, Jing Jiang

Research Collection School Of Computing and Information Systems

Recently, a new training oaxe loss has proven effective to ameliorate the effect of multimodality for non-autoregressive translation (NAT), which removes the penalty of word order errors in the standard cross-entropy loss. Starting from the intuition that reordering generally occurs between phrases, we extend oaxe by only allowing reordering between ngram phrases and still requiring a strict match of word order within the phrases. Extensive experiments on NAT benchmarks across language pairs and data scales demonstrate the effectiveness and universality of our approach. Further analyses show that ngram noaxe indeed improves the translation of ngram phrases, and produces more fluent …


Two Singapore Public Healthcare Ai Applications For National Screening Programs And Other Examples, Andy Wee An TA, Han Leong GOH, Christine ANG, Lian Yeow KOH, Ken POON, Steven M. MILLER 2022 Integrated Health Information Systems Pte Ltd

Two Singapore Public Healthcare Ai Applications For National Screening Programs And Other Examples, Andy Wee An Ta, Han Leong Goh, Christine Ang, Lian Yeow Koh, Ken Poon, Steven M. Miller

Research Collection School Of Computing and Information Systems

This article explains how two AI systems have been incorporated into the everyday operations of two Singapore public healthcare nation-wide screening programs. The first example is embedded within the setting of a national level population health screening program for diabetes related eye diseases, targeting the rapidly increasing number of adults in the country with diabetes. In the second example, the AI assisted screening is done shortly after a person is admitted to one of the public hospitals to identify which inpatients—especially which elderly patients with complex conditions—have a high risk of being readmitted as an inpatient multiple times in the …


Adaptive Structural Similarity Preserving For Unsupervised Cross Modal Hashing, Liang LI, Baihua ZHENG, Weiwei SUN 2022 Singapore Management University

Adaptive Structural Similarity Preserving For Unsupervised Cross Modal Hashing, Liang Li, Baihua Zheng, Weiwei Sun

Research Collection School Of Computing and Information Systems

Relation Extraction (RE) is a vital step to complete Knowledge Graph (KG) by extracting entity relations from texts. However, it usually suffers from the long-tail issue. The training data mainly concentrates on a few types of relations, leading to the lack of sufficient annotations for the remaining types of relations. In this paper, we propose a general approach to learn relation prototypes from unlabeled texts, to facilitate the long-tail relation extraction by transferring knowledge from the relation types with sufficient training data. We learn relation prototypes as an implicit factor between entities, which reflects the meanings of relations as well …


Tgdm: Target Guided Dynamic Mixup For Cross-Domain Few-Shot Learning, Linhai ZHUO, Yuqian FU, Jingjing CHEN, Yixin CAO, Yu-Gang JIANG 2022 Singapore Management University

Tgdm: Target Guided Dynamic Mixup For Cross-Domain Few-Shot Learning, Linhai Zhuo, Yuqian Fu, Jingjing Chen, Yixin Cao, Yu-Gang Jiang

Research Collection School Of Computing and Information Systems

Given sufficient training data on the source domain, cross-domain few-shot learning (CD-FSL) aims at recognizing new classes with a small number of labeled examples on the target domain. The key to addressing CD-FSL is to narrow the domain gap and transferring knowledge of a network trained on the source domain to the target domain. To help knowledge transfer, this paper introduces an intermediate domain generated by mixing images in the source and the target domain. Specifically, to generate the optimal intermediate domain for different target data, we propose a novel target guided dynamic mixup (TGDM) framework that leverages the target …


Digital Commons powered by bepress