Open Access. Powered by Scholars. Published by Universities.®

Arts and Humanities Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 17 of 17

Full-Text Articles in Arts and Humanities

Flacgec: A Chinese Grammatical Error Correction Dataset With Fine-Grained Linguistic Annotation, Hanyue Du, Yike Zhao, Qingyuan Tian, Jiani Wang, Lei Wang, Yunshi Lan, Xuesong Lu Oct 2023

Flacgec: A Chinese Grammatical Error Correction Dataset With Fine-Grained Linguistic Annotation, Hanyue Du, Yike Zhao, Qingyuan Tian, Jiani Wang, Lei Wang, Yunshi Lan, Xuesong Lu

Research Collection School Of Computing and Information Systems

Chinese Grammatical Error Correction (CGEC) has been attracting growing attention from researchers recently. In spite of the fact that multiple CGEC datasets have been developed to support the research, these datasets lack the ability to provide a deep linguistic topology of grammar errors, which is critical for interpreting and diagnosing CGEC approaches. To address this limitation, we introduce FlaCGEC, which is a new CGEC dataset featured with fine-grained linguistic annotation. Specifically, we collect raw corpus from the linguistic schema defined by Chinese language experts, conduct edits on sentences via rules, and refine generated samples manually, which results in 10k sentences …


Singlish Checker: A Tool For Understanding And Analysing An English Creole Language, Lee-Hsun Hsieh, Nam Chew Chua, Agus Trisnajaya Kwee, Pei-Chi Lo, Yang-Yin Lee, Ee-Peng Lim Dec 2022

Singlish Checker: A Tool For Understanding And Analysing An English Creole Language, Lee-Hsun Hsieh, Nam Chew Chua, Agus Trisnajaya Kwee, Pei-Chi Lo, Yang-Yin Lee, Ee-Peng Lim

Research Collection School Of Computing and Information Systems

As English is a widely used language in many countries of different cultures, variants of English also known as English creoles have also been created. Singlish is one such English creole used by people in Singapore. Nevertheless, unlike English, Singlish is not taught in schools nor encouraged to be used in formal communications. Hence, it remains to be a low resource language with a lack of up-to-date Singlish word dictionary and computational tools to analyse the language. In this paper, we therefore propose Singlish Checker, a tool that is able to help detecting Singlish text, Singlish words and phrases. To …


Learning For Amalgamation: A Multi-Source Transfer Learning Framework For Sentiment Classification, Cuong V. Nguyen, Khiem H. Le, Hong Quang Pham, Quang H. Pham, Binh T. Nguyen Apr 2022

Learning For Amalgamation: A Multi-Source Transfer Learning Framework For Sentiment Classification, Cuong V. Nguyen, Khiem H. Le, Hong Quang Pham, Quang H. Pham, Binh T. Nguyen

Research Collection School Of Computing and Information Systems

Transfer learning plays an essential role in Deep Learning, which can remarkably improve the performance of the target domain, whose training data is not sufficient. Our work explores beyond the common practice of transfer learning with a single pre-trained model. We focus on the task of Vietnamese sentiment classification and propose LIFA, a framework to learn a unified embedding from several pre-trained models. We further propose two more LIFA variants that encourage the pre-trained models to either cooperate or compete with one another. Studying these variants sheds light on the success of LIFA by showing that sharing knowledge among the …


Generating Music With Emotions, Chunhui Bao, Qianru Sun Mar 2022

Generating Music With Emotions, Chunhui Bao, Qianru Sun

Research Collection School Of Computing and Information Systems

We focus on the music generation conditional on human emotions, specifically the positive and negative emotions. There is no existing large-scale music datasets with the annotation of human emotion labels. It is thus not intuitive how to generate music conditioned on emotion labels. In this paper, we propose an annotation-free method to build a new dataset where each sample is a triplet of lyric, melody and emotion label (without requiring any labours). Specifically, we first train the automated emotion recognition model using the BERT (pre-trained on GoEmotions dataset) on Edmonds Dance dataset. We use it to automatically ‘`label’' the music …


Transformer-Based Joint Learning Approach For Text Normalization In Vietnamese Automatic Speech Recognition Systems, The Viet Bui, Tho Chi Luong, Oanh Thi Tran Jan 2022

Transformer-Based Joint Learning Approach For Text Normalization In Vietnamese Automatic Speech Recognition Systems, The Viet Bui, Tho Chi Luong, Oanh Thi Tran

Research Collection School Of Computing and Information Systems

In this article, we investigate the task of normalizing transcribed texts in Vietnamese Automatic Speech Recognition (ASR) systems in order to improve user readability and the performance of downstream tasks. This task usually consists of two main sub-tasks: predicting and inserting punctuation (i.e., period, comma); and detecting and standardizing named entities (i.e., numbers, person names) from spoken forms to their appropriate written forms. To achieve these goals, we introduce a complete corpus including of 87,700 sentences and investigate conditional joint learning approaches which globally optimize two sub-tasks simultaneously. The experimental results are quite promising. Overall, the proposed architecture outperformed the …


A Bert-Based Two-Stage Model For Chinese Chengyu Recommendation, Minghuan Tan, Jing Jiang, Bingtian Dai Nov 2021

A Bert-Based Two-Stage Model For Chinese Chengyu Recommendation, Minghuan Tan, Jing Jiang, Bingtian Dai

Research Collection School Of Computing and Information Systems

In Chinese, Chengyu are fixed phrases consisting of four characters. As a type of idioms, their meanings usually cannot be derived from their component characters. In this paper, we study the task of recommending a Chengyu given a textual context. Observing some of the limitations with existing work, we propose a two-stage model, where during the first stage we re-train a Chinese BERT model by masking out Chengyu from a large Chinese corpus with a wide coverage of Chengyu. During the second stage, we fine-tune the retrained, Chengyu-oriented BERT on a specific Chengyu recommendation dataset. We evaluate this method on …


An Efficient Transformer-Based Model For Vietnamese Punctuation Prediction, Hieu Tran, Cuong V. Dinh, Hong Quang Pham, Binh T. Nguyen Jul 2021

An Efficient Transformer-Based Model For Vietnamese Punctuation Prediction, Hieu Tran, Cuong V. Dinh, Hong Quang Pham, Binh T. Nguyen

Research Collection School Of Computing and Information Systems

In both formal and informal texts, missing punctuation marks make the texts confusing and challenging to read. This paper aims to conduct exhaustive experiments to investigate the benefits of the pre-trained Transformer-based models on two Vietnamese punctuation datasets. The experimental results show our models can achieve encouraging results, and adding Bi-LSTM or/and CRF layers on top of the proposed models can also boost model performance. Finally, our best model can significantly bypass state-of-the-art approaches on both the novel and news datasets for the Vietnamese language. It can gain the corresponding performance up to 21.45%21.45% and 18.27%18.27% in the overall F1-scores.


Base-Package Recommendation Framework Based On Consumer Behaviours In Iptv Platform, Kuruparan Shanmugalingam, Ruwinda Ranganayanke, Chanka Gunawardhaha, Rajitha Navarathna Nov 2020

Base-Package Recommendation Framework Based On Consumer Behaviours In Iptv Platform, Kuruparan Shanmugalingam, Ruwinda Ranganayanke, Chanka Gunawardhaha, Rajitha Navarathna

Research Collection School Of Computing and Information Systems

Internet Protocol TeleVision (IPTV) provides many services such as live television streaming, time-shifted media, and Video On Demand (VOD). However, many customers do not engage properly with their subscribed packages due to a lack of knowledge and poor guidance. Many customers fail to identify the proper IPTV service package based on their needs and to utilise their current package to the maximum. In this paper, we propose a base-package recommendation model with a novel customer scoring-meter based on customers behaviour. Initially, our paper describes an algorithm to measure customers engagement score, which illustrates a novel approach to track customer engagement …


Vietnamese Punctuation Prediction Using Deep Neural Networks, Thuy Pham, Nhu Nguyen, Hong Quang Pham, Han Cao, Binh Nguyen Jan 2020

Vietnamese Punctuation Prediction Using Deep Neural Networks, Thuy Pham, Nhu Nguyen, Hong Quang Pham, Han Cao, Binh Nguyen

Research Collection School Of Computing and Information Systems

Adding appropriate punctuation marks into text is an essential step in speech-to-text where such information is usually not available. While this has been extensively studied for English, there is no large-scale dataset and comprehensive study in the punctuation prediction problem for the Vietnamese language. In this paper, we collect two massive datasets and conduct a benchmark with both traditional methods and deep neural networks. We aim to publish both our data and all implementation codes to facilitate further research, not only in Vietnamese punctuation prediction but also in other related fields. Our project, including datasets and implementation details, is publicly …


Punctuation Prediction For Vietnamese Texts Using Conditional Random Fields, Hong Quang Pham, Binh T. Nguyen, Nguyen Viet Cuong Dec 2019

Punctuation Prediction For Vietnamese Texts Using Conditional Random Fields, Hong Quang Pham, Binh T. Nguyen, Nguyen Viet Cuong

Research Collection School Of Computing and Information Systems

We investigate the punctuation prediction for the Vietnamese language. This problem is crucial as it can be used to add suitable punctuation marks to machine-transcribed speeches, which usually do not have such information. Similar to previous works for English and Chinese languages, we formulate this task as a sequence labeling problem. After that, we apply the conditional random field model for solving the problem and propose a set of appropriate features that are useful for prediction. Moreover, we build two corpora from Vietnamese online news and movie subtitles and perform extensive experiments on these data. Finally, we ask four volunteers …


Discursive Power In Contemporary Media Systems: A Comparative Framework, Andreas Jungherr, Oliver Posegga, Jisun An Apr 2019

Discursive Power In Contemporary Media Systems: A Comparative Framework, Andreas Jungherr, Oliver Posegga, Jisun An

Research Collection School Of Computing and Information Systems

Contemporary media systems are in transition. The constellation of organizations, groups, and individuals contributing information to national and international news flows has changed as a result of the digital transformation. The 'hybrid media system' has proven to be one of the most instructive concepts addressing this change. Its focus on the mutually dependent interconnections between various types of media organizations, actors, and publics has inspired prolific research. Yet the concept can tempt researchers to sidestep systematic analyses of information flows and actors’ differing degrees of influence by treating media systems as a black box. To enable large-scale, empirical comparative studies …


Understanding Music Track Popularity In A Social Network, Jing Ren, Robert J. Kauffman Jun 2017

Understanding Music Track Popularity In A Social Network, Jing Ren, Robert J. Kauffman

Research Collection School Of Computing and Information Systems

Thousands of music tracks are uploaded to the Internet every day through websites and social networks that focus on music. While some content has been popular for decades, some tracks that have just been released have been ignored. What makes a music track popular? Can the duration of a music track’s popularity be explained and predicted? By analysing data on the performance of a music track on the ranking charts, coupled with the creation of machine-generated music semantics constructs and a variety of other track, artist and market descriptors, this research tests a model to assess how track popularity and …


What Makes A Music Track Popular In Online Social Networks?, Jing Ren, Jialie Shen, Robert John Kauffman Apr 2016

What Makes A Music Track Popular In Online Social Networks?, Jing Ren, Jialie Shen, Robert John Kauffman

Research Collection School Of Computing and Information Systems

Tens of thousands of music tracks are uploaded to the Internet every day through social networks that focus on music and videos, as well as portal websites. While some of the content has been popular for decades, some tracks that have just been released have been completely ignored. So what makes a music track popular? Can we predict the popularity of a music track before it is released? In this research, we will focus on an online music social network, Last.fm, and investigate three key factors of a music track that may have impact on its popularity. They include: the …


Influences Of Influential Users: An Empirical Study Of Music Social Network, Jing Ren, Zhiyong Cheng, Jialie Shen, Feida Zhu Jul 2014

Influences Of Influential Users: An Empirical Study Of Music Social Network, Jing Ren, Zhiyong Cheng, Jialie Shen, Feida Zhu

Research Collection School Of Computing and Information Systems

Influential user can play a crucial role in online social networks. This paper documents an empirical study aiming at exploring the effects of influential users in the context of music social network. To achieve this goal, music diffusion graph is developed to model how information propagates over network. We also propose a heuristic method to measure users' influences. Using the real data from Last. fm, our empirical test demonstrates key effects of influential users and reveals limitations of existing influence identification/characterization schemes.


Query-Document-Dependent Fusion: A Case Study Of Multimodal Music Retrieval, Zhonghua Li, Bingjun Zhang, Yi Yu, Jialie Shen, Ye Wang Dec 2013

Query-Document-Dependent Fusion: A Case Study Of Multimodal Music Retrieval, Zhonghua Li, Bingjun Zhang, Yi Yu, Jialie Shen, Ye Wang

Research Collection School Of Computing and Information Systems

In recent years, multimodal fusion has emerged as a promising technology for effective multimedia retrieval. Developing the optimal fusion strategy for different modality (e.g. content, metadata) has been the subject of intensive research. Given a query, existing methods derive a unified fusion strategy for all documents with the underlying assumption that the relative significance of a modality remains the same across all documents. However, this assumption is often invalid. We thus propose a general multimodal fusion framework, query-document-dependent fusion (QDDF), which derives the optimal fusion strategy for each query-document pair via intelligent content analysis of both queries and documents. By …


K-Pop Live: Social Networking & Language Learning Platform, Thomas Chua, Chin Leng Ong, Kian Ming Png, Aloysius Lau, Houston Toh, Feida Zhu, Kyong Jin Shim, Ee-Peng Lim Feb 2013

K-Pop Live: Social Networking & Language Learning Platform, Thomas Chua, Chin Leng Ong, Kian Ming Png, Aloysius Lau, Houston Toh, Feida Zhu, Kyong Jin Shim, Ee-Peng Lim

Research Collection School Of Computing and Information Systems

K-Pop live is a social networking and language learning platform developed by an undergraduate student team from Singapore Management University. K-Pop live aims to combine social media together with gamification to promote Korean culture. It consolidates all relevant Tweets from Twitter as well as videos from YouTube. The platform allows the user to connect with his friends who share similar interests in terms of K-pop artists and music.


An Artificial Immune System Based Approach For English Grammar Correction, Akshat Kumar, Shivashankar B. Nair Aug 2007

An Artificial Immune System Based Approach For English Grammar Correction, Akshat Kumar, Shivashankar B. Nair

Research Collection School Of Computing and Information Systems

Grammar checking and correction comprise of the primary problems in the area of Natural Language Processing (NLP). Traditional approaches fall into two major categories: Rule based and Corpus based. While the former relies heavily on grammar rules the latter approach is statistical in nature. We provide a novel corpus based approach for grammar checking that uses the principles of an Artificial Immune System (AIS).We treat grammatical error as pathogens (in immunological terms) and build antibody detectors capable of detecting grammatical errors while allowing correct constructs to filter through. Our results show that it is possible to detect a range of …