Open Access. Powered by Scholars. Published by Universities.®

Digital Commons Network

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 7 of 7

Full-Text Articles in Entire DC Network

Unveiling The Dynamics Of Crisis Events: Sentiment And Emotion Analysis Via Multi-Task Learning With Attention Mechanism And Subject-Based Intent Prediction, Phyo Yi Win Myint, Siaw Ling Lo, Yuhao Zhang Jul 2024

Unveiling The Dynamics Of Crisis Events: Sentiment And Emotion Analysis Via Multi-Task Learning With Attention Mechanism And Subject-Based Intent Prediction, Phyo Yi Win Myint, Siaw Ling Lo, Yuhao Zhang

Research Collection School Of Computing and Information Systems

In the age of rapid internet expansion, social media platforms like Twitter have become crucial for sharing information, expressing emotions, and revealing intentions during crisis situations. They offer crisis responders a means to assess public sentiment, attitudes, intentions, and emotional shifts by monitoring crisis-related tweets. To enhance sentiment and emotion classification, we adopt a transformer-based multi-task learning (MTL) approach with attention mechanism, enabling simultaneous handling of both tasks, and capitalizing on task interdependencies. Incorporating attention mechanism allows the model to concentrate on important words that strongly convey sentiment and emotion. We compare three baseline models, and our findings show that …


Text-To-Sql: A Methodical Review Of Challenges And Models, Ali Buğra Kanburoğlu, Faik Boray Tek May 2024

Text-To-Sql: A Methodical Review Of Challenges And Models, Ali Buğra Kanburoğlu, Faik Boray Tek

Turkish Journal of Electrical Engineering and Computer Sciences

This survey focuses on Text-to-SQL, automated translation of natural language queries into SQL queries. Initially, we describe the problem and its main challenges. Then, by following the PRISMA systematic review methodology, we survey the existing Text-to-SQL review papers in the literature. We apply the same method to extract proposed Text-to-SQL models and classify them with respect to used evaluation metrics and benchmarks. We highlight the accuracies achieved by various models on Text-to-SQL datasets and discuss execution-guided evaluation strategies. We present insights into model training times and implementations of different models. We also explore the availability of Text-to-SQL datasets in non-English …


Using Chatgpt To Generate Gendered Language, Shweta Soundararajan, Manuela Nayantara Jeyaraj, Sarah Jane Delany Mar 2024

Using Chatgpt To Generate Gendered Language, Shweta Soundararajan, Manuela Nayantara Jeyaraj, Sarah Jane Delany

Conference papers

Gendered language is the use of words that denote an individual's gender. This can be explicit where the gender is evident in the actual word used, e.g. mother, she, man, but it can also be implicit where social roles or behaviours can signal an individual's gender - for example, expectations that women display communal traits (e.g., affectionate, caring, gentle) and men display agentic traits (e.g., assertive, competitive, decisive). The use of gendered language in NLP systems can perpetuate gender stereotypes and bias. This paper proposes an approach to generating gendered language datasets using ChatGPT which will provide data for data-driven …


A Survey On Few-Shot Class-Incremental Learning, Songsong Tian, Lusi Li, Weijun Li, Hang Ran, Xin Ning, Prayag Tiwari Jan 2024

A Survey On Few-Shot Class-Incremental Learning, Songsong Tian, Lusi Li, Weijun Li, Hang Ran, Xin Ning, Prayag Tiwari

Computer Science Faculty Publications

Large deep learning models are impressive, but they struggle when real-time data is not available. Few-shot class-incremental learning (FSCIL) poses a significant challenge for deep neural networks to learn new tasks from just a few labeled samples without forgetting the previously learned ones. This setup can easily leads to catastrophic forgetting and overfitting problems, severely affecting model performance. Studying FSCIL helps overcome deep learning model limitations on data volume and acquisition time, while improving practicality and adaptability of machine learning models. This paper provides a comprehensive survey on FSCIL. Unlike previous surveys, we aim to synthesize few-shot learning and incremental …


A Chinese Power Text Classification Algorithm Based On Deep Active Learning, Song Deng, Qianliang Li, Renjie Dai, Siming Wei, Di Wu, Yi He, Xindong Wu Jan 2024

A Chinese Power Text Classification Algorithm Based On Deep Active Learning, Song Deng, Qianliang Li, Renjie Dai, Siming Wei, Di Wu, Yi He, Xindong Wu

Computer Science Faculty Publications

The construction of knowledge graph is beneficial for grid production, electrical safety protection, fault diagnosis and traceability in an observable and controllable way. Highly-precision text classification algorithm is crucial to build a professional knowledge graph in power system. Unfortunately, there are a large number of poorly described and specialized texts in the power business system, and the amount of data containing valid labels in these texts is low. This will bring great challenges to improve the precision of text classification models. To offset the gap, we propose a classification algorithm for Chinese text in the power system based on deep …


A Comparison Of Lexical Tokenization Methods, Nathan Culmer Jan 2024

A Comparison Of Lexical Tokenization Methods, Nathan Culmer

Williams Honors College, Honors Research Projects

The purpose of this project was to compare tokenization methods, or methods of breaking up a text into meaningful parts for use in natural language processing. The effectiveness of several commonly used tokenization methods were investigated, including morpheme tokenization, which takes into account the linguistic features of the language. In addition, I proposed and implemented a new technique to consider the capitalization pattern of a word in the tokenization process, in order to allow this process to include more natural language features. The effectiveness of these methods was compared by using them in a sentiment analysis model for various datasets, …


Unifying Context With Labeled Property Graph: A Pipeline-Based System For Comprehensive Text Representation In Nlp, Ali Hur, Naeem Janjua, Mohiuddin Ahmed Jan 2024

Unifying Context With Labeled Property Graph: A Pipeline-Based System For Comprehensive Text Representation In Nlp, Ali Hur, Naeem Janjua, Mohiuddin Ahmed

Research outputs 2022 to 2026

Extracting valuable insights from vast amounts of unstructured digital text presents significant challenges across diverse domains. This research addresses this challenge by proposing a novel pipeline-based system that generates domain-agnostic and task-agnostic text representations. The proposed approach leverages labeled property graphs (LPG) to encode contextual information, facilitating the integration of diverse linguistic elements into a unified representation. The proposed system enables efficient graph-based querying and manipulation by addressing the crucial aspect of comprehensive context modeling and fine-grained semantics. The effectiveness of the proposed system is demonstrated through the implementation of NLP components that operate on LPG-based representations. Additionally, the proposed …