Open Access. Powered by Scholars. Published by Universities.®

Computer Engineering Commons

Open Access. Powered by Scholars. Published by Universities.®

Technological University Dublin

Natural language processing

Articles 1 - 6 of 6

Full-Text Articles in Computer Engineering

Know An Emotion By The Company It Keeps: Word Embeddings From Reddit/Coronavirus, Alejandro García-Rudolph, David Sanchez-Pinsach, Dietmar Frey, Eloy Opisso, Katryna Cisek, John Kelleher Jan 2023

Know An Emotion By The Company It Keeps: Word Embeddings From Reddit/Coronavirus, Alejandro García-Rudolph, David Sanchez-Pinsach, Dietmar Frey, Eloy Opisso, Katryna Cisek, John Kelleher

Articles

Social media is a crucial communication tool (e.g., with 430 million monthly active users in online forums such as Reddit), being an objective of Natural Language Processing (NLP) techniques. One of them (word embeddings) is based on the quotation, “You shall know a word by the company it keeps,” highlighting the importance of context in NLP. Meanwhile, “Context is everything in Emotion Research.” Therefore, we aimed to train a model (W2V) for generating word associations (also known as embeddings) using a popular Coronavirus Reddit forum, validate them using public evidence and apply them to the discovery of context for specific …


Exploring Gender Bias In Semantic Representations For Occupational Classification In Nlp: Techniques And Mitigation Strategies, Joseph Michael O'Carroll Jan 2023

Exploring Gender Bias In Semantic Representations For Occupational Classification In Nlp: Techniques And Mitigation Strategies, Joseph Michael O'Carroll

Dissertations

Gender bias in Natural Language Processing (NLP) models is a non-trivial problem that can perpetuate and amplify existing societal biases. This thesis investigates gender bias in occupation classification and explores the effectiveness of different debiasing methods for language models to reduce the impact of bias in the model’s representations. The study employs a data-driven empirical methodology focusing heavily on experimentation and result investigation. The study uses five distinct semantic representations and models with varying levels of complexity to classify the occupation of individuals based on their biographies.


Exploiting Bert And Roberta To Improve Performance For Aspect Based Sentiment Analysis, Gagan Reddy Narayanaswamy Jan 2021

Exploiting Bert And Roberta To Improve Performance For Aspect Based Sentiment Analysis, Gagan Reddy Narayanaswamy

Dissertations

Sentiment Analysis also known as opinion mining is a type of text research that analyses people’s opinions expressed in written language. Sentiment analysis brings together various research areas such as Natural Language Processing (NLP), Data Mining, and Text Mining, and is fast becoming of major importance to companies and organizations as it is started to incorporate online commerce data for analysis. Often the data on which sentiment analysis is performed will be reviews. The data can range from reviews of a small product to a big multinational corporation. The goal of performing sentiment analysis is to extract information from those …


Evaluating The Performance Of Transformer Architecture Over Attention Architecture On Image Captioning, Deepti Balasubramaniam Jan 2021

Evaluating The Performance Of Transformer Architecture Over Attention Architecture On Image Captioning, Deepti Balasubramaniam

Dissertations

Over the last few decades computer vision and Natural Language processing has shown tremendous improvement in different tasks such as image captioning, video captioning, machine translation etc using deep learning models. However, there were not much researches related to image captioning based on transformers and how it outperforms other models that were implemented for image captioning. In this study will be designing a simple encoder-decoder model, attention model and transformer model for image captioning using Flickr8K dataset where will be discussing about the hyperparameters of the model, type of pre-trained model used and how long the model has been trained. …


Languages For Different Health Information Readers: Multitrait-Multimethod Content Analysis Of Cochrane Systematic Reviews Textual Summary Formats, Jasna Karačić, Pierpaolo Dondio, Ivan Buljan, Darko Hren, Ana Marušić Jan 2019

Languages For Different Health Information Readers: Multitrait-Multimethod Content Analysis Of Cochrane Systematic Reviews Textual Summary Formats, Jasna Karačić, Pierpaolo Dondio, Ivan Buljan, Darko Hren, Ana Marušić

Articles

Background: Although subjective expressions and linguistic fluency have been shown as important factors in processing and interpreting textual facts, analyses of these traits in textual health information for different audiences are lacking. We analyzed the readability and linguistic psychological and emotional characteristics of different textual summary formats of Cochrane systematic reviews. Methods: We performed a multitrait-multimethod cross-sectional study of Press releases available at Cochrane web site (n= 162) and corresponding Scientific abstracts (n= 158), Cochrane Clinical Answers (n= 35) and Plain language summaries in English (n= 156), French (n= 101), German (n= 41) and Croatian (n=156). We used SMOG index …


Investigation Into The Application Of Personality Insights And Language Tone Analysis In Spam Classification, Colm Mcgetrick May 2017

Investigation Into The Application Of Personality Insights And Language Tone Analysis In Spam Classification, Colm Mcgetrick

Dissertations

Due to its persistence spam remains as one of the biggest problems facing users and suppliers of email communication services. Machine learning techniques have been very successful at preventing many spam mails from arriving in user mailboxes, however they still account for over 50% of all emails sent. Despite this relative success the economic cost of spam has been estimated as high as $50 billion in 2005 and more recently at $20 billion so spam can still be considered a considerable problem. In essence a spam email is a commercial communication trying to entice the receiver to take some positive …