Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 2 of 2

Full-Text Articles in Physical Sciences and Mathematics

Dataset For Gendered Language, Shweta Soundararajan Jan 2023

Dataset For Gendered Language, Shweta Soundararajan

Datasets

Gendered language is the use of words that denote an individual’s gender. This can be explicit where the gender is evident in the actual word used, e.g. mother, she, man, but it can also be implicit where social roles or behaviours can signal an individual’s gender - for example, expectations that women display communal traits (e.g., affectionate, caring, gentle) and men display agentic traits (e.g., assertive, competitive, decisive). The use of gendered language in NLP systems can perpetuate gender stereotypes and bias. This paper proposes an approach to generating gendered language datasets using ChatGPT which will provide data for data-driven …


Determining Child Sexual Abuse Posts Based On Artificial Intelligence, Susan Mckeever, Christina Thorpe, Vuong Ngo Jan 2023

Determining Child Sexual Abuse Posts Based On Artificial Intelligence, Susan Mckeever, Christina Thorpe, Vuong Ngo

Conference papers

The volume of child sexual abuse materials (CSAM) created and shared daily both surface web platforms such as Twitter and dark web forums is very high. Based on volume, it is not viable for human experts to intercept or identify CSAM manually. However, automatically detecting and analysing child sexual abusive language in online text is challenging and time-intensive, mostly due to the variety of data formats and privacy constraints of hosting platforms. We propose a CSAM detection intelligence algorithm based on natural language processing and machine learning techniques. Our CSAM detection model is not only used to remove CSAM on …