Open Access. Powered by Scholars. Published by Universities.®
Articles 1 - 3 of 3
Full-Text Articles in Engineering
Language Models For Rare Disease Information Extraction: Empirical Insights And Model Comparisons, Shashank Gupta
Language Models For Rare Disease Information Extraction: Empirical Insights And Model Comparisons, Shashank Gupta
Theses and Dissertations--Computer Science
End-to-end relation extraction (E2ERE) is a crucial task in natural language processing (NLP) that involves identifying and classifying semantic relationships between entities in text. This thesis compares three paradigms for end-to-end relation extraction (E2ERE) in biomedicine, focusing on rare diseases with discontinuous and nested entities. We evaluate Named Entity Recognition (NER) to Relation Extraction (RE) pipelines, sequence-to-sequence models, and generative pre-trained transformer (GPT) models using the RareDis information extraction dataset. Our findings indicate that pipeline models are the most effective, followed closely by sequence-to-sequence models. GPT models, despite having eight times as many parameters, perform worse than sequence-to-sequence models and …
Quantification Of Various Types Of Biases In Large Language Models, Sudhashree Sayenju
Quantification Of Various Types Of Biases In Large Language Models, Sudhashree Sayenju
Doctor of Data Science and Analytics Dissertations
Natural Language Processing (NLP) systems are included everywhere on the internet from search engines, language translations to more advanced systems like voice assistant and customer service. Since humans are always on the receiving end of NLP technologies, it is very important to analyze whether or not the Large Language Models (LLMs) in use have bias and are therefore unfair. The majority of the research in NLP bias has focused on societal stereotype biases embedded in LLMs. However, our research focuses on all types of biases, namely model class level bias, stereotype bias and domain bias present in LLMs. Model class …
An Empirical Study Of Semantic Similarity In Wordnet And Word2vec, Abram Handler
An Empirical Study Of Semantic Similarity In Wordnet And Word2vec, Abram Handler
University of New Orleans Theses and Dissertations
This thesis performs an empirical analysis of Word2Vec by comparing its output to WordNet, a well-known, human-curated lexical database. It finds that Word2Vec tends to uncover more of certain types of semantic relations than others -- with Word2Vec returning more hypernyms, synonomyns and hyponyms than hyponyms or holonyms. It also shows the probability that neighbors separated by a given cosine distance in Word2Vec are semantically related in WordNet. This result both adds to our understanding of the still-unknown Word2Vec and helps to benchmark new semantic tools built from word vectors.