Open Access. Powered by Scholars. Published by Universities.®

Computational Linguistics Commons

Open Access. Powered by Scholars. Published by Universities.®

Dissertations, Theses, and Capstone Projects

Deep learning

Publication Year

Articles 1 - 2 of 2

Full-Text Articles in Computational Linguistics

From Sesame Street To Beyond: Multi-Domain Discourse Relation Classification With Pretrained Bert, Isaac R. Raff Sep 2022

From Sesame Street To Beyond: Multi-Domain Discourse Relation Classification With Pretrained Bert, Isaac R. Raff

Dissertations, Theses, and Capstone Projects

Research efforts in transfer learning have gained massive popularity in recent years. Pretrained language models have demonstrated the most successful results in producing high quality neural networks capable of quality inference after training across domains via transfer learning. This study expands on the domain transfer introduced in \cite{ferracane-etal-2019-news} exploring neural methods for transfer learning of discourse parsing between a news source domain and a medical target domain. \cite{ferracane-etal-2019-news} specifically discuss transfer learning from news articles to PubMed medical journal articles. Experiments in transfer learning in the current work expand to include three domains: Wall Street Journal articles previously annotated with …


Label Imputation For Homograph Disambiguation: Theoretical And Practical Approaches, Jennifer M. Seale Sep 2021

Label Imputation For Homograph Disambiguation: Theoretical And Practical Approaches, Jennifer M. Seale

Dissertations, Theses, and Capstone Projects

This dissertation presents the first implementation of label imputation for the task of homograph disambiguation using 1) transcribed audio, and 2) parallel, or translated, corpora. For label imputation from parallel corpora, a hypothesis of interlingual alignment between homograph pronunciations and text word forms is developed and formalized. Both audio and parallel corpora label imputation techniques are tested empirically in experiments that compare homograph disambiguation model performance using: 1) hand-labeled training data, and 2) hand-labeled training data augmented with label-imputed data. Regularized, multinomial logistic regression and pre-trained ALBERT, BERT, and XLNet language models fine-tuned as token classifiers are developed for homograph …