Open Access. Powered by Scholars. Published by Universities.®
Articles 1 - 2 of 2
Full-Text Articles in Computational Linguistics
From Sesame Street To Beyond: Multi-Domain Discourse Relation Classification With Pretrained Bert, Isaac R. Raff
From Sesame Street To Beyond: Multi-Domain Discourse Relation Classification With Pretrained Bert, Isaac R. Raff
Dissertations, Theses, and Capstone Projects
Research efforts in transfer learning have gained massive popularity in recent years. Pretrained language models have demonstrated the most successful results in producing high quality neural networks capable of quality inference after training across domains via transfer learning. This study expands on the domain transfer introduced in \cite{ferracane-etal-2019-news} exploring neural methods for transfer learning of discourse parsing between a news source domain and a medical target domain. \cite{ferracane-etal-2019-news} specifically discuss transfer learning from news articles to PubMed medical journal articles. Experiments in transfer learning in the current work expand to include three domains: Wall Street Journal articles previously annotated with …
Label Imputation For Homograph Disambiguation: Theoretical And Practical Approaches, Jennifer M. Seale
Label Imputation For Homograph Disambiguation: Theoretical And Practical Approaches, Jennifer M. Seale
Dissertations, Theses, and Capstone Projects
This dissertation presents the first implementation of label imputation for the task of homograph disambiguation using 1) transcribed audio, and 2) parallel, or translated, corpora. For label imputation from parallel corpora, a hypothesis of interlingual alignment between homograph pronunciations and text word forms is developed and formalized. Both audio and parallel corpora label imputation techniques are tested empirically in experiments that compare homograph disambiguation model performance using: 1) hand-labeled training data, and 2) hand-labeled training data augmented with label-imputed data. Regularized, multinomial logistic regression and pre-trained ALBERT, BERT, and XLNet language models fine-tuned as token classifiers are developed for homograph …