Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Other Computer Sciences

Dartmouth College

Theses/Dissertations

Attention

Articles 1 - 1 of 1

Full-Text Articles in Physical Sciences and Mathematics

Interpreting Attention-Based Models For Natural Language Processing, Steven J. Signorelli Jr Jun 2021

Interpreting Attention-Based Models For Natural Language Processing, Steven J. Signorelli Jr

Dartmouth College Undergraduate Theses

Large pre-trained language models (PLMs) such as BERT and XLNet have revolutionized the field of natural language processing (NLP). The interesting thing is that they are pre- trained through unsupervised tasks, so there is a natural curiosity as to what linguistic knowledge these models have learned from only unlabeled data. Fortunately, these models’ architectures are based on self-attention mechanisms, which are naturally interpretable. As such, there is a growing body of work that uses attention to gain insight as to what linguistic knowledge is possessed by these models. Most attention-focused studies use BERT as their subject, and consequently the field …