Open Access. Powered by Scholars. Published by Universities.®

Engineering Commons

Open Access. Powered by Scholars. Published by Universities.®

Computational Engineering

Articles

2019

Taxonomic enrichment

Articles 1 - 1 of 1

Full-Text Articles in Engineering

Size Matters: The Impact Of Training Size In Taxonomically-Enriched Word Embeddings, Alfredo Maldonado, Filip Klubicka, John D. Kelleher Oct 2019

Size Matters: The Impact Of Training Size In Taxonomically-Enriched Word Embeddings, Alfredo Maldonado, Filip Klubicka, John D. Kelleher

Articles

Word embeddings trained on natural corpora (e.g., newspaper collections, Wikipedia or the Web) excel in capturing thematic similarity (“topical relatedness”) on word pairs such as ‘coffee’ and ‘cup’ or ’bus’ and ‘road’. However, they are less successful on pairs showing taxonomic similarity, like ‘cup’ and ‘mug’ (near synonyms) or ‘bus’ and ‘train’ (types of public transport). Moreover, purely taxonomy-based embeddings (e.g. those trained on a random-walk of WordNet’s structure) outperform natural-corpus embeddings in taxonomic similarity but underperform them in thematic similarity. Previous work suggests that performance gains in both types of similarity can be achieved by enriching natural-corpus embeddings with …