Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

2019

Other

Technological University Dublin

Articles 1 - 3 of 3

Full-Text Articles in Physical Sciences and Mathematics

Contextual Word Embeddings - Trained On English Wikipedia Corpora, Filip Klubicka, Alfredo Maldonado, Abhijit Mahalunkar, John D. Kelleher Jan 2019

Contextual Word Embeddings - Trained On English Wikipedia Corpora, Filip Klubicka, Alfredo Maldonado, Abhijit Mahalunkar, John D. Kelleher

Datasets

This archive contains a collection of computational models called word embeddings. These are vectors that contain numerical representations of words. These have been trained on real language sentences collected from the English Wikipedia. As such, they contain contextual (thematic) knowledge about words (rather than taxonomic).


Taxonomic Word Embeddings - Trained On English Wordnet Random Walk Pseudo-Corpora, Filip Klubicka, Alfredo Maldonado, Abhijit Mahalunkar, John D. Kelleher Jan 2019

Taxonomic Word Embeddings - Trained On English Wordnet Random Walk Pseudo-Corpora, Filip Klubicka, Alfredo Maldonado, Abhijit Mahalunkar, John D. Kelleher

Datasets

This archive contains a collection of computational models called word embeddings. These are vectors that contain numerical representations of words. They have been trained on pseudo-sentences generated artificially from a random walk over the English WordNet taxonomy, and thus reflect taxonomic knowledge about words (rather than contextual).


English Wikipedia Corpus Chunks, Filip Klubicka, Alfredo Maldonado, Abhijit Mahalunkar, John D. Kelleher Jan 2019

English Wikipedia Corpus Chunks, Filip Klubicka, Alfredo Maldonado, Abhijit Mahalunkar, John D. Kelleher

Datasets

This archive contains a collection of language corpora. These are text files that contain samples of text collected from English Wikipedia.