Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Other

Book Gallery

2019

Text mining

Articles 1 - 1 of 1

Full-Text Articles in Physical Sciences and Mathematics

Dataset 1: Mobile Text Dataset, Keith Vertanen, Per Ola Kristensson Jan 2019

Dataset 1: Mobile Text Dataset, Keith Vertanen, Per Ola Kristensson

Mobile Text Dataset and Language Models

This zip file contains the sentences mined from public web forums and blogs. Additional details about the dataset:

  • The data is split into training, development, and test sets based on the original domain name the text was mined from.
  • The sent_*.txt files are tab-delimited and contain one sentence parsed from a particular post. Each line contains the device name, forum software, device form factor (tablet or phone), and device input (touch or touch+key) associated with the post it was obtained from.
  • The set's subdirectory contains the groupings used in Section 2.
  • 64K word list (used in the paper), 5K and …