Open Access. Powered by Scholars. Published by Universities.®

Engineering Commons

Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics

University of South Carolina

Faculty Publications

2018

Information gain

Articles 1 - 1 of 1

Full-Text Articles in Engineering

Patent Keyword Extraction Algorithm Based On Distributed Representation For Patent Classification, Jie Hu, Shaobo Li, Yong Yao, Liya Yu, Guanci Yang, Jianjun Hu Feb 2018

Patent Keyword Extraction Algorithm Based On Distributed Representation For Patent Classification, Jie Hu, Shaobo Li, Yong Yao, Liya Yu, Guanci Yang, Jianjun Hu

Faculty Publications

Many text mining tasks such as text retrieval, text summarization, and text comparisons depend on the extraction of representative keywords from the main text. Most existing keyword extraction algorithms are based on discrete bag-of-words type of word representation of the text. In this paper, we propose a patent keyword extraction algorithm (PKEA) based on the distributed Skip-gram model for patent classification. We also develop a set of quantitative performance measures for keyword extraction evaluation based on information gain and cross-validation, based on Support Vector Machine (SVM) classification, which are valuable when human-annotated keywords are not available. We used a standard …