Open Access. Powered by Scholars. Published by Universities.®

Social and Behavioral Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 2 of 2

Full-Text Articles in Social and Behavioral Sciences

D-Lib Magazine Pioneered Web-Based Scholarly Communication, Michael L. Nelson, Herbert Van De Sompel Jan 2022

D-Lib Magazine Pioneered Web-Based Scholarly Communication, Michael L. Nelson, Herbert Van De Sompel

Computer Science Faculty Publications

The web began with a vision of, as stated by Tim Berners-Lee in 1991, “that much academic information should be freely available to anyone”. For many years, the development of the web and the development of digital libraries and other scholarly communications infrastructure proceeded in tandem. A milestone occurred in July, 1995, when the first issue of D-Lib Magazine was published as an online, HTML-only, open access magazine, serving as the focal point for the then emerging digital library research community. In 2017 it ceased publication, in part due to the maturity of the community it served as well as …


Scholarly Big Data Quality Assessment: A Case Study Of Document Linking And Conflation With S2orc, Jian Wu, Ryan Hiltabrand, Dominik Soós, C. Lee Giles Jan 2022

Scholarly Big Data Quality Assessment: A Case Study Of Document Linking And Conflation With S2orc, Jian Wu, Ryan Hiltabrand, Dominik Soós, C. Lee Giles

Computer Science Faculty Publications

Recently, the Allen Institute for Artificial Intelligence released the Semantic Scholar Open Research Corpus (S2ORC), one of the largest open-access scholarly big datasets with more than 130 million scholarly paper records. S2ORC contains a significant portion of automatically generated metadata. The metadata quality could impact downstream tasks such as citation analysis, citation prediction, and link analysis. In this project, we assess the document linking quality and estimate the document conflation rate for the S2ORC dataset. Using semi-automatically curated ground truth corpora, we estimated that the overall document linking quality is high, with 92.6% of documents correctly linking to six major …