Open Access. Powered by Scholars. Published by Universities.®

Library and Information Science Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Institution
-
- SelectedWorks (111)
- Selected Works (81)
- University of Nebraska - Lincoln (45)
- Old Dominion University (41)
- Walden University (37)
-
- University of Rhode Island (18)
- University of South Carolina (16)
- Nova Southeastern University (14)
- Purdue University (13)
- University of Nevada, Las Vegas (13)
- Singapore Management University (12)
- Kennesaw State University (11)
- University of Kentucky (11)
- Virginia Commonwealth University (10)
- City University of New York (CUNY) (9)
- San Jose State University (9)
- Florida International University (8)
- University of South Florida (8)
- University of Wisconsin Milwaukee (7)
- University of Denver (6)
- Brigham Young University (5)
- James Madison University (5)
- University of Massachusetts Amherst (4)
- Western Kentucky University (4)
- California State University, San Bernardino (3)
- Himmelfarb Health Sciences Library, The George Washington University (3)
- Montclair State University (3)
- Olivet Nazarene University (3)
- Roger Williams University (3)
- Syracuse University (3)
- Keyword
-
- Architecture Arts and Humanities Business Education Engineering Law Life Sciences Medicine and Health Sciences Physical Sciences and Mathematics Social and Behavioral Sciences (72)
- Computer science (22)
- Libraries (22)
- Library science (22)
- Machine learning (19)
-
- Digital libraries (17)
- Web archives (16)
- Data representation (15)
- Robert Hooke (15)
- Scientific imaging (15)
- Digital preservation (14)
- Archives (13)
- Metadata (12)
- Web archiving (12)
- Cybersecurity (11)
- Technology (11)
- Artificial intelligence (10)
- Search engines (10)
- Usability (10)
- Academic libraries (9)
- Amanda Izenstark (9)
- Collection management (9)
- College of Arts and Sciences (9)
- Computer Science Department (9)
- Faculty Senate (9)
- Information retrieval (9)
- Library Impact Statement (9)
- Lisa DiPippo (9)
- New course proposal (9)
- Social media (9)
- Publication Year
- Publication
-
- Philadelphia University, Jordan (78)
- Walden Dissertations and Doctoral Studies (37)
- Dr Deogratias Harorimana (23)
- Library Philosophy and Practice (e-journal) (21)
- Computer Science Faculty Publications (19)
-
- Library Impact Statements (16)
- CCE Theses and Dissertations (14)
- Computer Science Presentations (14)
- Amber Settle (12)
- Information Science Faculty Publications (10)
- Publications and Research (9)
- Computer Science Theses & Dissertations (8)
- Copyright, Fair Use, Scholarly Communication, etc. (8)
- Section 5: Imaging at the Nano Scale (8)
- USF Tampa Graduate Theses and Dissertations (8)
- Works of the FIU Libraries (8)
- Faculty Publications (7)
- Research Collection School Of Computing and Information Systems (7)
- VCU Libraries Faculty and Staff Publications (7)
- William Osei-Poku (7)
- Handouts (6)
- Theses and Dissertations (6)
- Amanda Izenstark (5)
- Inaugural CSU IR Conference, 2015 (5)
- Library Conference Presentations and Speeches (5)
- Sonya S. Gaither (5)
- UNLV Theses, Dissertations, Professional Papers, and Capstones (5)
- Faculty Publications, UNL Libraries (4)
- Libraries (4)
- Library Faculty Presentations (4)
- Publication Type
- File Type
Articles 1 - 30 of 592
Full-Text Articles in Library and Information Science
The Library & Generative Ai, Nat Gustafson-Sundell, Mark Mccullough
The Library & Generative Ai, Nat Gustafson-Sundell, Mark Mccullough
Library Services Publications
A demonstration of several AI tools, including ChatGPT, ChatPDF, Consensus, and more. The focus of the session is on potential student uses of the tools and related library initiatives, so we address the limits of ChatGPT as an information source. Librarians can help students learn how to use these tools responsibly and provide leadership on campus as AI is integrated into assignments.
Aisha: A Custom Ai Library Chatbot Using The Chatgpt Api, Yrjo Lappalainen, Nikesh Narayanan
Aisha: A Custom Ai Library Chatbot Using The Chatgpt Api, Yrjo Lappalainen, Nikesh Narayanan
All Works
This article focuses on the development of a custom chatbot for Zayed University Library (United Arab Emirates) using Python and the ChatGPT API. The chatbot, named Aisha, was designed to provide quick and efficient reference and support services to students and faculty outside the library's regular operating hours. The article also discusses the benefits of chatbots in academic libraries, and reviews the early literature on ChatGPT's applicability in this field. The article describes the development process, perceived capabilities and limitations of the bot, and plans for further development. This project represents the first fully reported attempt to explore the potential …
Fair Signposting Profile, Herbert Van De Sompel, Martin Klein, Shawn Jones, Michael L. Nelson, Simeon Warner, Anusuriya Devaraju, Robert Huber, Wilko Steinhoff, Vyacheslav Tykhonov, Luc Boruta, Enno Meijers, Stian Soiland-Reyes, Mark Wilkonson
Fair Signposting Profile, Herbert Van De Sompel, Martin Klein, Shawn Jones, Michael L. Nelson, Simeon Warner, Anusuriya Devaraju, Robert Huber, Wilko Steinhoff, Vyacheslav Tykhonov, Luc Boruta, Enno Meijers, Stian Soiland-Reyes, Mark Wilkonson
Computer Science Faculty Publications
[First paragraph] This page details concrete recipes that platforms that host research outputs (e.g. data repositories, institutional repositories, publisher platforms, etc.) can follow to implement Signposting, a lightweight yet powerful approach to increase the FAIRness of scholarly objects.
Predicting The Pebcak: A Quantitative Analysis Of How Cybersecurity Education, Literacy, And Awareness Affect Individual Preparedness., Annie Goodman
Predicting The Pebcak: A Quantitative Analysis Of How Cybersecurity Education, Literacy, And Awareness Affect Individual Preparedness., Annie Goodman
Theses/Capstones/Creative Projects
This essay explores the relationship between individuals' cybersecurity education, literacy, awareness, and preparedness. While cybersecurity is often associated with complex hacking scenarios, the majority of data breaches and cyber-attacks result from individuals inadvertently falling prey to phishing emails and malware. The lack of standardized education and training in cybersecurity, coupled with the rapid expansion of technology diversity, raises concerns about individuals' cybersecurity preparedness. As individuals are the first line of defense and the weakest link in cybersecurity, understanding the influence of education, literacy, and awareness on their adherence to best practices is crucial. This work aims to survey a diverse …
Supporting Account-Based Queries For Archived Instagram Posts, Himarsha R. Jayanetti
Supporting Account-Based Queries For Archived Instagram Posts, Himarsha R. Jayanetti
Computer Science Theses & Dissertations
Social media has become one of the primary modes of communication in recent times, with popular platforms such as Facebook, Twitter, and Instagram leading the way. Despite its popularity, Instagram has not received as much attention in academic research compared to Facebook and Twitter, and its significant role in contemporary society is often overlooked. Web archives are making efforts to preserve social media content despite the challenges posed by the dynamic nature of these sites. The goal of our research is to facilitate the easy discovery of archived copies, or mementos, of all posts belonging to a specific Instagram account …
Chatgpt As Metamorphosis Designer For The Future Of Artificial Intelligence (Ai): A Conceptual Investigation, Amarjit Kumar Singh (Library Assistant), Dr. Pankaj Mathur (Deputy Librarian)
Chatgpt As Metamorphosis Designer For The Future Of Artificial Intelligence (Ai): A Conceptual Investigation, Amarjit Kumar Singh (Library Assistant), Dr. Pankaj Mathur (Deputy Librarian)
Library Philosophy and Practice (e-journal)
Abstract
Purpose: The purpose of this research paper is to explore ChatGPT’s potential as an innovative designer tool for the future development of artificial intelligence. Specifically, this conceptual investigation aims to analyze ChatGPT’s capabilities as a tool for designing and developing near about human intelligent systems for futuristic used and developed in the field of Artificial Intelligence (AI). Also with the helps of this paper, researchers are analyzed the strengths and weaknesses of ChatGPT as a tool, and identify possible areas for improvement in its development and implementation. This investigation focused on the various features and functions of ChatGPT that …
Understanding U.S. Customers' Intention To Adopt Robo-Advisor Technology, Deborah Wall
Understanding U.S. Customers' Intention To Adopt Robo-Advisor Technology, Deborah Wall
Walden Dissertations and Doctoral Studies
Finance and information technology scholars wrote that there is a literature gap on what factors drive investors in Western financial markets to use a Robo-advisor to manage their investments. The purpose of this qualitative, single case study with embedded units is to understand the adoption intentions of retail investors in U.S. markets to use a Robo-advisor instead of a human advisor. A single case study design addressed the literature gap, and qualitative data from seven semi=structured interviews, reflective field notes, and archival data were triangulated to answer the research question. This study was grounded in a theoretical framework that includes …
Analyzing Small Business Strategies To Prevent External Cybersecurity Threats, Dr. Kevin E. Moore
Analyzing Small Business Strategies To Prevent External Cybersecurity Threats, Dr. Kevin E. Moore
Walden Dissertations and Doctoral Studies
Some small businesses’ cybersecurity analysts lack strategies to prevent their organizations from compromising personally identifiable information (PII) via external cybersecurity threats. Small business leaders are concerned, as they are the most targeted critical infrastructures in the United States and are a vital part of the economic system as data breaches threaten the viability of these organizations. Grounded in routine activity theory, the purpose of this pragmatic qualitative inquiry was to explore strategies small business organizations utilize to prevent external cybersecurity threats. The participants were nine cybersecurity analysts who utilized strategies to defend small businesses from external threats. Data were collected …
Strategies Information Technology Managers Use To Retain Qualified Information Technology Employees, Wayne Arnold Reu
Strategies Information Technology Managers Use To Retain Qualified Information Technology Employees, Wayne Arnold Reu
Walden Dissertations and Doctoral Studies
Retaining qualified information technology (IT) personnel can take time and effort, given the high demand for skilled positions. Business leaders are concerned with the high turnover of IT employees because of the cost of recruiting and training personnel and the disruption to organizational processes and performance. Grounded in job characteristics theory, the purpose of this qualitative pragmatic inquiry was to explore IT managers' strategies to retain qualified IT employees in organizations across the southwestern United States. Eight IT leaders participated because of their years of experience implementing strategies to retain qualified IT professionals. Data were collected using semistructured interviews and …
Analyzing Small Business Strategies To Prevent External Cybersecurity Threats, Dr. Kevin E. Moore
Analyzing Small Business Strategies To Prevent External Cybersecurity Threats, Dr. Kevin E. Moore
Walden Dissertations and Doctoral Studies
Some small businesses’ cybersecurity analysts lack strategies to prevent their organizations from compromising personally identifiable information (PII) via external cybersecurity threats. Small business leaders are concerned, as they are the most targeted critical infrastructures in the United States and are a vital part of the economic system as data breaches threaten the viability of these organizations. Grounded in routine activity theory, the purpose of this pragmatic qualitative inquiry was to explore strategies small business organizations utilize to prevent external cybersecurity threats. The participants were nine cybersecurity analysts who utilized strategies to defend small businesses from external threats. Data were collected …
Understanding U.S. Customers' Intention To Adopt Robo-Advisor Technology, Deborah Wall
Understanding U.S. Customers' Intention To Adopt Robo-Advisor Technology, Deborah Wall
Walden Dissertations and Doctoral Studies
Finance and information technology scholars wrote that there is a literature gap on what factors drive investors in Western financial markets to use a Robo-advisor to manage their investments. The purpose of this qualitative, single case study with embedded units is to understand the adoption intentions of retail investors in U.S. markets to use a Robo-advisor instead of a human advisor. A single case study design addressed the literature gap, and qualitative data from seven semi=structured interviews, reflective field notes, and archival data were triangulated to answer the research question. This study was grounded in a theoretical framework that includes …
Hashes Are Not Suitable To Verify Fixity Of The Public Archived Web, Mohamed Aturban, Martin Klein, Herbert Van De Sompel, Sawood Alam, Michael L. Nelson, Michele C. Weigle
Hashes Are Not Suitable To Verify Fixity Of The Public Archived Web, Mohamed Aturban, Martin Klein, Herbert Van De Sompel, Sawood Alam, Michael L. Nelson, Michele C. Weigle
Computer Science Faculty Publications
Web archives, such as the Internet Archive, preserve the web and allow access to prior states of web pages. We implicitly trust their versions of archived pages, but as their role moves from preserving curios of the past to facilitating present day adjudication, we are concerned with verifying the fixity of archived web pages, or mementos, to ensure they have always remained unaltered. A widely used technique in digital preservation to verify the fixity of an archived resource is to periodically compute a cryptographic hash value on a resource and then compare it with a previous hash value. If the …
A Study Of The Effect Of Types Of Organizational Culture On Information Security Procedural Countermeasures, Sheri James
A Study Of The Effect Of Types Of Organizational Culture On Information Security Procedural Countermeasures, Sheri James
CCE Theses and Dissertations
This study examined the impact of specific organizational cultures on information security procedural countermeasures (ISPC). With increasing security incidents and data breaches, organizations acknowledge that people are their greatest asset as well as a vulnerability. Previous research into information security procedural controls has centered on behavioral, cognitive, and social theories; some literature incorporates general notions of organization culture yet there is still an absence in socio-organizational studies dedicated to elucidating how information security policy (ISP) compliance can be augmented by implementing comprehensive security education, training, and awareness (SETA) programs focusing on education, training, and awareness initiatives.
A theoretical model was …
Creating Data From Unstructured Text With Context Rule Assisted Machine Learning (Craml), Stephen Meisenbacher, Peter Norlander
Creating Data From Unstructured Text With Context Rule Assisted Machine Learning (Craml), Stephen Meisenbacher, Peter Norlander
School of Business: Faculty Publications and Other Works
Popular approaches to building data from unstructured text come with limitations, such as scalability, interpretability, replicability, and real-world applicability. These can be overcome with Context Rule Assisted Machine Learning (CRAML), a method and no-code suite of software tools that builds structured, labeled datasets which are accurate and reproducible. CRAML enables domain experts to access uncommon constructs within a document corpus in a low-resource, transparent, and flexible manner. CRAML produces document-level datasets for quantitative research and makes qualitative classification schemes scalable over large volumes of text. We demonstrate that the method is useful for bibliographic analysis, transparent analysis of proprietary data, …
Data Sharing Through Open Access Data Repositories, Karin Bennedsen
Data Sharing Through Open Access Data Repositories, Karin Bennedsen
All Things Open
The National Institutes of Health has expanded their data sharing requirements for obtaining funding to now include all awards for research producing scientific data to accelerate “biomedical research discovery, in part, by enabling validation of research results, providing accessibility to high-value datasets, and promoting data reuse for future research studies.” The new policy requiring a Data Management & Sharing Plan (DMSP) for all applications goes into effect January 25th, 2023. A DMSP includes where the data will be stored. This lightning talk will review Open Access Data Repositories. Don’t let the task of trying to find data storage hold you …
Bounded Confidence: How Ai Could Exacerbate Social Media’S Homophily Problem, Dylan Weber, Scott Atran, Rich Davis
Bounded Confidence: How Ai Could Exacerbate Social Media’S Homophily Problem, Dylan Weber, Scott Atran, Rich Davis
New England Journal of Public Policy
The advent of the Internet was heralded as a revolutionary development in the democratization of information. It has emerged, however, that online discourse on social media tends to narrow the information landscape of its users. This dynamic is driven by the propensity of the network structure of social media to tend toward homophily; users strongly prefer to interact with content and other users that are similar to them. We review the considerable evidence for the ubiquity of homophily in social media, discuss some possible mechanisms for this phenomenon, and present some observed and hypothesized effects. We also discuss how the …
Why We Should Remember The Soviet Information Age?, Ksenia Tatarchenko
Why We Should Remember The Soviet Information Age?, Ksenia Tatarchenko
Research Collection College of Integrative Studies
How to navigate the rapidly changing digital geopolitics of the world today? How do we make sense of digital transformation and its many social, political, cultural, and environmental implications at different locations around the world?
A Qualitative Look Into Repair Practices, Jumana Labib
A Qualitative Look Into Repair Practices, Jumana Labib
Undergraduate Student Research Internships Conference
This research poster is based on a working research paper which moves beyond the traditional scope of repair and examines the Right to Repair movement from a smaller, more personal lens by detailing the 6 categorical impediments as dubbed by Dr. Alissa Centivany (design, law, economic/business strategy, material asymmetry, informational asymmetry, and social impediments) have continuously inhibited repair and affected repair practices, which has consequently had larger implications (environmental, economic, social, etc.) on ourselves, our objects, and our world. The poster builds upon my research from last year (see "The Right to Repair: (Re)building a better future"), this time pulling …
Convivial Making: Power In Public Library Creative Places, Shannon Crawford Barniskis
Convivial Making: Power In Public Library Creative Places, Shannon Crawford Barniskis
Theses and Dissertations
In 2011, public libraries began to provide access to collaborative creative places, frequently called “makerspaces.” The professional literature portrays these as beneficial for communities and individuals through their support of creativity, innovation, learning, and access to high-tech tools such as 3D printers. As in longstanding “library faith” narratives, which pin the library’s existence to widely held values, makerspace rhetoric describes access to tools and skills as instrumental for a stronger economy or democracy, social justice, and/or individual happiness. The rhetoric generally frames these places as empowering. Yet the concept of power has been neither well-theorized within the library makerspace literature …
Building Capacity For Data-Driven Scholarship, Jamie Rogers
Building Capacity For Data-Driven Scholarship, Jamie Rogers
Works of the FIU Libraries
This talk provides an overview of "dLOC as Data: A Thematic Approach to Caribbean Newspapers," an initiative developed to increase access to digitized Caribbean newspaper text for bulk download, facilitating computational analysis. Capacity building for future research in Caribbean Studies being a crucial aspect of this initiative, a thematic toolkit was developed to facilitate use of the project data as well as provide replicable processes. The toolkit includes sample text analysis projects, as well as tutorials and detailed project documentation. While the toolkit focuses on the history of hurricanes and tropical cyclones of the region, the methodologies and tools used …
The Effect Of Internet On Students Studies: A Review, Mark Quaye Affum
The Effect Of Internet On Students Studies: A Review, Mark Quaye Affum
Library Philosophy and Practice (e-journal)
This paper is a literature review on effects of internet use on students’ academic performance. Assessing to factors that affect students’ use of the internet is the main objective of this research. The paper additionally aims to find out the various activities that students use the internet to do and assess the various technologies students use to access the internet. Several articles were reviewed by the researcher. Articles reviewed were all on factors influencing students’ use of the internet. Out of the twelve articles, nine of them looked at the functions of the internet of students activities whiles fourteen articles …
Modernization Of Legacy Information Technology Systems, Rabie Khabouze
Modernization Of Legacy Information Technology Systems, Rabie Khabouze
Walden Dissertations and Doctoral Studies
Large enterprises spend a large portion of their Information Technology (IT) budget on maintaining their legacy systems. Legacy systems modernization projects are a catalyst for IT architects to save cost, provide new and efficient systems that increase profitability, and create value for their organization. Grounded in sociotechnical systems theory, the purpose of this qualitative multiple case study was to explore strategies IT architects use to modernize their legacy systems. The population included IT architects in large enterprises involved in legacy systems modernization projects, one in healthcare, and one in the financial services industry in the San Antonio-New Braunfels, Texas metropolitan …
Guide To The Dr. L.S. Dederick Papers, 1908-1956, Undated, Orson Kingsley, Patrick Koetsch
Guide To The Dr. L.S. Dederick Papers, 1908-1956, Undated, Orson Kingsley, Patrick Koetsch
Archives & Special Collections Finding Aids
Louis Serle (L.S.) Dederick was born in Chicago in 1883. He received his Ph.D. in Mathematics from Harvard University in 1909. From 1909 – 1917 he was a professor at Princeton University. From 1917 – 1924 he was professor at the U.S. Naval Academy in Annapolis, Maryland. In 1926 Dederick began working for the U.S. Army, Ordnance. During his time there he was the Associate Director of the Ballistic Research Laboratory at the Aberdeen Proving Grounds in Aberdeen, Maryland where he focused on ballistics research.
While Dederick worked as a mathematician at the Aberdeen Proving Grounds, he was involved with …
Law Library Blog (January 2022): Legal Beagle's Blog Archive, Roger Williams University School Of Law
Law Library Blog (January 2022): Legal Beagle's Blog Archive, Roger Williams University School Of Law
Law Library Newsletters/Blog
No abstract provided.
Digital Searching: A Grounded Theory Study On The Modern Search Experience, Nicolas Armando Parés
Digital Searching: A Grounded Theory Study On The Modern Search Experience, Nicolas Armando Parés
Electronic Theses and Dissertations
This Grounded theory study explores US adults' modern information search process as they pursue information through digital search user interfaces and tools. To study the current search process, a systematic grounded theory methodology and two data collection methods, a think-aloud protocol and semi-structured interviews, are used to develop the theory. The emerging theory addressed two tightly connected research questions that asked, “What is the process by which humans search and discover information?” and “What is the process by which search and discovery interfaces and tools support the modern search process?”
The study collects participant data from US adults who have …
Examination Of Strategies To Implementing Chip-And-Personal Identification Number Credit Card Authentication Infrastructures, Neville Arthur Gallimore
Examination Of Strategies To Implementing Chip-And-Personal Identification Number Credit Card Authentication Infrastructures, Neville Arthur Gallimore
Walden Dissertations and Doctoral Studies
Chip-and-Personal Identification Number (PIN) technology is seen as a game changer in many e-commerce industries and a transformational technology in the 21st century. However, security concerns have made chip-and-PIN adoption relatively slow. Massive unauthorized card payment transactions in the United States (U.S.) cost victims an estimate totaling billions of dollars. Information Technology (IT) managers are concerned with credit card fraud's financial loss and liability cost. Grounded in Rogers’s diffusion of innovation theory, the purpose of this qualitative pragmatic study was to explore strategies used by IT managers to transition their e-commerce organizations to chip-and-PIN credit card authentication infrastructures. The participants …
Exploring Algorithmic Literacy For College Students: An Educator’S Roadmap, Susan Gardner Archambault
Exploring Algorithmic Literacy For College Students: An Educator’S Roadmap, Susan Gardner Archambault
LMU/LLS Theses and Dissertations
Research shows that college students are largely unaware of the impact of algorithms on their everyday lives. Also, most university students are not being taught about algorithms as part of the regular curriculum. This exploratory, qualitative study aimed to explore subject-matter experts’ insights and perceptions of the knowledge components, coping behaviors, and pedagogical considerations to aid faculty in teaching algorithmic literacy to college students. Eleven individual, semi-structured interviews and one focus group were conducted with scholars and teachers of critical algorithm studies and related fields. Findings suggested three sets of knowledge components that would contribute to students’ algorithmic literacy: general …
Streaminghub: Interactive Stream Analysis Workflows, Yasith Jayawardana, Vikas G. Ashok, Sampath Jayarathna
Streaminghub: Interactive Stream Analysis Workflows, Yasith Jayawardana, Vikas G. Ashok, Sampath Jayarathna
Computer Science Faculty Publications
Reusable data/code and reproducible analyses are foundational to quality research. This aspect, however, is often overlooked when designing interactive stream analysis workflows for time-series data (e.g., eye-tracking data). A mechanism to transmit informative metadata alongside data may allow such workflows to intelligently consume data, propagate metadata to downstream tasks, and thereby auto-generate reusable, reproducible analytic outputs with zero supervision. Moreover, a visual programming interface to design, develop, and execute such workflows may allow rapid prototyping for interdisciplinary research. Capitalizing on these ideas, we propose StreamingHub, a framework to build metadata propagating, interactive stream analysis workflows using visual programming. We conduct …
D-Lib Magazine Pioneered Web-Based Scholarly Communication, Michael L. Nelson, Herbert Van De Sompel
D-Lib Magazine Pioneered Web-Based Scholarly Communication, Michael L. Nelson, Herbert Van De Sompel
Computer Science Faculty Publications
The web began with a vision of, as stated by Tim Berners-Lee in 1991, “that much academic information should be freely available to anyone”. For many years, the development of the web and the development of digital libraries and other scholarly communications infrastructure proceeded in tandem. A milestone occurred in July, 1995, when the first issue of D-Lib Magazine was published as an online, HTML-only, open access magazine, serving as the focal point for the then emerging digital library research community. In 2017 it ceased publication, in part due to the maturity of the community it served as well as …
Scholarly Big Data Quality Assessment: A Case Study Of Document Linking And Conflation With S2orc, Jian Wu, Ryan Hiltabrand, Dominik Soós, C. Lee Giles
Scholarly Big Data Quality Assessment: A Case Study Of Document Linking And Conflation With S2orc, Jian Wu, Ryan Hiltabrand, Dominik Soós, C. Lee Giles
Computer Science Faculty Publications
Recently, the Allen Institute for Artificial Intelligence released the Semantic Scholar Open Research Corpus (S2ORC), one of the largest open-access scholarly big datasets with more than 130 million scholarly paper records. S2ORC contains a significant portion of automatically generated metadata. The metadata quality could impact downstream tasks such as citation analysis, citation prediction, and link analysis. In this project, we assess the document linking quality and estimate the document conflation rate for the S2ORC dataset. Using semi-automatically curated ground truth corpora, we estimated that the overall document linking quality is high, with 92.6% of documents correctly linking to six major …