Open Access. Powered by Scholars. Published by Universities.®

Series

Computer Sciences

Institution
Keyword
Publication Year
Publication

Articles 1 - 25 of 25

Full-Text Articles in Cataloging and Metadata

Chatgpt As Metamorphosis Designer For The Future Of Artificial Intelligence (Ai): A Conceptual Investigation, Amarjit Kumar Singh (Library Assistant), Dr. Pankaj Mathur (Deputy Librarian) Mar 2023

Chatgpt As Metamorphosis Designer For The Future Of Artificial Intelligence (Ai): A Conceptual Investigation, Amarjit Kumar Singh (Library Assistant), Dr. Pankaj Mathur (Deputy Librarian)

Library Philosophy and Practice (e-journal)

Abstract

Purpose: The purpose of this research paper is to explore ChatGPT’s potential as an innovative designer tool for the future development of artificial intelligence. Specifically, this conceptual investigation aims to analyze ChatGPT’s capabilities as a tool for designing and developing near about human intelligent systems for futuristic used and developed in the field of Artificial Intelligence (AI). Also with the helps of this paper, researchers are analyzed the strengths and weaknesses of ChatGPT as a tool, and identify possible areas for improvement in its development and implementation. This investigation focused on the various features and functions of ChatGPT that …


Hashes Are Not Suitable To Verify Fixity Of The Public Archived Web, Mohamed Aturban, Martin Klein, Herbert Van De Sompel, Sawood Alam, Michael L. Nelson, Michele C. Weigle Jan 2023

Hashes Are Not Suitable To Verify Fixity Of The Public Archived Web, Mohamed Aturban, Martin Klein, Herbert Van De Sompel, Sawood Alam, Michael L. Nelson, Michele C. Weigle

Computer Science Faculty Publications

Web archives, such as the Internet Archive, preserve the web and allow access to prior states of web pages. We implicitly trust their versions of archived pages, but as their role moves from preserving curios of the past to facilitating present day adjudication, we are concerned with verifying the fixity of archived web pages, or mementos, to ensure they have always remained unaltered. A widely used technique in digital preservation to verify the fixity of an archived resource is to periodically compute a cryptographic hash value on a resource and then compare it with a previous hash value. If the …


Creating Data From Unstructured Text With Context Rule Assisted Machine Learning (Craml), Stephen Meisenbacher, Peter Norlander Dec 2022

Creating Data From Unstructured Text With Context Rule Assisted Machine Learning (Craml), Stephen Meisenbacher, Peter Norlander

School of Business: Faculty Publications and Other Works

Popular approaches to building data from unstructured text come with limitations, such as scalability, interpretability, replicability, and real-world applicability. These can be overcome with Context Rule Assisted Machine Learning (CRAML), a method and no-code suite of software tools that builds structured, labeled datasets which are accurate and reproducible. CRAML enables domain experts to access uncommon constructs within a document corpus in a low-resource, transparent, and flexible manner. CRAML produces document-level datasets for quantitative research and makes qualitative classification schemes scalable over large volumes of text. We demonstrate that the method is useful for bibliographic analysis, transparent analysis of proprietary data, …


Theory Entity Extraction For Social And Behavioral Sciences Papers Using Distant Supervision, Xin Wei, Lamia Salsabil, Jian Wu Jan 2022

Theory Entity Extraction For Social And Behavioral Sciences Papers Using Distant Supervision, Xin Wei, Lamia Salsabil, Jian Wu

Computer Science Faculty Publications

Theories and models, which are common in scientific papers in almost all domains, usually provide the foundations of theoretical analysis and experiments. Understanding the use of theories and models can shed light on the credibility and reproducibility of research works. Compared with metadata, such as title, author, keywords, etc., theory extraction in scientific literature is rarely explored, especially for social and behavioral science (SBS) domains. One challenge of applying supervised learning methods is the lack of a large number of labeled samples for training. In this paper, we propose an automated framework based on distant supervision that leverages entity mentions …


Streaminghub: Interactive Stream Analysis Workflows, Yasith Jayawardana, Vikas G. Ashok, Sampath Jayarathna Jan 2022

Streaminghub: Interactive Stream Analysis Workflows, Yasith Jayawardana, Vikas G. Ashok, Sampath Jayarathna

Computer Science Faculty Publications

Reusable data/code and reproducible analyses are foundational to quality research. This aspect, however, is often overlooked when designing interactive stream analysis workflows for time-series data (e.g., eye-tracking data). A mechanism to transmit informative metadata alongside data may allow such workflows to intelligently consume data, propagate metadata to downstream tasks, and thereby auto-generate reusable, reproducible analytic outputs with zero supervision. Moreover, a visual programming interface to design, develop, and execute such workflows may allow rapid prototyping for interdisciplinary research. Capitalizing on these ideas, we propose StreamingHub, a framework to build metadata propagating, interactive stream analysis workflows using visual programming. We conduct …


Automatic Metadata Extraction Incorporating Visual Features From Scanned Electronic Theses And Dissertations, Muntabir Hasan Choudhury, Himarsha R. Jayanetti, Jian Wu, William A. Ingram, Edward A. Fox Jan 2021

Automatic Metadata Extraction Incorporating Visual Features From Scanned Electronic Theses And Dissertations, Muntabir Hasan Choudhury, Himarsha R. Jayanetti, Jian Wu, William A. Ingram, Edward A. Fox

Computer Science Faculty Publications

Electronic Theses and Dissertations (ETDs) contain domain knowledge that can be used for many digital library tasks, such as analyzing citation networks and predicting research trends. Automatic metadata extraction is important to build scalable digital library search engines. Most existing methods are designed for born-digital documents, so they often fail to extract metadata from scanned documents such as ETDs. Traditional sequence tagging methods mainly rely on text-based features. In this paper, we propose a conditional random field (CRF) model that combines text-based and visual features. To verify the robustness of our model, we extended an existing corpus and created a …


Large Scale Subject Category Classification Of Scholarly Papers With Deep Attentive Neural Networks, Bharath Kandimalla, Shaurya Rohatgi, Jian Wu, C. Lee Giles Jan 2021

Large Scale Subject Category Classification Of Scholarly Papers With Deep Attentive Neural Networks, Bharath Kandimalla, Shaurya Rohatgi, Jian Wu, C. Lee Giles

Computer Science Faculty Publications

Subject categories of scholarly papers generally refer to the knowledge domain(s) to which the papers belong, examples being computer science or physics. Subject category classification is a prerequisite for bibliometric studies, organizing scientific publications for domain knowledge extraction, and facilitating faceted searches for digital library search engines. Unfortunately, many academic papers do not have such information as part of their metadata. Most existing methods for solving this task focus on unsupervised learning that often relies on citation networks. However, a complete list of papers citing the current paper may not be readily available. In particular, new papers that have few …


Data, Stats, Go: Navigating The Intersections Of Cataloging, E-Resource, And Web Analytics Reporting, Rachel S. Evans, Wendy Moore, Jessica Pasquale, Andre Davison Jul 2020

Data, Stats, Go: Navigating The Intersections Of Cataloging, E-Resource, And Web Analytics Reporting, Rachel S. Evans, Wendy Moore, Jessica Pasquale, Andre Davison

Presentations

Do you trudge through gathering statistics at fiscal or calendar year-end? Do you wonder why you track certain things, thinking many seem outdated or irrelevant? Many places seem to keep counting certain statistics because "that's what they've always done." For e-resources, how do you integrate those with physical counts and reconcile the variations (updated e-resources versus re-cataloged physical items)? What about repository downloads and other web traffic? The quantity of stats that libraries track is staggering and keeps growing. This program will encourage attendees to stop and evaluate what and why they're gathering data and help identify possible alternatives to …


Smartcitecon: Implicit Citation Context Extraction From Academic Literature Using Unsupervised Learning, Chenrui Gao, Haoran Cui, Li Zhang, Jiamin Wang, Wei Lu, Jian Wu Jan 2020

Smartcitecon: Implicit Citation Context Extraction From Academic Literature Using Unsupervised Learning, Chenrui Gao, Haoran Cui, Li Zhang, Jiamin Wang, Wei Lu, Jian Wu

Computer Science Faculty Publications

We introduce SmartCiteCon (SCC), a Java API for extracting both explicit and implicit citation context from academic literature in English. The tool is built on a Support Vector Machine (SVM) model trained on a set of 7,058 manually annotated citation context sentences, curated from 34,000 papers in the ACL Anthology. The model with 19 features achieves F1=85.6%. SCC supports PDF, XML, and JSON files out-of-box, provided that they are conformed to certain schemas. The API supports single document processing and batch processing in parallel. It takes about 12–45 seconds on average depending on the format to process a …


Opening Books And The National Corpus Of Graduate Research, William A. Ingram, Edward A. Fox, Jian Wu Jan 2020

Opening Books And The National Corpus Of Graduate Research, William A. Ingram, Edward A. Fox, Jian Wu

Computer Science Faculty Publications

Virginia Tech University Libraries, in collaboration with Virginia Tech Department of Computer Science and Old Dominion University Department of Computer Science, request $505,214 in grant funding for a 3-year project, the goal of which is to bring computational access to book-length documents, demonstrating that with Electronic Theses and Dissertations (ETDs). The project is motivated by the following library and community needs. (1) Despite huge volumes of book-length documents in digital libraries, there is a lack of models offering effective and efficient computational access to these long documents. (2) Nationwide open access services for ETDs generally function at the metadata level. …


Acknowledgement Entity Recognition In Cord-19 Papers, Jian Wu, Pei Wang, Xin Wei, Sarah Rajtmajer, C. Lee Giles, Christopher Griffin Jan 2020

Acknowledgement Entity Recognition In Cord-19 Papers, Jian Wu, Pei Wang, Xin Wei, Sarah Rajtmajer, C. Lee Giles, Christopher Griffin

Computer Science Faculty Publications

Acknowledgements are ubiquitous in scholarly papers. Existing acknowledgement entity recognition methods assume all named entities are acknowledged. Here, we examine the nuances between acknowledged and named entities by analyzing sentence structure. We develop an acknowledgement extraction system, AckExtract based on open-source text mining software and evaluate our method using manually labeled data. AckExtract uses the PDF of a scholarly paper as input and outputs acknowledgement entities. Results show an overall performance of F1=0.92. We built a supplementary database by linking CORD-19 papers with acknowledgement entities extracted by AckExtract including persons and organizations and find that only up to …


A Heuristic Baseline Method For Metadata Extraction From Scanned Electronic Theses And Dissertations, Muntabir H. Choudhury, Jian Wu, William A. Ingam, Edward A. Fox Jan 2020

A Heuristic Baseline Method For Metadata Extraction From Scanned Electronic Theses And Dissertations, Muntabir H. Choudhury, Jian Wu, William A. Ingam, Edward A. Fox

Computer Science Faculty Publications

Extracting metadata from scholarly papers is an important text mining problem. Widely used open-source tools such as GROBID are designed for born-digital scholarly papers but often fail for scanned documents, such as Electronic Theses and Dissertations (ETDs). Here we present a preliminary baseline work with a heuristic model to extract metadata from the cover pages of scanned ETDs. The process started with converting scanned pages into images and then text files by applying OCR tools. Then a series of carefully designed regular expressions for each field is applied, capturing patterns for seven metadata fields: titles, authors, years, degrees, academic programs, …


Collecting Virtual And Augmented Reality In The Twenty-First Century Library, Matthew Hannah, Sarah Huber, Sorin Adam Matei Mar 2019

Collecting Virtual And Augmented Reality In The Twenty-First Century Library, Matthew Hannah, Sarah Huber, Sorin Adam Matei

Matei Interdisciplinary Research Collaboratory

In this paper, we discuss possible pedagogical applications for virtual and augmented reality (VR and AR), within a humanities/social sciences curriculum, articulating a critical need for academic libraries to collect and curate 3D objects. We contend that building infrastructure is critical to keep pace with innovative pedagogies and scholarship. We offer theoretical avenues for libraries to build a repository 3D object files to be used in VR and AR tools and sketch some anticipated challenges. To build an infrastructure to support VR/AR collections, we have collaborated with College of Liberal Arts to pilot a program in which Libraries and CLA …


A Survey Of Archival Replay Banners, Sawood Alam, Mat Kelly, Michele C. Weigle, Michael L. Nelson Jan 2018

A Survey Of Archival Replay Banners, Sawood Alam, Mat Kelly, Michele C. Weigle, Michael L. Nelson

Computer Science Faculty Publications

We surveyed various archival systems to compare and contrast different techniques used to implement an archival replay banner. We found that inline plain HTML injection is the most common approach, but prone to style conflicts. Iframe-based banners are also very common and while they do not have style conflicts, they suffer from screen real estate wastage and limited design choices. Custom Elements-based banners are promising, but due to being a new web standard, these are not yet widely deployed.


Infographics: A Practical Guide For Librarians, Darren Sweeper Feb 2017

Infographics: A Practical Guide For Librarians, Darren Sweeper

Sprague Library Scholarship and Creative Works

No abstract provided.


Databrarianship: The Academic Data Librarian In Theory And Practice, Darren Sweeper Dec 2016

Databrarianship: The Academic Data Librarian In Theory And Practice, Darren Sweeper

Sprague Library Scholarship and Creative Works

No abstract provided.


Data Visualizations And Infographics, Darren Sweeper Sep 2016

Data Visualizations And Infographics, Darren Sweeper

Sprague Library Scholarship and Creative Works

No abstract provided.


Comparing Institutional Repository Software: Pampering Metadata Uploaders, Craighton Hippenhammer Apr 2016

Comparing Institutional Repository Software: Pampering Metadata Uploaders, Craighton Hippenhammer

Faculty Scholarship – Library Science

This article highlights the key concepts of institutional repositories and identifies the strengths of Digital Commons and Wesleyan Holiness Digital Library products. Special attention is given to software structures and features, support systems, and factors that impact quality. Parts of this article were given as an Association of Christian Librarians annual national conference workshop presentation presented at Carson-Newman University, Jefferson City, Tennessee, June 11, 2015.


How Information Science Professionals Add Value In A Scientific Research Center, Chris Eaker, Andrea Thomer, Erica Johns, Kayla Siddell Jan 2013

How Information Science Professionals Add Value In A Scientific Research Center, Chris Eaker, Andrea Thomer, Erica Johns, Kayla Siddell

DataONE Sociocultural and Usability & Assessment Working Groups

In response to the increasing need for a data curation workforce, the Data Curation Education in Research Centers program is educating library and information science students in scientific data curation. During the summer of 2012, the authors worked alongside scientists and data managers at the National Center for Atmospheric Research in Boulder, Colorado, to learn data curation within the context of a research center. Each student was matched with a “Science Mentor” and a “Data Mentor” based on prior work experience and the results of a placement questionnaire completed before the internship. Though NCAR has robust data services, we found …


Linked Data Demystified: Practical Efforts To Transform Contentdm Metadata For The Linked Data Cloud, Silvia B. Southwick, Cory K. Lampert Nov 2012

Linked Data Demystified: Practical Efforts To Transform Contentdm Metadata For The Linked Data Cloud, Silvia B. Southwick, Cory K. Lampert

Library Faculty Presentations

The library literature and events like the ALA Annual Conference have been inundated with presentations and articles on linked data. At UNLV Libraries, we understand the importance of linked data in helping to better service our users. We have designed and initiated a pilot project to apply linked data concepts to the practical task of transforming a sample set of our CONTENTdm digital collections data into future-oriented linked data. This presentation will outline rationale for beginning work in linked data and detail the phases we will undertake in the proof of concept project. We hope through this research experiment to …


Evaluating And Implementing Web Scale Discovery Services: Part Two, Jason Vaughan, Tamera Hanken Jul 2011

Evaluating And Implementing Web Scale Discovery Services: Part Two, Jason Vaughan, Tamera Hanken

Library Faculty Presentations

Part Four: Quick Tour of the Current Marketplace:

  • "The Big 5"
  • Similarities and differences

Part Five: It's Not All Sliced Bread:

  • Shortcomings of web scale discovery

Part Six: Implementation (pre launch steps):

  • Selecting and preparing implementation staff
  • Preparing and communicating process/decisions with all staff
  • Working with the vendor (roles, expectations, timeline)
  • Workflow changes and implications (technical services)

Part Seven: Specific implementation tasks, issues, and considerations:

  • Record loading and mapping (catalog content)
  • Harvesting and mapping digital/local content
  • Working with central index data (internal & external content)
  • Web integration and customization
  • Assessment and continuous improvement


Evaluating And Implementing Web Scale Discovery Services: Part One, Jason Vaughan, Tamera Hanken Jul 2011

Evaluating And Implementing Web Scale Discovery Services: Part One, Jason Vaughan, Tamera Hanken

Library Faculty Presentations

Preface: Before Web Scale Discovery

  • A very brief overview

Part 1: What is Web Scale Discovery

  • Content
  • Technology

Part 2: Why is Web Scale Discovery important?

  • What’s the need?
  • How is it different from earlier attempts at broad discovery?

Part 3: A Framework for Evaluating Web Scale Discovery Services

  • What we did at UNLV
  • Other options




Skos And The Semantic Web: Knowledge Organization, Metadata, And Interoperability, Eric A. Robinson Jan 2010

Skos And The Semantic Web: Knowledge Organization, Metadata, And Interoperability, Eric A. Robinson

Other Topics

The Simplified Knowledge Organization System (SKOS) is a Semantic Web framework, based on the Resource Description Framework (RDF) for thesauri, classification schemes and simple ontologies. It allows for machine-actionable description of the structure of these knowledge organization systems (KOS) and provides an excellent tool for addressing interoperability and vocabulary control problems inherent to the rapidly expanding information environment of the Web. This paper discusses the foundations of the SKOS framework and reviews the literature on a variety of SKOS implementations. The limitations of SKOS that have been revealed through its broad application are addressed with brief attention to the proposed …


Task-Based Mood Induction Procedures For The Elicitation Of Natural Emotional Responses., Brian Vaughan, Charlie Cullen, Spyros Kousidis, Yi Wang Jul 2007

Task-Based Mood Induction Procedures For The Elicitation Of Natural Emotional Responses., Brian Vaughan, Charlie Cullen, Spyros Kousidis, Yi Wang

Other resources

This paper details experimental procedures designed to elicit real emotional responses from participants within a controlled acoustic environment. The experiments use Mood Induction Procedures (MIP’s), specifically MIP 4, to implement a co-operative task using two participants. These cooperative tasks are designed to engender emotional responses of activation and evaluation from the participants who are situated in separate isolation booths, thus reducing unwanted noise in the signal, preventing the participants from being distracted and ensuring a cleanly recorded audio signal. The audio is recorded at a professional level of quality (24bit/192Khz). The emotional dimensions of each audio recording will be evaluated …


Cataloging Expert Systems: Optimism And Frustrated Reality, William Olmstadt Feb 2000

Cataloging Expert Systems: Optimism And Frustrated Reality, William Olmstadt

E-JASL 1999-2009 (Volumes 1-10)

There is little question that computers have profoundly changed how information professionals work. The process of cataloging and classifying library materials was one of the first activities transformed by information technology. The introduction of the MARC format in the 1960s and the creation of national bibliographic utilities in the 1970s had a lasting impact on cataloging. In the 1980s, the affordability of microcomputers made the computer accessible for cataloging, even to small libraries. This trend toward automating library processes with computers parallels a broader societal interest in the use of computers to organize and store information. Following World War II, …