Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 15 of 15

Full-Text Articles in Physical Sciences and Mathematics

Migrating 120,000 Legacy Publications From Several Systems Into A Current Research Information System Using Advanced Data Wrangling Techniques, Yrjö Lappalainen, Matti Lassila, Tanja Heikkilä, Jani Nieminen, Tapani Lehtilä Nov 2023

Migrating 120,000 Legacy Publications From Several Systems Into A Current Research Information System Using Advanced Data Wrangling Techniques, Yrjö Lappalainen, Matti Lassila, Tanja Heikkilä, Jani Nieminen, Tapani Lehtilä

All Works

This article describes a complex CRIS (current research information system) implementation project involving the migration of around 120,000 legacy publication records from three different systems. The project, undertaken by Tampere University, encountered several challenges in data diversity, data quality, and resource allocation. To handle the extensive and heterogenous dataset, innovative approaches such as machine learning techniques and various data wrangling tools were used to process data, correct errors, and merge information from different sources. Despite significant delays and unforeseen obstacles, the project was ultimately successful in achieving its goals. The project served as a valuable learning experience, highlighting the importance …


Creating Data From Unstructured Text With Context Rule Assisted Machine Learning (Craml), Stephen Meisenbacher, Peter Norlander Dec 2022

Creating Data From Unstructured Text With Context Rule Assisted Machine Learning (Craml), Stephen Meisenbacher, Peter Norlander

School of Business: Faculty Publications and Other Works

Popular approaches to building data from unstructured text come with limitations, such as scalability, interpretability, replicability, and real-world applicability. These can be overcome with Context Rule Assisted Machine Learning (CRAML), a method and no-code suite of software tools that builds structured, labeled datasets which are accurate and reproducible. CRAML enables domain experts to access uncommon constructs within a document corpus in a low-resource, transparent, and flexible manner. CRAML produces document-level datasets for quantitative research and makes qualitative classification schemes scalable over large volumes of text. We demonstrate that the method is useful for bibliographic analysis, transparent analysis of proprietary data, …


Application Of Artificial Intelligence And Machine Learning In Libraries: A Systematic Review, Rajesh Kumar Das, Mohammad Sharif Ul Islam Aug 2021

Application Of Artificial Intelligence And Machine Learning In Libraries: A Systematic Review, Rajesh Kumar Das, Mohammad Sharif Ul Islam

Library Philosophy and Practice (e-journal)

As the concept and implementation of cutting-edge technologies like artificial intelligence and machine learning has become relevant, academics, researchers and information professionals involve research in this area. The objective of this systematic literature review is to provide a synthesis of empirical studies exploring application of artificial intelligence and machine learning in libraries. To achieve the objectives of the study, a systematic literature review was conducted based on the original guidelines proposed by Kitchenham et al. (2009). Data was collected from Web of Science, Scopus, LISA and LISTA databases. Following the rigorous/ established selection process, a total of thirty-two articles were …


Literature Review: How U.S. Government Documents Are Addressing The Increasing National Security Implications Of Artificial Intelligence, Bert Chapman Jun 2020

Literature Review: How U.S. Government Documents Are Addressing The Increasing National Security Implications Of Artificial Intelligence, Bert Chapman

Libraries Faculty and Staff Scholarship and Research

This article emphasizes the increasing importance of artificial intelligence (AI) in military and national security policy making. It seeks to inform interested individuals about the proliferation of publicly accessible U.S. government and military literature on this multifaceted topic. An additional objective of this endeavor is encouraging greater public awareness of and participation in emerging public policy debate on AI's moral and national security implications..


Harnessing Artificial Intelligence Capabilities To Improve Cybersecurity, Sherali Zeadally, Erwin Adi, Zubair Baig, Imran A. Khan Jan 2020

Harnessing Artificial Intelligence Capabilities To Improve Cybersecurity, Sherali Zeadally, Erwin Adi, Zubair Baig, Imran A. Khan

Information Science Faculty Publications

Cybersecurity is a fast-evolving discipline that is always in the news over the last decade, as the number of threats rises and cybercriminals constantly endeavor to stay a step ahead of law enforcement. Over the years, although the original motives for carrying out cyberattacks largely remain unchanged, cybercriminals have become increasingly sophisticated with their techniques. Traditional cybersecurity solutions are becoming inadequate at detecting and mitigating emerging cyberattacks. Advances in cryptographic and Artificial Intelligence (AI) techniques (in particular, machine learning and deep learning) show promise in enabling cybersecurity experts to counter the ever-evolving threat posed by adversaries. Here, we explore AI's …


Digital Libraries, Intelligent Data Analytics, And Augmented Description: A Demonstration Project, Elizabeth Lorang, Leen-Kiat Soh, Yi Liu, Chulwoo Pack Jan 2020

Digital Libraries, Intelligent Data Analytics, And Augmented Description: A Demonstration Project, Elizabeth Lorang, Leen-Kiat Soh, Yi Liu, Chulwoo Pack

UNL Libraries: Faculty Publications

From July 16-to November 8, 2019, the Aida digital libraries research team at the University of Nebraska-Lincoln collaborated with the Library of Congress on “Digital Libraries, Intelligent Data Analytics, and Augmented Description: A Demonstration Project.“ This demonstration project sought to (1) develop and investigate the viability and feasibility of textual and image-based data analytics approaches to support and facilitate discovery; (2) understand technical tools and requirements for the Library of Congress to improve access and discovery of its digital collections; and (3) enable the Library of Congress to plan for future possibilities. In pursuit of these goals, we focused our …


Final Presentation To The Library Of Congress On Digital Libraries, Intelligent Data Analytics, And Augmented Description, Elizabeth Lorang, Leen-Kiat Soh, Yi Liu, Chulwoo Pack Jan 2020

Final Presentation To The Library Of Congress On Digital Libraries, Intelligent Data Analytics, And Augmented Description, Elizabeth Lorang, Leen-Kiat Soh, Yi Liu, Chulwoo Pack

University of Nebraska-Lincoln Libraries: Conference Presentations and Speeches

This presentation to Library of Congress staff, delivered onsite on January 10, 2020, presents a tour through the demonstration project pursued by the Aida digital libraries research team with the Library of Congress in 2019-2020. In addition to providing an overview and analysis of the specific machine learning projects scoped and explored, this presentation includes a number of high-level take-aways and recommendations designed to influence and inform the Library of Congress's machine learning efforts going forward.


Virtual Wrap-Up Presentation: Digital Libraries, Intelligent Data Analytics, And Augmented Description, Elizabeth Lorang, Leen-Kiat Soh, Yi Liu, Chulwoo Pack Nov 2019

Virtual Wrap-Up Presentation: Digital Libraries, Intelligent Data Analytics, And Augmented Description, Elizabeth Lorang, Leen-Kiat Soh, Yi Liu, Chulwoo Pack

CSE Conference and Workshop Papers

Includes framing, overview, and discussion of the explorations pursued as part of the Digital Libraries, Intelligent Data Analytics, and Augmented Description demonstration project, pursued by members of the Aida digital libraries research team at the University of Nebraska-Lincoln through a research services contract with the Library of Congress. This presentation covered: Aida research team and background for the demonstration project; broad outlines of “Digital Libraries, Intelligent Data Analytics, and Augmented Description”; what changed for us as a research team over the collaboration and why; deliverables of our work; thoughts toward “What next”; and deep-dives into the explorations. The machine learning …


What Do You Mean? Research In The Age Of Machines, Arthur J. Boston Nov 2019

What Do You Mean? Research In The Age Of Machines, Arthur J. Boston

Faculty & Staff Research and Creative Activity

What Do You Mean?” was an undeniable bop of its era in which Justin Bieber explores the ambiguities of romantic communication. (I pinky promise this will soon make sense for scholarly communication librarians interested in artificial intelligence [AI].) When the single hit airwaves in 2015, there was a meta-debate over what Bieber meant to add to public discourse with lyrics like “What do you mean? Oh, oh, when you nod your head yes, but you wanna say no.” It is unlikely Bieber had consent culture in mind, but the failure of his songwriting team to take into account that some …


Document Images And Machine Learning: A Collaboratory Between The Library Of Congress And The Image Analysis For Archival Discovery (Aida) Lab At The University Of Nebraska, Lincoln, Ne, Yi Liu, Chulwoo Pack, Leen-Kiat Soh, Elizabeth Lorang Aug 2019

Document Images And Machine Learning: A Collaboratory Between The Library Of Congress And The Image Analysis For Archival Discovery (Aida) Lab At The University Of Nebraska, Lincoln, Ne, Yi Liu, Chulwoo Pack, Leen-Kiat Soh, Elizabeth Lorang

CSE Conference and Workshop Papers

This presentation summarized and presented preliminary results from the first weeks of work conducted by the Aida research team in response to Library of Congress funding notice ID 030ADV19Q0274, “The Library of Congress – Pre-processing Pilot.” It includes overviews of projects on historic document segmentation, document classification, document quality assessment, figure and graph extraction from historic documents, text-line extraction from figures, subject and objective quality assesments, and digitization type differentiation.


Improved Evolutionary Support Vector Machine Classifier For Coronary Artery Heart Disease Prediction Among Diabetic Patients, Narasimhan B, Malathi A Dr Apr 2019

Improved Evolutionary Support Vector Machine Classifier For Coronary Artery Heart Disease Prediction Among Diabetic Patients, Narasimhan B, Malathi A Dr

Library Philosophy and Practice (e-journal)

Soft computing paves way many applications including medical informatics. Decision support system has gained a major attention that will aid medical practitioners to diagnose diseases. Diabetes mellitus is hereditary disease that might result in major heart disease. This research work aims to propose a soft computing mechanism named Improved Evolutionary Support Vector Machine classifier for CAHD risk prediction among diabetes patients. The attribute selection mechanism is attempted to build with the classifier in order to reduce the misclassification error rate of the conventional support vector machine classifier. Radial basis kernel function is employed in IESVM. IESVM classifier is evaluated through …


The New Legal Landscape For Text Mining And Machine Learning, Matthew Sag Jan 2019

The New Legal Landscape For Text Mining And Machine Learning, Matthew Sag

Faculty Articles

Now that the dust has settled on the Authors Guild cases, this Article takes stock of the legal context for TDM research in the United States. This reappraisal begins in Part I with an assessment of exactly what the Authors Guild cases did and did not establish with respect to the fair use status of text mining. Those cases held unambiguously that reproducing copyrighted works as one step in the process of knowledge discovery through text data mining was transformative, and thus ultimately a fair use of those works. Part I explains why those rulings followed inexorably from copyright's most …


Work-In-Progress Reports Submitted To The Library Of Congress As Part Of Digital Libraries, Intelligent Data Analytics, And Augmented Description, Chulwoo Pack, Yi Liu, Leen-Kiat Soh, Elizabeth Lorang Jan 2019

Work-In-Progress Reports Submitted To The Library Of Congress As Part Of Digital Libraries, Intelligent Data Analytics, And Augmented Description, Chulwoo Pack, Yi Liu, Leen-Kiat Soh, Elizabeth Lorang

CSE Technical Reports

This document includes work-in-progress reports submitted to the Library of Congress as part of the Aida digital libraries research team's work on Digital Libraries, Intelligent Data Analytics, and Augmented Description: A Demonstration Project. These work-in-progress reports provide a snapshot glimpse, as well as underlying rationale and decision-making, at various points in the development of the project and its machine learning explorations. Reports cover explorations on historic newspapers, minimally-processed manuscript collections, materials digitized from physical originals and those digitized from microform surrogates, and investigate challenges related to image segmentation and document zoning, classification, document image quality analysis, metadata generation, and more.


Using Chronicling America’S Images To Explore Digitized Historic Newspapers & Imagine Alternative Futures, Elizabeth Lorang, Leen-Kiat Soh Sep 2018

Using Chronicling America’S Images To Explore Digitized Historic Newspapers & Imagine Alternative Futures, Elizabeth Lorang, Leen-Kiat Soh

University of Nebraska-Lincoln Libraries: Conference Presentations and Speeches

This presentation situates the work of the Aida team broadly as well as hinges this work on some very specific challenges for digital libraries. In doing so demonstrate the many types of questions and domains to be explored in digitized newspapers.


Increasing Our Vision For 21st-Century Digital Libraries, Elizabeth M. Lorang, Leen-Kiat Soh Jan 2018

Increasing Our Vision For 21st-Century Digital Libraries, Elizabeth M. Lorang, Leen-Kiat Soh

University of Nebraska-Lincoln Libraries: Conference Presentations and Speeches

This presentation

  1. Reads digital library interfaces—or their "main door" interfaces—as glimpses into what we have thus far valued in the development of digital libraries
  2. Frames a visual way of thinking about textual materials
  3. Introduces the work of our research team—where we are now, and where we're headed
  4. Draws some connections between the parts

This presentation is very much a look into thinking in process and work in progress and proposes the following ideas:

  1. As a community, we can do much more with the digital images we're creating of textual materials than we've heretofore done.
  2. We aspire to have additional layers …