Open Access. Powered by Scholars. Published by Universities.®

Computer Engineering Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 16 of 16

Full-Text Articles in Computer Engineering

Integrating Preservation Functions Into The Web Server, Joan A. Smith Jul 2008

Integrating Preservation Functions Into The Web Server, Joan A. Smith

Computer Science Theses & Dissertations

Digital preservation of theWorldWideWeb poses unique challenges, different fromthe preservation issues facing professional Digital Libraries. The complete list of a website’s resources cannot be cited with confidence, and the descriptive metadata available for the resources is so minimal that it is sometimes insufficient for a browser to recognize. In short, the Web suffers from a counting problem and a representation problem. Refreshing the bits, migrating from an obsolete file format to a newer format, and other classic digital preservation problems also affect the Web. As digital collections devise solutions to these problems, the Web will also benefit. But the ...


Creating Preservation-Ready Web Resources, Joan A. Smith, Michael L. Nelson Jan 2008

Creating Preservation-Ready Web Resources, Joan A. Smith, Michael L. Nelson

Computer Science Faculty Publications

There are innumerable departmental, community, and personal web sites worthy of long-term preservation but proportionally fewer archivists available to properly prepare and process such sites. We propose a simple model for such everyday web sites which takes advantage of the web server itself to help prepare the site's resources for preservation. This is accomplished by having metadata utilities analyze the resource at the time of dissemination. The web server responds to the archiving repository crawler by sending both the resource and the just-in-time generated metadata as a straight-forward XML-formatted response. We call this complex object (resource + metadata) a CRATE ...


Fedcor: An Institutional Cordra Registry, Giridhar Manepalli, Henry Jerez, Michael L. Nelson Jan 2006

Fedcor: An Institutional Cordra Registry, Giridhar Manepalli, Henry Jerez, Michael L. Nelson

Computer Science Faculty Publications

FeDCOR (Federation of DSpace using CORDRA) is a registry-based federation system for DSpace instances. It is based on the CORDRA model. The first article in this issue of D-Lib Magazine describes the Advanced Distributed Learning-Registry (ADL-R) [1], which is the first operational CORDRA registry, and also includes an introduction to CORDRA. That introduction, or other prior knowledge of the CORDRA effort, is recommended for the best understanding of this article, which builds on that base to describe in detail the FeDCOR approach.


Lightweight Federation Of Non-Cooperating Digital Libraries, Rong Shi Apr 2005

Lightweight Federation Of Non-Cooperating Digital Libraries, Rong Shi

Computer Science Theses & Dissertations

This dissertation studies the challenges and issues faced in federating heterogeneous digital libraries (DLs). The objective of this research is to demonstrate the feasibility of interoperability among non-cooperating DLs by presenting a lightweight, data driven approach, or Data Centered Interoperability (DCI). We build a Lightweight Federated Digital Library (LFDL) system to provide federated search service for existing digital libraries with no prior coordination.

We describe the motivation, architecture, design and implementation of the LFDL. We develop, deploy, and evaluate key services of the federation. The major difference to existing DL interoperability approaches is one where we do not insist on ...


Final Report For The Development Of The Nasa Technical Report Server (Ntrs), Michael L. Nelson Jan 2005

Final Report For The Development Of The Nasa Technical Report Server (Ntrs), Michael L. Nelson

Computer Science Faculty Publications

The author performed a variety of research, development and consulting tasks for NASA Langley Research Center in the area of digital libraries (DLs) and supporting technologies, such as the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH). In particular, the development focused on the NASA Technical Report Server (NTRS) and its transition from a distributed searching model to one that uses the OAI-PMH. The Open Archives Initiative (OAI) is an international consortium focused on furthering the interoperability of DLs through the use of "metadata harvesting". The OAI-PMH version of NTRS went into public production on April 28, 2003. Since that ...


Lessons Learned With Arc, An Oai-Pmh Service Provider, Xiaoming Liu, Kurt Maly, Michael L. Nelson Jan 2005

Lessons Learned With Arc, An Oai-Pmh Service Provider, Xiaoming Liu, Kurt Maly, Michael L. Nelson

Computer Science Faculty Publications

Web-based digital libraries have historically been built in isolation utilizing different technologies, protocols, and metadata. These differences hindered the development of digital library services that enable users to discover information from multiple libraries through a single unified interface. The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) is a major, international effort to address technical interoperability among distributed repositories. Arc debuted in 2000 as the first end-user OAI-PMH service provider. Since that time, Arc has grown to include nearly 7,000,000 metadata records. Arc has been deployed in a number of environments and has served as the basis for ...


Metadata And Buckets In The Smart Object, Dumb Archive (Soda) Model, Michael L. Nelson, Kurt Maly, Delwin R. Croom Jr., Steven W. Robbins Jan 2004

Metadata And Buckets In The Smart Object, Dumb Archive (Soda) Model, Michael L. Nelson, Kurt Maly, Delwin R. Croom Jr., Steven W. Robbins

Computer Science Faculty Publications

We present the Smart Object, Dumb Archive (SODA) model for digital libraries (DLs), and discuss the role of metadata in SODA. The premise of the SODA model is to "push down" many of the functionalities generally associated with archives into the data objects themselves. Thus the data objects become "smarter", and the archives "dumber". In the SODA model, archives become primarily set managers, and the objects themselves negotiate and handle presentation, enforce terms and conditions, and perform data content management. Buckets are our implementation of smart objects, and da is our reference implementation for dumb archives. We also present our ...


Report On The Third Acm/Ieee Joint Conference On Digital Libraries (Jcdl), Michael L. Nelson Jan 2003

Report On The Third Acm/Ieee Joint Conference On Digital Libraries (Jcdl), Michael L. Nelson

Computer Science Faculty Publications

The Third ACM/IEEE Joint Conference on Digital Libraries (JCDL 2003) was held on the campus of Rice University in Houston, Texas, May 27 - 31. Regarding the merging of the ACM and IEEE conference series, in the JCDL 2002 conference report published last year in D-Lib Magazine Edie Rasmussen noted, "Perhaps by next year...no one will remember that it wasn't always so" [1]. Judging by the number of participants I met who did not know that the ACM and IEEE used to hold separate digital library conferences, Rasmussen's prediction has come to pass.


A Scalable Architecture For Harvest-Based Digital Libraries, Xiaoming Liu, Tim Brody, Stevan Harnard, Les Carr, Kurt Maly, Mohammad Zubair, Michael L. Nelson Jan 2002

A Scalable Architecture For Harvest-Based Digital Libraries, Xiaoming Liu, Tim Brody, Stevan Harnard, Les Carr, Kurt Maly, Mohammad Zubair, Michael L. Nelson

Computer Science Faculty Publications

This article discusses the requirements of current and emerging applications based on the Open Archives Initiative (OAI) and emphasizes the need for a common infrastructure to support them. Inspired by HTTP proxy, cache, gateway and web service concepts, a design for a scalable and reliable infrastructure that aims at satisfying these requirements is presented. Moreover, it is shown how various applications can exploit the services included in the proposed infrastructure. The article concludes by discussing the current status of several prototype implementations.


Object Persistence And Availability In Digital Libraries, Michael L. Nelson, B. Danette Allen Jan 2002

Object Persistence And Availability In Digital Libraries, Michael L. Nelson, B. Danette Allen

Computer Science Faculty Publications

We have studied object persistence and availability of 1,000 digital library (DL) objects. Twenty World Wide Web accessible DLs were chosen and from each DL, 50 objects were chosen at random. A script checked the availability of each object three times a week for just over 1 year for a total of 161 data samples. During this time span, we found 31 objects (3% of the total) that appear to no longer be available: 24 from PubMed Central, 5 from IDEAS, 1 from CogPrints, and 1 from ETD.


Smart Objects And Open Archives, Michael L. Nelson, Kurt Maly Jan 2001

Smart Objects And Open Archives, Michael L. Nelson, Kurt Maly

Computer Science Faculty Publications

Within the context of digital libraries (DLs), we are making information objects "first-class citizens". We decouple information objects from the systems used for their storage and retrieval, allowing the technology for both DLs and information content to progress independently. We believe dismantling the stovepipe of "DL-archive-content" is the first step in building richer DL experiences for users and insuring the long-term survivability of digital information. To demonstrate this partitioning between DLs, archives and information content, we introduce "buckets": aggregative, intelligent, object-oriented constructs for publishing in digital libraries. Buckets exist within the "Smart Object, Dumb Archive" (SODA) DL model, which promotes ...


Buckets: Smart Objects For Digital Libraries, Michael L. Nelson Jan 2001

Buckets: Smart Objects For Digital Libraries, Michael L. Nelson

Computer Science Faculty Publications

Current discussion of digital libraries (DLs) is often dominated by the merits of the respective storage, search and retrieval functionality of archives, repositories, search engines, search interfaces and database systems. While these technologies are necessary for information management, the information content is more important than the systems used for its storage and retrieval. Digital information should have the same long-term survivability prospects as traditional hardcopy information and should be protected to the extent possible from evolving search engine technologies and vendor vagaries in database management systems. Information content and information retrieval systems should progress on independent paths and make limited ...


Arc - An Oai Service Provider For Digital Library Federation, Xiaoming Liu, Kurt Maly, Mohammad Zubair, Michael L. Nelson Jan 2001

Arc - An Oai Service Provider For Digital Library Federation, Xiaoming Liu, Kurt Maly, Mohammad Zubair, Michael L. Nelson

Computer Science Faculty Publications

The usefulness of the many on-line journals and scientific digital libraries that exist today is limited by the inability to federate these resources through a unified interface. The Open Archive Initiative (OAI) is one major effort to address technical interoperability among distributed archives. The objective of OAI is to develop a framework to facilitate the discovery of content in distributed archives. In this paper, we describe our experience and lessons learned in building Arc, the first federated searching service based on the OAI protocol. Arc harvests metadata from several OAI compliant archives, normalizes them, and stores them in a search ...


The Ups Prototype: An Experimental End-User Service Across E-Print Archives, Herbert Van De Sompel, Thomas Krichel, Michael L. Nelson, Patrick Hochstenbach, Victor Lyapunov, Kurt Maly, Mohammad Zubair, Mohamed Kholief, Xiaoming Liu, Heath O'Connell Jan 2000

The Ups Prototype: An Experimental End-User Service Across E-Print Archives, Herbert Van De Sompel, Thomas Krichel, Michael L. Nelson, Patrick Hochstenbach, Victor Lyapunov, Kurt Maly, Mohammad Zubair, Mohamed Kholief, Xiaoming Liu, Heath O'Connell

Computer Science Faculty Publications

A meeting was held in Santa Fe, New Mexico, October 21-22, 1999, to generate discussion and consensus about interoperability of publicly available scholarly information archives. The invitees represented several well known e-print and report archive initiatives, as well as organizations with interests in digital libraries and the transformation of scholarly communication. The central goal of the meeting was to agree on recommendations that would make the creation of end-user services -- such as scientific search engines and linking systems -- for data originating from distributed and dissimilar archives easier. The Universal Preprint Service (UPS) Prototype was developed in preparation for this meeting ...


A Digital Library For The National Advisory Committee For Aeronautics, Michael L. Nelson Jan 1999

A Digital Library For The National Advisory Committee For Aeronautics, Michael L. Nelson

Computer Science Faculty Publications

We describe the digital library (DL) for the National Advisory Committee for Aeronautics (NACA), the NACA Technical Report Server (NACATRS). The predecessor organization for the National Aeronautics and Space Administration (NASA), NACA existed from 1915 until 1958. The primary manifestation of NACA's research was the NACA report series. We describe the process of converting this collection of reports to digital format and making it available on the World Wide Web (WWW) and is a node in the NASA Technical Report Server (NTRS). We describe the current state of the project, the resulting DL technology developed from the project, and ...


Buckets: Aggregative, Intelligent Agents For Publishing, Michael L. Nelson, Kurt Maly, Stewart N. T. Shen, Mohammad Zubair Jan 1998

Buckets: Aggregative, Intelligent Agents For Publishing, Michael L. Nelson, Kurt Maly, Stewart N. T. Shen, Mohammad Zubair

Computer Science Faculty Publications

Buckets are an aggregative, intelligent construct for publishing in digital libraries. The goal of research projects is to produce information. This information is often instantiated in several forms, differentiated by semantic types (report, software, video, datasets, etc.). A given semantic type can be further differentiated by syntactic representations as well (PostScript version, PDF version, Word version, etc.). Although the information was created together and subtle relationships can exist between them, different semantic instantiations are generally segregated along currently obsolete media boundaries. Reports are placed in report archives, software might go into a software archive, but most of the data and ...