Open Access. Powered by Scholars. Published by Universities.®
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
Articles 1 - 1 of 1
Full-Text Articles in Physical Sciences and Mathematics
Template-Based Metadata Extraction For Heterogeneous Collection, Jianfeng Tang
Template-Based Metadata Extraction For Heterogeneous Collection, Jianfeng Tang
Computer Science Theses & Dissertations
With the growth of the Internet and related tools, there has been a rapid growth of online resources. In particular, by using high-quality OCR (Optical Character Recognition) tools it has become easy to convert an existing corpus into digital form and make it available online. However, a number of organizations have legacy collections that lack metadata. The lack of metadata hampers not only the discovery and dispersion of these collections over the Web, but also their interoperability with other collections. Unfortunately, manual metadata creation is expensive and time-consuming for a large collection, and most existing automated metadata extraction approaches have …