Open Access. Powered by Scholars. Published by Universities.®

Digital Commons Network

Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences

PDF

Brigham Young University

2017

Data extraction

Articles 1 - 1 of 1

Full-Text Articles in Entire DC Network

A Green Form-Based Information Extraction System For Historical Documents, Tae Woo Kim May 2017

A Green Form-Based Information Extraction System For Historical Documents, Tae Woo Kim

Theses and Dissertations

Many historical documents are rich in genealogical facts. Extracting these facts by hand is tedious and almost impossible considering the hundreds of thousands of genealogically rich family-history books currently scanned and online. As one approach for helping to make the extraction feasible, we propose GreenFIE—a "Green" Form-based Information-Extraction tool which is "green" in the sense that it improves with use toward the goal of minimizing the cost of human labor while maintaining high extraction accuracy. Given a page in a historical document, the user's task is to fill out given forms with all facts on a page in a document …