| Title: | The Challenge of Virginia Banks: An Evaluation of Named Entity Analysis in a 19th Century Newspaper Collection |
| Citable URL: | http://hdl.handle.net/10427/42691 |
| Author: | Crane, Gregory; Jones, Alison |
| Date: | 2006 |
| Citation: | Crane, Gregory, and Alison Jones. "The Challenge of Virginia Banks: An Evaluation of Named Entity Analysis in a 19th Century Newspaper Collection." In Proceedings of the Sixth ACM/IEEE-CS 2006 Joint Conference on Digital Libraries, Chapel Hill, NC, June 11-15, 2006, preprint. New York: Association for Computing Machinery, 2006. http://doi.acm.org/10.1145/1141753.1141759. Available from Tufts Digital Library, Digital Collections and Archives, Medford, MA. http://hdl.handle.net/10427/42691 |
| Rights: | http://www.acm.org/publications/policies/copyright_policy |
View the PDF File: The Challenge of Virginia Banks: An Evaluation of Named Entity Analysis in a 19th Century Newspaper Collection (opens in a new window)
Abstract: This paper evaluates automatic extraction of ten named entity classes from a 19th century newspaper, the Civil War years of the Richmond Times Dispatch, digitized with IMLS support by the University of Richmond. This paper analyzes success with ten categories of entities prominent in these newspapers and the particular problems that these classes of named entities raise. Personal and place names are familiar but some more important categories (such as ship names and military units) illustrate some of the challenges that named entity identification confronts as it evolves into a fundamental tool not only for automatic metadata generation but also for searching and browsing as well. We conclude by suggesting the kinds of knowledge sources that digital libraries need to assemble as part of their machine readable reference collections to support named entity identification as a core service.