A New Generation of Textual Corpora: Mining Corpora from Very Large Collections

Crane, Gregory

Stewart, Gordon

Babeu, Alison

2007

Description
  • While digital libraries based on page images and automatically generated text have made possible massive projects such as the Million Book Library, Open Content Alliance, Google, and others, humanists still depend upon textual corpora expensively produced with labor-intensive methods such as double-keyboarding and manual correction. This paper reports the results from an analysis of OCR-generated ... read more
This object is in collection Subject Temporal Permanent URL
ID:
gh93h937z
Component ID:
tufts:PB.001.001.00006
To Cite:
TARC Citation Guide    EndNote
Usage:
Detailed Rights