• Developing automated text-image alignment to enhance access to heritage manuscript images

    Project Director(s):
    Peter M. Scharf
    Author(s):
    Peter M. Scharf
    Date:
    2016
    Group(s):
    Data Rescue
    Subject(s):
    Asian languages
    Item Type:
    White paper
    Institution:
    Sanskrit Library
    Tag(s):
    NEH White papers, Research and Development, NEH Preservation and Access
    Permanent URL:
    http://dx.doi.org/10.17613/M60T0D
    Abstract:
    The proposed project aims to enhance access to primary cultural heritage materials of India by developing human-validated automated text-image alignment techniques in order to provide access to digital images via related machine-readable texts, lexical resources, linguistic software, and a sophisticated search interface. Digital images of manuscripts written in Sanskrit, one of the world's richest culture-bearing languages, will be integrated into a digital library of Sanskrit. This integration will allow generalized information extraction and search techniques to reach enormous reservoirs of Sanskrit manuscripts. Integrating primary cultural materials with the Sanskrit Library will thus enable broad use of Indic collections for research and education where Indic materials are grossly underrepresented. The result will be extendable to the collections of Sanskrit manuscripts housed in American libraries and throughout the world and to archives of scanned Sanskrit books.
    Notes:
    Development of software to produce the partial transcription of Sanskrit manuscripts for human validation. The project would also integrate the manuscripts in a digital library to extend the use of lexical resources and linguistic tools for full-text searching and analysis.
    Metadata:
    Status:
    Published
    Last Updated:
    3 years ago
    License:
    Attribution-NonCommercial
    Share this:

    Downloads

    Item Name: pdf pr-50178-13.pdf
      Download View in browser
    Activity: Downloads: 158