Enhancing Archival Access: USF Libraries Implements Ethical AI for Handwritten Document Transcription

Share:
Page Link Copied!
Reading Time: 2 minutes

Post by Marlena Carrillo, Digital Initiatives Coordinator in the Digital Initiatives unit of the USF Libraries

 

clipping from a diary showing cursive writing
Hughes, Ellis, “Ellis Hughes Diary: Book 1” (1838). Select Florida Studies Manuscripts. 1.

The University of South Florida Libraries is proud to announce a major advancement in its Digital Initiatives: the adoption of QDox, an AI transcription solution developed by Quantiphi and powered with AWS Textract. This technology is being used to transcribe handwritten archival documents with greater efficiency and ethical oversight.

The Challenge of Handwritten Archives

Digitizing archival materials has long been a cornerstone of expanding access to historical resources. However, handwritten documents present a unique challenge. Traditional Optical Character Recognition (OCR) tools have proven effective for printed text but fall short when applied to handwriting, often producing inaccurate or incomplete transcriptions. This issue is further exacerbated by hand-drawn images, charts, or tables, which are even harder for automated prompts to assess for order and direction.

A Human-in-the-Loop AI Solution

To overcome this barrier, USF Libraries has implemented a human-in-the-loop AI model using QDox. This hybrid system combines machine learning with human review to ensure high transcription accuracy:

  • AI-generated transcriptions are assigned confidence scores.
  • Low-confidence segments are flagged for human review and correction.
  • Human feedback is used to continuously improve the system’s performance.

This approach not only accelerates the transcription process, but it also ensures that the final output maintains the integrity and context of the original documents.

Ethical and Scholarly Impact

Beyond improving access , this initiative reflects USF Libraries’ commitment to ethical AI use and inclusive scholarship. The human-in-the-loop model ensures transparency, accountability, and respect for the historical and cultural significance of archival materials.

Researchers, students, and the public will benefit from improved access to previously hard-to-read documents, enabling deeper engagement with primary sources and fostering new opportunities for digital humanities research.

For more information or to explore collaborative opportunities, please contact the Digital Initiatives unit at USF Libraries.

 


References

 Amazon Web Services. (n.d.). Amazon Textracthttps://aws.amazon.com/textract/

Blanke, T., Hedges, M., & Bryant, M. (2011). Open source optical character recognition for historical research. Academia.edu. https://www.academia.edu/5291478/Open_source_optical_character_recognition_for_historical_research

Quantiphi. (2023). QDox: GenAI-powered document processing built on AWS. https://quantiphi.com/partners/amazon-web-services/qdox/

[LS1]without a WCAG paragraph ‘compliance’ comes across as sudden.  Perhaps “Beyond improving access”?

Go Back