A long-baffling archive of more than 400,000 medieval manuscripts – written in Hebrew, Aramaic, Arabic and Yiddish – is now being brought into the digital age by a project using advanced artificial intelligence to decipher and transcribe what was once inaccessible text. The collection, part of the Cairo Geniza, spans roughly a thousand years and remained largely untapped because its documents are fragmented, handwritten in diverse scripts and mostly uncatalogued.
The initiative, driven by the MiDRASH transcription project, applies machine-learning models to read and convert images of manuscripts into searchable text. Where only about 10% of the Geniza’s contents had been transcribed before, the AI now dramatically accelerates the process and enables researchers to cross-reference names, terms and events across the archive in ways that were previously impossible.
Already, the project has surfaced documents that offer vivid glimpses into daily life and social history – including a 16th-century Yiddish letter from a widow in Jerusalem to her son in plague-ridden Cairo. Such personal records, along with religious, civic and commercial documents, are helping to reconstruct centuries-old networks of trade, scholarship and community life.
For historians, linguists and scholars of religion, the impact could be transformative. The newly accessible records may reshape understanding of medieval Jewish, Middle Eastern and Mediterranean history – offering unprecedented detail about migration, commerce, correspondence, religious practice and cultural exchange. Yet the challenge remains to ensure accurate transcription across languages and scripts, and to make the translated records broadly available for verification and study.
The deeper question is whether this marriage of old manuscripts and new AI will fundamentally change how societies preserve and interpret cultural heritage – turning what was once a locked archive into a source of insight for future generations.

