
AI helps historians restore, date ancient texts
The artificial intelligence (AI) application, called Ithica, was created to help historians decipher and date inscriptions on ancient stone, pottery, and metal artifacts that have been damaged to the point of illegibility and possibly transported far from their original location. Ithica, say the researchers, can restore the missing text of damaged inscriptions, identify their original location, and help establish the date they were created.
“Our evaluations show that Ithaca achieves 62% accuracy in restoring damaged texts, 71% accuracy in identifying their original location, and can date texts to within 30 years of their ground-truth date ranges,” say the researchers. “Historians have already used the tool to reevaluate significant periods in Greek history.”
Ithaca, named after the Greek island in Homer’s Odyssey, is trained on the largest digital dataset of Greek inscriptions from the Packard Humanities Institute. Since many of the inscriptions historians are interested in analyzing with Ithaca are damaged and often missing chunks of text, the researchers trained Ithica using both words and the individual characters as inputs to ensure their model still works when presented with one of these.
The sparse self-attention mechanism at the model’s core evaluates these two inputs in parallel, say the researchers, allowing Ithaca to evaluate inscriptions as needed. To maximize Ithaca’s value as a research tool, the researchers also created a number of visual aids to ensure Ithaca’s results are easily interpretable by historians:
- Restoration hypotheses: Ithaca generates several prediction hypotheses for the text restoration task for historians to choose from using their expertise.
- Geographical attribution: Ithaca shows its uncertainty by giving historians a probability distribution over all possible predictions – instead of just a single output. As a result, it returns probabilities for 84 different ancient regions representing its level of certainty. It visualizes these results on a map to shed light on possible underlying geographical connections across the ancient world.
- Chronological attribution: When dating a text, Ithaca produces a distribution of predicted dates across all decades from 800 BCE to 800 CE. This can enable historians to visualize the model’s confidence for specific date ranges, which may offer valuable historical insights.
- Saliency maps: To convey the results to historians, Ithaca uses a technique commonly used in computer vision that identifies which input sequences contribute most to a prediction. The output highlights the words in different color intensities that led to Ithaca’s predictions for missing text, location and dates.
Experimental evaluation shows how Ithaca’s design decisions and visualization aids make it easier for researchers to interpret results, say the researchers.
“The expert historians we worked with achieved 25% accuracy when working alone to restore ancient texts. But, when using Ithaca, their performance increases to 72%, surpassing the model’s individual performance and showing the potential for human-machine cooperation to advance historical interpretation, establish relative datings for historical events, and even contribute to current methodological debates.”
The researchers say they are currently working on versions of Ithaca trained on other ancient languages besides Greek, and historians can already use their datasets in the current architecture to study other ancient writing systems, from Akkadian to Demotic and Hebrew to Mayan. To aid further research, say the researchers, they have also open sourced their code, the pretrained model, and an interactive Collaboratory notebook at https://github.com/deepmind/ithaca.
For more, see “Restoring and attributing ancient texts using deep neural networks.”