This article presents a methodological study focused on the application of convolutional neural networks (CNNs) for the automatic segmentation of historical archaeological photographs, exploring ways to mitigate long-standing interpretative challenges related to ambiguous or poorly preserved visual elements. The research centers on the Tachara palace of Persepolis (Iran), chosen as a case study due to its rich architecture and iconographic programme, documented through an extensive corpus of historical and modern photographs from various archival collections, including those of the IsMEO Italian mission (1964-1979). A custom dataset, named PeRSeg14 (Persepolis Restoration activities Segmentation), was developed through manual annotation of 14 visual classes representing significant architectural, decorative, and contextual features identified based on direct visual analysis of the photographic corpus and established terminologies from the Art & Architecture Thesaurus (Getty) and authoritative Persepolis scholarship. The YOLOv8n-seg model was trained and evaluated following a reproducible pipeline using open-source tools. The quantitative evaluation showed moderate but promising results (Precision=0.649, Recall=0.425, [email protected]=0.445, [email protected]:0.95=0.264), with higher performance (higher precision and mAP) for recurrent and clearly defined architectural elements, while highlighting significant difficulties in segmenting visually ambiguous or underrepresented categories, such as Scaffolding Poles or Human figures. The qualitative analysis confirmed the model’s capacity to produce semantically coherent segmentation masks for both validation and test images, demonstrating some generalisation to external photographic archives, though with notable limitations due to visual variability and preservation states. The study critically addresses methodological issues related to dataset imbalance, photographic degradation, and iconographic ambiguity, underscoring the essential role of human interpretation and post-processing validation. Despite the intrinsic limitations of the lightweight YOLOv8n-seg architecture, constrained by the complexity inherent in archaeological data, this research offers a replicable analytical framework and provides a structured, richly annotated dataset designed to facilitate future deep learning applications in archaeology. The PeRSeg project thus shows the potential of CNNs as analytical tools for archaeological documentation, proposing both a controlled visual vocabulary and a defined annotation protocol, designed to address the enduring challenges posed by fragmentation and limited accessibility within historical photographic archives.
Automated segmentation of historical archaeological photographs: A CNN-based approach applied to the Tachara palace of Persepolis
Domenico Andreucci
2025-01-01
Abstract
This article presents a methodological study focused on the application of convolutional neural networks (CNNs) for the automatic segmentation of historical archaeological photographs, exploring ways to mitigate long-standing interpretative challenges related to ambiguous or poorly preserved visual elements. The research centers on the Tachara palace of Persepolis (Iran), chosen as a case study due to its rich architecture and iconographic programme, documented through an extensive corpus of historical and modern photographs from various archival collections, including those of the IsMEO Italian mission (1964-1979). A custom dataset, named PeRSeg14 (Persepolis Restoration activities Segmentation), was developed through manual annotation of 14 visual classes representing significant architectural, decorative, and contextual features identified based on direct visual analysis of the photographic corpus and established terminologies from the Art & Architecture Thesaurus (Getty) and authoritative Persepolis scholarship. The YOLOv8n-seg model was trained and evaluated following a reproducible pipeline using open-source tools. The quantitative evaluation showed moderate but promising results (Precision=0.649, Recall=0.425, [email protected]=0.445, [email protected]:0.95=0.264), with higher performance (higher precision and mAP) for recurrent and clearly defined architectural elements, while highlighting significant difficulties in segmenting visually ambiguous or underrepresented categories, such as Scaffolding Poles or Human figures. The qualitative analysis confirmed the model’s capacity to produce semantically coherent segmentation masks for both validation and test images, demonstrating some generalisation to external photographic archives, though with notable limitations due to visual variability and preservation states. The study critically addresses methodological issues related to dataset imbalance, photographic degradation, and iconographic ambiguity, underscoring the essential role of human interpretation and post-processing validation. Despite the intrinsic limitations of the lightweight YOLOv8n-seg architecture, constrained by the complexity inherent in archaeological data, this research offers a replicable analytical framework and provides a structured, richly annotated dataset designed to facilitate future deep learning applications in archaeology. The PeRSeg project thus shows the potential of CNNs as analytical tools for archaeological documentation, proposing both a controlled visual vocabulary and a defined annotation protocol, designed to address the enduring challenges posed by fragmentation and limited accessibility within historical photographic archives.| File | Dimensione | Formato | |
|---|---|---|---|
|
richiesta_deroga_IRIS_AperTO_Andreucci_Parthica_2025_solo_deroga.pdf
Accesso riservato
Descrizione: Richiesta di deroga per presenza di immagini e materiali soggetti a diritti di terzi e per impossibilità di pubblicare il PDF editoriale completo.
Tipo di file:
DEROGA (OBBLIGATORIO ALLEGARE FILE CON MOTIVAZIONE)
Dimensione
3 kB
Formato
Adobe PDF
|
3 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
|
Andreucci_Parthica_2025_first-page.pdf
Accesso aperto
Descrizione: Prima pagina ufficiale dell’articolo, con abstract, parole chiave e DOI, liberamente condivisibile in accesso aperto secondo le indicazioni dell’editore.
Tipo di file:
MATERIALE NON BIBLIOGRAFICO
Dimensione
129.94 kB
Formato
Adobe PDF
|
129.94 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.



