Lecture
DAG Lecture: A Semantic ETL Pipeline for Large-Scale Provenance Research
- Date
- Thursday 13 November 2025
- Time
- Address
- Van Steenis Building, Einsteinweg 2
- Room
- E0.02a
 
        Pre-Colonial and Indigenous Latin American objects have drawn European interest since the earliest encounters between Indigenous peoples and Europeans in the Americas. The ERC-funded project Between Canon and Coincidence (BECACO) adopts an innovative, multidisciplinary framework to investigate the provenance of ethnographic and archaeological collections from Latin America held in European museums, focusing on the period 1850–2000.
The project aims to combine data-driven methodologies to identify collecting patterns and apply quantitative analysis to the collections of eleven European museums. By integrating these museums’ collection data, we seek to achieve semantic interoperability through the CIDOC-CRM standard, enabling large-scale provenance research and advancing data-driven approaches to study the formation of collections over time.
To support this effort, a semantic ETL pipeline has been developed to manage large volumes of cultural data efficiently and to generate a searchable knowledge graph. This talk will outline the data integration process and address the challenges of standardizing complex cultural information.
