Flexible & Transparent Data Reuse

Time: 14:00

Venue: LT 1.4, Kilburn Building, The University of Manchester

Sorry, this event has now ended.

Internal UoM Seminar Flexible & Transparent Data Reuse by Prof Paul Groth from University of Amsterdam


A central challenge in our modern information environment is how to use, integrate and repurpose data that stem from a multitude of diverse sources. Within data science, ~60-70% of the time is spent gathering, preparing, integrating, and munging data. In science, there is, for instance, the need to know which of the thousands of prior experimental records are reliable, applicable and can be reused for an experiment. In this talk, I discuss the goal of developing intelligent systems that work with people to combine and reuse data flexibly, reproducibly and transparently. I give examples from my work on flexible knowledge graph construction and taxonomy creation. I then discuss interoperable data provenance tracking to provide transparency for these sort of complex data workflows. I outline a future for using transparency to create more flexible, intelligently supported data integration and reuse environments.


Paul Groth is Professor of Algorithmic Data Science at the University of Amsterdam where he leads the Intelligent Data Engineering Lab (INDElab). His research focuses on intelligent systems for dealing with large amounts of diverse contextualized knowledge with a particular focus on web and science applications. Previously Paul led the design of a number of large scale data integration and knowledge graph construction efforts in the biomedical domain. Paul was co-chair of the W3C Provenance Working Group that created a standard for provenance interchange. He has also contributed to the emergence of community initiatives to build a better scholarly ecosystem including altmetrics and the FAIR data principles.