MidPrivacy: Data Provenance Prototype Abstract

Last modified 12 Mar 2021 10:22 +01:00

Data provenance is one of the fundamental problems of data protection. Data protection regulations and practices ask for transparency and accountability. However, current systems are seldom capable of tracing origin (provenance) of data they are processing. This situation is hardly surprising given the complexity that data provenance brings, especially for data modelling and maintenance.

The Data Provenance Prototype project aims at design and implementation of provenance metadata in identity management. The features are implemented in midPoint, a comprehensive open source identity management and governance platform. The project involved design and implementation of Axiom, novel data modeling language with a unique capability to define metadata schemas. Axiom was used to model data provenance schemas that were integrated into midPoint core functionality. The result was a prototype of end-to-end provenace-enabled system. Provenance metadata are generated at data entry points. Provenance information is maintained during data processing, transformation and storage. User interface was modified to display provenance metadata to system administrators.

As this was a prototyping project, there naturally were challenges and even unexpected discoveries. We have gained insights into character of provenance metadata and overlap to personal data protection that is likely to be of interest to a broad community. The prototype was successful and it is likely to lead to a full productization of the functionality. Production quality support for provenance metadata is likely to establish midPoint and its maintainer Evolveum as one of the leaders in personal data protection field. This may be further amplified by open source character of the solution.

This project is part of midPrivacy, which is a long-term initiative to implement privacy and data protection features in midPoint.