Notes from Internet2 ACAMP (un)conference session

Last modified 12 Mar 2021 10:22 +01:00

We have proposed a session about metadata management in IdM during Internet2 ACAMP (un)conference. We had 19 attendees and a rich discussion.

Discussion notes

Original discusion notes a resources can be found at https://docs.google.com/document/d/1PHXqTQXEzUDsNPojrglvrwENd-ixxE9EDdsZirwKZtQ

Notes from discussion

  • storing metadata about attributes such as where the attributes originate etc.

  • Trace source of a given attribute value; identity verification, might have family name from multiple sources; which value from where

  • BB: Attribute freshness, additional info that would be good to expose; SP should be able to get info about what assurance they can put in attributes provided by Systems of Record; If Loader brings in Basis groups, can we add attributes to those groups as we load them? Freshness,…​; Inventory of services, How soon can new hire get access; We have a mission to inventory all this. It’s in someone’s head—​they know the bits we want to capture. Then I can get out of the loop. Mary McKee, Shilen Patel had a session related to this in New Orleans last year: Curated Group management. Drive ITSM out of Grouper!

  • Slavek: 1) Metadata management: Freshness, LOA, Info about source operations; 2) Identity Mgmt: Change in system X, when will it get to System Y and enable Service Z; Privacy protection motivated midPoint work on metadata and provenance

  • Possible categories of Metadata;

    • security aspects

    • System of record? source/provenance

    • Last updated date for this attribute value?

    • Historical values?

  • ChrisB: Blue sky infinite data storage: per value metadata; point in time stuff what was the value yesterday? Interest is in the operational perspective; person x doesn’t have a u-card, why? Needs 4 system flow, problem was between systems 2 and 3….

  • Keith: Explore this more as a logging capability. It really isn’t but treat the information that way. Then you don’t have to cram it all into “The Registry”; Stream processing: ChrisB: Kafka, Splunk,…​.

  • GDPR requirements demand that this info is available and it MUST disappear at point X

  • LauraP: Metadata2020, a more general effort to evangelize need for metadata and collection of it

  • ChrisB: Static vs Dynamic provenance: Static: datum x comes from here every ; Per n minutes; Per value, more dynamic: change-event metadata of operational utility (Log-ish)

  • Is an attribute still in use? By whom?…​The answers evolve over time. Sys X still has access to LDAP; Deprovisioning system-level access management;

  • BB: Spinning down our old LDAP: Who was using ePAffiliation? Default deny, then the exceptions can be associated with a specific policy

  • SP onboarding on campus: tech, sec, op contacts;

  • Data Provenance can help with derived values: email is created from official family name appended with “@foo.edu”

  • Trust in data that has been pulled into ORCID; https://www.ga4gh.org/ Genomics validating that data came from such and such a source; Assesses in a distributed way; Slavek: store signature with the data;

  • ChrisB: Cable provider wanted to provide Xfinity for dorm residents: We found a reliable source that Housing Dept was coding Hall/Room/Occupant information in PeopleSoft; Metadata from other sources that can be added to existing entries

  • Slavek: Evolveum metadata prototype: https://docs.evolveum.com/midpoint/midprivacy/phases/01-data-provenance-prototype/ Welcome experimenters!

  • Origin: categories, nomenclature; Could BB adopt this taxonomy in his Grouper metadata; LastLoaded, LastChanged on memberships;

  • Mahbub: Central Identity Registry, HR & SIS; matching, survivorship (winning value); Is there a way to leverage metadata/provenance features of midPoint.

  • DEMO of Experimental Metadata/Provenance support in midPoint;

  • Metadata can be used in provisioning/deprovisioning logic and can be provisioned out

Resources

  • 2019 TechEx Presentation: Paranoid IAM

  • Metadata Advocacy: Metadata 2020

  • Different lenses for thinking about the Metadata question (from Laura)

    • Metadata Principles - high-level goals for great metadata

    • Personas - those roles that interact with metadata - thinking in terms of how one/a system works with the metadata rather than thinking of it in terms of a job description:

    • Metadata Practices - these are really general practices around metadata, though I think an IAM-specific set of tasks that are round these practices could be a useful resource. - if we create something like that, I can also promote it among the Metadata 2020 audiences which includes many of the same organizations, but different parts of those organizations

  • Data provenance

    • ORCID - uses a chain of provenance to provide insight about the authoritative source of information and what systems have touched the data in between.

    • http://www.GA4GH.org - GA4GH is the Global Alliance for Genomic Health. It provides technical and policy standards for the sharing of genomics data