Notes from Internet2 ACAMP (un)conference session
We have proposed a session about metadata management in IdM during Internet2 ACAMP (un)conference. We had 19 attendees and a rich discussion.
Discussion notes
Original discusion notes a resources can be found at https://docs.google.com/document/d/1PHXqTQXEzUDsNPojrglvrwENd-ixxE9EDdsZirwKZtQ
Notes from discussion
-
storing metadata about attributes such as where the attributes originate etc.
-
Trace source of a given attribute value; identity verification, might have family name from multiple sources; which value from where
-
BB: Attribute freshness, additional info that would be good to expose; SP should be able to get info about what assurance they can put in attributes provided by Systems of Record; If Loader brings in Basis groups, can we add attributes to those groups as we load them? Freshness,…; Inventory of services, How soon can new hire get access; We have a mission to inventory all this. It’s in someone’s head—they know the bits we want to capture. Then I can get out of the loop. Mary McKee, Shilen Patel had a session related to this in New Orleans last year: Curated Group management. Drive ITSM out of Grouper!
-
Slavek: 1) Metadata management: Freshness, LOA, Info about source operations; 2) Identity Mgmt: Change in system X, when will it get to System Y and enable Service Z; Privacy protection motivated midPoint work on metadata and provenance
-
Possible categories of Metadata;
-
security aspects
-
System of record? source/provenance
-
Last updated date for this attribute value?
-
Historical values?
-
-
ChrisB: Blue sky infinite data storage: per value metadata; point in time stuff what was the value yesterday? Interest is in the operational perspective; person x doesn’t have a u-card, why? Needs 4 system flow, problem was between systems 2 and 3….
-
Keith: Explore this more as a logging capability. It really isn’t but treat the information that way. Then you don’t have to cram it all into “The Registry”; Stream processing: ChrisB: Kafka, Splunk,….
-
GDPR requirements demand that this info is available and it MUST disappear at point X
-
LauraP: Metadata2020, a more general effort to evangelize need for metadata and collection of it
-
ChrisB: Static vs Dynamic provenance: Static: datum x comes from here every ; Per n minutes; Per value, more dynamic: change-event metadata of operational utility (Log-ish)
-
Is an attribute still in use? By whom?…The answers evolve over time. Sys X still has access to LDAP; Deprovisioning system-level access management;
-
BB: Spinning down our old LDAP: Who was using ePAffiliation? Default deny, then the exceptions can be associated with a specific policy
-
SP onboarding on campus: tech, sec, op contacts;
-
Data Provenance can help with derived values: email is created from official family name appended with “@foo.edu”
-
Trust in data that has been pulled into ORCID; https://www.ga4gh.org/ Genomics validating that data came from such and such a source; Assesses in a distributed way; Slavek: store signature with the data;
-
ChrisB: Cable provider wanted to provide Xfinity for dorm residents: We found a reliable source that Housing Dept was coding Hall/Room/Occupant information in PeopleSoft; Metadata from other sources that can be added to existing entries
-
Slavek: Evolveum metadata prototype: https://docs.evolveum.com/midpoint/midprivacy/phases/01-data-provenance-prototype/ Welcome experimenters!
-
Origin: categories, nomenclature; Could BB adopt this taxonomy in his Grouper metadata; LastLoaded, LastChanged on memberships;
-
Mahbub: Central Identity Registry, HR & SIS; matching, survivorship (winning value); Is there a way to leverage metadata/provenance features of midPoint.
-
DEMO of Experimental Metadata/Provenance support in midPoint;
-
Metadata can be used in provisioning/deprovisioning logic and can be provisioned out
Resources
-
Metadata Advocacy: Metadata 2020
-
Different lenses for thinking about the Metadata question (from Laura)
-
Metadata Principles - high-level goals for great metadata
-
Personas - those roles that interact with metadata - thinking in terms of how one/a system works with the metadata rather than thinking of it in terms of a job description:
-
Metadata Practices - these are really general practices around metadata, though I think an IAM-specific set of tasks that are round these practices could be a useful resource. - if we create something like that, I can also promote it among the Metadata 2020 audiences which includes many of the same organizations, but different parts of those organizations
-
-
Data provenance
-
ORCID - uses a chain of provenance to provide insight about the authoritative source of information and what systems have touched the data in between.
-
http://www.GA4GH.org - GA4GH is the Global Alliance for Genomic Health. It provides technical and policy standards for the sharing of genomics data
-
On the Data Use and Researcher Identity (DURI) working group, They are including a model for sharing signature information about the original source of an assertion (I’ll put in the link to the specific specification… one sec. Looking for it.)
-
-