Design and Implementation of the Native Repository

Last modified 25 Jun 2022 01:05 +02:00
Since 4.4
This functionality is available since version 4.4.
You can learn a lot from the Native Repository webinar video (from the series prepared after LTS 4.4 release, see Talks for more) or go through the slides. The webinar goes quite in-depth about the role of the repository in midPoint, its evolution, motivation for the rework, the changes in the new repository, database table structure and more. Second half focuses on the usage - how to configure midPoint for the repository, tuning it, etc. Finally, the webinar closes with section dedicated to the Native SQL audit and its partitioning.

Repository responsibilities

These are based on RepositoryService interface and there should be no difference compared to the legacy repository. Each point contains design/implementation notes.

  • Repository stores midPoint objects (subtypes of ObjectType), assigning their OID if missing. For this m_object table with oid (UUID) and fullObject (BYTEA) columns would be enough. However, this is not acceptable for searchObjects performance see note there.

  • Repository assigns any missing container IDs (CID) for object’s containers. These IDs are unique inside a single object and assigned loosely sequentially. For this m_object.cid_seq is stored in the database and it contains the next available CID. IDs are assigned only to multivalued containers, e.g. object’s metadata does not need any CID.

  • Repository allows searching for objects and selected containers with reasonable performance. This is why objects stored as documents (fullObject) is not enough. Various types are stored in specific sub-tables instead with values extracted to separate columns. There are other related tables, typically for containers and other multi-value properties. Not all values are by default searchable/indexable in this way and it must be possible to specify what other properties are indexed. These can be properties from the default schema but also from schema extensions.

  • Each object is a separate "aggregate" and any inter-object references are soft, not constraints on the database level. For example, multi-value container roleMembershipRef is stored in m_ref_role_membership table which has a foreign key to the common OID pool m_object_oid (this is necessary for any FK pointing to object OID if multiple types are possible, FK to m_object does not work this way). But the targetOid is not a FK as it points to another object.

Design notes

Modify Efficiency concerns

Modify efficiency for distributed full-object (AccessCertificationCampaign vs its case):

  • ADD modification for Case container of AccCertCampaign should not fetch other cases. Update of full object (but without cases), version, etc. works as usual.

  • DELETE modification for Case container of AccCertCampaign should not fetch any cases, only delete the case row (and update main full object).

  • REPLACE modification for case does not need to read any cases, just delete all the rows.

  • Modification going inside any Case should fetch only the needed case before applying the modification.

This mechanism can be reused if full-object for assignments is split from object’s full-object in the future (not sure about the correctness of the column name fullObject).

Index-only attributes/extensions modify efficiency:

  • REPLACE modification does not require any fetch, it simply obliterates the previous value in the DB.

  • ADD/DELETE modification currently requires the fetch of the index-only property, which can be achieved by selective fetch of ext→'id-of-index-only-property'. Modification is then applied to prism object as usual. Value in the DB is overwritten by the new complete value for the property (like for REPLACE).

    • In the future, ideally, this would not need any fetch access (except for reindex) and issue fine-tuned SQL operations. This may also require more flexible storage options for extensions/attributes, e.g. promoting the index-only attribute to a separate array column or separate table even.

Iterative search is a solution for long-running operations modifying many objects. Doing all the work in a single transaction is not a good design for various reasons:

  • It does not perform well, even if done on objects read iteratively (to scale without running out of memory). Long transaction my lock objects and slow down other transactions, and/or cause inconsistencies and retries. Finally, if something goes wrong on 99% the whole work still needs to be done again.

  • It causes inconsistencies as reported in MID-4414. Long database reads (recompute, recon) can return data that are outdated. It seems that the databases using READ_COMMITTED isolation will return the data as they were at the beginning of the transaction, which is the beginning of the search.

Possible solutions:

  • Search returning only OIDs - but this does not scale well. When the code gets to the last OID, the object can be changed and not matching the original search.

  • Splitting the single search to multiple searches in separate transactions. This makes the consistency problems less serious and also assures that the work progresses.

  • Due to complexity of possible work on the found objects any pure database-based solution is out of question. Sometimes we use the results for reporting, but sometimes we call modify object on each of them.

  • Optimistic locking (version check) - but if the object is modified much later this more and more likely requires operation restart.

Relativity and deltas
In theory, relativity of midPoint means that modifications do not rely on locking that much, but the single modify operation still needs to be atomic. When multiple deltas are applied to a single object, even if we don’t know or care what order they are applied, they must all be applied. It is not acceptable when only the last delta is applied to the original object. This does not mean that order of deltas is not relevant, only that all deltas must be respected. If the order is critical, then something above the repository must assure that deltas are executed in specified order.

In the end, the iterative search was introduced which loads objects using a filter, ideally in smaller batches, and uses provided handler to process each object. The handler may add it to report data or call possibly heavy-weighted modify operation on it. There are still many variables in this solution - how the transactions are organized, how the candidates (possibly millions) are iterated over (paginated), how the convergence is assured, etc.

The following solution is used for the Native repository:

  • Search uses paging based on strict OID ordering (on top of optionally provided business ordering).

  • Page is reasonably small (by default 100, can be configured) which makes the data resonably fresh.

  • Handling is called out of the reading transaction. This is good also for edge situation when the search uses last possible connection and then needs another one inside the handler. Of course this does not prevent handler from "needing" two connections and blocking because of it.

  • Even with handling running inside the search, even if cursor (live result set) was used, the data is still current only at the time of the execution, not during the cursor reading. There is no benefit from doing the work inside the transaction.

  • The problem with using some internal search for reading the whole page is that it requires more memory than executing the handler while reading the cursor. But with each page limited to small maximum size (100 by default) this is not an issue.

  • Another issue is that for complex selects the query execution can take prohibitively long to read just 100 objects. This can be prevented by avoiding complicated conditions and ordering and leaving just ordering by OID which is actually pretty fast, even if the table is scanned. Full table scan in the order of OID index can be perfectly fine if a lot of data matches the condition, in the opposite case usage of indexes is very likely.

  • Technically, cursor like reading is possible in new repository too, SQLQuery#getResults() returns normal JDBC ResultSet. To avoid blocking the connection and having long-running read-only transaction (should not be a big deal for read-only, but still) we still prefer the solution with handler running outside the search for each page when the connection is freed.

Possible inconsistencies

  • When object is handled it may have been possibly changed by some other process since read by the search iteration. While problem in theory, in practice the object is mostly fresh "enough" to compute the deltas and apply them. Applying new delta using modifyObject does not destroy any non-conflicting deltas applied in the meantime. This repository call corresponds with ModelService.executeChanges() on the model level or with functions midpoint.executeChanges() or midpoint.modifyObject() for script expressions.

  • If the fresh object is absolutely needed, one may use modifyObjectDynamically. This is new as of 4.4 and intended mostly for internal use - there is no direct alternative on the model level or a function available for script expressions.

  • There are inconsistencies between the iterations (pages) of the search. This is hardly a problem for iterative change executions, as we are concerned only with the consistency of each object. It can be problem for reports, however, but this must simply be accepted. If totally transactional reports are necessary, they must be preformed on SQL level and only the externalized data are available - this is beyond the topic of iterative search.

Developer notes

How to add a new field/column?

Example - let’s add administrativeOperationalStateAdministrativeAvailabilityStatus column for m_resource table.

This is an extremely long column name, because it is inside a single value container, its path is actually administrativeOperationalState/administrativeAvailabilityStatus. But because this is single value container, the property is mapped directly in the object table (from the repository perspective, we call this an "embedded" container), unlike, for instance, assignments that have a dedicated m_assignment table.

To follow this example, check this commit. I’m sorry about the superlong name, but this is the best example in a single commit.

  1. Let’s start in the SQL schema, locate the table m_resource and add administrativeOperationalStateAdministrativeAvailabilityStatus column to it. Consider logical order of the columns - even though it’s not relevant for upgraded DB, it is relevant for future readability. Notice, how columns for items from subcontainers are grouped in existing tables. If the column requires a comment - like this one to clarify the actual path - add it before the column line.

    Also, we name the columns "Java style", although Postgres doesn’t care aboute casing. The same name will be used in so called "M-class" (MResource in our example). Sometimes the name contains container name prefix, this is not used for metadata, activation and similar common containers - even these are still pure camel-case names.

  2. What type is it? Simple things are easy (TEXT, INTEGER), some items require multiple columns (references, poly-strings), follow the patterns used previously for such cases. Our administrativeOperationalStateAdministrativeAvailabilityStatus is enum type - great! Read the next bullet how to do that. Some multi-value types can be represented by arrays like TEXT[] or JSONB columns. If we’re talking about whole new multi-value container, whoa-whoa…​ that’s out of scope of this humble instruction sheet!

  3. If the type is previously unused enum, we’ve got just a bit more work to do. First we need to add CREATE TYPE for it to the schema. Find the section with custom types, read its intro comment and add the enum specification. Very likely it will be "schema enum", not specialized repository internal enum - that’s typical. Use the class name for the custom type to make things as obvious as possible, so it’s AdministrativeAvailabilityStatusType for our case. Copy/paste the values to avoid any mistake…​ other than copy/paste error, that is.

    Don’t forget to mention this enum in the SqaleRepoContext constructor. Otherwise, you will later get an error saying Can’t infer the SQL type to use for an instance of …​AdministrativeAvailabilityStatusType. Keep the types in alphabetic order, please, thank you.

  4. Let’s change the "M-class", MResource in our example. Simply add public field for the column, like public AdministrativeAvailabilityStatusType administrativeOperationalStateAdministrativeAvailabilityStatus. Keep the order consistent with the table. BTW, "M" does not stand for "mapping", we will see mapping class later.

  5. Now it’s great time to update SqaleRepoAddDeleteObjectTest! Find the method testing mapping for this class and add value and assert for the new item. Feel free to add the method if this entity/object type is not yet tested - just like I did in another example. Run the test, it should fail for the new attribute, which is a good sign.

  6. We need to declare the new column in a "Q-class" which extends from Querydsl type hierarchy. Technically it’s mapping for Querydsl, but it’s still not "our" mapping for midPoint (soon, I promise!). For our example, it’s QResource - and there are two sections:

    1. First, static column metadata, find good example from other class if necessary. In our case, I’ll add:

      public static final ColumnMetadata ADMINISTRATIVE_OPERATIONAL_STATE_ADMINISTRATIVE_AVAILABILITY_STATUS =
        ColumnMetadata.named("administrativeOperationalStateAdministrativeAvailabilityStatus")
          .ofType(Types.OTHER);

      For enum types we use Types.OTHER, again, see existing examples from other classes for your type.

    2. Next, we add non-static column (or attribute) path:

      public final EnumPath<AdministrativeAvailabilityStatusType> administrativeOperationalStateAdministrativeAvailabilityStatus =
        createEnum("administrativeOperationalStateAdministrativeAvailabilityStatus",
          AdministrativeAvailabilityStatusType.class,
          ADMINISTRATIVE_OPERATIONAL_STATE_ADMINISTRATIVE_AVAILABILITY_STATUS);

      The name of the column (the same as the name of the field in M-class) appears twice here, because we want the same name again for Q-class paths. Previously specified column metadata are used. As before, see examples from other Q-classes to use the right create* method and path type.

    Keep the order consistent with SQL and M-class in both sections. Good, now Querydsl knows what to do with our field in the M-class.

  7. Now it’s time to add the insert code. Finally, we’re getting to the "mapping class" - QResourceMapping in our case. Locate toRowObjectWithoutFullObject and add something like this there:

    row.xxx = node.getXxx();

    Now a bit of explanation is needed, because other actual example is a bit more complicated:

    var administrativeOperationalState = schemaObject.getAdministrativeOperationalState();
    if (administrativeOperationalState != null) {
      row.administrativeOperationalStateAdministrativeAvailabilityStatus =
        administrativeOperationalState.getAdministrativeAvailabilityStatus();
    }

    This is caused by the nesting of the property inside the container, as we meantioned at the beginning. Properties directly on the object are a single liner as indicated above.

    As always, follow the order from SQL and M-class. The code for enum and many other types is as trivial as shown above, but there is great support for refs, poly-strings and many more too - just find the examples in other Q-Mapping classes.

  8. Nearly there! We still need one more thing to support searching and modifications too. Go to the constructor of the mapping class (QResourceMapping for us) and add (respecting the right order again, of course!) something like this:

    addItemMapping(F_XXX, enumMapper(q -> q.xxx));

    Now, again, because our propety is nested inside another container, our example gets a bit more complicated, but we have plenty of support for this as well:

    addNestedMapping(F_ADMINISTRATIVE_OPERATIONAL_STATE, AdministrativeOperationalStateType.class)
      .addItemMapping(AdministrativeOperationalStateType.F_ADMINISTRATIVE_AVAILABILITY_STATUS,
        enumMapper(q -> q.administrativeOperationalStateAdministrativeAvailabilityStatus));

    I mean, seriously, can it be more auto-magical than this?

    Just be sure to use:

    1. the right item names: F_ADMINISTRATIVE_OPERATIONAL_STATE imported statically from ResourceType, but also for the item from the nested container;

    2. proper mapper method depnding on the type: enumMapper in this case

    3. and proper path q.administrativeOperationalStateAdministrativeAvailabilityStatus, which is that final non-static field we added on the Q-class.

  9. Write the search test.

    Often, when the mapping is for a well-working type, we don’t bother adding a test for it - although in that case it would be nice to try it at least once from Query playground.

    But this time we did - look at test990…​ in SqaleRepoSearchTest for our example. Even the search that returns nothing is better than none, because it executes the query and checks the mapping. But it’s easy to prepare new data at the start of the class and actually return some data as well.

  10. And SQL alter script, of course! Finally, we need to prepare SQL command for upgrade script postgres-new-upgrade.sql. Prepare the right ALTER command and test it on an existing database without the change. Wrap the change inside apply_change call - use the examples already available in the file. Don’t forget to bump the change identifier for each chagne. Revert the change and test it again by calling the apply_change procedure from the upgrade script. Check that the change was applied and also that subsequent call skips the change.

    Each change block is commited automatically - do not use explicit commit there. If you need to create a new type (enumeration, just like in this example) do this in a separate change, because this change must be committed before it can be used. See the existing examples in the upgrade script - there definitely are some.
  11. Update the last indicated change in the main SQL scipt! This is the last line in postgres-new.sql saying call apply_change…​. The number here should match the last used change identifier in the upgrade script. This is important to avoid applying any change more than once.

To see the whole success story, check this commit.

How to add a new persisted type?

Example - let’s add persistence for MessageTemplateType. This is a simple example, only minimal m_message_template table will be added, without special persistence for its containers, because these are not searchable (full text can be used for that, if required).

  1. New persisted object type must be declared. Add the new value MESSAGE_TEMPLATE to the ObjectType enum in both postgres-new.sql and postgres-new-upgrade-audit.sql. This value will be later added in the MObjectType Java enum, but that requires some other classes we don’t have yet.

  2. Also, prepare and test upgrade command to the same effect just above the bottom comment in both postgres-new-upgrade.sql and postgres-new-upgrade-audit.sql. Adding the type value in both files is critical for cases when audit is used in a separate schema/database. We will wrap this inside apply_change (or apply_audit_change for audit SQL) so it is executed only if not applied yet. We will write it in an idempotent fashion (IF NOT EXISTS), not only because we can and it is more flexible, but because, again, it is critical, this time for cases when audit and main repo share the same database and both upgrade scripts are run for the same schema:

    -- MID-7484, always add some relevant comment or a related Jira issue
    -- We add the new enum value in separate change, because it must be committed before it is used.
    call apply_change(2, $aa$
    ALTER TYPE ObjectType ADD VALUE IF NOT EXISTS 'MESSAGE_TEMPLATE' AFTER 'LOOKUP_TABLE';
    $aa$);

    As described in the comment, we make this a separate change, because the apply_change procedure manages the commits and we can’t use COMMIT inside the change block.

  3. Continue with the table definition in postgres-new.sql. Again, this one is very simple, we can copy/paste and carefully modify existing simple tables like m_dashboard for instance. Use search to be sure all m_dashboard strings are replaced with m_message_template as appropriate, especially for the trigger and index names. Replace the object type values in the objectType column definition.

    Place the tables in the proper section (region in IDEA). This can be "OTHER object tables", but we will create new region in this case, as we plan more Notification related tables in the future. Example for the complete table with triggers and indexes can be seen here.

  4. Add the same creation statements in the upgrade script postgres-new-upgrade.sql just after the ALTER from the previous step. You can use a single apply_change block for it, unless commit is required after any of the statements. As always, use a new change number (first parameter) for each change.

  5. After all changes in SQL upgrade files, update the last lines in the main and audit SQL files containing the call apply_[audit_]change calls - just repeat the last change number there. Be careful not to mix the numbers between the main and audit SQL scripts.

  6. Now it’s time to add the mapping Java classes to the native repository module repo-sqale. For a minimal mapping example see the classes in the package named com.evolveum.midpoint.repo.sqale.qmodel.node. Typically, three classes are needed:

    1. Plain bean representing the row in the database - so called "M-class", e.g. MNode.

    2. Querydsl mapping for the table, which is traditionally prefixed with Q, e.g. QNode.

    3. Finally, midPoint specific mapping which is named like the "Q-class", but with Mapping suffix, e.g. QNodeMapping.

    In our case, the situation is even simpler, because there is no additional column mapping for our new table. We can use MObject as our M-bean, this mapping is similar to the mapping for m_dashboard, so we will copy and paste these classes. After replacing all the "dashboards" with "messageTemplates" (including the proper table names) we are done with the mapping classes.

    Obviously, mapping for more complicated objects with sub-containers stored in dedicated tables is more complicated. Existing classes should be used as examples; also see the section about adding the new column for tips about column vs Java types.

  7. With the Java mapping classes ready, we can add the value inside MObjectType. Respectfully, we will conform to the alphabetical order and add the following code after the lookup table line:

    MESSAGE_TEMPLATE(QMessageTemplate.class, MessageTemplateType.class),
  8. New mapping for object must be also registered in SqaleRepositoryBeanConfig.sqlRepoContext() method. Find the right place (again, typically alphabetically) and add the following line:

    .register(MessageTemplateType.COMPLEX_TYPE, QMessageTemplateMapping.init(repositoryContext))
  9. Finally, we will add a simple add (insert) test for the class. This proves that the midPoint can add objects of this type. Because there are no columns specific to this type, we simply test that the row is added, as we presume we can trust the existing code handling the super-type columns (already tested). In more interesting cases you may also add a search test into SqaleRepoSearchTest, etc.

The whole example can be found in this commit. Forgotten object type update for audit is in this commit.

Possible future improvements

Retry mechanism (if/where necessary) may not be necessary if tests don’t show that. SELECT …​ FOR UPDATE seems to do its work just fine.

Minor, tentative, questinable, don’t implement while it’s in this list:

  • Dereferencing (@) for extension refs stored in JSONB.

  • Grouping is not supported and the semantics for SQL without ORM is unclear. This is not used by midPoint itself and there are no tests for it.

  • Arbitrary ordering for iterative search - currently only one path is supported. Fixes for NULL values are needed - TODO in the code. Multi-path ordering throws - this can probably wait a bit.

This is a mix of potentially handy queries, experiments and some PG SQL showing-off. It is in this "devel" section as it is the least bad place for this content.

Playing with JSON inside fullobject

This is possible, but not efficient on large datasets and not recommended in production. This also assumes JSON is used as a serialization format (by default it is).

-- Full object is often readable in some clients, but not in psql (it's just byte array).
select fullobject from m_user
where oid = '00000000-0000-0000-0000-000000000002';

-- Conversion to text helps, now it works in psql too.
select convert_from(fullobject, 'UTF8') from m_user
where oid = '00000000-0000-0000-0000-000000000002';

-- Playing with JSON inside (assuming it's JSON and not XML).
-- BAD: Converts JSONB fullobject into record, but there is only the top-level one (one row).
-- Also, using jsonb_each in select, we loose (key, value) structure of the record.
select jsonb_each(convert_from(fullobject, 'UTF8')::jsonb) from m_user
where oid = '00000000-0000-0000-0000-000000000002';

-- GOOD: jsonb_each is in from clause, it's structured now, but we still have just one row.
select jrec, pg_typeof(jrec), pg_typeof(jrec.value), jrec.*
from m_object, jsonb_each(convert_from(fullobject, 'UTF8')::jsonb) jrec
where oid = '00000000-0000-0000-0000-000000000002';

-- 2nd level jsonb_each must again be in FROM to preserve (key, value) structure.
-- Now we see the top level structure of the actual object, whatever it's top level key was.
select jrec.*
from (select value jval
        from m_object, jsonb_each(convert_from(fullobject, 'UTF8')::jsonb)
        where oid = '00000000-0000-0000-0000-000000000002'
    ) a,
    jsonb_each(jval) jrec;

-- It's easy to extract exact attribute from known object type (here user).
-- But this is not optimal, also the top level key can be anything (type is
select oid, convert_from(fullobject, 'UTF8')::jsonb->'user'->'indestructible' indestructible
from m_user;

-- To do it from any type (skipping the first level), try this:
select oid,
    (select value jval
        from jsonb_each(convert_from(fullobject, 'UTF8')::jsonb))
            ->'indestructible' indestructible
from m_object;

-- Wrap that with an outer select to extract multiple items from that JSON:
select oid, jval->'indestructible' indestructible, jval->'iteration' iteration
from (
    select oid, (select value jval from jsonb_each(convert_from(fullobject, 'UTF8')::jsonb))
    from m_object
) a;
-- show OID and full object preview as string (works in plain psql too)
select oid, objecttype, substring(convert_from(fullobject, 'UTF8'), 1, 100), pg_column_size(fullobject), length(fullobject)
from m_object
-- possible conditions here, e.g. oid = '...'
limit 10
;

-- showing all extension values exploded to rows (including multi-val)
select oid, key, coalesce(aval, sval) val from (
    select oid, key,
        case when jsonb_typeof(value) = 'array' then value end avals,
        case when jsonb_typeof(value) <> 'array' then value end sval
    from m_user, jsonb_each(ext) fields
) x left join jsonb_array_elements(avals) aval on true
where oid = '0cbe39c7-c7af-4cf3-a334-098400284a0a'
-- other conditions possible, but let's not run it on the whole table or order by ext values
;

-- finding all objects with multi-value extensions
with mvkeys as (select id::TEXT mvids from m_ext_item where cardinality = 'ARRAY')
select ARRAY(SELECT jsonb_object_keys(ext) INTERSECT select mvids from mvkeys) mvkeys,
       ext, * from m_object
where array(select jsonb_object_keys(ext)) && (select array_agg(mvids) from mvkeys);

Handy audit queries

-- Finding audit events with any of the specified changed items
-- (provided array can contain also non-existent values):
select * from ma_audit_event
  -- && returns true if the intersection of the two arrays is not empty
  where changeditempaths && array['\${common}3#familyName', '\${common}3#fullName'];

-- Finding audit events with all of the specified changed items
-- (event can have additional items):
select * from ma_audit_event
  -- @> returns true if the array on the right is superset of the array on the left
  where changeditempaths @> array['\${common}3#familyName', '\${common}3#fullName'];

-- Unwrapping changeditempaths to separate rows and removing the common prefix:
select replace(changedItem, '\${common}3#', ''), *
from ma_audit_event, unnest(changeditempaths) as changedItem
-- this allows for simpler conditions on a each value:
where changedItem in ('\${common}3#familyName', '\${common}3#fullName')
-- and also allows ordering by the changedItem
order by id, changedItem