public static final ColumnMetadata OPERATIONAL_STATE = ColumnMetadata.named("operationalState").ofType(Types.OTHER);
These are based on RepositoryService interface and there should be no difference compared to the legacy repository. Each point contains design/implementation notes.
Repository stores midPoint objects (subtypes of
ObjectType), assigning their OID if missing. For this
fullObject(BYTEA) columns would be enough. However, this is not acceptable for
searchObjectsperformance see note there.
Repository assigns any missing container IDs (CID) for object’s containers. These IDs are unique inside a single object and assigned loosely sequentially. For this
m_object.cid_seqis stored in the database and it contains the next available CID. IDs are assigned only to multi-valued containers, e.g. object’s metadata does not need any CID.
Repository allows searching for objects and selected containers with reasonable performance. This is why objects stored as documents (
fullObject) is not enough. Various types are stored in specific sub-tables instead with values extracted to separate columns. There are other related tables, typically for containers and other multi-value properties. Not all values are by default searchable/indexable in this way and it must be possible to specify what other properties are indexed. These can be properties from the default schema but also from schema extensions.
Each object is a separate "aggregate" and any inter-object references are soft, not constraints on the database level. For example, multi-value container
roleMembershipRefis stored in
m_ref_role_membershiptable which has a foreign key to the common OID pool
m_object_oid(this is necessary for any FK pointing to object OID if multiple types are possible, FK to
m_objectdoes not work this way). But the
targetOidis not a FK as it points to another object.
Retry mechanism (if/where necessary) may not be necessary if tests don’t show that.
SELECT … FOR UPDATE seems to do its work just fine.
Minor, tentative, questinable, don’t implement while it’s in this list:
@) for extension refs stored in JSONB.
Grouping is not supported and the semantics for SQL without ORM is unclear. This is not used by midPoint itself and there are no tests for it.
Arbitrary ordering for iterative search - currently only one path is supported. Fixes for
NULLvalues are needed - TODO in the code. Multi-path ordering throws - this can probably wait a bit.
FullText filter (textInfo), including storage of the info
Modify object with reindex; implemented as full delete/add cycle, very reliable and potentially more efficient too.
Right-hand path support, e.g.
column = otherColumn, including other comparison operations. Minimal needed support implemented,
..not supported, but polystrings are.
Support for various get/search/modify options
useNoFetchExtensionValuesDeletionmay be obsolete and ignored by the new repository.
GetOperationOptions.iterationMethodis ignored by the new repository, don’t implement.
JPEG out of full-object (kinda "index-only") - Done
Index-only extension properties (
indexOnly=true) - these are probably stored correctly and can be filtered on, but they are not added to the obtained object (
They should be included only when asked for specifically, e.g. for shadow attribute
INCLUDE. - Done
Support for various get/search/modify options
modifyObjectDynamically()implementation - Transactional support introduced
Audit service (original functionality without new iterative search)
Audit service - iterative search, single order supported, just like in main repo.
Multi-node safety for
Modify Efficiency issues
Modify efficiency for distributed full-object (
AccessCertificationCampaign vs its case):
ADD modification for Case container of AccCertCampaign should not fetch other cases. Update of full object (but without cases), version, etc. works as usual.
DELETE modification for Case container of AccCertCampaign should not fetch any cases, only delete the case row (and update main full object).
REPLACE modification for
casedoes not need to read any cases, just delete all the rows.
Modification going inside any Case should fetch only the needed case before applying the modification.
This mechanism can be reused if full-object for assignments is split from object’s full-object
in the future (not sure about the correctness of the column name
Index-only attributes/extensions modify efficiency:
REPLACE modification does not require any fetch, it simply obliterates the previous value in the DB.
ADD/DELETE modification currently requires the fetch of the index-only property, which can be achieved by selective fetch of
ext→'id-of-index-only-property'. Modification is then applied to prism object as usual. Value in the DB is overwritten by the new complete value for the property (like for REPLACE).
In the future, ideally, this would not need any fetch access (except for
reindex) and issue fine-tuned SQL operations. This may also require more flexible storage options for extensions/attributes, e.g. promoting the index-only attribute to a separate array column or separate table even.
How to add a new field/column?
Example - let’s add
operationalState column for
Let’s start in the SQL schema, locate the table
operationalStatecolumn to it. Consider logical order of the columns - even though it’s not relevant for upgraded DB, it is relevant for future readability. Notice, how columns for items from subcontainers are grouped in existing tables. Also, we name the columns "Java style", although Postgres doesn’t care aboute casing. The same name will be used in so called "M-class" (
MNodein our example). Sometimes the name contains container name prefix, this is not used for metadata, activation and similar common containers - even these are still pure camel-case names.
What type is it? Simple things are easy (
INTEGER), some items require multiple columns (references, poly-strings), follow the patterns used previously for such cases. Our
operationalStateis enum type - great! Read the next bullet how to do that. Some multi-value types can be represented by arrays like
JSONBcolumns. If we’re talking about whole new multi-value container, whoa-whoa… that’s out of scope of this humble instruction sheet!
If the type is previously unused enum, we’ve got just a bit more work to do. First we need to add
CREATE TYPEfor it to the schema. Find the section with custom types, read its intro comment and add the enum specification. Very likely it will be "schema enum", not specialized repository internal enum - that’s typical. Use the class name for the custom type to make things as obvious as possible, so it’s
NodeOperationalStateTypefor our case. Copy/paste the values to avoid any mistake… other than copy/paste error, that is. Don’t forget to mention this enum in the
SqaleRepoContextconstructor. Alphabetic order, please!
Let’s change the "M-class",
MNodein our example. Simply add public field for the column, like
public NodeOperationalStateType operationalState. Keep the order consistent with the table. BTW, "M" does not stand for "mapping", we will see mapping class later.
Now it’s great time to update
SqaleRepoAddDeleteObjectTest! Find the method testing mapping for this class and add value and assert for the new item. Feel free to add the method if this entity/object type is not yet tested - just like I did. Run the test, it should fail for the new attribute, which is a good sign.
We need to declare the new column in a "Q-class" which extends from Querydsl type hierarchy. Technically it’s mapping for Querydsl, but it’s still not "our" mapping for midPoint (soon, I promise!). For our example, it’s
QNode- and there are two sections:
First, static column metadata, find good example from other class if necessary. In our case, I’ll add:
For enum types we use
Types.OTHER, again, see existing examples from other classes for your type.
Next, we add non-static column (or attribute) path:
public final EnumPath<NodeOperationalStateType> operationalState = createEnum("operationalState", NodeOperationalStateType.class, OPERATIONAL_STATE);
The name of the column (the same like the name of the field in M-class) appears twice here, because we want the same name again for Q-class paths. Previously specified column metadata are used. As before, see examples from other Q-classes to use the right
create*method and path type.
Keep the order consistent with SQL and M-class in both sections. Good, now Querydsl knows what to do with our field in the M-class.
Now it’s time to add the insert code. Finally, we’re getting to the "mapping class" -
QNodeMappingin our case. Locate
toRowObjectWithoutFullObjectand add the following line there:
row.operationalState = node.getOperationalState();
As always, follow the order from SQL and M-class. The code for enum and many other types is as trivial as shown above, but there is great support for refs, poly-strings and many more too - just find the examples in other Q-Mapping classes.
Nearly there! We still need one more thing to support searching and modifications too. Go to the constructor of the mapping class (
QNodeMappingfor us) and add (respecting the right order again, of course!):
addItemMapping(F_OPERATIONAL_STATE, enumMapper(q -> q.operationalState));
I mean, seriously, can it be any more auto-magical than this? It is possible to write test to this as well, but honestly, we don’t bother when adding a new mapping for well-working type. Just be sure to use the right item name (
F_OPERATIONAL_STATEimported statically from
NodeType), proper mapper method (
enumMapper) and proper path (
q.operationalState, which is that final non-static field we added on the Q-class).
And SQL alter script, of course! OK, the alter script is not yet in place (this is written before 4.4 is out), but find it and add the modification there using the newest fashionable method available. Hopefully, it’s something utilizing
apply_changeand - for your sake - same examples are already available.
To see the whole success story, check this commit (add-test was committed separately and linked previously).
Depending on the chosen collation ordering of text fields can be case-insensitive or not. To see it quickly, one can use select like this:
select * from (values ('a'), ('B'), ('Z'), ('x'), ('Č'), ('@'), ('_'), ('%'), ('^'), ('5'), ('47'), ('持')) as t(text) order by text -- collation "C"
collation "x" one can see various results for other collations.
en_US.UTF-8 is used during database creation (recommended by midPoint docs) the ordering
of the select above should be case-insensitive.
Collation used by default for each database can be determined like so:
select datname, datcollate from pg_database; -- returns among others: midpoint,en_US.UTF-8
Funny enough, this default collation name may not be a valid name for
clause but for
en_US.UTF-8 it seems to act the same as
collate "en_US.utf8" (or
The list of valid collation names can be obtained from
select * from pg_collation;
As explained in the collation documentation,
there is a default collation (
collprovider = 'd'), libc based collations (
d) and ICU collations
i) if compiled with it (PG 13 on Ubuntu contains these too).
|ICU collations can’t be used as database defaults.|
It is possible to create tables or single columns with non-default collation.
To list columns with different collation one can use this select (with or without
filter, select based on this answer):
select table_schema, table_name, column_name, collation_name from information_schema.columns where collation_name is not null and table_schema = 'public' order by table_schema, table_name, ordinal_position;
If query is used with custom
If case-sensitive locale is used when case-insensitive behavior is expected it may result in
The old repository uses