<objectTemplate xmlns="http://midpoint.evolveum.com/xml/ns/public/common/common-3"
oid="74a2112a-0ecc-4c09-818a-1d9e234e8e6f">
<name>person</name>
<item>
<ref>givenName</ref>
<indexing>
<normalization>
<default>true</default>
<steps>
<polyString/> (1)
</steps>
</normalization>
<normalization>
<steps>
<polyString> (2)
<order>1</order>
</polyString>
<prefix>
<order>2</order>
<length>3</length>
</prefix>
</steps>
</normalization>
</indexing>
</item>
<item>
<ref>familyName</ref>
<indexing/> (3)
</item>
<item>
<ref>costCenter</ref>
<indexing>
<normalization>
<steps>
<none/> (4)
</steps>
</normalization>
</indexing>
</item>
</objectTemplate>
Custom Indexing
EXPERIMENTAL
This feature is experimental.
It means that it is not intended for production use.
The feature is not finished.
It is not stable.
The implementation may contain bugs, the configuration may change at any moment without any warning and it may not work at all.
Use at your own risk.
This feature is not covered by midPoint support.
In case that you are interested in supporting development of this feature, please consider purchasing midPoint Platform subscription.
|
Since 4.6
This functionality is available since version 4.6.
|
Sometimes, we need to base the search on specially-indexed data. For example, we could need to match only first five normalized characters of the surname. Or, we could want to take only digits into account when searching for the national ID. MidPoint supports these requirements using custom indexing.
This feature is available only when using the native repository implementation. |
Overview
For each focus object (for example, a user), we have a special searchable container for all data that are indexed in this way. Each time the original data are modified, the content of this container is updated.
This feature can be used to search for:
-
data normalized in a custom way, e.g. like "take first five characters of the surname",
-
data that are not indexed by default, e.g. the
description
property, -
data in multi-source.
Implementation
The container that stores the indexed data is identities/normalizedData
. For each indexing (normalization) defined on a given item, it contains a value or values of the given item (or items in the multi-identity case) after the normalization has been applied.
An Example
# | Item | Name | Description |
---|---|---|---|
1 |
|
|
Default system |
2 |
|
|
First three characters of the default system |
3 |
|
|
Default system |
4 |
|
|
Original value (no normalization). |
1 | PolyString normalization is the default one, and can be omitted. Here it is shown just for completeness. |
2 | However, at this place it must be present. Otherwise, we would take the first three characters of the original form. |
3 | This tells midPoint to index the familyName in the default way (PolyString normalization). |
4 | If one wants to preserve the original form, it must be explicitly specified like this. |
The original and normalized values on a real user object can then look like this:
<user>
...
<givenName>Alice</givenName>
<familyName>Black</familyName>
<costCenter>CCx-1/100</costCenter>
...
<identities>
<normalizedData xmlns:gen370="http://midpoint.evolveum.com/xml/ns/public/common/normalized-data-3">
<gen370:familyName.polyStringNorm xsi:type="xsd:string">black</gen370:familyName.polyStringNorm>
<gen370:givenName.polyStringNorm xsi:type="xsd:string">alice</gen370:givenName.polyStringNorm>
<gen370:givenName.polyStringNorm.prefix3 xsi:type="xsd:string">ali</gen370:givenName.polyStringNorm.prefix3>
<gen370:costCenter.original xsi:type="xsd:string">CCx-1/100</gen370:costCenter.original>
</normalizedData>
</identities>
</user>
In the database, the normalized values are stored in a separate JSONB column: m_focus.normalizedData
.
They are not part of m_object.fullObject
.
Configuration Options
Custom indexing is configured in the object template by attaching indexing
information to the item
element.
(It is also turned on by default when multi-source feature is enabled for the item.)
The following configuration options are available for each item:
Option | Description | Example |
---|---|---|
|
Local item name in the |
|
|
Set of normalizations that are applied to the given item. |
Default |
Each normalization is configured using these options:
Option | Description | Example |
---|---|---|
|
Name of the index (normalization). It is appended to the item name. Usually it can be left unspecified, because it is derived from the normalization step(s). |
|
|
Is this the default index (normalization) for the given item? It is necessary to specify it only if there is more than one normalization defined. |
|
|
Overrides the generated name for the indexed item (original item name + normalization name). Should not be normally needed. |
|
|
How is the indexed value computed?
The default is to use system-defined |
Use |
There are the following types of normalization steps:
Type | Description | Default normalized item name suffix |
---|---|---|
|
Does no normalization, i.e., keeps the original value intact. |
|
|
Applies system-defined or custom |
|
|
Takes first |
|
|
Applies a custom normalization expression (e.g., a Groovy script) to the value. |
|
Each normalization step has the following options:
Option | Applies to | Description |
---|---|---|
|
all steps |
Order in which the step is to be applied. It should be specified (if there’s more than single step), because current prism structures (containers) are not guaranteed to preserve the order of their values. Steps without order value go last. |
|
all steps |
Technical documentation for the step. |
|
|
Configuration of |
|
|
How many characters to keep. |
|
|
Expression that transforms the value to its normalized form.
Expects |
Querying
The values are queried just like any others. The only difference is that their definition is dynamic, hence e.g. in Java it must be constructed manually.
ItemName itemName = new ItemName(SchemaConstants.NS_NORMALIZED_DATA, "familyName.polyStringNorm");
var def = PrismContext.get().definitionFactory()
.createPropertyDefinition(itemName, DOMUtil.XSD_STRING, null, null);
ObjectQuery query = PrismContext.get().queryFor(UserType.class)
.itemWithDef(def,
UserType.F_IDENTITIES,
FocusIdentitiesType.F_NORMALIZED_DATA,
itemName)
.eq("green")
.build();
In the future, it should be possible to specify the queries also in Axiom query language or XML/JSON/YAML. However, there are some issues to be resolved.
-
The definitions of normalized data are dynamic. Hence, such a query is not interpretable without knowing the archetype/object template of the objects in question. (It is very similar to searching by shadow attribute values; their definition is specified by resource object type.) Therefore, such a query should be always interpreted within the scope of an archetype.
-
In 4.6, Axiom has issues with dots in names. These are used for normalized item names.
identities/normalizedData/familyName.polyStringNorm = "green"
Maintenance
The normalized data are maintained automatically by midPoint.
In the current implementation it is the model
subsystem that takes care of it.
This means that careless "raw" update may break the consistence of the indexed data.
If this happens, or if the definition of the indexing changes, the administrator should execute any regular operation to put things into sync again. An example of such operation is focus object recomputation.
We should consider finding (or creating) a special partial processing option that would do just this update without the overhead of the full recomputation. |
Limitations
-
This feature is available on the native repository only.
-
Only
string
andPolyString
values are currently indexable. -
One must be careful when editing the data in "raw" mode and when changing the indexing definition, see Maintenance section.
-
The object template must be declared in the "new style" using an archetype (i.e., not in "legacy way" in the system configuration).
Future Work
In 4.6, this feature is used in the context of the correlation only. However, in theory, nothing precludes its use in more general scenarios. One of them could be, for example, searching for users right in the user list in GUI.