Shadow Caching

Last modified 14 Mar 2024 12:49 +01:00
Attribute caching feature
This page is an introduction to Attribute caching midPoint feature. Please see the feature page for more details.
Since 3.5
This functionality is available since version 3.5.

MidPoint usually works with fresh data. When midPoint needs to get data about an account then the data will be retrieved on-demand from the resource. This is usually the best method. But there are cases when this approach is problematic. There may be resources that are often down. There may be resources that are very slow. Yet others may be costly or limited regarding accessing them, typically when in cloud. Therefore, midPoint has an ability to cache the values of resource objects and use them instead of retrieving them from the resource.

Moreover, there is another benefit of having the data cached: instant reporting. With the relevant parts of data being cached right in the repository, one can write reports against them, usually allowing for more complex queries and providing faster execution times.

Enabling Caching

The caching is turned off by default TODO reconsider. It can be turned on in Resource Configuration:

Listing 1: Enabling the caching with the default configuration
<resource>
    ...
    <caching>
        <cachingStrategy>passive</cachingStrategy>
    </caching>
    ...
</resource>

The only supported caching method is passive caching: Caches are maintained with minimal impact on normal operations. Generally the data are cached only if they are retrieved for other reasons. There is no read-ahead. The writes are always going to the resource (synchronously): read-through, write-through. There is no cache eviction (but old information is overwritten if newer information is available).

If caching is turned on then midPoint will start to cache the data in Shadow Objects. The caches are build gradually as midPoint reads the objects. If you need to populate the caches with information you have to make an operation that is searching all the objects. E.g. reconciliation or similar operation should do the trick. A special cache-population task is being considered as well.

Configuring Caching

The caching can be enabled or disabled at the level of a resource, object class, object type, or even an individual attribute or association.

An Example

For a quick overview, let us consider the following example. (The complete description is below.)

Listing 2: An example of custom caching configuration
<resource>
    ...
    <schemaHandling>
        <objectType>
            <kind>account</kind>
            <intent>default</intent>
            ...
            <attribute>
                <ref>ri:jpegPhoto</ref>
                <cached>false</cached> (4)
            </attribute>
            ...
            <caching>
                <cachingStrategy>passive</cachingStrategy> (1)
                <scope>
                    <attributes>all</attributes> (2)
                    <associations>none</associations> (3)
                </scope>
            </caching>
        </objectType>
    </schemaHandling>
    ...
    <caching>
        <cachingStrategy>none</cachingStrategy>
    </caching>
    ...
</resource>
1 Enables the caching for account/default object type.
2 Enables the caching of all attributes.
3 Disables the caching of associations.
4 Overrides the caching for a specific attribute.

Shortly speaking, the caching is disabled for the resource as a whole, except for the account/default type, for which it is enabled: Cached are all attributes except for jpegPhoto but no associations.

Configuration Details

At the level of resource, object class, or object type definition, the caching configuration item can be provided. It has the following sub-items:

Table 1. The caching configuration item content
Item Description Default

cachingStrategy

The overall switch that turns the caching on and off. It can have a value of none (no caching) and passive (passive caching as described above).

none at least for now

scope

Scope of the caching (see below).

see below

Table 2. The scope configuration item content
Item Description Default

attributes

Scope of the caching for attributes.

mapped

associations

Scope of the caching for associations.

all

activation

Scope of the caching for the activation information. The value of mapped is currently not supported here. (It is interpreted as all.)

all

Table 3. The values of the scope configuration items
Value Description

none

No items of given kind will be cached.

mapped

Only mapped items of given kind, i.e., those that have any mapping defined right in the object type, either inbound or outbound, will be cached.

all

All items of given kind will be cached.

Exceptions (both positive and negative) to the scope can be defined by using cached boolean property present for individual attributes and associations.

Impact on Operations

The cached data are accessible by using the usual IDM Model Interface. There are two operation options that provide access to the cached data:

  • noFetch option: This option returns the data from midPoint repository. Therefore, if there are data cached in the repository then the noFetch option returns them.

  • staleness option: Requirement how stale or fresh the retrieved data should be. It specifies maximum age of the value in milliseconds. The default value is zero, which means that a fresh value must always be returned. This means that caches that do not guarantee fresh value cannot be used. If non-zero value is specified then such caches may be used. In case that Long.MAX_VALUE is specified then the caches are always used and fresh value is never retrieved.

Both options can be used to get cached data. The primary difference is that the noFetch option never goes to the resource, and it returns whatever data it has. On the other hand, the staleness option is smarter, and it determines whether it has to go to the resource or not. In case that the "maximum" staleness option is used it will result in an error if cached data is not available.

Those options can be used both with getObject operations and search operations. For getObject the staleness option work as expected. But there is one special consideration for the search operations. The search operations cannot easily determine how fresh the data in the repository are. E.g. there may be new objects on the resource that are not in the repository. Therefore, to be on the safe side the search operations will always make search on the resource even if staleness option is specified. There is just one exception: the maximum staleness option will force repository search. However, if the search discovers any object that does not have cached data then it will result in an error (specified in the fetchResult object property).

Caching Metadata in Shadows

Shadow Objects contain cachingMetadata property. This property can be used to determine whether the returned shadow represents fresh or cached data:

  • If no cachingMetadata property is present in the shadow then the data are fresh. They have been just retrieved from the resource.

  • If cachingMetadata property is present then the data are taken from the cache. The cachingMetadata property specified how fresh the data are (when they were originally retrieved).

Relation to the "Caching-Only" Read Capability

When the "caching only" read capability is present (e.g., for manual resources), the full shadow caching is enabled by default. It can be turned off by specifying cachingStrategy to none. The scope has currently no effect there, though. TODO reconsider

Limitations

TODO describe these

Migration Note

Before 4.9, this feature was experimental. The default setting was that all attributes and no associations were cached. Since 4.9, the default is to cache all mapped attributes, and all defined associations.

Was this page helpful?
YES NO
Thanks for your feedback