Dead Shadows

Last modified 23 Mar 2022 13:54 +01:00

Introduction

Ordinary shadow objects in midPoint represent existing resource-side objects, such as accounts and groups. However, sometimes there is a need to represent non-existing objects, such as accounts that were recently deleted. We refer to such objects as dead shadows.

Dead shadows are very similar to ordinary "live" shadows. They have identifiers, metadata, resource reference, kind, intent and all the other details. The data are mostly "frozen" in time at the moment that the corresponding resource object (e.g. account) was deleted. Dead shadows has a boolean flag dead set to true value, to clearly distinguish them.

Dead shadows are usually kept in midPoint for some time. There are several reasons for this behavior:

  • Diagnostics. Dead shadows contain information about the operation that have deleted the resource object. If we would just delete the shadow, there would be no place to store that information. Dead shadows give administrator a chance to diagnose problems with delete operations.

  • Pending operations: manual and semi-manual resources. Manual resource are very slow. It may take days for an operation to complete. The state of the operation is usually kept in the shadow. However, this is tricky to do for delete operations. Once the delete operation is completed, the shadow should be gone. However, for semi-manual resources, even if the delete operation is completed, the deleted account will still be present in the CSV export for hours or even days. We need to remember that the account was deleted recently. Otherwise midPoint could think that the account was re-created in the meantime, and it can re-issue a command to delete the account again. Dead shadows are kept to make sure that this does not happen.

  • Consistency mechanism. MidPoint may discover that a resource object has disappeared. For example, midPoint tries to read an account linked to a user, discovering that the account was deleted in the meantime. MidPoint will usually try to re-create the account. However, new account means new shadow, as internal identifiers (e.g. entryUUID or GUID) may be different for re-created account. The new account does not contain any information about the old account. However, the old information may be useful, e.g. to investigate the case why has the original account disappeared. Hence the old information is kept in the dead shadow.

  • Handling weird delete-and-recreate situations. MidPoint provisioning subsystem can be quite fast, perhaps a bit too fast in some situations. Incorrectly configured midPoint could try to delete an account, and immediately re-create it. This kind of problems is usually very difficult to diagnose, as the resulting situation looks quite normal. However, dead shadows are an evidence that something strange happened.

  • Future potential. MidPoint has to be prepared to co-exist with many kinds of resource, from well-behaving centralized databases to all kind of weird distributed monstrosities that have never heard about CAP theorem. There may be temporary fluctuations in data. For example, in case of a distributed database, an account may be successfuly deleted on one node. However, the following read operation may be directed to a different node, where the previous delete operation was not propagated yet. In that case midPoint might naively think that the account was re-created, while the truth is that is has received an outdated data. Dead shadows may be helpful in detecting such situations. MidPoint might be able to realize that this information is outdated and resolve the situation. Such functionality is not present in midPoint yet, but it may come in the future.

Managing Dead Shadows

Dead shadows are automatically cleaned up (deleted) after few days.

Dead shadows are quite useful, and usually they do not pose any performance problem. Therefore, in a common case, it is best to keep the dead shadow retention period to its default setting. However, if there is need to completely disable the dead shadow functionality, setting deadShadowRetentionPeriod resource setting to zero will do the trick:

<resource>
    ...
    <consistency>
        <deadShadowRetentionPeriod>PT0S</deadShadowRetentionPeriod>
    </consistency>
    ...
</resource>

Implementation Notes

  • There must be at most one "live" shadow for each resource object (account). There may be any number of dead shadows.

  • Shadows are never reused.

    • Once shadow is dead it stays dead. There is no way to resurrect it.

    • There can be only one ADD operation for a shadow and one DELETE operation.

    • If deleted account gets re-created then a new shadow will be created. There is a reason for that: new account is really new account. It may have different (primary) identifier, e.g. entryUUID generated by the resource. Therefore creating a new shadow is more than justified.

  • TODO: refresh operation, get operation, forceRefresh option, reshreshOnRead

Shadow Lifecycle and Flags

There are two flags that define shadow lifecycle: dead and exists. They specify midPoint’s best knowledge about the state of the resource. In this case it is state of the resource, not its "snapshot" (used in semi-manual resources).

State shadow/dead shadow/exists shadow/lifecycleState shadow/pendingOperation Resource/connector operation(e.g. ticket, async operation) Resource database Resource snapshot (CSV export) Description Notable transitions

Proposed

false

false

proposed

Requested operation (ADD)
Status: REQUESTED

No operation

Not present

Not present

Operation is requested. But it was not started yet. We are processing the request.
This is used mostly to detect uniqueness conflicts (to "reserve" identifiers)

Conception

false

false

active

Requested operation (ADD)
Status: EXECUTION_PENDING or EXECUTING

ADD operation "open" (executing)

Not present
(but being created)

Not present

Signal to create account was sent. It is being executed.

Gestation

false

true

active

Requested operation (ADD)
Status: COMPLETED/SUCCESS
Operation in its grace period

ADD operation "closed" (completed)

Present
(most likely)

Not present

This is a "quantum" state: shadow is alive, but not yet alive at the same time. It probably already exists in the resource (hence exists=true). But it does not exists in the snapshot yet.
Gestating shadows will not appear in resource searches. This should not be a problem for reconciliation, because they should be linked and they will be processed by reconciliation anyway.

In case that the ADD operation was a failure the shadow should instantly become a tombstone.

LIfe

false

true

active

No ADD operation (or operation over grace period)
There may be MODIFY operations.

No operation or modify operations only

Present

Present

Normal state. Shadow exists. Everything works as expected. No quantum effects. No controversies.

In case that the object is not present in the snapshot then the shadow becomes a tombstone.

Reaping

false

true

active

Requested operation (DELETE)
Status: EXECUTION_PENDING or EXECUTING

DELETE operation "open" (executing)

Present
(but being deleted)

Present

Signal to delete account was sent. It is being executed.

Corpse

true

false

active

Requested operation (DELETE)
Status: COMPLETED/SUCCESS
Operation in its grace period

DELETE operation "closed" (completed)

Not present
(most likely)

Present

A.k.a. Schroedinger’s shadow.
This is a "quantum" state: shadow is dead, but is also alive at the same time. It is probably already deleted in the resource (hence exists=false). But it still exists in the snapshot.
Corpse shadows will appear in resource searches - even though is it marked as dead=true.

TODO: what to do if DELETE operation was a failure? Return to life? Or do we need a "zombie" state?

Tombstone

true

false

active

No operations, or only operations over grace period.

No operation

Not present

Not present

Shadow is dead. Nothing remains. No resource object, no object in the snapshot. Just this stone on a grave remains. And it will also expire eventually.
Tombstone shadows will not appear in resource searches - because they do not exist on the resource. But they will work with get operations. And they can be searched with noFetch.

This is the terminal state. Shadow stays dead. Cannot be resurrected.

TODO: later (4.0?) we should get rid of those flags and replace it with a shadow lifecycle status …​ also combine in proposed shadow

Shadow Graveyard

Getting an object will always return a shadow if there is one. Even if it is tombstone. ObjectNotFound exception is thrown only if there is nothing to return: no resource object and no shadow. Therefore clients cannot assume that resource object exists if getObject() operation returns something. The clients should always check shadow lifecycle flags (dead, exists).

TODO: cleanup of dead shadows. grace period, operation retention period, dead shadow retention period

Semi-Manual "Quantum" Cases (Schroedinger’s Shadow)

Somehow-special-case for semi-manual connectors:

  • Created account, ticket closed, account is created by administrator in the target system. But the account is not yet in the exported snapshot (CSV) because scheduled export has not refreshed the file yet. Create operation was successful. Therefore the shadow should be alive. But it is not yet in the snapshot, therefore reading from the "resource" will end up with an error. Therefore the shadow should not be alive.

  • Deleted account, ticket closed, account is deleted by administrator in the target system. But the account is still in the exported snapshot (CSV) because scheduled export has not refreshed the file yet. Delete operation was successful. Therefore we have dead shadow for that. On the other hand the account still exists in the snapshot. Search over the snapshot will return the account. Therefore the shadow should not be dead. We have Schroedinger’s shadow here. Get operation will in fact fetch the data from the resource (we are not fetching normal dead shadows when searching the resource) - as long as we are in grace period. After grace period the shadow becomes completely dead.

When we are searching through the resource, we are in fact searching through CSV and the account-that-should-be-dead-but-it-is-not-dead-yet will be part of search results. In that case:

  • If there is a pending delete operation in dead shadow then we return the dead shadow - even if the account is still "alive" in the snapshot (CSV)

  • If there is no pending operation (or operation over grace period). Provisioning will stop playing Schroedinger here. Dead shadow will remain dead. And provisioning will create new live shadow for the account. Discovery will run and all that usual stuff.