# Managing cluster nodes

 Since 4.1, 4.0.2 This functionality is available since version 4.1, 4.0.2.

MidPoint can run in clustered mode with two or more nodes. Here we describe the most important parameters influencing how nodes are named and managed.

## Node identification (naming)

Each node in cluster must have a unique node identifier (name).

### nodeId vs. nodeIdSource

There are two properties that can be used to set the node identifier:

Property Meaning Placement Alternative specification Examples

`nodeId`

A constant value or an expression that yields the node name

`<midpoint>` section in `config.xml`

`-Dmidpoint.nodeId` command line parameter

`NodeA`, `${env:NodeID}` `nodeIdSource` Mechanism that is used to derive node Id (obsolete) `<midpoint>` section in `config.xml` `-Dmidpoint.nodeIdSource` command line parameter `hostname`, `random`, `sequence` The `nodeIdSource` was originally meant as a way to assign node identifiers without the need to specify them as constants. However, after `nodeId` started supporting expressions, `nodeIdSource` is now simply translated into `nodeId`.  The translation looks like this: if `nodeIdSource` value contains ':' (e.g. it is `random:number:0:9999`) then it is copied into `nodeId` by wrapping in `${…​}`. For example: `random:number:0:9999` ${`random:number:0:9999}` if `nodeIdSource` value does not contain ':' (e.g. it is `hostname`) then it is copied into `nodeId` by wrapping in `${…​}` with appended colon at the end. For example: `hostname`` ${hostname:}` So, let’s deal with the syntax of `nodeId` only in the following discussion. ### Using nodeId property Since 4.0.2/4.1, midPoint configuration properties support expressions in the form of `${variable}` or `${prefix:variable}`. The first form evaluates using a configuration option specified by `variable`. The second one is more general and supports the following prefixes: Prefix Meaning Example `sys` References given Java system properties. `${sys:user.home}`

`env`

References given operating system environment variables.

`${env:ENVIRONMENT}` `hostname` References local host name as determined by midPoint. Note that the colon after hostname is obligatory. `${hostname:}`

`random`

Generates random node ID. Full format is `${random:number:lower-limit:upper-limit}` but accepts also forms of `${random:}`, `${random:number}`, and `${random:number:upper-limit}`. Default values are lower limit = 0, upper limit = 999999999. Lower and upper limits are inclusive.

`${random:}` `sequence` Uses first available node ID in a given sequence. Full format is `${sequence:start:end:format}` but accepted forms are also `${sequence:}`, `${sequence:start}`, and `${sequence:start:end}`. Default values are: start = 0, end = 100, format = %d. `${sequence:0:99:%02d}`

The `sequence` expression works like this:

1. A counter starts at the `start` value, incrementing by 1 up to (and including) the `end` value.

2. At each step, node name is determined using the formatting string and other parts of the expression, and is checked for availability.

3. If such a node does not exist in the repository, the name is used. Technically speaking, the node name is allocated by creating the node in the repository. If the operation succeeds, the node is acquired. This is to avoid race conditions: only the first midPoint instance that successfully creates a node object can use this name.

4. If a node with a given name exists but the node is permanently down (this is determined by `running` property being set to `false`) the name is used. This is implemented by removing the node object and then retrying the allocation attempt.

5. Names of nodes that are not marked as down but are not alive are not used here. This is to avoid using names of nodes that are e.g. currently booting, or temporarily unavailable. Please see the Node state management section below.

Note that `sequence` expression can be combined with other ones. E.g. you can specify `nodeId` as `${env:ENVIRONMENT}-${sequence:0:99:%02d}`, yielding names like `Test-01`, `Test-02`, …​, `QA-01`, `QA-02`, …​, `Prod-01`, `Prod-02`, …​

## Node state management

A midPoint node can be typically in one of the following states:

State Characterization

up and alive

Node regularly checks into the repository. Its `operationalState` property is `UP` and its `lastCheckInTime` is regularly updated (less than `nodeTimeout` ago).

up, but not checking in

There’s an issue with this node. Its `operationalState` property is `UP` but its `lastCheckInTime` is older than `nodeTimeout` seconds. Nodes in this state are excluded from some operations e.g. status querying or cache invalidation calls.

down

Node’s `operationalState` property is `DOWN`. This typically occurs when the node is going down cleanly: it marks itself as down. If node goes down abruptly (without having a chance to do this modification), other nodes watch its `lastCheckInTime` and after it’s older than `nodeAlivenessTimeout` ago, they mark the respective node as down by setting its `operationalState` property to `DOWN`. This check is occurring every `nodeAlivenessCheckInterval` seconds. Nodes in this state are excluded from almost all operations.

starting

Node’s `operationalState` property is `STARTING`. Nodes in this state are excluded from some operations e.g. status querying or cache invalidation calls.

deleted

Node object no longer exists in the repository. The deletion can occur either manually or by the Cleanup task. The task deletes nodes that have `lastCheckInTime` older than `deadNodes/maxAge` ago.

Default parameter values:

Parameter Where it is Description Default value

`nodeTimeout`

`taskManager` section of `config.xml`

When to start considering node as not checked in.

30 seconds

`nodeAlivenessTimeout`

`taskManager` section of `config.xml`

When to start considering node as being down.

900 seconds

`nodeAlivenessCheckInterval`

`taskManager` section of `config.xml`

How often is the node aliveness check carried out.

120 seconds

`nodeStartupTimeout`

`taskManager` section of `config.xml`

When to start reporting node as starting too long.

900 seconds

`deadNodes/maxAge`

cleanup policy e.g. in the system configuration object

After what not-checked-in time should the node be deleted.

none

``` The nodes are not cleaned up by default.
If you'd like to enable this feature, you can set this parameter to e.g. 1 day.
Note that cleanup task runs - by default - once per day.
But you can change this interval or you can schedule other cleanup task, devoted specifically to cleaning up dead nodes.```