Task Manager

Last modified 04 Aug 2025 15:41 +02:00
Task management feature
This page is an introduction to Task management midPoint feature. Please see the feature page for more details.

The task manager controls tasks. It is a cluster-aware scheduler combined with a thread pool. It scans for tasks to execute, coordinates with other nodes in the cluster, allocates a thread for execution, executes tasks, and monitors them. It also ensures that task entries in the repository are updated accordingly.

Currently, the task manager is based on Quartz scheduler which provides most of this functionality.

What Is a Task in MidPoint

A task is a logical unit of work or a thread of execution. A typical example is a modification of a user object with subsequent provisioning, optionally including approvals.

Tasks are data structures that can be stored in the runtime memory or in the repository. In-memory tasks are typically used for relatively simple and short-lived synchronous operations. For such tasks, storing task state in the repository would present an unacceptable overhead. Tasks are stored in the repository in cases of asynchronous, background or scheduled tasks. In such cases, storing the task state in the repository is necessary.

Difference Between Synchronous and Asynchronous Tasks

Tasks in midPoint are of two kinds:

  • Synchronously executing tasks execute quite shortly and are stored only in the runtime memory. It is impossible to inspect them retrospectively via logs, for instance.

  • Asynchronously executing tasks generally take longer to execute, their execution status and result are stored in the repository, and the system administrator can see and manage them through the administrative user interface (see Persistence status below).

An example of an asynchronous task is an approval that waits for interaction of (possibly several) users and therefore cannot be executed synchronously. Another example is a data import which usually runs for some time. Other examples are synchronization and reconciliation tasks that need to be scheduled and run in the background.

The state of asynchronous tasks must be completely serializable​[1], i.e., it must be possible to store it in the repository. Tasks move between midPoint nodes, may be passed outside midPoint, may be suspended, resumed, and must survive even a sudden restart of all nodes. Therefore, a task must be able to store its complete state to the repository, as well as to recover its state back to the runtime memory. Tasks should also survive system upgrades. For instance, an approval process that started on version X must be able to finish after an upgrade to version X+1.

These requirements place significant constraints on the design of task mechanism and flexibility. They impact performance as well. This means that asynchronous tasks are not suitable for every kind of operation, such as every trivial read operation. Generally, only the operations that require asynchronous execution are implemented as asynchronous tasks.

Understand Task Status and What It Means

The status of tasks in midPoint has two parts: the persistence status, i.e., where is the task runtime information stored, and the execution status, i.e., whether the task is currently running, paused, or finished. On top of that, some tasks have extra details specific to their job, like a live sync token used to track progress during synchronization.

Persistence Status

The persistence status signifies whether the task is in the runtime memory or stored in the repository.

  • Transient: The task is in the memory only, it is not stored in the repository. Only synchronous foreground tasks may use this approach. As the task data only exist while the task runs, the user or the client application need to (synchronously) wait for the task to complete.

  • Persistent: The task is stored in the repository. Both synchronous (foreground) and asynchronous (background, scheduled, etc.) tasks may use this approach. However, it is used almost exclusively for asynchronous tasks.[2]

Execution Status

The execution status provides information whether the task is running, waits for something, or is done.

  • Running: The task is executing on a midPoint node, there is a thread on one of the nodes that executes the task.

  • Runnable: The task is ready to be executed. This state implies that the task is prepared to be started as scheduled.

  • Waiting: MidPoint is waiting while the task is being executed on an external node (e.g., an external workflow engine) or is waiting for an external signal (e.g., an approval in an internal workflow). In other words, the task may be running on external node or be blocked on midPoint node. One way or another, there is no point in allocating a thread to run this task. Other task properties provide more information about the actual business state of the task.

  • Suspended: The task is suspended, e.g., by the system administrator. The task will not be executed until it is explicitly resumed, usually by the system administrator.

  • Closed: The task is done. No other changes or progress will happen. A task in this state is considered immutable and the only thing that can happen to it is a removal by a cleanup task.

The distinction between Running and Runnable states is currently not done on the level of task attributes; the executionState attribute for both cases is runnable. These states can be distinguished by calling the TaskManager.isTaskThreadActiveClusterwide method.

Business Status

Some tasks need extra details to do their job. For example, a task that syncs data between systems might use a "live sync token" to remember where it left off, so it doesn’t repeat work or miss updates.

Business states are to be covered in the Documentation in more detail.

Distinguish Single-Run and Recurring Tasks

  • Some tasks in midPoint are single-run.

    • A usual example is the initial import of accounts from a resource. This import is started by the system administrator, executes in background, and after its work is done (or an irrecoverable error occurs), it finishes. Another example is an operation that requires approval, such as assigning a role to a user. After the approval is obtained and the operation is executed, the task finishes and is closed.

    • Single-run tasks may start either immediately after you create them or according to a defined schedule.

  • Other tasks have to be repeated. Such tasks are recurring.

    • Examples of recurring tasks include live synchronization (scheduled to run, e.g., every 5 seconds) or reconciliation (scheduled to run, e.g., every day at 3:00).

    • Tasks scheduled using an interval start immediately after creation and then run repeatedly once the set interval elapses.

    • Tasks scheduled to run, e.g., every Sunday at 16:00, do not run immediately after creation. They remain in the Runnable state until the designated run time arrives.

Schedule Tasks

Task scheduling is governed by the schedule attribute, which has the following parts:

  1. interval: Denotes interval in seconds between task runs. Used only for recurring tasks.

  2. cronLikePattern: Cron-like pattern specifying time(s) when the task is to be run. Currently only loosely bound recurring tasks can use this feature.

  3. earliestStartTime: Earliest time when the task is allowed to start. Usable for any kind of task. This is parameter is particularly useful to postpone the start of a single-run task.

  4. latestStartTime: Latest time when the task is allowed to start. Usable for any kind of task.

  5. latestFinishTime: Latest time when the task is allowed to run. A reason to specify this time may be because another task conflicting with this task is scheduled to start at this time, so the task for which you specify latestFinishTime must NOT run after that moment. It is a responsibility of the task handler to finish working when this time comes. It is not enforced by the task manager.

In the following sections, we examine scheduling for both recurring and single-run tasks in more detail.

Schedule Recurring Tasks

MidPoint currently supports two styles of schedule definition for recurring tasks:

  • Interval-based scheduling repeats a task every N seconds.

  • Cron-like scheduling provides the ability to specify starting times using a cron expression.

    • For example, to schedule a task to run every Sunday at 03:45, you would use the following pattern: 45 03 * * 7 (minute: 45, hour: 3, day: any, month: any, week day: 7 (Sunday)).

    • For more information on these expressions, see the Quartz documentation.

Recurring Tasks Can Be Tightly or Loosely Bound

A recurring task can be bound to a midPoint node either tightly or loosely.

A tightly bound task is given a thread in which it executes. Even between executions, the thread is allocated to the task. (Technically, the thread just sleeps between the runs using the Thread.sleep method.) A direct consequence is that each execution of this task occurs on the same node. This has some pros and cons:

  • The main positive aspects are that the execution is a bit more efficient (scheduling via Quartz is avoided) and that the troubleshooting is a simpler, as all the executions are recorded in a log file on the same node.

  • A negative aspect is that such a task consumes permanently one thread.

As a general rule, a task should be tightly bound only when its scheduling interval is quite short, e.g., under 30 seconds. (In the current Quartz-based implementation of the task manager, it is not possible to use a cron expression for a tightly bound task.)

On the other hand, a loosely bound task has no thread permanently allocated to it. It waits in the repository until its start time comes. At the time, it is started on any available midPoint node. When its execution finishes, the thread is released and the task waits for the next start time. A loosely bound task may execute repeatedly on the same node or on different nodes, as determined by the Quartz scheduler algorithm (hence the name 'loosely bound'). The Quartz documentation states that "The load balancing mechanism is near-random for busy schedulers (lots of triggers) but favors the same node for non-busy schedulers (few triggers)."

Schedule Single-Run Tasks

To postpone the start of a single-run task, such as an import task, use the Earliest start time attribute and set the Recurrence to Single in the Scheduling section of task definition. Then, save the task using Save & Run. The task will be in the Runnable state until its scheduled time comes. After it finishes, its status will change to Closed.

Mind the time zones

Before scheduling tasks, verify the time zone your midPoint instance uses. By default, midPoint uses the system time of the server on which it runs. On Linux machines, this is UTC, even if the user may set an arbitrary time zone in the operating system user interface.

When a Task Fails to Start as Scheduled

The misfireAction attribute controls what is to be done when the task fails to start at its specified start time (e.g., because no node or thread are available to execute the task at that time). There are the following possibilities:

  1. executeImmediately: The task is to be executed immediately when possible.

  2. reschedule: The task is rescheduled according to its schedule. This can be used only for loosely bound recurring tasks.

  3. forget: The task is not executed at all. This would be used only for scheduled single-run tasks. Not yet implemented.

Task Execution Terminology Basics

Task run (or sometimes "task cycle run") denotes one execution of a task logic, provided by task handler or handlers, see below. Task thread run denotes one execution of a task thread.

For single-run tasks, a task run is the same as a task thread run: there is only one such run (or thread run) during the task lifetime.

For loosely bound recurring tasks, a task run is the same as a task thread run as well. However, in this case, there are potentially many runs (or thread runs) during the task lifetime.

For tightly bound recurring tasks, there is only one task thread run, because the task thread is allocated to the task permanently. Within this task thread run, there are many task runs occurring at defined points in time.

For this discussion, we do not consider task failovers and node restarts.

  • Starts and ends of a task thread run are logged to the console (standard output) as debug messages.

  • Starts and ends of a task run are logged as lastRunStartTimestamp and lastRunFinishTimestamp attributes.

These terms are open to discussion and possibly subject to change; they are not set in stone.

Task Resilience: What Happens to Interrupted Tasks

This section covers two task types and their behavior when the node on which they run shuts down before they finish, as well as your options to control the action they take.

By default, all persistent tasks are resilient. It means that after a node is stopped (either regularly, e.g., by shutting down the application server, or irregularly, e.g., by a hardware malfunction), persistent tasks continue to execute on another node in the cluster. If no suitable node is available at the time, they resume after an available node appears.

However, there are situations when such a resilience is not desirable. For such cases, you can declare a task as non-resilient. Non-resilient tasks do not resume on another node after their node goes down. They are simply suspended or closed. The use case for non-resilient tasks may be a manual synchronization of resources. Something that is started by the system administrator with the expectation that it executes only until the node is down.

Available Actions After Halt

The task behavior after node shutdown is controlled by the threadStopAction attribute which determines whether a task is resilient or non-resilient.

The threadStopAction attribute can have the following values:

  1. restart: The task will restart on the first node available (i.e., either immediately if there is a suitable node in the cluster, or later when a suitable node appears).

  2. reschedule: The task will be rescheduled according to its schedule (for single-run and tightly bound recurring tasks, this is the same as restart).

  3. suspend: The task will be suspended.

  4. close: The task will be closed.

The restart and reschedule options make the task resilient, the suspend and close options achieve non-resilient behavior.

For tasks with no threads allocated when their node goes down (loosely bound recurring tasks and scheduled single-run tasks), the threadStopAction attribute has no effect. These tasks simply wait until their next start time comes. See also misfire action.

Is It OK to Make Tasks Non-resilient?

If you set task as non-resilient using threadStopAction (options suspend and close), it will suspend or close when its node shuts down. Persistent tasks are designed to survive node failures by default, meaning they restart or reschedule on an available node. Making tasks non-resilient overrides this behavior. It leads to task termination or suspension instead of automatic recovery. This is undesirable in most clustered environments where high availability is expected. Although there are specific scenarios where halting the task on failure is intentional, you should avoid this setting unless you have a strong reason for it (e.g., a manual synchronization task you want to inspect after an interruption).

Configure Task Using Activities

An activity describes the real work that a task is to carry out. Refer to Activities for introduction to the concept of activities as well as details on how to configure them. See Migration of Tasks from 4.0/4.3 on dealing with legacy task configuration that uses handler URIs.

Object Associated to Tasks

Tasks (or rather their activities) may be associated with particular objects. For example, an "import from resource" task is associated with the resource definition object from which it imports. Synchronization and reconciliation tasks may have similar resource object associations. This is an optional property.

The associated object could be also specified using the extension mechanism. That would not be optimal, though, because it would be difficult to search for all the tasks that work on a particular object, be it a resource or anything else.

Task Owner

Task owner is (usually) the midPoint user who created the task. This attribute is used for auditing reasons, for instance.

Clustering and High Availability

There can be multiple midPoint nodes working in a cluster. These nodes share the workload: when a task becomes ready to be executed, one of the nodes takes the task and executes it. This process is governed by the Quartz job scheduler.

When a node becomes unavailable (either because of a shutdown, or due to a sudden crash), the task manager performs the following:

  1. It takes the tasks running on that node and restarts them on other available nodes. This is subject to the threadStopAction settings described above.

  2. It executes other (scheduled) tasks on remaining available nodes.

This way, the high availability of the task execution is ensured.

Refer to Achieve High Availability of MidPoint with Clustering for more information on deploying a high availability setup.

Task State in MidPoint Repository and Quartz JDBC Job Store

The midPoint repository contains general task information, such as execution and business states, while the Quartz JDBC job store is responsible for maintaining information necessary for task scheduling (e.g., the next planned start time).

The information in Quartz job store can be erased at any time and recreated from the midPoint repository on node startup with only minor consequences. The only damage that can occur is that some tasks may be executed one more or one less time.

Because of this, the simplest installations, such as those serving a showcase purpose, can be run with in-memory Quartz job store: a store that is re-created on node startup. This approach has the following limitations:

  1. Clustering (failover) feature is not available.

  2. Tasks do not know their last run time. The consequences of this are, for example:

    • Interval-based loosely-coupled tasks will start immediately, even if their expected start time has not come yet.

    • Misfired cron-scheduled tasks will not start, even if configured to do so, because the information on the misfire event was lost.

    • Reconciliation tasks, for instance, may start immediately after midPoint start.

More advanced installations could use JDBC-based Quartz job store—a store that remembers task scheduling information.

Task Manager Configuration and Administration

Authorize Specific Operations

This section details the specific action URIs used to control different aspects of task execution, scheduling, and system-level operations.

In order to authorize task-related operations, the following action URIs are defined. These are evaluated with respect to task objects, i.e., you define a filter that selects tasks to act upon.

Operation Action URI

Suspend a task

http://midpoint.evolveum.com/xml/ns/public/security/authorization-model-3#suspendTask

Suspend and delete a task

http://midpoint.evolveum.com/xml/ns/public/security/authorization-model-3#delete

Resume a task

http://midpoint.evolveum.com/xml/ns/public/security/authorization-model-3#resumeTask

Schedule a task to run instantly

http://midpoint.evolveum.com/xml/ns/public/security/authorization-model-3#runTaskImmediately

Note that "suspend and delete a task" operation uses the delete action URI. That means, for both deleting a task and deleting a task after suspending it, you would use the same authorizations.

For node-related operations, the following action URIs are defined. These are evaluated with respect to node objects, i.e., you define a filter that selects nodes to act upon (although we do not expect such a selection would be used in practice frequently).

Operation Action URI

Start the task scheduler

http://midpoint.evolveum.com/xml/ns/public/security/authorization-model-3#startTaskScheduler

Stop the task scheduler (optionally with stopping tasks that are executing on it)

http://midpoint.evolveum.com/xml/ns/public/security/authorization-model-3#stopTaskScheduler

Other Operations

Finally, the following actions URIs are defined for operations that are not bound to specific task nor node:

Operation Action URI

Stop all service threads

http://midpoint.evolveum.com/xml/ns/public/security/authorization-model-3#stopServiceThreads

Start all service threads

http://midpoint.evolveum.com/xml/ns/public/security/authorization-model-3#startServiceThreads

Synchronize tasks between the midPoint repository and the Quartz scheduler

http://midpoint.evolveum.com/xml/ns/public/security/authorization-model-3#synchronizeTasks


1. Serialization is the conversion of an object to a series of bytes so that the object can be easily saved to persistent storage or streamed across a communication link. The byte stream can then be deserialized—converted into a replica of the original object. Source: TarkaDaal on SO
2. It is rare, but certain specific configuration or edge cases require persistent status even for short-lived simple tasks. For example, a task with the execution mode set to dry run uses the persistent status. Even though it may perform a one-time short-lived operation, it requires persistence to track the progress and outcome or for audit purposes.
Was this page helpful?
YES NO
Thanks for your feedback