Role Mining

Last modified 12 Oct 2023 08:37 +02:00
Since 4.8
This functionality is available since version 4.8.

Role mining is a process to "mine" business roles from the application roles. Mining process is analyzing application roles and their assignment to users, trying to detect data clusters and patterns, suggesting candidates for business roles.


The primary motivation behind role mining is the simplification of role-based access control (RBAC). Users in organization are involved in many roles, organizational units and projects, requiring numerous permissions to carry out their activities. In an ideal RBAC system, each work responsibility, organizational unit assignment or project engagement would have its own role, a business role. Business role would group all the permissions necessary to carry out a specific business activity.

However, modeling business roles is no easy task. Business processes are often under-specified or even not specified at all. It is very difficult to find anyone in an organization that could tell which permissions are required for a specific activity. As business roles are very difficult to define from the bottom up, the common practice is to use application roles instead. Theoretically speaking, application roles represent user permissions in a specific application. Application roles are often the smallest pieces of permission data that identity management system works with. They are not supposed to be assigned to users directly. Unfortunately, it is a very common practice. However, such practice leads to numerous problems. The number of assignments is increasing, the access control model looks like a huge ball of mud in quite a short time. This makes the roles difficult to review and unassign, which leads to accumulation of security risk.

However, the processes are there and business activities are carried out every day. This means that the business-related permission information is there. It is just hidden, buried in the heap of access control data, waiting to be discovered. Role mining is a method to uncover such information. It uses algorithms and Artificial Intelligence (AI) techniques to sift through this maze of permissions, grouping them into cohesive business roles based on patterns of access.

Role mining detects candidates for new business roles based on data clustering, pattern recognition and similar methods. It suggests business roles that allow assigning similar permissions to groups of users. By conducting role mining, organizations can move from a chaotic, role permission model to a more organized, business role-based model. This transition not only enhances security by reducing the risk of access escalation but also makes the entire IAM process more understandable and manageable.


  1. Simplified Access Management: Managing user permissions becomes easier when they are grouped into clear, well-defined business roles. Assigning or revoking access for new or departing employees becomes a matter of allocating or removing a single business role, rather than multiple individual roles (permissions).

  2. Reduced Administrative Overhead: Fewer individual permissions to manage means less time spent on administration and fewer errors. Automation and structured roles reduce the need for manual interventions in access management. Instead of managing individual role (permissions) for each user, administrators can manage business roles, considerably reducing the number of entities they need to handle. With well-defined business roles in place, when a user requests access related to a specific job function or project, administrators can simply assign a role rather than piecing together multiple role permissions. This speeds up the provisioning process.

  3. Transparent Access: Clearly defined business roles make it easier to understand who has access to what, simplifying compliance reporting. In environments where users can request access, a business roles simplifies the user experience. Users can request access based on business roles they understand, rather than having to navigate an intricate network of individual roles.

  4. Efficient Audit Trails: With business roles, it becomes simpler to track changes in roles permissions, ensuring an effective and quick audit process. In traditional systems without role mining, audit logs might be flooded with granular access changes. With business roles as aggregate units of permissions, logs can capture more significant, meaningful changes, reducing the noise and making anomalies more evident.

  5. Enhanced Business Agility: As businesses evolve, role mining can help restructure access rights to match changing business needs (adaptable role structures), ensuring that employees always have the tools and information they need. New projects or business units can be quickly equipped with the necessary access by leveraging existing business roles.

Process Overview

Role mining in midPoint is a flexible process that usually includes several steps:

  1. Create role mining session. Session specifies role mining settings, it is an envelope object to contain results of the analysis as well as overall statistics.

  2. Start role mining process to find clusters. Role mining algorithms are applied to data selected for session to detect clusters. Each cluster contains similar user-permission combinations. The clusters are likely to contain business role suggestions.

  3. Analyze promising clusters, looking for business role suggestion (pattern detection). Each cluster is likely to have one or more suggestions for business roles. Have a closer look at the data in the cluster, selecting a business role candidate.

  4. Refine business role suggestion. Specify a name for the business role, adjust permissions (induced application roles) and role members, fill in other details as necessary.

  5. Apply business role suggestion. New business role is created, specialized task replaces application role membership with business role membership for all affected users.

  6. Analyze other promising clusters in the session, create other business roles. Once all promising clusters are analyzed, the session is done, and it can be retired. It is no longer needed.

  7. Repeat the process as necessary. Start new sessions to find ever finer and finer business roles by adjusting session parameters. Repeat the process as long as it gives practical results.


Role mining is carried out in sessions. Each role analysis session is a data structure that contains setting for a particular case of role mining. When role mining session is set up and launched, the role mining process is started. Role mining is detecting clusters, groups of similar permissions (application roles) and users. Each cluster is likely to contain a business role suggestion. Both session and clusters are represented by midPoint objects:

  • Session (Role Analysis Session) represents organized settings for a particular role mining case, such as role mining mode, scope of objects to include in the analysis, similarity setting and so on. Session is created by midPoint administrator. Once started, role mining process creates clusters within a session. When the process is complete, the session will contain statistic information regarding role mining results.

  • Cluster (Role Analysis Cluster) represents detected group of objects with similar characteristics. Clusters are created automatically by role mining process. Each cluster is likely to contain data used for business role suggestions. Clusters also include statistical data about cluster members.

Starting a Session

Before starting role mining, it is important to collect all relevant data. This basic step sets the stage for the entire process. Extract the current access rights detailing which users have permissions and access to which. Collect information that can provide overview of access needs and patterns.

  1. Selection of Session process mode

    • ROLE_MODE (Role Based Analysis) - the main session object becomes the (RoleType) role object. The analysis will take place due to the similarity of the (UserType) user members of the given roles. User objects in this case fall under the properties of the given role.

    • USER_MODE (User Based Analysis) - the main session object becomes the (UserType) user object. The analysis will take place in view of the similarity of (RoleType) role assignments of the given user. Role objects in this case fall under the properties of the given user.

  2. Session clustering procedure options

    • Query filter - defines the condition of objects for data collection. Designed to filter users according to the established rules. The role-mining administrator can thus specify, for example, the condition of a relationship with a certain organizational unit, which increases the security and accuracy of the analysis.

    • Properties range - it specifies the condition of the range of properties to the main session object, given the selected analysis type. For example, in the case of Role Based Analysis, only those roles that have the required number of user members specified in this setting are collected. This step will make managing the mining role more consistent and convenient.


Once data is collected, it needs to be analyzed to identify patterns and inform the creation of potential business roles. Identify recurring access patterns across different users. Use algorithms to group users with similar access patterns. These clusters hint at potential roles. Detect any unusual or outlier access patterns that might signify misconfigurations or excessive permissions.

We use a density cluster approach that helps group users or roles that act similarly. Density-based clustering is a category of clustering algorithms that partition data points into clusters based on the density of data points in the feature space. The core idea behind these algorithms is that a cluster in the data is a dense region of points that is separated from other dense regions by areas with low density of points.

In our case, we cluster the data with respect to the chosen mode of analysis:

  • Role Based Analysis - cluster roles according to the similarity of their user members.

  • User Based Analysis - cluster users according to the similarity of their role access assignment.

To start the clustering process, we need to define additional rules. These rules are part of Session clustering procedure options:

  • Similarity (Jaccard Similarity) - the minimum value of similarity between two objects that must be met for inclusion in the cluster group.

  • Required overlap

    • Required Roles overlap (User Based Analysis) - minimum overlap size of role users members between two roles.

    • Required Users overlap (Role Based Analysis) - minimum overlap size of user roles assignment between two users.

  • Required object count

    • Roles count (Role Based Analysis) - the minimum necessary number of roles to create a cluster that meets the condition of overlap and similarity.

    • Users count (User Based Analysis) - the minimum necessary number of users to create a cluster that meets the condition of overlap and similarity.

Pattern Detection

After successful clustering, we created groups of similar objects, or clusters. These objects are ready for further analysis from the point of view of finding patterns (groups of objects with exact overlap). To maintain the hierarchy and security of access rights, we focus on the exact overlap match. With such an approach, we avoid noise or similar inconsistencies when designing a pattern, and we will ensure that no new unwanted rights are added.

Clustering data are further analyzed using pattern detection algorithm. Results of a pattern detection can be used as business role candidates.

Pattern detection options:

  • Min Roles Occupancy - defined the minimum required count of roles occupancy for detected pattern.

  • Min Users Occupancy - defined the minimum required count of users occupancy for detected pattern.

  • Frequency Range - minimum and maximum required frequency that expresses the percentage value of occupancy. It refers to the number of occurrences of properties over cluster members. If the condition is not met, the objects are not included in the detection.

    • Role Based Analysis - occupation frequency of the user (properties) in the given cluster.

    • User Based Analysis - occupation frequency of the role (properties) in the given cluster.

  • Process Mode (experimental) - defines the detection trigger rule.

    • FULL (default) - detection will take place over all clusters (except outliers).

    • PARTIAL - detection will not run on clusters that consist of too much data.

    • SKIP - skip detection

Pattern detection takes place automatically after the clustering procedure, except for the Process Mode (SKIP) setting in experimental.

Statistic Data

After data clustering and pattern detection steps, we have almost everything generated, except for the missing statistical data. They are important in facilitating the management role of the mining process. They play an important role in the selection of suitable clusters on which to search or build a business role. We process the following statistical data:

  1. Session (Role Analysis Session) Statistic

    • Processed objects count - object count that has been processed regarding configuration selected mode of analysis * Clusters count - clusters count in session.

    • Mean density - represents the average of clusters membership density.

  2. Cluster (Role Analysis Cluster) Statistic

    • User objects count - number of users of which the cluster consists.

    • Roles objects count - number of roles of which the cluster consists

    • Membership density - represents the properties overlap percentage value between cluster objects.

    • Membership mean - average number:

      • Role Based Analysis - of user members

      • User Based Analysis - of role assignment

    • Membership range - minimum and maximum number of:

      • Role Based Analysis - users members located in cluster objects

      • User Based Analysis - roles assignment located in cluster objects

    • Reduction metric - number of relationships value when applying the largest pattern found.

Business Role Candidates

After these steps we have a complete session. Based on the analysis, begin to define potential business roles that represent a common approach patterns. We simply go through the list of detected patterns and select the most suitable one. If necessary, the structure of the proposed pattern can be fine-tuned by assigning additional rights and users falling under a specific cluster.

In case of not finding or finding inappropriate patterns, the detection can be started again directly above the cluster with new customized detection options.

After roles are defined and refined, it’s time to integrate them into the business role access management structure. The last step of role mining is migration to a business role. In this step, we can set business role additional parameters belonging to the role category, such as adding access rights and candidates who should have the newly created business role and run a task that applies (migrates) the new business role to the selected candidates.

Was this page helpful?
Thanks for your feedback