Navigation Tree

PolyString

Last modified 23 Jan 2026 12:02 +01:00

Polystring feature

This page is an introduction to Polystring midPoint feature. Please see the feature page for more details.

Table of Contents

Introduction
Normalization
PolyString Localization
- Implementation
Future
See Also

Introduction

PolyString is an unusual kind of animal. It is a built-in data type for polymorphic string. This string maintains extra values in addition to its original value. The extra values are derived from the original value automatically using a normalization code.

<user>
  <fullName>
    <orig>Count Felix Téléké from Tölökö</orig>
    <norm>count felix teleke from toloko</norm>
  </fullName>
</user>

PolyString is currently used to support national characters in strings. The PolyString contains both the original value (with national characters) and normalized value (without national characters). This can be used in expressions e.g. to generate username that does not contain national characters or is a transliteration of the national characters. It deprecates the need to use custom conversion routines and each expression and therefore it brings some consistency into the integration code.

The normalized value can be used for uniqueness checking, therefore avoiding use of object names that can be confusing (e.g. only lowercase/uppercase differences) or names that look almost the same (characters that looks the same but have different unicode codes).

But the most important reason is data storage. All the values are stored in the repository therefore they can be used to look for the object. Search that ignores the difference in diacritics or search by transliterated value can be used even if the repository itself does not support that feature explicitly.

MidPoint data processing layer (Prism) is designed with syntactic shortcuts in mind. And there is a nice syntactic shortcut for PolyString as well. In all data supported data formats simple PolyStrings can be specified as strings:

PolyString syntactic shortcut

<user>
  <fullName>Count Felix Téléké from Tölökö</fullName>
</user>

The system will make sure that such PolyString is properly processed and normalized.

Normalization

There are several pre-built normalizers for PolyString that can be used in midPoint deployment. And there is an (experimental) option to create a completely custom normalizer.

See PolyString Normalization Configuration page for more details.

PolyString Localization

MidPoint has a fully localized user interface. This means that all the strings used in the user interface are localized. User interface is using keys instead of final strings and the keys are translated to a specific language by using translation catalogs. Such localization works well for all the strings that are present at compile time. But there are many things in midPoint that are configured during the deployment. E.g. it is often desired to name objects in multiple languages. Maybe we need names of the roles in several languages. Or names of organizational units. Those objects are not present at compile time. Therefore ordinary localization techniques cannot be used on their own.

We have considered this issue very early in midPoint design. And it was one of the motivations to introduce PolyString. The idea is that PolyString could be used to store language mutations. Simplest form is to specify strings for all supported languages:

<role>
  <name>
    <orig>System administrator</orig> <!-- This will be used for languages that do not have translations. -->
    <lang>
      <en>
        <orig>System administrator</orig>
      </en>
      <sk>
        <orig>Systémový správca</orig>
     </sk>
     <cz>
        <orig>Správce systému</orig>
     </cz>
   </lang>
  </name>
  ...
</role>

There may be a cases, when the object name is in fact already part of the localization catalogs, e.g. if the object refers to some well-known object in midPoint. In that case the PolyString can simply refer to a localization key:

<role>
  <name>
    <translation>
      <key>RelationTypes.owner</key>
    </translation>
  </name>
  ...
</role>

Or there may be a more complex case that includes translation parameters:

<case>
  <name>
    <translation>
      <key>ObjectSpecification</key>
      <argument>
        <translation>
          <key>ObjectTypeLowercase.RoleType</key>
          <fallback>RoleType</fallback>
        </translation>
      </argument>
      <argument>
        <value>Role1</value>
      </argument>
    </translation>
  </name>
  ...
</case>

Implementation

The algorithm is supposed to work like this:

Values from lang are processed. If there is a suitable value for current language then this value is displayed. The values of lang are supposed to always override anything else in polystring.
Translations are processed. MidPoint looks in the catalog files for translation keys. In case that a particular translation key is not found, the fallback will be used. In case that fallback is not specified for any inner (nested) translations, then the translation key itself will be used. In case of top-level translation the behavior is different: if the key cannot be found and there is no fallback then the orig value will be used.
The orig value of the polystring is used as default value. It will be used in case that there is no suitable lang for a particular language, there is no translation or that the translation does not provide meaningful result (e.g. the translation key cannot be found).

Future

See PolyString Improvements page to learn about planned improvements to polystrings.