all definitions |
discriminator |
Item whose values will used to segment objects into buckets (if applicable).
Usually required. |
matchingRule |
Matching rule to be applied when creating filters (if applicable).
Optional. |
numberOfBuckets |
Number of buckets to be created (if applicable).
Optional. |
numericSegmentation |
from |
Start of the processing space (inclusive).
If omitted, 0 is assumed. |
to |
End of the processing space (exclusive).
If not present, both bucketSize and numberOfBuckets must be defined and the end of processing space is determined as their product.
In the future we might implement dynamic determination of this value e.g. by counting objects to be processed. |
bucketSize |
Size of one bucket.
If not present it is computed as the total processing space divided by number of buckets (i.e. to and numberOfBuckets must be present). |
stringSegmentation |
boundaryCharacters |
Characters that make up the prefix or interval.
Currently, the string segmentation is done by creating all possible boundaries (by combining boundaryCharacters ) and then using these boundaries either as interval boundaries (if comparisonMethod is interval ) or as prefixes (if comparisonMethod is prefix ).
This is a multivalued property:
-
the first value contains characters that occupy the first place in the boundary.
-
the second value contains characters destined for the second place, etc.
-
if boundaryCharacters = ("qx", "0123456789", "0123456789", "0123456789") then the following boundaries are generated: q000, q001, q002, …, q999, x000, x001, …, x999.
-
this might be suitable e.g. for accounts that start either with "q" or with "x" and then continue with numbers, like q732812.
-
if boundaryCharacters = ("abcdefghijklmnopqrstuvwxyz", "0123456789abcdefghijklmnopqrstuvwxyz") then the following boundaries are generated: a0, a1, a2, …, a9, aa, ab, …, az, b0, b1, …, b9, ba, …, bz, …, z0, z1, …, z9, za, …, zz.
-
this might be suitable e.g. for alphanumeric account names that always start with alphabetic character.
Beware: current implementation requires that the characters are specified in the order that complies with the matching rule used.
Otherwise, empty intervals might be generated, like when using "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ" there will be an interval of e.g. "values greater than z but lower than A " (empty one) or "values greater than Z " (covers items covered by earlier intervals of a-b, b-c, …).
|
depth |
If a value N greater than 1 is specified here, boundaryCharacters values are repeated N times.
This means that if values of V1, V2, …, Vk are specified, the resulting sequence is V1, V2, …, Vk, V1, V2, …, Vk etc, with N repetitions - so N × k values in total. |
comparisonMethod |
Either interval (the default), resulting in interval queries like item >= 'a' and item < 'b' . Or prefix , resulting in prefix queries like item starts with 'a' . Beware, when using prefix method, all the discriminator values are covered by boundaryCharacters you specify.
Otherwise some items will not be processed at all. |
oidSegmentation |
The same as stringSegmentation but providing defaults of discriminator = # and boundaryCharacters = 0-9a-f (repeated depth times, if needed). |
explicitSegmentation |
content |
Explicit content of work buckets to be used.
This is useful e.g. when dealing with filter-based buckets.
But any other bucket content (e.g. numeric intervals, string intervals, string prefixes) might be used here as well. |