Completeness

This is a snapshot of the specification that was created during midPrivacy: provenance prototype project. For the latest development version of the document please look in Axiom workspace.

When we deal with data, we usually have complete and reliable information about the data. However, there are some unusual cases. Such as:

  • We know that item has values X, Y and Z. It has no other values. This is the common case.

  • We know that item has no value. E.g. we are sure that there are no criminal records for this person.

  • We know nothing about an item.

  • We know that item used to have values X, Y, Z recently. E.g. values that were removed by a particular mapping.

  • We know that the item has some value, but we do not know the value (the value is "unknowable"). E.g. there is a hashed password, we do not even want to disclose the hash, but we want to indicate that the password is already set.

  • We know that the item has some, we do not know the value now, but we can easily find out when needed. E.g. values that are not returned in the query because they are big or expensive. But we can easily construct a query that requests them explicitly. Or values, for which we have an expression that can be used to determine them. But we do not want to execute that expression if the value is not needed.

  • We know that the item has value X, but the item may also have other values that we do not know.

Axiom has concepts of item completeness and value significance to denote such cases:

  • Item completeness. Item can be:

    • Complete (default): we have all the values of the item. We are sure there are no more values than those that we have.

    • Incomplete: we do not have all the values of the item. There may be more values of the item, and we do not know anything about such values.

  • Value significance. Value can be:

    • Positive (default): We know the value, and it is the normal, usual value of the item.

    • Negative: We know that the item used to have the value, but it may no longer have that value. E.g. the value was removed by a mapping.

    • Potential: Item does not have this value, but it might have it. For example, the value was generated by the mapping, but mapping condition (or "strength") prohibited setting the value. This can be significant benefit for diagnostics and troubleshooting. It may also be useful for system administration, e.g. in case we have "value override" in place, this may show that value that would be present if the override was not active.

    • Default: Item does not have this value, but it will have this value (or this value will be assumed) if no other value is explicitly specified. This kind of value can be used by user interface to inform user about the default setting, pre-fill the field by a gray text or by any other similar means. It is very likely to improve the overall user experience. It is different from potential significance above, as the default significance clearly defines a condition when this value can become positive value. Therefore user interface can precisely simulate the behavior.

    • Unknown: We know that the item has a value, but we do not know what the value is. E.g. the value may be hashed or encrypted by an unknown key, the value may be determined by an expression that was not evaluated yet and so on.
      Question: do we need shades of meaning here? E.g. unknowable, expensive, dynamic

Completeness and significance can be combined to describe what we know (and what we do not know) about the item:

Item complete Item incomplete

Value positive

Normal data

We know that item has some values, but it may also have other values that we do not know.

Value negative

Value was removed by the delta. We know all the remaining values (if any).

Value was removed, but we have no information about other values.

Value unknown

We know that the item has a value, but we do not know the value (the value is "unknowable"). E.g. hashed password ("unknowable" value), value that is not returned (expensive value), expression (dynamic value)

We know that the item has (unknown) value, but it may also have other values.

No value or null value

We are sure that item has no value. E.g. "no criminal records"

We do not know anything about the item.

Examples

Use case: jpegPhoto was not fetched from repository and we do not know whether it has a value or not.

XML, full namespaces
    ...
    <jpegPhoto xsi:nil="true">
        <axiom:completeness>incomplete</axiom:completeness>
    </jpegPhoto>
    ...
XML, minimal namespaces
    ...
    <jpegPhoto nil="true">
        <_completeness>incomplete</_completeness>
    </jpegPhoto>
    ...

There is an issue that the completeness is a property of jpegPhoto item, not a particular value. But we cannot express data about an item if there is no value present. Hence the nil. The nil indicates that this value is not really a value.

JSON
  ...
  "jpegPhoto" : {
    "@value" : null,
    "@completeness" : "incomplete"
  }

The @value=null may be optional, as this is a hash and no value is specified this should be obvious.

Use case: password is present, but we cannot or do not want to disclose the value. However, we want to indicate that there is a password.

XML, full namespaces
    ...
    <password xsi:nil="true">
        <axiom:significance>unknown</axiom:significance>
    </password>
    ...
XML, minimal namespaces
    ...
    <password nil="true">
        <_significance>unknown</_significance>
    </password>
    ...
JSON
  ...
  "password" : {
    "@significance" : "unknown"
  }

Metadata Of Incomplete And Negative Values

Value significance is used to denote a negative value, metadata are attached as usual.

Metadata serialized with data:

XML, full namespace
    ...
    <description>
        <axiom:value>This was all wrong, it is gone now</axiom:value>
        <axiom:significance>negative</axiom:significance>
        <axiom:metadata>
            <midpoint:transformation>
                <midpoint:mapping>...</midpoint:mapping>
            </midpoint:storage>
        </axiom:metadata>
    </description>
    ...
XML, minimal namespace
    ...
    <description>
        <_value>This was all wrong, it is gone now</_value>
        <_significance>negative</_significance>
        <_metadata>
            <midpoint:transformation>
                <mapping>...</mapping>
            </midpoint:storage>
        </_metadata>
    </description>
    ...
JSON
  ...
  "description" : {
    "@value" : "This was all wrong, it is gone now",
    "@significance" : "negative",
    "@metadata" : {
        "http://.../midpoint#transformation" : {
          "mapping" : ....,
      }
    }
  }