Constraints

Explanation of the diagram

The Constraint is used to specify the content of a data source (the term data source refers to both data and metadata sources) but  also to specify a sub set of a Codelist which may used as a partial code list which is relevant in the context of the artefact to which the constraint is attached e.g. Data Structure Definition, Dataflow, Provision Agreement.

A data source may contain data for many data or metadataflows (called DataflowDefinition, and MetadataflowDefinition in the model), and the mechanisms described in this section allow an organisation to specify precisely the scope of  the content of the data source where this data source is registered (SimpleDataSource,QueryDataSource).

The DataflowDefinition and MetadataflowDefinition, themselves may be specified as containing only a sub set of all the possible keys that could be derived from a DataStructureDefinition or MetadataStructureDefinition.

These specifications are called Constraint in this model. Any artefact that is derived from ConstrainableArtefact can have constraints defined. The artefacts that can have constraint metadata attached are:

  • DataflowDefinition
  • ProvisionAgreement
  • DataProvider – this is restricted to release calendar
  • MetadataflowDefinition
  • DataStructureDefinition
  • MetadataStructureDefinition
  • DataSet
  • SimpleDataSource –  a registered data source where the registration references the actual DataSet or MetadataSet
  • QueryDataSource

Note that, because the Constraint can specify a sub set of the component values implied by a specific Structure (such a specific DataStructureDefinition or specific  MetadataStructureDefinition), the ConstrainableArtefacts must be associated with a specific Structure. Therefore, whilst the Constraint itself may not be linked directly to a DataStructureDefinition or MetadataStructureDefinition, the artefact that it is constraining will be linked to a DataStructureDefinition or MetadataStructureDefinition. As a Data Provider does not link to any one specific DSD or MSD the type of information that can be contained in a Constraint linked to a DataProvider is restricted to Release Calendar.

The constraint mechanism allows specific constraints to be attached to a ConstrainableArtefact. With the exception of ReferencePeriod, and ReleaseCalendar these constraints specify a sub set of the total set of values or keys that may be present in any of the ConstrainableArtefacts. For instance a DataStructureDefinition specifies, for each Dimension, the list of allowable code values. However, a specific DataflowDefinition that uses the DataStructureDefinition may contain only a sub set of the possible range of keys that is theoretically possible from the DataStructureDefinition definition (the total range of possibilities is sometimes called the Cartesian product of the dimension values). In addition to this, a DataProvider that is capable of supplying data according to the DataflowDefinition has a ProvisionAgreement, and the DataProvider may also wish to supply constraint information which may further constrain the range of possibilities in order to describe the data that the provider can supply. It may also be useful to describe the content of a datasource in terms of the KeySets or CubeRegions contained within it.

A ConstrainableArtefact can have two types of Constraint:

  • ContentConstraint – is used solely as a mechanism to specify either the available set of keys (DataKeySet, MetadataKeySet) or set of component values (CubeRegion, MetadatTargetRegion) in a DataSource such as a DataSet or a database (QueryDatasource), or the allowable keys that can be constructed from a DataStructureDefinition. Multiple such constraints may be present for a ConstrainableArtefact. For instance, there may be a ContentConstraint that specifies the values allowed for the ConstrainableArtefact (role is allowableContent) which can be used for validation or for constructing a partial code list, whilst another constraint can specify the actual content of a data or metadata source (role is actualContent).
  • AttachmentConstraint – is used as a mechanism to define slices of the full set of data and to which metadata can be attached in a Data Set or MetadataSet. These slices can be defined either as a set of keys (KeySet) or a set of component values (CubeRegion). There can be many AttachmentConstraints specified for a specific AttachableArtefact.

In addition to (DataKeySet, MetadataKeySet, CubeRegion, MetadataTargetRegion, a Constraint can have a ReferencePeriod defining one of more date ranges (ValidityPeriod) specifying the time period for which data or metadata are available in the ConstrainableArtefact and a ReleaseCalendar specifying when data are released for publication or reporting.

A Constraint is a MaintainableArtefact.A Constraint has a choice of two ways of specifying value sub sets:

  1. As a set of keys that can be present in the DataSet (DataKeySet) or MetadataSet (MetadataKeySet). Each DataKey or MetadataKey specifies a number of ComponentValues each of which reference a Component (e.g. Dimension, TargetObject). Each ComponentValue is a value that may be present for a Component of a structure when contained in a DataSet or MetadataSet. The MetadataKeySet must also identify the MetadataTarget as there can be many of each of these in a MetadataStructureDefinition. For the DataKeySet the equivalent identification is not necessary as there is only one DimensionDescriptor and one AttributeDescriptor.
  2. As a set of CubeRegions or MetadataTaregetRegions each of which defines a “slice” of the total structure (MemberSelection) in terms of one or more MemberValues that may be present for a Component of a structure when contained in a DataSet or MetadataSet.

The difference between (1) and (2) above is that in (1) a complete key is defined whereas in (2) above the “slice” defines a list of possible values for each of the Components but does not specify specific key combinations. In addition, in (1) the association between Component and DataKeyValue or MetadataKeyValue is constrained to the components that comprise the key or identifier, whereas in (2) it can contain other component types (such as attributes). The value in ComponentValue.value and MemberValue.value must be consistent with the Representation declared for the Component in the DataStructureDefinition or MetadataStructureDefinition. Note that in all cases the “operator” on the value is deemed to be “equals”. Furthermore, it is possible in a MemberValue to specify that child values (e.g. child codes) are included in the constraint by means of the cascadeValues attribute.

It is possible to define for the DataKeySet, DataKey, MetadataKeySet, MetadataKey, CubeRegion, MetadataTargetRegion, and MemberSelection whether the set is included (isIncluded = “true”) or excluded (isIncluded = ”false”) from the constraint definition. This attribute is useful if, for example, only a small sub-set of the possible values are not included in the set, then this smaller sub-set can be defined and excluded from the constraint. Note that if the child construct is “included: and the parent construct is “excluded” then the child construct is included in the list of constructs that are “excluded”.

 

 

 

  • No labels