81. The definition of Information Architecture 1 being used by CSPA is given below.
Information Architecture (IA) classifies the information and knowledge assets gathered, produced and used within the Business Architecture. It also describes the information standards and frameworks that underpin the statistical information. IA facilitates discoverability and accessibility, leading to greater reuse and sharing.
82. In other words, Information Architecture connects information assets to the business processes that need them and the IT systems that use and manage them. It includes relating the coherent and consistent definition of information assets at an enterprise level to the information needs of specific business processes and IT systems in practice.
83. As an industry architecture, the Information Architecture set out by CSPA must provide an agreed and actionable (rather than purely conceptual) connection between:

  • The common information frameworks and implementation standards agreed within the industry (e.g. GSIM, SDMX, DDI)

    2

    , and
  • The practical business goals and needs to be supported under CSPA, such as the ability to share and reuse CSPA Services

84. It must support the needs of:

  • Business leaders, planners and process designers who are seeking to apply the Business Architecture from CSPA and who need to understand the connection between processes and information at a business level
  • Application architects and developers who are seeking to apply the Application Architecture from CSPA and who need to understand how CSPA Services interact with information

 

Reference frameworks and their use

85. The Information Architecture will identify common reference frameworks to be used for aligning communication and high level (conceptual) designs.

  • GSBPM will be used as a common reference when recording information in regard to business processes.
  • GSIM will be used as a common reference when defining the information input to, and output from, business processes.
  • A common reference framework for recording information with regards to the definition of CSPA Services is being developed as part of CSPA (this is the Logical Information Model).
  • A common reference framework to use when describing statistical methods is a gap at this stage.

    3


86. The completed Information Architecture will not only identify the reference frameworks which apply but also provide guidance on how they are applied, in combination, within CSPA.

CSPA implementation specifications


Conceptual Specifications

87. A major barrier to effective collaboration within and between statistical organizations has been the lack of common terminology. Using GSIM as a common language will increase the ability to compare information within and between statistical organizations. GSIM describes, at the conceptual level, the information that the statistical production process consumes and produces.
88. Although GSIM can be used independently, it has been designed to work in conjunction with the GSBPM. It supports GSBPM and covers the whole statistical process. It is assumed in this document that an organization either uses the GSBPM or uses another business process model, which can be mapped to the GSBPM.
89. In order for interoperability and reuse to be supported in practice when applying CSPA, the industry needs to do more than align conceptual designs using common frameworks. While GSIM describes the information objects relevant to statistical production it does not provide enough detail for implementation. When it comes to describing information objects in the real world we need to describe them in terms of standards for representing the precise logical relationships between them in a manner, which is consistent with GSIM.

Logical Specifications


90. Standards such as SDMX and DDI offer examples of such detailed logical models. They identify and use many more attributes than are defined within the GSIM model.
91. As these standards and practices have evolved independently, the objects and attributes they use are similar, but not consistent and can be implemented in many different ways. There is some overlap between the standards where a GSIM information object could be described using multiple standards and in some cases, there are information objects where neither DDI nor SDMX is appropriate.
92. Establishing a consistent set of objects and attributes independent of the terminology used in existing standards such as SDMX and DDI requires the development of a CSPA Logical Information Model (LIM). Logical information models describe the information objects of a business without referring to specific industry standards. The logical model is not expected to be an exhaustive representation of all information objects a statistical organisation uses, but rather focus on the information objects that have the greatest use cases. The LIM will:

  • Complement, and be founded upon, the existing GSIM conceptual model and using the relevant parts of SDMX, DDI and other relevant information models;
  • Describe the relationship, attributes, data types and cardinality of GSIM information objects;
  • Support consistent use of SDMX, DDI and other implementation standards in reusable CPSA services;
  • Make it easier for organisations that do not yet use SDMX or DDI to implement reusable CSPA Services, and
  • Be a useful way for users to precisely document the inputs and outputs of their business processes.

93. While the CSPA LIM will act as a bridge at the logical level between different standards, physical representations of information objects in CSPA will rely on representations provided by the existing implementation standards.

Conceptual

GSIM

Logical

CSPA Logical Information Model (LIM)
Relevant modelling from SDMX
Relevant modelling from DDIOther relevant models

Physical

SDMX instance
(SDMX-ML/
SDMX-JSON/etc...)DDI-XML
DDI-RDF
Other standard instances
XKOS-RDF


Figure 5: CSPA Logical Information Model

Physical Specifications

94.Depending on what information is being represented in practice, DDI and SDMX are expected to provide the primary basis for the physical representation of statistical information (e.g. data and metadata) in CSPA.
95.A strong recommendation about specific standards for the logical and physical representation of business process information has not been stated yet 4 . However, CSPA recognises that there are use cases in which process metadata would have value to the business, specifically for the use of orchestration and or workflow to execute statistical function services and the tracking of paradata for performance management. The business process information objects currently in GSIM will be expanded upon in more detail in the LIM.
96.For implementation, data and metadata may be communicated together or separately (recommended) provided that the data and metadata can be reunited when required. If the two are separated the data must contain a reconciling or correlating identifier for the metadata to enable it to be retrieved.

Future work and implications for statistical organisations

97.CSPA will over the long-term provide implementation specifications on:

  • Whether SDMX, DDI or a custom schema should be used for representing a particular object of the LIM
  • Exactly how the chosen schema will be applied for the particular purpose. In many instances there are multiple technically compliant means of achieving the same business purpose, the implementation specification will specify which should be used.

98.Implementation specifications mean CSPA is prescriptive in regard to some practical details. While it would be simpler to align with CSPA if it was less prescriptive, the practical value from alignment would be much less. It is often the case that two developments which have a "common conceptual basis", but were implemented using completely unrelated approaches, are difficult and expensive to make interoperable and/or sharable (if it is possible at all).
99.In addition, an organization which has already implemented a different standard, or a local specification, can "map" their existing approach to the relevant implementation specification – they are not required to "rebuild" from first principles.
100.CSPA implementation specifications specify approaches which will support maximum interoperability/sharability on a cost effective basis. In particular cases it may be difficult for an organization to fully comply with a CSPA implementation specification (due to operational constraints). In these cases, compliance to the extent practical will still realize benefits. In other words, while CSPA implementation specifications provide a set of expectations, it is recognized that not all implementations may be able to achieve them fully in practice.

Information Architecture Principles


101. A number of principles which are common to most organization's information architecture (whether formally defined or not) have been agreed. These are outlined below. 
102. Principle: Manage information as an asset
Statement: Information is an asset that has value to the organization and must be managed accordingly.
103.  Principle: Manage the information lifecycle
Statement: All information has a lifecycle and should be managed to provide reliable identification, versioning and all information should be managed independently and beyond the scope of a single service. 
104.  Principle: Protect information appropriately
Statement: All personal, confidential and classified data should be protected and the data should be treated accordingly. 
105.  Principle: Use agreed models and standards
Statement: All information used as inputs and outputs to Statistical Services should be described using a common, business-oriented, reference model.
106.  Principle: Capture information as early as possible
Statement: Information should be captured in a standard and structured manner at the earliest possible point in the statistical business process to ensure it can be used by all subsequent services. 
107. Principle: Describe to ensure reuse
Statement: All information should be described in a manner that ensures information is reusable between services. Reuse is intended to reduce duplication, additional human intervention and reduce errors. 
108. Principle: Ensure there is an authoritative source
Statement: Information consumed and produced by services should be sourced and updated from a single authoritative source. Information should be consistent across all relevant services. 
109. Principle: Preserve information input into Statistical Services
Statement: Information that is input into services must be preserved to ensure no information loss. 
110. Principle: Describe information by metadata
Statement: All information consumed and produced by services must be described by sufficient metadata. 


  1. While the standards body responsible for TOGAF recognises the term "Information Architecture", the formal model underlying TOGAF refers to "Data Architecture"     
  2. The High Level Group Modernisation Committee on Standards validates the rationale for and approves GSIM implementation standards.  The current agreed implementation standards (as at December 2014) are SDMX and DDI.     
  3. The value agencies achieve through applying CSPA would be greater if such a framework existed. If a framework is agreed in future then it will be referenced in the Information Architecture.     
  4. The probable relevance of existing standards such as BPMN and BPEL is expected to be considered at a later stage of development of CSPA. However it is noted that while BPMN can provide a logical model of business processes it may not be directly implementable.     
  • No labels