1. Statistical organisations have to deal with many different external data sources. From (traditionally) primary data collection, via secondary data collection, to (more recently) Big Data. Each of these data sources has its own set of characteristics in terms of relationship, technical details and semantic content. At the same time the demand is changing, where besides creating output as "end products", statistical organisations create output together with other institutes.

2. Statistical organisations need to find, acquire and integrate data from both traditional and new types of data sources in an ever increasing pace and under ever stricter budget constraints, while taking care of security and data ownership. They would all benefit from having a reference architecture and guidance for the modernisation of their processes and systems.

3. A Data Architecture is defined as:

  • "A data architecture is [an architecture that is] composed of models, policies, rules or standards that govern which data is collected, and how it is stored, arranged, integrated, and put to use in data systems and in organizations." (Wikipedia)

  • "A description of the structure and interaction of the enterprise's major types and sources of data, logical data assets, physical data assets, and data management resources." (TOGAF 9, Part I)

4. Although CSDA is (loosely) based on TOGAF, it should be stressed that "data" to statistical organisations means something different from what is understood by most industries. "Data", to statistical organisations, is the raw material, the parts and components and the finished products, rather than the information needed to support and execute the organisation's primary processes (although, also in statistical organisations, there is data that plays that role, of course). Although the definition still applies, "data architecture" is meant in this document also has a (slightly) different scope.


Next >>

  • No labels