Login required to access the wiki. Please register to create your login credentials We apologize for any inconvenience this may cause, but please note that this step is necessary to protect your privacy and ensure a safer browsing experience. Thank you for your cooperation. Documents available for download: GAMSO , GSBPM , GSIM |
In GSIM 1.5, two new subtypes of Exchange Channel were added – Statistical Register and Data Harvest – and the Web Scraper Channel was removed. It was thought this change reflected the current and new ways of exchanging information with a statistical organisation. [for further information, see Issue #46]. Do you have any comments on the new objects?
Feedback from countries
Country | Response in short | Feedback from country |
---|---|---|
Croatia | OK with the change | The Croatian Bureau of Statistics finds the change acceptable. |
Lithuania | OK with the change | Elimination of Web Scrapping in favor of Data Harvest seems reasonable, as the different possibilities become available and there are different ways to acquire the same data. The term is broad enough to encompass more options. |
Finland | OK with the change + proposal for further changes | In our opinion, the changes made in this part of GSIM are in the right direction. However, we would suggest some further changes. Firstly, we would like to combine the Administrative Register and Statistical Register into one object (perhaps a super type) called Register. Administrative Register and Statistical Register could then either be mentioned in the description of the Register –object or included as sub types. Secondly, we prefer Data Harvesting over Web Scraping. However, we are not quite sure, whether Data harvesting as a concept is still too web-focused? We suggest that in the descriptions of the exchange channels the most important examples would be mentioned. Especially new data sources, like the following: -scanner data -sensor data -other types of business data (e.g. bonus data from chain stores) |
Mexico | OK with the change + | “Data Harvest” is a more general concept than “Web Scrapper Channel” to describe a way to obtain data from Internet, data banks/databases or from other instruments, but it relates to an action when the other types of information channels are related to things (“Statistical Register”, “Administrative Register”, “Questionnaire”). We think that a term like “Harvested Data” would be more compatible with this line of thoughts. There is a difference between the concept of “data” and the concept of “information”. A process transforms “data” into “information”. We do not agree with the change made to the “Exchange channel” definition. In our view, both concepts - with independent objects - must be considered to state a clear difference when the “Exchange channel” is used to gather data, and when it is used to deliver information. |
Australia | OK with the change + | See ABS comments on issue #46. Generalizing the former “Web Scraper” channel to Data Harvesting (including from 3rd party APIs) is strongly supported. As per comments in the wiki, ABS typically harnesses Administrative Register sources outside the ABS to maintain Statistical Registers within the ABS, rather than having exchanges with Statistical Registers beyond the ABS. Nevertheless, if other agencies (eg in the European context) interact with external Statistical Registers the addition makes sense. As per the wiki post on 7 June 2018, caution is urged in regard to the suggestion that a Statistical Register within an agency might be seen as having “exchange channels” with SPs that use that Register. Routing every flow of data from an internal Statistical Support Program or from SP to another SP via a GSIM Exchange Channel appears to risk unnecessary complexity. A focus on addition formalization where information flows in and out across the boundaries of the agency as a whole, however, appears to add value. |
Canada | OK with the change | No comments! It does reflect the new types Exchange Channels. |
16 Comments
InKyung Choi
01 Oct, 2018(Feedback from Norway)
We cannot see that Data Harvest is an improvement on Web Scraper Channel. If we do keep it, then could we at least change it to Data Harvester or Data Harvest Channel. Harvest sounds like the result of harvesting with a harvester. Nor do I like the Definition “A concrete and usable tool to pass information between two sources, usually by a machine to machine mechanism.” Hopefully, not much information goes from us although it may be in the Provision Agreement. Suggest “A concrete and usable tool to pass information from one source to another, usually by a machine to machine mechanism.” I note that API is mentioned as a type of Data Harvest(er). Good to see that API is mentioned. We are increasingly required to collect information from Administrative Registers via the owners APIs. A good example of an Environmental Change that is costing us a lot of time and resources. Google gives 12 million hits for ‘data harvesting’ and 2 million for ‘web scraping’. Maybe we should have web harvester!
InKyung Choi
23 Oct, 2018(Feedback from Norway)
In our office we make Statistical Registers from Administrative Registers so for us it is more internal than an ‘Exchange channel used for incoming information’. It is more like a subtype of Information/Data Resource internally. Looking at your definition and explanatory text it is clear that you also regard this as internal to a statistical organisation. We strongly suspect this information object is not and should not be a subtype of Exchange Channel. We recommend that this is removed as a subtype of Exchange Channel. It is very important, but it should be in the Structures
InKyung Choi
26 Oct, 2018(GSIM Revision Meeting 24th October, 2018)
On Data Harvest
To change definition from “A concrete and usable tool to pass information between two sources, usually by a machine to machine mechanism” to “… pass information from one source to another, usually by a machine to machine mechanism
To add new data source example in explanatory text
On Statistical Register
Some organisations use Statistical Register for internal purpose, but Exchange Channel is not necessarily defined for "external" user, so still can be subtype of Exchange Channel
It can be both input and output - can there be more explanation about product to include this?
On Exchange Channel: to add mention of API
Mikko Saloila
06 Nov, 2018Definition:
A Register is a regularly updated list of Units and their properties which is obtained from an external organisation (or sometimes from another department of the same organisation)
Explanatory Text:
All the Units in a Register typically have an identifier that makes it possible to update the Register with new information on the Units. Examples of Register are Administrative Register and Statistical Register.
An Administrative Register is a source of administrative information. This administrative information is usually collected for an organisation's operational purposes, rather than for statistical purposes.
A Statistical Register provides an (ideally) complete inventory of the statistical Units within a specific Population, and describes these Units using different characteristics. One example is a business register held within a statistical organization.
Mikko Saloila
06 Nov, 2018Above is the suggestion from me and Essi for discussion or next meeting.
BR,
Mikko and Essi
InKyung Choi
20 Nov, 2018(GSIM Revision Meeting 14th November, 2018)
Agreed to
Mikko Saloila
26 Nov, 2018Register Definition:
A Register is a regularly updated list of Units and their properties which is received from an external organisation (or sometimes from another department of the same organisation)
Explatanotory text:
Same as before (included in a message above)
Product Definition:
A package of content that can be disseminated as a whole.
Product Explanatory text:
Product is a type of Exchange Channel for outgoing information. A Product packages Presentations of Information Sets for an Information Consumer. The Product and its Presentations are generated according to Output Specifications, which define how the information from the Information Sets it consumes are presented to the Information Consumer. (The rest of the explanatory text is as it is)
Remarks:
We noticed, that (at least) the definition of Exchange Channel requires some changes in order this to work in the model.
Mikko Saloila
26 Nov, 2018The Exchange Channel and all the related objects need to be checked if they still match this new conception of outgoing and ingoing information.
This is quite a big change that we now say that basically all the types of Exchange Channel can go in and out.
BR,
Mikko and Essi
InKyung Choi
10 Jan, 2019How would we do with attribute Information Provide Identifier in Administrative Register? which does not exist in Statistical Register.
Jenny proposed to remove this attribute in Issue #2-24 as this can be handled by relationships through other information objects, if this is the case, we might not need to worry about anyway..
Jenny Linnerud
17 Jan, 2019I still struggle with this. For me a Statistical Register does not go in or out of a statistical organisation. It is fed on a regular basis by information from Administrative Register(s) and supports statistical production. It would be a type of Information Resource or Data Resource, but not an Exchange Channel. Lets discuss this more.
Maybe we need to focus on the primary purpose/intention of the Exchange Channel. Our data collection people were confused that a Questionnaire was an ingoing channel when they knew they preprinted the questionnaire with data from inside the statistical organisation and pushed this out to reduce the response burden, but also to enable the information provider to update outdated data. Simliarly publishing a Product can also result in questions coming in to the Statistical organisation for clarification, but that is not the primary purpose of the Product. They should be as self-explanatory as possible. Is the Exchange Channel primarily used to bring information in to the Statistical organisation or send it out?
InKyung Choi
24 Jan, 2019GSIM Virtual Sprint (23 Jan.)
Discussion points
Decision (compared to this version of Exchange Channel)
Object
Group
Definition
Explanatory Text
Product
Exchange
A package of content that can be disseminated as a whole.
A Product is a
the only definedtype of Exchange Channel for outgoing information. A Product packages Presentations of Information Sets for an Information Consumer. The Product and its Presentations are generated according to Output Specifications, which define how the information from the Information Sets it consumes are presented to the Information Consumer. The Protocol for a Product determines the mechanism by which the Product is disseminated (e.g website, SDMX web service, paper publication).A Provision Agreement between the statistics organization and the Information Consumer governs the use of a Product by the Information Consumer. The Provision Agreement, which may be explicitly or implicitly agreed, provides the legal or other basis by which the two parties agree to exchange data. In many cases, dissemination Provision Agreements are implicit in the terms of use published by the statistical organization.
For static Products (e.g. paper publications), specifications are predetermined. For dynamic products, aspects of specification could be determined by the Information Consumer at run time. Both cases result in Output Specifications specifying Information Set data or referential metadata that will be included in each Presentation within the Product.
Object
Group
Definition
Explanatory Text
Synonyms
Questionnaire
Exchange
A concrete and usable tool to elicit information from observation Units.
This is an example of a way statistical organizations collect information (an Exchange Channel). Each collection mode (e.g. in-person, CAPI, online questionnaire) should be interpreted as a new Questionnaire derived from the Questionnaire Specification. The Questionnaire is a tool in which data is obtained.
The Questionnaire is a subtype of Exchange Channel, as it is a way in which data is obtained.Object
Group
Definition
Explanatory Text
Synonyms
Data Harvesting
Exchange
A concrete and usable tool to pass information from one source to another, usually by a machine to machine mechanism.
Examples of Data Harvesting channels include
-webscrapping
-API
Object
Group
Definition
Explanatory Text
Statistical Register
Exchange
A Statistical Register is a register that is a regularly updated list of Units and their properties that is designed for statistical purposes.
A Statistical Register provides an (ideally) complete inventory of the
statisticalUnits within a specific Population, and describes these Units using different characteristics. One example is a (statistical) business register held within a statistical organization.All the statistical Units in a Statistical Register have an identifier that makes it possible to update the Statistical Register with new information on the statistical Units.Object
Group
Definition
Explanatory Text
Synonyms
Administrative Register
Exchange
A source of administrative information which is obtained from an external organisation
(or sometimes from another department of the same organisation)The Administrative Register is a source of administrative information obtained from external organisations. The Administrative Register would be provided under a Provision Agreement with the Information Provider
supplying organisation. This administrative information is usually collected for an organisation's operational purposes, rather than for statistical purposes.Object
Group
Definition
Explanatory Text
Exchange Channel
Exchange
A means of exchanging information.
An abstract object that describes the means to receive (data collection) or send (dissemination) information.
Different Exchange Channels are used for collection and dissemination. Examples of collection Exchange Channel include Questionnaire, Web Scraper Channel and Administrative Register. The only example of a dissemination Exchange Channel currently contained in GSIM is Product. Additional Exchange Channels can be added to the model as needed by individual organizations.
InKyung Choi
24 Jan, 2019GSIM Virtual Sprint (24 Jan.)
Object
Group
Definition
Explanatory Text
Data Harvest
Exchange
A concrete and usable tool to pass information from one source to another, usually by a machine to machine mechanism.
Examples of Data Harvest channels
are Webscraping or an APIinclude web scrapper, API, scanner, sensor, satellite, etc.Data Harvesting vs. Data Harvest or Data Harvester: we should have noun-form information object, consistently throughout the GSIM model; Data Harvester has become a modern term compared to webscrapping (more frequently used; source-google
); For plenary: Data Harvest - okay? → okay
Object
Group
Definition
Explanatory Text
Statistical Register
Exchange
A Statistical Register is a register that is a regularly updated list of Units and their properties that is designed for statistical purposes.
A Statistical Register provides an (ideally) complete inventory of the
statisticalUnits within a specific Population, and describes these Units using different characteristics. One example isathe statistical business register held within a statistical organization.All the
statisticalUnits in a Statistical Register have an identifier that makes it possible to update the Statistical Register with new information on thestatisticalUnits.New proposal: All the
statisticalUnits in a Statistical Register have an identifier that makes it possible to update the Statistical Register with new information coming from administrative units and/or for Units. Essi Kaukonen Mikko Saloila Guillaume Duffes Eva Holm Marina Signore is this okay?For plenary: do we want to keep the last sentence in the explanatory text of Statistical Register (wo "statistical") → see new proposal above
Object
Group
Definition
Explanatory Text
Exchange Channel
Exchange
A means of exchanging information.
An abstract object that describes the means to receive
(data collection)or send(dissemination)information. The Exchange Channel is used for external and internal purposes.Different Exchange Channels are used for collection and dissemination. Examples of
collectionExchange Channel for receiving information include Questionnaire, *Web Scraper Channel and Administrative Register.The onlyAn example ofa disseminationExchange Channel for sending informationcurrently contained in GSIMis Product. Additional Exchange Channels can be added to the model as needed by individual organizations.Object
Group
Definition
Explanatory Text
Administrative Register
Exchange
A source of administrative information which is obtained usually from an external organisation
(or sometimes from another department of the same organisation)The Administrative Register is a source of administrative information obtained usually from external organisations. The Administrative Register would be provided under a Provision Agreement with the Information Provider
supplying organisation. This administrative information is usually collected for an organisation's operational purposes, rather than for statistical purposes."usually" has been added as some statistical organisations do have administrative registers (e.g. France)
Mikko Saloila
25 Jan, 2019The explanatory text of Statistical Register is ok for us.
InKyung Choi
21 Feb, 2019Reading again, the last sentence of explanatory text for Statistical Register sounds a bit weird..
All the
statisticalUnits in a Statistical Register have an identifier that makes it possible to update the Statistical Register with new information coming from administrative units and/or for Units.Shouldn't it be
All the
statisticalUnits in a Statistical Register have an identifier that makes it possible to update the Statistical Register with new information coming from administrative units on the Units. ?Jenny Linnerud
21 Feb, 2019I suggest we change 'obtained usually' to 'usually obtained'
Jenny Linnerud
21 Feb, 2019I preferred 'All the
statisticalUnits in a Statistical Register have an identifier that makes it possible to update the Statistical Register with new information on thestatisticalUnits."Introducing a new term 'administrative units' is not making any of this clearer to me.
What I think we still lack is the definition of a r(R)egister that both Administrative Register and the Statistical Register are subtypes of.
Work on this was commenced in the Glossary work that Dan has referred to, but we do need to get GSIM v1.2 out before the enire glossary is completed. The statistical ontology work may also contribute positively, but again we need to get GSIM v1.2 out before that work is completed.