Issue #3-2 from Metadata Glossary team - Business Group

Metadata glossary team reviewed definition and explanatory text of GSIM (v1.2) information objects and made following comments for next GSIM revision team to consider.

Some writing convention applied to definition include:

Spelling : UK English will be used (example: organisation instead of organization)
No leading articles in definitions : All starting "A", "An", "The", etc. will be removed. Definition of term should be able to substitute grammatically; starting with article/using sentence will not allow this
Definitions will not start with the concept to be defined (e.g. "A Classification Family is...")
Definitions will start with lowercase and have no ending dot

Business Group

Business Case: (Meeting 24 September 2019) we propose to move the second sentence to the explanatory text
Business Function: (Meeting 24 September 2019) we propose to change definition to "activities undertaken by a statistical organisation to achieve its one or more objectives"
Information Request: (Meeting 24 September 2019) we propose to change definition to "Statistical Need that is a request for new information for a particular purpose"
Process Design: (Meeting 3 September 2019) the current definition does not describe essence of process design. It should be reformulated. Closer approximation: " description of the arrangement of steps needed to perform a business function"
Process Execution Log: (Meeting 3 September 2019) in definition 1) second sentence should be in explanatory text; 2) proposed to change to "recording of a multi-set of process traces, i.e. a list of events generated by process instances"
Process Input: (Meeting 3 September 2019) we propose to change definition to "any instance of an information object supplied to a Process Step Instance at the time its execution is initiated" (removing "which is")
Process Pattern: (Meeting 24 September 2019) we propose to change definition to "named set of Process Designs that is highlighted for possible reuse"
Process Step Instance: (Meeting 3 September 2019) we propose following 1) definition: "executed step in a Business Process specifying the actual inputs to and outputs from an occurrence of a Process Step"; 2) explanatory text: change "For this reason, each Process Step Instance details the inputs and outputs for that instance of the implementation of the Process Step." (remove "of" and replace "implementing")
Statistical Need: (Meeting 24 September 2019) in definition, we propose 1) to move the second and third sentences to explanatory text (also fix error "Environmental Change", it should be "Environment Change"; 2) to change first sentence to "requirement, request, or other notification that will be considered by a statistical organisation"
Statistical Program: (Meeting 24 September 2019) we propose to change definition to "set of activities, which may be repeated, that describe the purpose and context of a set of Business Processes within the relevant Statistical Programme Cycles"
Statistical Support Program: (Meeting 24 September 2019) we are concerned about how the definition is written (e.g. definition written in negative and doesn’t really make sense – i.e., “not related to post-design production” but “necessary to support production”? if it “supports”, how can it not then be “related”? ) Can you reformulate it?
Transformable Input: (Meeting 3 September 2019) we propose to move the second sentence with change (".. content may be represented..") to explanatory text

No labels

23 Comments

InKyung Choi
25 Nov, 2020
(feedback from Linking GSBPM-GSIM task team, meeting 29 Sept. 2020)
Regarding Change Definition - for those who are not familiar with GSIM vocabulary, the name of object is quite confusing, we should consult with GSIM group how they think about changing the name
InKyung Choi
04 May, 2021
Business Function
(GSIM task team meeting, 25 November, 2020)
- Metadata Glossary team proposed to change the definition from “Something an enterprise does, or needs to do, in order to achieve “ to “activities undertaken by a statistical organisation to achieve its one or more objectives” => agree, but remove "one or more"
(GSIM task team meeting, 16 December, 2020)
- Isn't Business Function about purpose or objective (e.g. imputation can be Business Process and its purpose (e.g. correction) can be Business Function)? If it is about activities, how is it differentiated from Business Process (which is also about activities)?
- Business Function is aligned with purpose, but it is not about purpose, it is rather a statement or description of what statistical organisations do (activities). Business Process shows how the function can be done and involves flow, sequencing or steps. Business Function can be both high level or low level, but it is just a connected statements describing what we do without "how" part. Business Function is about "what" while Business Process is about "how"
- It would be helpful to add a text clarifying this point in the explanatory text (e.g. "Business Functions answer in a generic sense “what the statistical organization does. And Business Process answers the question of “how”)
- GSBPM sub-processes can be used as a catalogue of Business Functions which are arranged in a certain order to create a sequence (Business Process), and when we describe this Business Process, Business Functions used in the actual process becomes Process Steps
- There is an error in explanatory text: sub-process 5.3 name is from old version of GSBPM
InKyung Choi
14 Dec, 2021
Business Process
(GSIM task team meeting, 10 Feb. 2021)
Relationship between Process Design, Business Process and Business Function
- Business Function is a high-level description of what business is doing and Business Process is how it is implemented. This is in line with what Metadata Glossary team defined (Business Function can be considered as activity; Business Process can be considered as process). Explanatory text should be checked to make sure they are in line with this notion
  Activity: what we do
  Process: how we do an activity
  Capability: what enables us to do an activity
- The name “Process Design” can be confusing, perhaps use algorithm? => if we replace Process Design with algorithm, we need to do a whole re-design of this part of the model, discuss how algorithm should be defined, check how it can be linked to other objects - which is out of scope for this team. We can have algorithm as part of our design, in the same way we have Rule (algorithm is indeed an attribute of Rule because Rule can be expressed as an algorithm) and Process Method. DDI introduced algorithm, during next GSIM revision, we can take into account how it works in DDI and discuss whether to introduce in GSIM
- It makes sense to bring Business Process near to Process Design, Business Function is abstract level of what is being implemented, if we want to go to Process Steps, we need to go through Business Process
- Decision: given that we agree on the new definition of Process Design ("specification of each Process Step and description of their arrangement in a Business Process needed to perform a Business Function", we should keep the association between Business Function and Process Design"), keep the association between Business Function and Process Design; add an association between Business Process and Process Design; update cardinality of Process Step linked to Process Design (Process Step should be required, not optional)
(GSIM task team meeting, 16 Dec. 2020)
Business Process and Process Step
- Business Process is defined as “set of Process Steps” and Process Step is defined as “work package that performs a Business Process”. This is circular, needs to be reviewed later.
Business Process attribute “Date Initiated” and “Date Ended”
- The name of attributes and their description (“First date of validity” and “Last date of validity”) do not seem to match. Also, if they are really about “validity”, these can be covered by attributes “Valid From” and “Valid To” of Administrative Details.
- First, remove descriptions (as “validity” should be covered by Administrative Details), check if there was any action done during last GSIM revision about these attributes, add description if we really need to keep these attributes
- There are several objects with missing attribute descriptions, we can take this work off-line.
(GSIM task team meeting, 20 October. 2021)
- Current definition says "set of Process Steps to perform one of more Business Functions to deliver a Statistical Program Cycle or Statistical Support Program", it does not make sense that Business Process is "to deliver Statistical Program Cycle".
- ISO definition of business process is: series of process steps that are related to each other that use inputs in order to achieve planned output"
- It is not necessary to link Business Process with Statistical Program Cycle. Business Function covers the notion of objectives and purpose.
- Decision: to delete "to deliver a Statistical Program Cycle or Statistical Support Program", but make Business Function required (i.e., change cardinality from 0...* to 1..*)
InKyung Choi
04 May, 2021
Change Definition
(GSIM task team meeting 25 Nov. 2020)
- Linking GSBPM and GSIM task team asked to review the name of the object as it is hard to guess what it really means by its name
- Change Proposal => at the stage where we use this object, we pass the point of proposal, at this stage, we need to write something out precisely
- Change Specification => this is more precise, but there are two issues: 1) there is "specification" in the definition, we need to rephrase this; 2) if we use "Change Specification", it can be associated with other GSIM objects such as "Process Input Specification", "Process Output Specification" and "Output Specification" and create confusion => To come back to this issue later.
(GSIM task team meeting 16 Dec. 2020)
- Agreed not to change the name of the object, the name conveys what its definition is describing. None of potential names work for all people. Need to direct people to read definition and explanatory text.
InKyung Choi
04 May, 2021
Environment Change
(GSIM task team meeting 14 April 2020)
- Attributes of Environment Change (“Legal Changes”, “Method Changes”, “Software Changes” and “Other Changes”): to remove and create a single attribute for the type and text describing the change
InKyung Choi
04 May, 2021
Information Request
(GSIM task team meeting 16 Dec. 2020)
- The definition has changed to "Statistical Need that is a request for new information for a particular purpose" following comment of Metadata Glossary task team
InKyung Choi
04 May, 2021
Parameter Input
(GSIM task team meeting 7 Oct 2020)
- Original definition: Inputs used to specify which configuration should be used for a specific Process Step which has been designed to be configurable.
- Proposed definition: auxiliary Process Input that specifies the run-time configuration used in a parameterized Process Step Instance
(GSIM task team meeting 16 Dec 2020)
- The definition has changed to “auxiliary Process Input that specifies the run-time configuration used in a parametrised Process Step Instance” following discussion with Linking GSBPM/GSIM task team
- There is a disconnection between definition and explanatory text because definition talks about Process Step Instance (run-time) while explanatory talks about Process Step (design-time). Process Step can be designed to be configurable based on its specification and, at the execution level, the actual instance of the parameter comes in and creates Process Step Instance. What is configurable is Process Step, not Process Step Instance. Explanatory text “inputs passed into the Process Step” is confusing because at this stage, when we pass input value, it means we are already at executing level. The explanatory text needs to be updated later.
- Linking GSBPM and GSIM task team had difficulties in using this object for some GSBPM phases because there are not many things that are configurable outside Phase 5. Process (e.g. Phase 1. Specify Needs, Phase 2. Design) => Parameter Input is for machine actionable part of the process, more for formula, algorithms, etc. which enables metadata-driven methodology / process / system
(GSIM task team meeting 24 March 2021)
- remove "secondary" and "auxiliary" (we already have "Core Input", other inputs are secondary automatically; also check other objects with similar adjective)
InKyung Choi
04 May, 2021
Process Control
(GSIM task team meeting 20 Jan 2020)
- Process Control can be made to regulate what happens inside a Process Step if the Process Step is made up with multiple Process Steps in hierarchy.
Process Control Design
(GSIM task team meeting 20 Jan 2020)
- There is confusing part how Process Control Design is modelled. Process Control Design specifies a set of Process Controls but it doesn't specify how multiple Process Controls are interrelated within it. This might be because GSIM is a conceptual model and some of specificities are not provided.
InKyung Choi
04 May, 2021
Process Design
(GSIM task team meeting 25 Nov 2020)
- Process Design having multiple Process Steps is still a problem as we never know which inputs and outputs are linked to which steps. If we have one design associated with one step, this could solve the issue (note that Process Control Design has one-to-one relation with Process Control). => To come back to this issue later.
(GSIM task team meeting 20 Jan. 2021)
- There is mismatch between what definition says (about arrangement of Process Steps) and what explanatory text says (about description of a Process Step). Definition should include both specification of Process Step and arrangement of Process Steps
- Currently, Process Design depends on Business Function ("specifies delivery of one to many Business Function"), but we could have Business Process without specifying Business Function. In Statistics Norway, there is maturity in terms of Business Process but not for Business Function, so it happens in practice that we have Business Process without associated Business Function yet. Business Process is for performing Business Function and consists of multiple Process Steps. Would it be possible to link Process Design with Business Process instead of Business Function. How should this reflect in the model? => Action: Jenny to have a look at this issue
- Change definition to "specification of each Process Step and description of their arrangement in a Business Process needed to perform a Business Function"
(GSIM task team meeting, 10 Feb. 2021)
Relationship between Process Design, Business Process and Business Function
- Business Function is a high-level description of what business is doing and Business Process is how it is implemented. This is in line with what Metadata Glossary team defined (Business Function can be considered as activity; Business Process can be considered as process). Explanatory text should be checked to make sure they are in line with this notion
  Activity: what we do
  Process: how we do an activity
  Capability: what enables us to do an activity
- The name “Process Design” can be confusing, perhaps use algorithm? => if we replace Process Design with algorithm, we need to do a whole re-design of this part of the model, discuss how algorithm should be defined, check how it can be linked to other objects - which is out of scope for this team. We can have algorithm as part of our design, in the same way we have Rule (algorithm is indeed an attribute of Rule because Rule can be expressed as an algorithm) and Process Method. DDI introduced algorithm, during next GSIM revision, we can take into account how it works in DDI and discuss whether to introduce in GSIM
- It makes sense to bring Business Process near to Process Design, Business Function is abstract level of what is being implemented, if we want to go to Process Steps, we need to go through Business Process
- Decision: given that we agree on the new definition of Process Design ("specification of each Process Step and description of their arrangement in a Business Process needed to perform a Business Function", we should keep the association between Business Function and Process Design"), keep the association between Business Function and Process Design; add an association between Business Process and Process Design; update cardinality of Process Step linked to Process Design (Process Step should be required, not optional)
InKyung Choi
04 May, 2021
Process Execution Log
(GSIM task team meeting, 7 Oct. 2020)
- Original definition: The Process Execution Log captures the output of a Process Step which is not directly related to the Transformed Output it produced. It may include data that was recorded during the real time execution of the Process Step.
- Proposed definition: secondary Process Output describing timestamped events that happen during the execution
(GSIM task team meeting, 20 Jan. 2021)
- Linking GSBPM-GSIM task team proposed a new definition: "secondary Process Output describing time stamped events that happen during the execution"
- We don't need "time stamped" part, we might have a sequence of events but not necessarily stamped with time
- Also, use "listing" instead of "describing", Process Execution Log does not necessarily produce "description"
- Change definition to "secondary Process Output listing events generated by a Process Step Instance"
- Description of attribute "End Time" should be changed to "The time the Process Step Instance ended." from "The time the Process Step ended"
- Regarding attribute "Log Code", it is not clear what this "code" means, if it is programming code that generates log, we should not associate it with process execution (rather, it should be Process Step). It seems this code is ID for the entry of event list. Chang the definition to "The identifier for the event that occurred during the process execution" from "The code for the event that occurred during the process execution"
(GSIM task team meeting 24 March 2021)
- remove "secondary" and "auxiliary" (we already have "Core Input", other inputs are secondary automatically; also check other objects with similar adjective)
InKyung Choi
04 May, 2021
Process Input
(GSIM task team meeting, 7 Oct. 2020)
- Original definition: Any instance of an information object which is supplied to a Process Step Instance at the time its execution is initiated.
- Proposed definition: instance of an information object supplied to a Process Step during its execution
(GSIM task team meeting, 25 Nov. 2020)
- Replace "Process Step execution" with "Process Step Instance" and make sure this is applied everywhere consistently
- Core Input has constraint that it is mandatory, this should be contained in definition. In Clickable, we could add a comment box describing this
- Process Design having multiple Process Steps is still a problem as we never know which inputs and outputs are linked to which steps. If we have one design associated with one step, this could solve the issue (note that Process Control Design has one-to-one relation with Process Control). => To come back to this issue later.
(GSIM task team meeting, 3 March 2020)
- GSIM has an issue how it uses term "information object" (this issue is being discussed in Core Ontology team too), what GSIM calls "object" is in fact "class". Definition of Process Input says "instance of an information object", but generally, object is instance of class, then what does it mean "instance of object"?
- Is it okay to use “to produce outputs” in the explanatory text? => Some Process Step may not produce outputs as concreate as Data Set, but they still produce logs and (process) metrics. Also, explanatory text says "it might be", so it does not exclusively say Process Input have to be used as output
- Also, example "Rule" in explanatory text might not be good one
General (big) issue about run-time objects in GSIM
- It is quite difficult to manage difference between the specification level and implementation level (e.g. Process Step - Process Step Instance), in practice only one level is used.
- There is difference between process as it is designed vs. how it is run. We could design a process, and re-use many times (e.g. monthly survey), then design-time process remains the same, but run-time always changes depending on, e.g. input data
- But do we need to have these run-time objects explicitly in a conceptual model like GSIM? All GSIM objects are something that are instantiated during run-time anyway, why do we need to have separate run-time objects and design-time objects only in certain parts of GSIM?
- We could greatly simplify the model by removing these run-time objects. But objects such as sub-types of Process Input worth keeping, these are currently run-time (as Process Input is run-time), but we can make them as design-time and keep them in the model
- This is very big decision, it is major change to the model, there might be countries that are not in this team and using these design-time objects.
- Action: first try to remove run-time objects, check cardinality and associations, so that team can have a concrete proposal to submit. Discussion on these run-time objects can be put on hold.
InKyung Choi
04 May, 2021
Process Input Specifiation
InKyung Choi
04 May, 2021
Process Method
(GSIM task team meeting, 3 March 2020)
- Words like "technique" and "technology" in definition/explanatory text give impression that it is more for something implemented by software. For Process Step with manual process (e.g. consult with stakeholder), there is no such technology.
- Action: change to "methodology"
InKyung Choi
04 May, 2021
Process Metric
(GSIM task team meeting, 7 Oct. 2020)
- Original definition: A Process Output whose purpose is to measure and report some aspect of how the Process Step performed during execution.
- Proposed definition: secondary Process Output summarizing some aspect or property of the execution at a point in time
(GSIM task team meeting 24 March 2021)
- Action: remove "secondary" (we already have "Core Input", other inputs are secondary automatically; also check other objects such as Process Execution Log, Parameter Input, Process Support Input with similar adjective) and remove "at a point in time" (this is meant for time-stamped information which is more for Process Execution Log) in the definition
- Quality measures are metrics hence can be captured as Process Metric, they can also be part of quality report and Core Output along with, e.g. Data Set
InKyung Choi
04 May, 2021
Process Output
(GSIM task team meeting, 7 Oct. 2020)
- Original definition: Any instance of an information object which is produced by a Process Step as a result of its execution.
- Proposed definition: instance of an information object produced by a Process Step as a result of its execution
Comments about Process Input (see above) are applicable for Process Output
InKyung Choi
04 May, 2021
Process Output Specification
InKyung Choi
04 May, 2021
Process Pattern
(GSIM task team meeting, 24 March 2021)
- Metadata Glossary team proposed to change "nominated" to "named" as the former indicates a kind of authority
- Why do we need "named"? Most GSIM objects are "named" (as they are Identifiable Artefacts which has name as an attribute) anyway => it is important part of the definition, it is for something recognisible in the organisation so that we can re-sue and call it, it should be more than simply "set of Process Designs"
- Action: change "nominated" to "recommended" in the definition
InKyung Choi
14 Dec, 2021
Process Step
(GSIM task team meeting, 18 August. 2021)
Circular definitions of Process Step and Business Process
- Process Step is defined as “work package that performs a Business Process” and Business Process is defined as “set of Process Steps to perform one of more Business Functions to deliver a Statistical Program Cycle or Statistical Support Program”. This is circular
- Two new definitions are proposed for Process Step: “unit of work” and “work package that performs a specific statistical task”
(GSIM task team meeting, 20 October. 2021)
- There is circular definition between Process Step and Business Process
- Decision: to change the definition as "unit of work" (term "task" is reserved to refer to specific level of activity; finer level than GSBPM sub-process)
Process Step Instance
InKyung Choi
04 May, 2021
Process Support Input
(GSIM task team meeting, 7 Oct. 2020)
- Original definition: A form of Process Input that influences the work performed by the Process Step, and therefore influences its outcome, but is not in itself changed by the Process Step.
- Proposed definition: auxiliary Process Input that influences the work performed by the Process Step Instance without its content being changed by the execution
(GSIM task team meeting, 24 March 2021)
- Action: remove "auxiliary" (see comment above) in the definition
- Do we still need “without its content being changed by the execution”? It was needed to differentiate from Transformable Input, but we don't use it anymore => it is still important part of the definition, it highlights the role of "supporting"
- Examples provided in the explanatory text should be reviewed and updated according to the change from Transformable Input to Core Input. E.g. the example of Assessment is not clear (how it is playing the supporting role). Methodological handbook can be a clear example of Process Support Input
- Action: InKyung to review the explanatory text, remove Assessment and add handbook
=> action taken
- Current explanatory text: Process Support Input is a sub-type of Process Input. Typical Process Support Inputs include metadata resources such as Statistical Classifications or structural information used in the processing of data. Examples of Process Support Inputs could include:
  - A Code List which will be used to check whether the Codes recorded in one dimension of a dataset are valid
  - An auxiliary Data Set which will influence imputation for, or editing of, a primary Data Set which has been submitted to the Process Step as the Transformable Input
  - A Provision Agreement which can be used as a supporting document
  - An Assessment from a previous Statistical Program Cycle which can be used as an input for the current Statistical Program Cycle
- Proposed explanatory text: Process Support Input is a sub-type of Process Input. Examples of Process Support Inputs could include:
- A technical or methodological handbook which can be used as a reference to assist the work performed (e.g. data editing, coding and classification)
- An auxiliary Data Set which will influence imputation for, or editing of, a primary Data Set which has been submitted to the Process Step as the Core Input
- A Provision Agreement which can be used as a supporting document
- A repository or inventory of Process Methods or software system / architecture that are approved in the Organisation that could be used as reference
(GSIM task team meeting, 14 April 2021)
- Action: to move “without its content being changed by the execution” from definition to explanatory text. Depending on the use, people may change or not change content of Core Input. If we use this as a part of definition of Process Support Input (PSI), this could cause a problem to how people interpret Core Input. We better clarify this point in the explanatory text.
- How to differentiate PSI from other inputs? Currently, definition says “Process Input that influences the work…”, but all inputs influence the work anyway.
- Core input provides the essential information and PSI provides additional information that support the work and affect the way Core Input is used. Definition should be updated reflecting this
InKyung Choi
04 May, 2021
Rule
(GSIM task team meeting, 24 March 2021)
- Action: remove first "specific" in the definition (there are two uses of "specific", first is less important)
- "They may be used as the input parameters of processes" in the explanatory text is not in line with we have been discussed regarding Parameter Input. Parameter Inputs can be provided for Rules and Process Methods but not the other way around (Rule is not a parameter). The definition of Parameter Input seems still supporting the possibility of using Rule as a parameter, although discussion is moving toward the opposite direction.
- Action: remove "they may be used as the input parameters of processes" in the explanatory text
InKyung Choi
04 May, 2021
Statistical Need
(GSIM task team meeting, 14 April 2021)
- Action: to accept the proposal about definition from Metadata Glossary team
- Action: to update the explanatory text without "raw". "Raw" is supposed to mean “as-is” status and “initial” information as received from other organisations, without having gone through any evaluation or assessment from the statistical organisation. However, this word can be confusing and interpreted as "basic" (which is not always the case, e.g. European regulation has very detailed need)
- Action: to remove attribute "Type" as there are already sub-classes. Information about external vs. internal of Environment Change could be useful for provenance, but this can be captured by "Change Origin" attribute of Environment Change
InKyung Choi
14 Dec, 2021
Statistical Program / Statistical Program Cycle / Statistical Program Design
(GSIM task team meeting 14 April 2021)
- These three objects are tightly linked, definition of them should be carefully examined and updated together to be coherent.
(GSIM task team meeting 17 June 2021)
- In the proposed definition of Statistical Program Design ("specification of the set of activities undertaken to investigate characteristics of a given Population"), is it enough to have "activities" only? Original definition had "methodologies, resources, requirements", if we only have "activities", doesn't it sound the design only includes specification of business processes but missing other important elements such as methodology? → We can consider these "activities" as activities of specifying methodologies, resources, etc. If we start to list what we need, there is always risk of missing something. For example, the original definition also does not have mention of metadata and quality which are all very important things to be designed. It is better to keep the definition simple and have more details in the explanatory text
- How to differentiate "statistical" programs from non-statistical program? There are scenarios that are at the boundary. For example, when we take administrative data, at some point, it can diverge, one into statistical process where you would be required to provide quality explainability (methodologies applied to data to ensure data quality) and the other into standard application that has nothing to do with statistics. From where we call "statistical"? → Administrative data used for purposes other than producing statistics are not covered by Statistical Program. If it is done for the purpose of producing statistics by statistical agency, it is where statistical activity starts. The question is whether it is Statistical Program or Statistical Support Program. Maintenance of administrative register data is statistical but should be the support program
- What is the difference between Statistical Program Design and Statistical Support Program? Statistical Program Design is the core one that create new methodologies and update and modify as needed, creation of methodology does not take place on its own, it should be triggered by some needs. It is not only Statistical Support Program that "impacts" Statistical Program Design, but it should be also vice versa
- It is important to have a clear alignment with GSBPM, having different ways of classifying activities in GSIM and GSBPM will be very confusing to people. Tentatively, assume that Statistical Program Design covers GSBPM phase 1-3, Statistical Program covers GSBPM phase 4-8 and Statistical Support Program covers GSBPM OP (possibly GAMSO corporate support activities)
- What GAMSO corporate support activities are "statistical"? The distinction is not clear, even HR and building maintenance are not independent from statistical activities (e.g. when we need to provide certain security clearance for staff we are hiring for handling sensitive data, when we need to build secure room) → All activities within statistical organisation have some links to statistical production. However, not all are related to the production with the same degree, some are more direct and the others are less
Statistical Program / Statistical Program Design / Statistical Support Program
(GSIM task team meeting 7 July 2021)
Mapping GSBPM phases to Statistical Program, Statistical Program Design and Statistical Support Program
- Can we map Statistical Program to GSBPM Phase 4-7, Statistical Program Design to GSBPM Phase 1-3 and Statistical Support Program to GSBPM Phase 8?
- Only after Business Case is approved, we know whether there is going to be a Statistical Program to be designed or not, so GSBPM Phase 1 (Specify Needs) is not a design activity
- Also, definition of Statistical Program Design is "specification", so it is an output from the design activity (GSBPM Phase 2) rather than activity of the design activity itself. So perhaps Statistical Support Program should cover everything that is not Statistical Program
Exploratory / development activity - where does this belong?
- There are "alpha-stage", pre-Business Case stage where we don't have a very firm design, but just high-level design and discussion (similar to "Proof-of-Concept"). Design at this stage can be scrapped depending on the feasibility or merged with other alpha-stage designs. It is important we document all of these whether they fail or not so that information is not lost (e.g. when developing new machine learning methodology).
- We have "manage methodologies" in GAMSO Corporate Support, but can it include this kind of methodology development process, development life cycle?
- This exploratory phase may have focus on a specific Statistical Program, but in many cases, it doesn't (data discovery, profiling)
- GSBPM is more about production rather than development or exploration, but there are many exploratory processes in the statistical organisation. How to model the flow of moving things from this experimental R&D into production?
- For example, development of a machine learning solution follows its own process (e.g. identifying needs for ML, consulting with stakeholder, development of PoC, getting approval) which ends with the deployment of the solution into the production. There are links between the development process and the production process, but they run in parallel.
- For a specific process or method, we can use, for example, Process Method with Administrative Details (attribute "Life Cycle Status") to represent a method that is in the exploratory phase, but we don't have a "container" GSIM object to capture these exploration-phase related activities and information.
- Can we use Statistical Support Program for the development activity? Statistical Support Program has Business Process, which can be used to model the exploratory and development process .
GSBPM Overarching Process and GAMSO - where does this belong?
- What is the difference between GSBPM Overarching Process (OP) and GAMSO Corporate Support activities? There seems a lot of overlap → OP activities are overarching to "production phases" while GAMSO is for the activities with corporate-wide implication. For example, setting up and updating corporate-level quality framework is GAMSO "manage quality" activity while applying this corporate quality framework during the production (e.g. checking failure rate of coding) is OP activity.
- Cross-boarder process such as management of methodology, quality and data should be separate from Statistical Program and in Statistical Support Program (but there is connection, e.g. requirement from Statistical Program can trigger a new Statistical Support Program; also Statistical Support Program can affect another Statistical Support Program too)
- We need common language to describe information not just for GSBPM (production) but also for GAMSO (activities in statistical organisation), GSIM should be beyond statistical production, it should include statistical entities and activities whether it is production or research, or something else.
- Statistical Support Program should cover GAMSO Corporate Support activity that has statistical component in it
- Don't GSBPM OP activities have to be included and integrated in the production process, so should be a part of the Statistical Program, not Statistical Support Program?
- Think about two options: 1) mapping Statistical Support Program to GSBPM OP and create a new object to cover the other activities; 2) mapping all of these to Statistical Support Program
(GSIM task team meeting 18 August 2021)
Scope and definitions of Statistical Support Program and Statistical Program (also see email from InKyung below)
- Centralized and corporate-level activities (e.g., data ingestion, linkage, matching, master data management) that are not for specific Statistical Program → These are Statistical Support Program. Such program is independent from other Statistical Programs, it takes all common components across all Statistical Programs, activities and its products can be used for many programs in the organization (e.g., dissemination platform)
- Note that innovative activities, development of new methods are in GAMSO Capability Development, not GAMSO Corporate Support. Once new methods are tested and proven to work, then they can be transferred to Corporate Support to be supported in the production.
- Changing, improving and amending of existing Statistical Program → These are Statistical Support Program.
- Specifying needs, designing new Statistical Program and build new components → These are Statistical Support Program. But note that there are some missing associations and objects. For example, if we say GSBPM Phase 1 (Specify needs) is Statistical Support Program, Change Definition should, not just output but also an input of the Statistical Support Program. Also, currently Statistical Support Program does not produce anything (c.f., Statistical Program produces Products), if GSBPM Phase 1 (Build) is Statistical Support Program, it should produce something. The relationship between Statistical Support Program and Statistical Program should be back and forth.
- Statistical Program is execution side of GSBPM (e.g., Phase 4. Collect, Phase 5. Process)
- Current definition of Statistical Program says it is a "set of activities to produce statistics...". Although GSBPM diagram gives a wrong impression that GSBPM overarching process (OP) takes place outside GSBPM phases, but they actually happen along with GSBPM phase activities, hence GSBPM OP should be part of Statistical Program when it is happening along with Phase 4-7.
- Perhaps definitions of Statistical Program is not clear, the fact that the activity is for production of statistics is not enough to differentiate it from supporting activity, should we add a qualifier (e.g., "necessary", "core")? There are many activities covered by Statistical Support Program, how can we represent them in a short definition? → Definition should be simple and brief, we should use explanatory text to give more details and context around.
- In sum,
  Statistical Support Program: centralized, corporate-level activities, GSBPM Phase 1-3, GSBPM OP
  Statistical Program: GSBPM Phase 4-7
  GSBPM OP can be either depending on whether t is specific for certain Subject Field and Universe of Statistical Program
(GSIM task team meeting 8 September 2021)
Scope and nature of Statistical Program and Statistical Support Program
- GSBPM Phase 1-3 should not be in Statistical Support Program, they are not of corporate-support nature, we can always update our design based on new data source, methods and assessment coming continuously from and during execution of Statistical Program, we update the design as a part of Statistical Program
- But Statistical Support Program is not only for corporate-support and centralized activities. For example, metadata management (which is in Statistical Support Program) is also not centralized, some programs might run their own metadata management that is different from other programs, there might be no broad "corporate-level" support at all depending on organizational set up.
- Statistical Program vs Statistical Support Program is not just about whether it is centralized or not, it is more about whether it is for pure production or for supporting the production. Also, Statistical Program has notion of cycle, that is repeated for different periods and Phase 1-3 are not repeated for every cycle
- There are more and more centralized services in the statistical organization. The services may serve mostly statistical processes, but they can also serve non-statistical works (e.g., HR, program management). They can provide service to many programs and processes (1-to-many), provide support for various points of production pipeline, but they themselves don't run the pipeline.
- We should not get distracted with "corporate vs. program-based" distinction. In most situation, there is no such clear separation. Many organizations are moving toward centralized system, but there are many in-betweens. For example, certain domain might have platform that serve most of survey programs within it, but not completely corporate-level. Everything can be centralized to some extent, many Statistical Program use centralized collection and dissemination platform, but this should not make GSBPM phase 4 (collect) and 7 (disseminate) as a part of Statistical Support Program.
- We are putting a lot under the umbrella of Statistical Support Program. Perhaps word "program" is not needed in Statistical Support Program, it often involves the notion of centralized, independent, corporate-level activities of its own governance, but many things under Statistical Support Program (e.g., metadata management) are not "program" per see.
(GSIM task team meeting 20 October 2021)
- The notion of design activity is conducted as a part of Statistical Program does not support the way GSIM is model right now. Objects related to phase 1-2 (e.g., Business Case, Statistical Needs) are connected to Statistical Support, not Statistical Program
- Decision: to map phase 1-3 to Statistical Support and phase 4-7 to Statistical Program; there might be people associating survey to Statistical Program, we need to make sure that Statistical Program covers the cyclic parts, not the whole survey
- If we cover phase 4-7 with Statistical Program and the rest with Statistical Support, what do we use to cover the entire production process? It seems we have a missing object for this → Perhaps Business Process can be used for this?
- Need to be careful with usage of words ("process" and "activity"), need to institute disciplines around use of these terms.
Statistical Support Program
(GSIM task team meeting 29 September 2021)
- To change definition from "A program which is not related to the post-design production of statistical products, but is necessary to support production" to "activity that supports statistical production". Also, to replace occurrences of "program" with "activity".
- How to map with GAMSO activities? Shouldn't we include only "statistical activity" (not just activity?), everything in statistical organization is, in one way or another, to support the statistical production, so Statistical Support will include leadership, capability development which are different from corporate support activities
- How to map with initial GSBPM phases?
  If initial phases are to be mapped to Statistical Program, this requires extensive re-modeling because information related to these phases (e.g., Statistical Need, Business Case, Change Definition) are linked to Statistical Support, not Statistical Program nor Statistical Program Design. Also if we assume that Statistical Program contains activities that are repeatable (thus in Statistical Program Cycle), it is natural to assume only phase 4-7 are in Statistical Program
  If initial phases are to be mapped to Statistical Support, this does not cover cases where some of activities in initial phases are conducted as a part of Statistical Program Cycle which often happens in many organizations
- How about removing Statistical Support? Support activities are covered by GAMSO, do we need it? → Statistical Support is in fact a good place to cover many activities that can happen under the umbrella of GAMSO
- Statistical Support is a grey area, the model should be flexible, we need to leave it to users to use this object in a way they see fitting, but make sure that Statistical Support support both ways. Need to expand the explanatory text to include different use cases and examples.
Statistical Program
(GSIM task team meeting 14 April 2021)
- Action: to remove “which may be repeated” in definition of Statistical Program, it is not essential part of the definition. Re-use is fundamental principle of metadata model, saying something can be repeated is redundant.
- Definition of Statistical Program should be also updated to include corresponding concept of Universe. Statistical Program Cycle can be modelled as attribute of Statistical Program (instead of having a separate object), but this could cause issue regarding provenance and lineage as it would be hard to characterise the execution (i.e. cycle) if it is just an attribute.
(GSIM task team meeting 5 May 20201)
- (from the last meeting) to remove “which may be repeated” in definition of Statistical Program
- (from the last meeting) to include corresponding concept of Universe
- Regarding "within the context of the relevant Statistical Program Cycles" → Statistical Program Cycle (SPC) should depend on the existence of Statistical Program (SP), not the other way around. SP cannot exist in the context of SPC. It should be reversed.
- New proposed definition “set of activities to collect information on the characteristics about a given Universe” → collection (or acquiring data) is not really the purpose of SP, it is a part of the work, but the ultimate purpose of SP is to produce some statistics or estimates of characteristics of Universe.
- New proposed definition "set of activities to produce [ descriptive statistics, statistical product or statistical information ] within the context of Universe" → according to GSIM, SP has context of Subject Field, not Universe.
- Is it mandatory for SP to produce an output? What about SPs that fail to produce any output? There are programs such as ML that are run to figure out patterns, and only one winner gets to produce an output, so the rest does not produce any output in a traditional sense → the definition does not necessarily mean that SP will produce an output, it is the intention why it was designed in the first place. The example given can be considered as a support program
- Is it a set of activities or a set of processes? → Metadata Glossary used the distinction of what we do (activity) vs. how it is done (process). At the cycle level, we have processes that are in place to run and at the program level we have activities.
- We need to be clear about the scope we are using for "activities". What activities will fall under SP? Does it include HR, IT activities, does it also include cross-sectional activities like metadata management? We need to be clear if we are talking about just activities, statistical activities or statistical production activities.
- Proposed definition: "set of activities to produce [ descriptive statistics, statistical product or statistical information ? ] on a given Universe within the context of a Subject Field"
(GSIM task team meeting 26 May 20201)
- The definition of Statistical Program is a “set of activities…” and explanatory text says it “describes the purpose and objectives of a set of activities”. This seems to indicate that a set of activities describes the purpose of a set of activities
- Also we don't use phrases like "family of objects" for other GSIM objects
- → To remove "set of activities" part and "family of objects" in the explanatory text
- In the definition, does it have to be "a Subject Field"? We can start without any Subject Field (e.g. when we are in the process of figuring out data from web-scrapping) and can have multiple (e.g. when integrating multiple sources) → But when we start "Statistical Program", should it be already associated with at least one Subject Field? We should think in terms of our business, which is the production of statistics, and the product is always in context of one or more Subject Fields
- We don't need to be specific about cardinalities, we can simply say "within the context of Subject Fields"
- → To change as "within the context of Subject Fields" and update cardinality with Subject Field from "0..*" to "1..*"
- Should there be a link between Statistical Program and Universe if it is important enough to be a part of definition?
- Then what cardinality to use? -> Universe concept can exist without Statistical Program, multiple Statistical Programs can use one Universe
- → To add relationship between Statistical Program and Universe (also Statistical Program Cycle and Population) with cardinality "0..*" and "1" (as Universe itself can be a set of Universes)
(GSIM task team meeting 29 September 2021)
- How about changing name to "Statistical Production"? → Then the concept becomes too generic, it will need to cover (1) generic notion of statistical production as well as (2) statistical production that is conducted for specific target Universe and Product.
- In the explanatory text, to add a text "A statistical program could take as inputs other statistical programs outputs, e.g. national accounts. These activities are all carried out to generate products."
Statistical Program Cycle
(GSIM task team meeting 14 April 2021)
- Action: to remove "for a particular reference period" in the definition of Statistical Program Cycle ("set of activities to investigate characteristics of a given Population for a particular reference period"). It is redundant as Population is already time-dependent Universe.
(GSIM task team meeting 5 May 20201)
- (from the last meeting) to remove "for a particular reference period" as it is covered by Population
- New proposed definition "execution of an iteration of a SP for a certain reference period" (using the first sentence of the explanatory text) → does it have to be "execution"? we could just have "iteration"
- Can we add "geographic area" in the definition? If we conduct a survey program one on the continent-level and the other on the overseas territories, this would be two cycles.
- Proposed definition: "iteration of a SP for a given Population" and add in the explanatory text that Population can be bounded by geographic area or a reference period.
Statistical Program Design
(GSIM task team meeting 26 May 20201)
- There is a growing emphasis on ingraining policy compliance, privacy assurance, etc. "by design" in the statistical program → To add mention about policy and compliance as requirement in the explanatory text
- Regarding "specification of processes used" in the explanatory text, there is ambiguity, we can create a new process and just use pre-built existing processes → To include both cases in the explanatory text
- According to the model, Statistical Program has at most one Statistical Program Design, then Statistical Program Cycle performs the design. If we think in connection with GSBPM, in GSBPM Design phase, we already talk about sampling frame for specific Population, Represented Variables, etc. What is being designed in the Statistical Program Design?
InKyung Choi
04 May, 2021
Transformable Input and Transformed Output
(GSIM task team meeting 20 August 2020)
- Linking GSBPM and GSIM task team identified issue with Transformable Input. Part of definition “... is changed in some way” caused confusion because there are many cases where inputs are not changed such as for statistical register. Three proposals were made (see document prepared by Flavio for full description):
  Option 1: Assume that all inputs are static, i.e. a copy is always made by the Process Step and therefore they are never changed
  Option 2: Add an attribute to indicate "static" or "dynamic"
  Option 3: Adjust the definition slightly as “A type of Process Input whose content goes into a Process Step and may be changed in some way by the execution of that Process Step.”
- Team agreed on option 3 but with modification to the definition
  The new definition can cause confusion with Process Support Input because what used to differentiate this from Transformable Input was “is not in itself changed” part. If we use “may be changed” as part of definition of Transformable Input, it could create confusion.
  Possibility of change is not the most important characteristic of Transformable Input, important part is that it is major and core input that contributes to the output.
  We should convey that transformation does not necessarily mean that the actual input itself is changed, it can be a copy of the input and we change it
  Definition should conform the writing convention (see discussion item #3)
  We can give an example in explanatory text how Transformable Input is different from Process Support Input
- => Action: Flavio to update the proposal
(GSIM task team meeting 7 Oct 2020)
- Proposed definition is "main Process Input used to produce a Transformed Output where some or all its content may be changed by the execution"
- There are cases where Transformable Input is not really changed (e.g. business registers). Whether making an input as "transformable" or not is an implementation decision. Some people will not want to make anything “transformable” for traceability, reproducibility or auditing reasons.
- What should be emphasized is that this input is "main", not "transformable". We need to be able to handle cases where there is nothing transformable or high-level/process-agnostic cases where we are not even interested in whether something is transformable or not
- Perhaps we should change the name from Transformable Input to Main Input with definition "Process Input used to produce a Main Output where some or all its content may be changed by the execution". This can make it clearer and avoid confusion. Similarly, Main Output can be defined as "Process Output produced from Process inputs"
- "Main Input" is more generic than "Transformable Input"
(GSIM task team meeting 4 Nov 2020)
- change Main Output to Core Output with definition: key output of the Process Step execution
- change Main Input to Core Input with definition: essential input for the Process Step execution
- make the cardinality of Core (Transformed) Input from 0..* to 1...*
- include note in the explanatory text saying that there could be multiple "essential" inputs and "key" outputs

Page tree

23 Comments