My solution in point 4 on error detection and error correction is somewhat simplistic in formulation. Imputation methods can also be used for missing values and by extention as a modelling technique.The distinction between 5.3 (review, validate & edit) and 6.2 (validate outputs) is unclear and perhaps not relevant.
Istat suggests to join former sub-processes 5.3 and 5.4 into the following sub-process 5.3.
5.3. Data validation - This sub-process applies to collected micro-data, and looks at each record to try to identify (and where necessary correct) potential problems, errors and discrepancies such as outliers, item non-response and miscoding. It can also be referred to as input data validation. It may be run iteratively, validating data against predefined edit rules, usually in a set order. It may apply automatic edits, or raise alerts for manual inspection and correction of the data. Reviewing, validating and editing can apply to unit records both from surveys and administrative sources, before and after integration. In certain cases, imputation may be used as a form of editing.
Where data are missing or unreliable, estimates may be imputed. Specific steps typically include:
- the identification of potential errors and gaps;
- the selection of data to include or exclude from imputation routines;
- imputation using one or more pre-defined methods e.g. “hot-deck” or “cold-deck”;
- writing the imputed data back to the data set, and flagging them as imputed;
- the production of metadata on the imputation process.
If Istat suggestion is accepted, it is necessary to renumber all the following sub-processes.