In GSIM 1.5, an effort has been made to standardize the value types of attributes. One area for further standardization relates to the use of “String” and “Text” value types. It is assumed that “String” is better for IT people and “text” is better for business people. Should the model only use “String” or “Text”? Which one is preferred?

Feedback from countries

CountryResponse in shortFeedback from country
CroatiaText-
LithuaniaTextIn our experience we prefer to use “Text” instead of “String” as a value type. Our IT colleagues agree, and in their view in different programming languages there is also no consistency in naming value types. Sometimes it is “String”, sometimes “Varchar”, so “Text” is preferable as most common.
FinlandNo strong preference

We do not have a strong opinion on this topic, either option is fine for us. However, in the GSIM 1.5 descriptions some of the value types are regarded as "multilingual text". If multilingualism was the issue to be emphasized here, "text" would then be the obvious option. For example, the value type of the Description-attribute for Universe is "Multilingual Text" https://statswiki.unece.org/display/newclick/Universe

MexicoText

We prefer the term “Text”, it is a better than “string” since it is more meaningful for business people. IT staff can deal with it in computer languages and databases. Besides, the use of “Text” to identify value types is consequent with the aim to change the attributes to natural language.

AustraliaNo strong preference

No strong preference. “String” is a high level base type defined by W3C. If GSIM intends literally a string of characters then String may be better, but could be explained as synonymous with Plain text.
If the term “Text” is used there might be some explanation of whether it is a “conceptual” sense of “text” (eg as opposed to “numeric”), leaving open possible implementation as Formatted text (eg HTML, RTF) instead of plain text strings.

CanadaStringI worked with Jenny at reducing the variety of datatypes that were used. We had agreed that String (Simple Class) will be used everywhere except for where multilingualText (a Composite Class that is derived from Text, which is a String) is used. The other point I have is that String being a UML primitive datatype, should we not also assume that business people using UML would at least know about those types?


  • No labels

4 Comments

  1. InKyung Choi

    Also on attribute type

    (Feedback from Norway)

    Many Text fields could/should be Multilingual Text e.g. Legal Framework for Statistical Program (Business Group) is Text, but Legal Base for Statistical Classification (Concepts Group) is Multilingual Text. This should be consistently checked across all groups whenever the Value Type is Text and changed to Multilingual Text.

  2. Danny DELCAMBRE

    Feedback from David BARRACLOUGH (SDMX Statistical Working Group): Preference would be to go with the term that is most linked to standards (W3C, ISO, etc.) used for reference frameworks related to official statistics metadata. I also think that flipping a coin (either option is acceptable) is a problem in itself and the group should look at establishing something for future reference. Based on feedback, it seems that “string” is the one most rooted in standards so preference would be that, as long as further standards analysis backs it up.

  3. InKyung Choi

    (GSIM Revision Meeting 24th October, 2018)

    • Regarding "Text" vs. "String": agreed on "String" 
    • Regarding "Multilingual Text": to inform UNECE if encounter any attribute that has to be changed to "Multilingual Text". However, as GSIM is a conceptual model, it should not create a big problem. 
  4. InKyung Choi

    GSIM v1.2 has 7 value types:

    1. Boolean
    2. ControlledVocabulary
    3. Date
    4. DateTime
    5. MultilingualText
    6. Number
    7. String