Genericode code lists¶
A coded field in an invoice — cbc:DocumentCurrencyCode, cbc:InvoiceTypeCode,
the VAT category code — is only valid if its value appears in the right published
list. Those lists are not hard-coded into the rules; they ship as data, in the
OASIS Genericode format (files with a .gc extension). This page shows what a
.gc file looks like and how to look codes up in it efficiently — the real-world
home of the xsl:key and map techniques
from the XSLT section.
What a .gc file is¶
Genericode (OASIS Code List Representation) expresses one code list as a typed, self-describing table: a column set that declares the columns and names the key column, then a row per code.
- Identification — the list's name, version and a canonical URI. A Schematron binding refers to the list by this identity.
- ColumnSet — declares the columns up front, with a datatype each. Here a
codeand aname; real lists often addStatus,Description, dates, etc. - Key — which column is the lookup key. Every row's
codeis unique. - SimpleCodeList — the rows. Each
Rowcarries oneValueper column, referencing the column byColumnRef.
The lists you actually meet¶
EN16931 validation ships a folder of these. The common ones:
| List | Field it controls | Source |
|---|---|---|
| Currency codes | cbc:DocumentCurrencyCode (BT-5) |
ISO 4217 |
| Country codes | cac:Country/cbc:IdentificationCode |
ISO 3166-1 |
| Invoice type codes | cbc:InvoiceTypeCode (BT-3) |
UNCL1001 |
| VAT category codes | tax category cbc:ID |
UNCL5305 |
| Payment means | cbc:PaymentMeansCode |
UNCL4461 |
| Unit of measure | @unitCode on quantities |
UN/ECE Rec 20 |
Are code lists shared across formats?¶
Yes — and at several levels, because a code list attaches to the business term, not to the element that carries it. It lives above the syntax layer.
Across the two EN16931 syntaxes. BT-5 (currency) must be an ISO 4217 code
whether the document writes it as UBL's cbc:DocumentCurrencyCode or CII's
ram:InvoiceCurrencyCode. The allowed-value set is identical; only the element
differs. So the same .gc files back both bindings — there are two separate
mappings, and only the first is syntax-specific:
| Mapping | Goes from | To | Per-syntax? |
|---|---|---|---|
| Syntax binding | business term | an element | yes (cbc:… vs ram:…) |
| Code-list binding | business term | a list | no — shared |
Across document types. Within UBL, an Invoice, a CreditNote and an Order all draw currency from the same ISO 4217 list. The list does not know which document references it.
Across unrelated standards. The base lists are external global standards — ISO 4217, ISO 3166, the UNCL lists, UN/ECE Rec 20 — reused far beyond invoicing. EN16931 does not own them; it references them. Genericode is a format-neutral table keyed by code, so any consumer can load the same file, and the lookup below is identical regardless of the document being validated.
A profile narrows a shared list, it does not redefine it
A CIUS such as Peppol may restrict a shared list to a subset (consistent with "narrow, never loosen"), and add its own scheme lists (EAS, ICD) on top. The base list stays shared; the profile just applies a tighter allowed set. So: shared, format-neutral value sets — only the "this field uses that list" binding is restated per syntax.
Looking a code up¶
The job is the codelist join from the XSLT
section, on real data: the invoice says EUR, and validation must confirm EUR
is a row in Currency-2.1.gc. Because a .gc file declares its key column, it
maps directly onto an xsl:key:
- Index every
Rowby itscodecell — the same column the.gc<Key>names. - Switch the context into the loaded
.gcdocument sokey()searches its index — the gotcha explained in Keys and indexed lookup. key()returns the matching row, or empty if the code is unknown. A predicate scan would re-read every currency row on every lookup; the key reads an index.
In XSLT 3.0 the same list reads naturally as a map — load
once, index by code, then a constant-time membership test:
Why the index matters here specifically
UN/ECE Rec 20 (units) has thousands of rows; a multi-line invoice checks a
@unitCode on every line. A predicate scan is rows × lines work per
document; an xsl:key or map makes it rows + lines. This is the textbook
case from Keys and indexed lookup, with real numbers behind
it.
How validation actually uses them¶
In the EN16931 artefacts you rarely write the lookup yourself — the code-list
checks are compiled into the Schematron. A binding file associates each coded
field with a list, and the toolchain generates assertions equivalent to "the value
of this field exists in list X." The mechanism underneath is exactly the keyed
lookup above; the standard just authors it declaratively so the rule, the field,
and the list stay in separate, maintainable files — mirroring the
include split
of the rule model itself.
Next¶
The last piece is how EN16931 gets specialised for a real network — added fields, narrowed lists, and the one thing a profile is forbidden to do: Peppol and CIUS profiles.