GPX and KML — two geo vocabularies, two extension styles¶
Geospatial data is one of XML's quiet success stories: your watch records a run as GPX, Google Earth saves places as KML. Both are small enough to read whole, both are namespaced, and — the reason they share a page — they solve "how do I add data the base spec didn't define?" in two different ways. Put next to each other, they make the extension question concrete.
GPX: a clean schema with an <extensions> hatch¶
GPX (the GPS Exchange Format) is defined by a real, compact XSD. A track is points, each with coordinates as attributes and optional children.
- Position is in attributes (
lat,lon), elevation and time in child elements — a deliberate split between identity and payload. - The
<extensions>element is GPX's named escape hatch. The base schema does not define heart rate, so Garmin puthrin its own namespace (gpxtpx:) and dropped it inside<extensions>. A reader that only knows base GPX skips the whole subtree.
The GPX schema, browseably¶
GPX is a good schema to read because it is small and uses the constructs from
the XSD chapter without ceremony. unxml --xsd renders it as a data model:
schema http://www.topografix.com/GPX/1/1 (elementFormDefault=qualified)
xmlns = http://www.topografix.com/GPX/1/1
element gpx : gpxType
type gpxType
metadata : metadataType ? # (1)!
wpt : wptType *
trk : trkType *
extensions : extensionsType ? # (2)!
@version : xsd:string (required)
@creator : xsd:string (required)
type wptType
ele : xsd:decimal ?
time : xsd:dateTime ?
@lat : latitudeType (required)
@lon : longitudeType (required)
type latitudeType : xsd:decimal [-90.0..90.0] # (3)!
- The occurrence suffixes are
unxml's shorthand:?optional,*zero-or-more,+one-or-more — the sameminOccurs/maxOccursyou met in the XSD chapter, compressed to one character. extensions : extensionsType ?— the hatch is a typed element in the schema, andextensionsTypeitself is ananywildcard accepting foreign-namespace content. Extension is a first-class, named part of the model, not an afterthought.latitudeType : xsd:decimal [-90.0..90.0]is a range-restricted simple type — the schema enforces that latitude is a real coordinate, solat="999"fails validation. The bracket range isunxml's rendering ofminInclusive/maxInclusive.
KML: extension by reserved prefix¶
KML (Keyhole Markup Language, now an OGC standard) describes places, styles, and
overlays. Its extension style is different: instead of a generic <extensions>
bag, Google reserved a prefix, gx:, for vendor extensions that live inline
wherever they apply.
- A
Styleis defined once with anid, then referenced — the same define-once/reference-by-id idea as OOXML relationships and SVG's<defs>, recurring across vocabularies. styleUrl>#pinis the reference. KML reuses the URL-fragment convention (#id) for intra-document links.coordinatesis lon,lat,alt — longitude first, the opposite of GPX'slat/lonattributes. A perennial bug source when converting between the two.
Two extension philosophies, side by side
- GPX: a single, schema-defined
<extensions>container — extensions are quarantined in one place, easy for a base reader to skip wholesale. - KML: a reserved
gx:namespace used inline — extensions appear right where they belong (e.g.gx:Trackbeside aPoint), at the cost of being sprinkled throughout the document.
Both are valid; the choice is between containment and locality. When you design your own extensible format, this is the fork in the road.
Querying coordinates¶
Both default-namespace their core, so the
SVG namespace caveat applies again:
//trkpt or //Placemark match nothing until you bind the GPX / KML namespace
URI to a prefix and query //g:trkpt. The vendor extensions (gpxtpx:, gx:)
are already prefixed, but you still must bind their URIs in your query host.
Things to note¶
- Two real, readable geo vocabularies that are small enough to learn whole — and
whose schemas read cleanly under
unxml --xsd. - Extension by container (
<extensions>) vs extension by reserved prefix (gx:) — the two dominant strategies, with opposite trade-offs. - Simple-type restrictions earn their keep: a latitude range catches bad data at validation time.
- Coordinate order (
lat,lonvslon,lat) is a reminder that a schema constrains structure, not meaning — that part is still on you.
Next: XBRL — namespacing pushed to its limit, where you define a whole namespace of business concepts.