unxml — reading XML compactly¶
Throughout this site, schemas, stylesheets, and the odd large document are shown
rendered with unxml rather than as raw markup. This page explains the tool
those renders come from.
Full disclosure
unxml is written by the author of
this site, so putting it in an appendix is shameless self-promotion. It earns
the page anyway: the site uses it on nearly every real-world page as a
reading aid, and the renders you see are meaningless unless the notation is
documented somewhere. This is that somewhere. It is an appendix, not a
chapter, precisely because the tool is a convenience, not part of the XML
standards the rest of the site teaches.
The problem it solves¶
XML is a fine format for machines and a tiring one for eyes. The information
density of a document is low: a large fraction of the bytes are closing tags,
xs:/xsl: ceremony, and namespace declarations. When you are trying to grasp
the shape of a schema or the logic of a stylesheet, that ceremony is
noise.
unxml flattens an XML file into an indented, bracket-free outline that keeps the
structure and drops the syntax. It auto-detects the document kind, and has
dedicated rendering modes for the XML dialects this site covers.
Plain mode — instance documents¶
With no flags, unxml flattens an ordinary document. Elements become indented
names, attributes fold into ( … ), and leaf text becomes name = value:
<invoice xmlns="urn:example:invoice" id="INV-42" currency="EUR">
<supplier>
<name>Acme Records</name>
<country>FI</country>
</supplier>
<line sku="LP-001" qty="3">
<desc>Vinyl LP</desc>
<price>19.90</price>
</line>
<total>59.70</total>
</invoice>
invoice(currency="EUR", id="INV-42", xmlns="urn:example:invoice")
supplier
name = Acme Records
country = FI
line(qty="3", sku="LP-001")
desc = Vinyl LP
price = 19.90
total = 59.70
The tree is the same; the angle brackets are gone. (On this site, instance documents are usually shown as raw XML anyway — for a short specimen the markup is already clear. The compact form earns its keep on the schemas and stylesheets below, which are far denser.)
Prose with inline elements¶
Document-style XML interleaves text with small inline elements — a <para> with
a <command> inside it. Flattening every run onto its own line would shred the
sentence, so unxml keeps such mixed content on one line, as verbatim XML:
para = The <command>widget</command> daemon keeps its <filename>state.db</filename> in one place.
The rule is structural, not a list of known tags: an element flows inline when
its whole subtree is inline-safe — text interleaved with elements that are
themselves inline-safe. A leaf with significant multi-line text (a
<programlisting> or <screen>) is not inline-safe, so its parent stays in
the flattened block form and the listing keeps its line breaks. The principle:
flatten the scaffolding, quote the prose — angle brackets vanish from the
document skeleton but remain on the short inline spans, where the markup is the
content. This is what makes the DocBook skeleton read
the way it does.
--xsd — XML Schema¶
A schema's job is to declare a data model, but xs:complexType / xs:sequence /
xs:element scaffolding buries it. --xsd rewrites the schema vocabulary into a
type-declaration syntax that reads like the model it describes:
schema
element invoice
line +
type
desc : xs:string
price : xs:decimal
@sku : xs:string (required)
@qty : xs:integer
@currency : xs:string
: is "typed as", @ an attribute, + one-or-more (? optional, *
zero-or-more), (required) a use constraint. The whole XSD
vocabulary compresses to a handful of symbols — see the
XSD section for where this is used in anger.
--xslt — stylesheets¶
--xslt is what the XSLT at scale page uses to show wide
spans of the DocBook stylesheets. It turns templates, xsl:choose, functions,
and output ceremony into pseudocode:
xsl:stylesheet(version="3.0", xmlns:xsl="http://www.w3.org/1999/XSL/Transform")
match /invoice:
html
body
apply line
match line:
p
<- desc
":"
<- price
The notation:
unxml --xslt |
XSLT it stands for |
|---|---|
match X: |
xsl:template match="X" |
apply / apply X |
xsl:apply-templates (of children / of a select) |
<- expr |
xsl:value-of select="expr" (a string) |
<-- expr |
xsl:sequence select="expr" (a sequence — the doubled arrow) |
<- expr ?? "text" |
a value-of with literal fallback text |
name as T := … |
a typed xsl:variable / xsl:param |
function f:n(args) -> T: |
xsl:function name="f:n" as="T" |
choose: / when X: / else: |
xsl:choose / xsl:when / xsl:otherwise |
--schematron and --wsdl¶
The same idea covers the other validation and service dialects. --schematron
reduces a rule set to context-and-assertion lines:
--wsdl renders a WSDL 1.1 / SOAP service description, including its embedded
<types> schema via the --xsd transformation — it is what the
SOAP and WSDL page uses.
Handy flags¶
A few options come up often:
| Flag | What it does |
|---|---|
--auto |
Pick the mode from each file's extension (.xsl/.xslt, .sch, .xsd). Lets you point it at a mixed set of files. |
--select NAME |
Render only the subtrees whose tag matches NAME, as top-level fragments — how this site quotes one template out of a big module. |
--hide-ns p,q |
Drop namespace prefixes (and their xmlns: declarations) to cut noise — e.g. --hide-ns cbc,cac on UBL. |
--expand |
Inline xsl:apply-templates by pulling in matching templates from imports — trace a transform without chasing files. |
--bat |
Pipe through bat for syntax-highlighted, paged output (implies --auto). |
--stdin |
Read XML from standard input instead of a file. |
Installing it¶
uv tool install unxml-rs # from PyPI
# or grab a binary from the GitHub releases page
# or, if you have a Rust toolchain:
cargo install --git https://github.com/vivainio/unxml-rs
There is a gallery of real documents rendered side-by-side at
vivainio.github.io/unxml-demos, and
the full set of schema transformations is documented in the
unxml XSD docs.
It is itself an XML-processing program
unxml is written in Rust on top of the quick-xml streaming parser — so it
doubles as a worked example for the Rust XML APIs
page. Reading XML compactly is, after all, just one more thing programs do
with XML.