Whitespace and xsl:text¶
Sooner or later your output grows a stray space, a run of blank lines, or
indentation you never asked for. Almost always the cause is whitespace — in the
stylesheet, in the source document, or added by the serialiser. This page
shows where each kind of whitespace comes from and the small tools that let you
control it: xsl:text, xsl:strip-space / xsl:preserve-space,
normalize-space(), and xsl:output indent.
Whitespace in the stylesheet¶
A whitespace-only text node sitting between XSLT elements is stripped from the stylesheet automatically. The indentation that makes your stylesheet readable therefore does not reach the output:
<xsl:template match="cd">
<xsl:value-of select="title"/>
<xsl:value-of select="artist"/>
</xsl:template>
The newlines and spaces around the two xsl:value-of elements are
whitespace-only nodes between instructions, so they are discarded. The title and
artist come out glued together:
Empire BurlesqueBob Dylan
The one exception is text inside xsl:text — that is never stripped (see
below). Everything else that is whitespace-only between elements goes away.
Indentation inside literal result content is not whitespace-only
The rule only removes text nodes that are entirely whitespace. The moment a text node also contains real characters, all of its whitespace is kept. So indentation inside a literal result element surfaces in the output:
The newline-plus-spaces before and after the xsl:value-of are kept,
because they are part of the <line> element's content. The result is
<line> with a leading newline, indentation, the title, then another
newline — rarely what you want. Pull the xsl:value-of onto the same line as
its parent, or use xsl:text, to avoid it.
xsl:text — emit text exactly¶
xsl:text writes its content to the output verbatim, and its content is
exempt from the whitespace-stripping rule. That makes it the precise tool for
two jobs: forcing a space (or newline) to appear, and stopping unwanted
whitespace from appearing.
Force exactly one space¶
To put a single space between two values, do not rely on the layout of your
stylesheet — wrap the separator in xsl:text:
The space inside <xsl:text> </xsl:text> is real content, so it is emitted
exactly once:
Empire Burlesque Bob Dylan
A literal newline works the same way — <xsl:text> </xsl:text> emits one
line break, which is the usual way to end a record in method="text" output.
Suppress unwanted whitespace¶
An empty <xsl:text/> outputs nothing, but it is an element. Placing it
between two values lets you spread instructions across several indented lines
while the whitespace between them stays whitespace-only (and so is stripped):
- The newlines either side of
<xsl:text/>are whitespace-only text between elements, so they are removed — the two values join with nothing between them, even though the source is nicely indented.
Empire BurlesqueBob Dylan
Make separators explicit
When output spacing matters, stop depending on stylesheet layout and state
every separator with xsl:text. It reads more verbosely but it is
unambiguous, and it survives reformatting your stylesheet.
Whitespace in the source document¶
Whitespace in the input is a separate question. By default the processor treats text nodes in the source as significant — even ones that are only indentation. A pretty-printed source therefore carries stray newlines and spaces in its text nodes.
Two top-level elements control this:
xsl:strip-space elements="..."— remove whitespace-only text nodes from the listed source elements.xsl:preserve-space elements="..."— keep them (the default for everything).
Both take a space-separated list of element names, or * for all elements:
| strip-space.xsl | |
|---|---|
- Drop whitespace-only text nodes from every source element — the indentation
between
<catalog>,<cd>and their children disappears, so things likecount(node())no longer trip over phantom text nodes. - Override the blanket strip for
<title>, where you want any whitespace-only text kept.
It only removes whitespace-only nodes
xsl:strip-space never touches text that contains real characters. A
<title> of Empire Burlesque keeps its text untouched; only nodes that
are entirely whitespace are candidates for removal. To clean leading,
trailing or internal whitespace from a real value, use normalize-space().
normalize-space() — per-value cleanup¶
Stripping handles whole nodes; normalize-space() cleans a single string. It
trims leading and trailing whitespace and collapses every internal run of
spaces, tabs and newlines to one space. It is the standard defence against messy
source text:
If the source held <artist> Bob Dylan </artist>, the output is the tidy:
Bob Dylan
See String functions for the rest of the XPath 1.0 string toolbox.
Output indentation¶
xsl:output indent="yes" asks the serialiser to pretty-print the result tree by
adding its own newlines and indentation:
This is purely for human readability of the result.
indent is a hint, not a contract
indent="yes" is advisory — different processors indent differently, and it
may add or alter whitespace inside your elements. Never turn it on for output
whose whitespace must be preserved byte-for-byte (signed documents,
whitespace-significant formats). Use indent="no" when exactness matters.
See Producing XML output for the full xsl:output reference.
disable-output-escaping¶
Normally the serialiser escapes special characters: a < in your text becomes
<, an & becomes &. disable-output-escaping="yes" on
xsl:value-of or xsl:text turns that off, so the characters are written
raw:
Instead of the visible text <hr/>, this writes the raw markup <hr/> into the
output stream.
Last resort — and not portable
disable-output-escaping is a serialisation hack: the processor is allowed
to ignore it entirely, and it does nothing when the output is consumed as a
node tree rather than serialised bytes. It is easy to produce ill-formed
output with it. Prefer building real result nodes (xsl:element,
xsl:copy-of, literal result elements) and reach for
disable-output-escaping only when nothing else can inject the markup you
need.
Worked example: title and artist, controlled separator¶
The shared catalog.xml:
<catalog>
<cd><title>Empire Burlesque</title><artist>Bob Dylan</artist><price>10.90</price></cd>
<cd><title>Hide your heart</title><artist>Bonnie Tyler</artist><price>9.90</price></cd>
</catalog>
We want one line per CD, title — artist, with exactly one space either side of
the dash. The xsl:text separators make the spacing explicit, and a final
<xsl:text> </xsl:text> ends each line:
- Drop indentation from the source so no phantom text nodes leak through.
- The separator — space, em dash, space — emitted exactly as written.
- A literal newline ends the record.
Empire Burlesque — Bob Dylan
Hide your heart — Bonnie Tyler
Now compare the same template without xsl:text, relying on stylesheet
layout for the spacing:
| join-broken.xsl | |
|---|---|
The — here sits in a text node that also holds the surrounding spaces and
newlines. Because that node is not whitespace-only (it contains the dash),
its whitespace is kept — including the newline and indentation before the next
iteration's title. The records run together with ragged spacing:
Empire Burlesque — Bob Dylan Hide your heart — Bonnie Tyler
The lesson: when spacing matters, say it with xsl:text rather than letting the
shape of the stylesheet decide.
Next¶
Template modes — once whitespace is under control, modes let you process the same nodes more than once, each pass producing a different kind of output.