Node tests and predicates¶
A location step has three parts: axis::nodetest[predicate]. The axis chooses a
direction through the tree; the node test then decides which nodes on
that axis the step keeps; and any predicates filter that result further.
This page covers the last two — what a node test can match, and how predicates
narrow a node-set.
For the examples the catalog gains a genre attribute, an optional <year> on
the first CD, and a comment plus the usual whitespace between elements — so we
have something for the kind tests to bite on:
Node tests¶
A node test is written after the axis (or after the @, //, / shorthands,
which imply an axis). It comes in two flavours: name tests, which match by
name, and kind tests, which match by node type.
Name tests¶
A bare name matches the nodes on the axis that have that name. On the default child axis it means elements; on the attribute axis it means attributes.
- Element children named
cd. - The
genreattribute (the@selects the attribute axis). - All element children, whatever their name.
- All attributes, whatever their name.
So * is the wildcard name test. Its meaning follows the axis: on the child
axis * is "any element node", but on the attribute axis @* is "any attribute
node".
* is element-only on the child axis
* never matches text, comment, or processing-instruction nodes — it is a
name test, and only elements and attributes have names. To reach the other
node types you need a kind test (below).
Namespaced name tests¶
When the document uses namespace prefixes, a name test can be qualified. The prefix is matched by the namespace it is bound to, not by the literal prefix string — the expression's prefix and the document's prefix only have to resolve to the same namespace URI, they need not be spelled the same.
- Elements named
pricein the namespace bound tocbc. - Any element in the
cbcnamespace, whatever its local name. - Any element with local name
price, in any namespace (XPath 2.0+).
*:local needs 2.0+
prefix:* is available in XPath 1.0, but the *:local form (wildcard
namespace, fixed local name) is an XPath 2.0 addition. In 1.0 you match
across namespaces with local-name() instead, e.g. *[local-name()='price'].
Kind tests¶
A kind test selects by node type and is written with trailing parentheses, so it never collides with a name test. There are four:
- Any node on the axis — elements, text, comments, and PIs alike.
- Text nodes only (the character content of an element).
- Comment nodes (
<!-- ... -->). - Processing-instruction nodes (
<?target data?>); pass a name to pin the target, e.g.processing-instruction('xml-stylesheet').
text() versus selecting the element¶
Selecting the element and selecting its text are different nodes. title
returns the <title> element; title/text() returns the text node inside
it:
- The
<title>element node. - The text node it contains — the characters
Empire Burlesque.
title → the <title> element (string value: Empire Burlesque)
title/text() → the text node Empire Burlesque
In practice you often don't write text(): taking the string value of the
element (as string() or a host's value-of does) gathers all descendant text
for you, whereas text() returns the immediate text-node children only — which
matters when an element has mixed content split by child elements.
Why node() is broader than *¶
* is an element name test; node() is the all-types kind test. Asked for the
children of a cd, they return different sets:
- The element children:
title,artist,price,year. - Every child node — those same elements plus any text nodes (and comments) between and around them.
Asked for the children of catalog, catalog/node() also picks up the
<!-- imported 2024 --> comment and the whitespace text nodes, none of which
catalog/* would return. Neither test ever reaches @genre, though: attributes
are not children (see the data-model note on the section index) and
live only on the attribute axis.
Node tests at a glance¶
| Node test | Matches |
|---|---|
cd |
elements named cd on the axis |
@genre |
the genre attribute |
* |
all element nodes (all attributes on the attribute axis) |
cbc:* |
all elements in the namespace bound to cbc |
*:price |
elements with local name price in any namespace (2.0+) |
node() |
any node — element, text, comment, or PI |
text() |
text nodes |
comment() |
comment nodes |
processing-instruction() |
processing-instruction nodes |
Predicates¶
A predicate is a [ ... ] filter on a step. The step first produces a
node-set; the predicate is then tested against each node in turn and keeps only
those for which it is true. The kinds you will use — comparison, existence,
attribute, and positional — are worked through with XSLT-context examples on the
XPath predicates page; this page looks at the same
mechanism from the XPath-language side.
Each candidate is the context node¶
A predicate is evaluated once per candidate node, with that node as the context
node. The expression inside the brackets is therefore relative to the node
being tested. Within the step, the context position and size are set
from the step's node-set, so position() and last() count over the nodes the
step selected — not over the whole document.
- For each
cd, evaluateprice > 10with thatcdas context — keep the ones whose<price>child exceeds 10. - The last
cdin the set, sinceposition()andlast()range over the threecdnodes.
A number is shorthand for position()¶
A predicate that is just a number is shorthand for "the node at that position":
a numeric predicate [n] means [position() = n]. Positions are 1-based.
- The second
cd. - Exactly the same thing —
[2]expands to this.
cd[2] → the Hide your heart CD
Truthiness inside a predicate¶
The bracketed expression is coerced to a boolean, and the rule depends on its type:
- A node-set is true when it is non-empty — so a bare name is an existence test.
- A number is true when it is neither
0norNaN. - A string is true when it is non-empty.
- CDs that have a
<year>child — true because theyearnode-set is non-empty. Only the first CD qualifies. - CDs that carry a
genreattribute — here, all three. - The inverse: CDs with no
<year>.
cd[year] → Empire Burlesque
cd[not(year)] → Hide your heart, Greatest Hits
A bare number is not an existence test
cd[year] tests existence (a node-set), but cd[1] tests position (a
number expands to position() = 1). The two look similar but mean
different things — a name in brackets asks "does it exist?", a literal
number asks "is it in this position?".
Multiple predicates filter left to right¶
Predicates chain. Each one is applied to the node-set the previous one produced, and re-numbers it before the next runs — so order changes the result.
- Keep the rock CDs, then take the first of those.
- Take the first CD, then keep it only if it is rock.
Here both yield the first CD (it is the rock one), but they diverge as soon as the first CD is not rock: case 1 still returns the first rock CD, while case 2 returns nothing.
Position and //¶
//cd[1] is not the first cd in the document
The predicate binds to its step, not to the whole path. In //cd[1] the
[1] applies to the cd step, selecting every cd that is the first
cd among its own siblings — possibly several nodes. To take the single
global first node, parenthesise the path so the predicate applies to the
combined result:
See XPath predicates for a worked walk-through of the distinction.
Next¶
Functions and data types — once you can select and filter nodes, the core function library lets you compute over them and convert between XPath's four 1.0 types.