XPath and XPointer/XPointer Syntax

From WikiContent

< XPath and XPointer(Difference between revisions)
Jump to: navigation, search
(Initial conversion from Docbook)
Current revision (09:53, 7 March 2008) (edit) (undo)
(Initial conversion from Docbook)
(One intermediate revision not shown.)

Current revision

XPath and XPointer

Like XPath, XPointer is not in itself an XML vocabulary. Rather, it's meant to be used within the markup in XML documents — most often in XLink or XLink-like situations requiring a URI. This chapter covers the details of coding the various XPointer forms. There are two approaches to defining XPointers as described in the XPointer Framework. Shorthand pointers use a very brief syntax, while scheme-based XPointers use a more complex syntax composed of pointer parts.


Shorthand Pointers

In XHTML hyperlinking, as you know, you can locate a subresource using a combination of a named anchor (the <a name=" mybookmark "/> sort of tag) and a normal anchor (<a href=" #mybookmark ">...). Notwithstanding the limitations of XHTML subresource hyperlinking, the XPointer spec's authors recognized its principal value: simplicity. Thus, they carried it forward into XPointer, enhanced slightly for the new standard's use with XML documents of any vocabulary. This form of an XPointer is called a shorthand pointer; it includes neither scheme nor XPath expression, just the "name" of the target resource:


In an XPointer, as in an XHTML fragment identifier, the pound sign/hash mark, #, is not itself part of the XPointer or other fragment identifier. It merely serves to delimit the fragment from the full URI preceding it. Section 8.3 at the end of this chapter addresses this issue more fully.

The value of name is the value of an ID-type attribute assigned to some element in the target resource. Thus, the shorthand form is in essence a shortcut for the longer XPointer form:


Consider the following simple XML document:

<gaming_platforms currency="sadly-outdated">
   <gaming_platform id="A">Atari</gaming_platform>
   <gaming_platform id="S">Sega</gaming_platform>
   <gaming_platform id="SN">Super Nintendo</gaming_platform>
   <gaming_platform id="P">Pong</gaming_platform>

Assuming the id attributes are in fact ID-type attributes, therefore, you could locate the Pong gaming_platform element with this simple XPointer:


Chapter 4 described how the XPath id() function works, how it depends on ID attributes having been declared in DTDs, and how it depends on those DTDs having been processed. XPointer's shorthand pointers have the same set of issues, but the XPointer Framework specification adds one more: in addition to IDs defined in XML 1.0 DTDs, it recognizes IDs defined in the W3C's XML Schema vocabulary.

In DTDs, IDs are pretty simple. An ID is plainly identified as an attribute of type ID. The only real problem with IDs is the requirement that a DTD be provided and processed. XML Schema offers a number of different options, including IDs provided as child attributes. This means that, if XML Schema processing took place and a Post-Schema Validation Infoset (PSVI) is available, shorthand pointers must look for IDs in that PSVI.


For more on how XML Schema defines and uses IDs, see XML Schema, by Eric van der Vlist (O'Reilly).

Schema-aware ID processing is also specified for the element( ) scheme, but is not required for the xpointer( ) scheme, most likely because it builds on XPath 1.0, which is not XML Schema-aware.

Scheme-Based XPointer Syntax

Scheme-based XPointers follow this general form:


The ellipsis (...) indicates that XPointers can be chained together in sequence. Each scheme/schemedata item in the chain is referred to as a pointer part; thus, some XPointers consist of just a single pointer part and some consist of multiple pointer parts. When multiple pointer parts are used, they may be delimited from one another with optional whitespace. You'll see more information about these chains of pointers in Section 8.2.8.

The Scheme

The scheme of a pointer part functions something like the protocol of a URI (such as http:, ftp:, gopher:, and so on). Its purpose, said the previous draft of the spec, is to "[identify] the particular notation" used by the XPointer; you'll probably agree this isn't an especially descriptive definition. From the examples provided in the spec, though, we can come up with a simple definition like: the scheme tells us what kind of pointer part we're dealing with.

A pointer part is typically one of two predefined kinds, denoted by three predefined schemes:

  • A scheme of xpointer — easily the most common scheme — says that this pointer part is to be used in XPointer's typical manner: to identify some portion of an XML document of interest.
  • A scheme of element indicates that this pointer part will identify a portion of an XML document using a "child sequence" notation for walking the document tree.
  • A scheme of xmlns marks this pointer part as a prelude to the pointer parts that follow. By itself, it doesn't locate any resource at all; it simply declares a namespace context in which succeeding pointer parts (within the same scheme-based XPointer) are to be evaluated. More information on xmlns-type schemes appears later in this chapter.

You may also use custom schemes instead of these three predefined kinds. More information on this option is found in Section 8.2.7 later in this chapter.

The schemedata

The schemedata contents of pointer parts vary with their schemes, and the XPointer Framework itself does very little to constrain them. Each scheme specification provides its own set of rules describing how its schemedata is to be interpreted.

Contents of the xmlns() Scheme

When the scheme of a pointer part is xmlns, the expr_or_decl declares the namespace associated with a particular namespace prefix used in subsequent pointer parts. This namespace declaration takes the form:


For instance:

xmlns(xsl=http://www.w3.org/1999/XSL/Transform) [subsequent pointer parts]

asserts that the namespace prefix xsl: appearing in the rest of the multipart XPointer is to be associated with the indicated namespace URI (that is, in this case, the namespace for XSLT elements and attributes).

Contents of the element( ) Scheme

You can locate content without knowing anything at all about the specific named nodes of a target resource. This XPointer form, which uses the element( ) scheme and schemedata known as child sequences, uses a conventional tree-navigation syntax to locate the nth child of each succeeding level in the document.

Consider the gaming-platform document again:

<gaming_platforms currency="sadly-outdated">
   <gaming_platform id="A">Atari</gaming_platform>
   <gaming_platform id="S">Sega</gaming_platform>
   <gaming_platform id="SN">Super Nintendo</gaming_platform>
   <gaming_platform id="P">Pong</gaming_platform>

To locate the Sega gaming_platform element, aside from any other options you can use the element( ) scheme:


This simply directs the processor to walk the tree, getting the first child (that is, the root gaming_platforms element) of the root node, and then selecting that child's second child (the Sega gaming_platform element).

Note a few things about XPointers built using the element( ) scheme. First, they can locate elements only; all other "children" (such as PIs contained within the element's start and end tags) are effectively invisible. Second — barring some way of resetting the context in which the child sequence is to be evaluated — the very first integer in a child sequence will nearly always be 1; this follows from XML's well-formedness requirement that a document have no more than one root element.


As the XPointer spec mentions, while a well-formed XML document must have only one root element, XPointer can be used for locating content in possibly non-well-formed external unparsed entities as well. such entities may have multiple "root" elements, leading to the possibility of a child sequence such as:


Third, although it may not be as obvious as with shorthand pointers, child sequences are also shortcuts for scheme-based XPointers. To locate the Sega gaming_platform element as described above, using element(/1/2) is effectively an abbreviated form of the scheme-based XPointer:

xpointer(/*[position()=1]/*[position(  )=2])

or, more simply:


Finally, child sequences are both robust (the simplest ones won't break at all) and fragile (when they break, they're liable to break in more or less subtle and difficult-to-diagnose ways).

To understand this last point, consider an XML document such as the following:

      <title>XML in a Nutshell</title>
      <author>Harold &amp; Means</author>
      <title>DocBook: The Definitive Guide</title>
      <author>Walsh &amp; Muellner</author>
      <title>Learning XML</title>
      <title>HTML &amp; XHTML: The Definitive Guide</title>
      <author>Musciano &amp; Kennedy</author>
      <title>Building Oracle XML Applications</title>

Using a child sequence, we could construct an XPointer to the author of the last book, which would look as follows:


This locates the second child of the fifth child of the first child of the root node. Note the right-to-left reading of the child sequence. This is often the simplest way to express in everyday language what a child sequence points to. Thus, this child sequence is functionally equivalent to an XPointer using the more robust xpointer( ) scheme, such as:

xpointer(//author[../title = "Building Oracle XML Applications"])

If, however, the document changes — particularly with the addition or removal of book elements — the child sequence will now point to a different author element or, worse, return an empty location-set altogether. The xpointer( ) approach, on the other hand, continues to point to the author of that book as long as a book with that title exists in the document, regardless of where in the document it is.

(Whether this is desirable, of course, depends on your application's specific needs. Personally, I'm much more comfortable knowing what I'm pointing to than I am knowing where it's supposed to be.)

Potential fragility aside, child sequences feature what can be a killer advantage: a processor can simply read only as much of a document as it needs to locate the desired node. Relying on loading the entire document — as other kinds of XPointers must — can make processing very large documents practically infeasible.

Combining Names and Child Sequences

Because shorthand pointers — at least, assuming liberal use of ID-type attributes — are so convenient and simple, XPointer provides an option that combines them with child sequences. These open using the same rules for connecting names to ID values as shorthand pointers, followed by a child sequence starting at the element so identified.

Assume the following XML fragment is coded in a vocabulary in which each attribute named id has been declared as an ID-type attribute:

<brewery id="petes">
      <name>Wicked Ale</name>
      <name>Strawberry Blonde</name>
      <name>Helles Lager</name>

Using the element( ) scheme, you could locate the carbs element corresponding to Helles Lager this way:


Note that this combines the content awareness of a shorthand pointer with the structure awareness of a child sequence and thus avoids some of the problems associated with each.

Contents of the xpointer( ) Scheme

When the scheme is xpointer, what appears within the required parentheses of a scheme-based XPointer is based on an XPath expression, locating some subresource within a target resource.

The XPath expression in an xpointer-type pointer part is not set off from what surrounds it with quotation marks. This makes XPointer syntax notably different from that of XSLT, XPath's other big "client." XPath expressions in XSLT stylesheets always appear as attribute values and therefore must be enclosed in quotation marks. (On the other hand, remember that XPointer will almost never be used by itself; rather, it will be used to locate a subresource of a resource located by XLink or a similar standard. Just as in XHTML, these resources as a whole — URIs — will almost always appear within quotation marks, as attribute values.)

For example, consider the following simple XML document:

<gaming_platforms currency="sadly-outdated">
   <gaming_platform id="A">Atari</gaming_platform>
   <gaming_platform id="S">Sega</gaming_platform>
   <gaming_platform id="SN">Super Nintendo</gaming_platform>
   <gaming_platform id="P">Pong</gaming_platform>

You could locate all gaming_platform elements whose names begin with S using a scheme-based XPointer such as this:

xpointer(//gaming_platform[starts-with(., "S")])

Or you could locate any given gaming_platform simply by referring to its id attribute (assuming, of course, that the attribute by that name is explicitly declared as an ID-type attribute):


This latter approach is very similar to the shorthand pointers described earlier. More detailed coverage and examples of the xpointer( ) scheme appear in Chapter 9.

Custom Schemes

The XPointer Framework's mechanisms are generic enough that developers can extend XPointer by devising custom schemes beyond the predefined element, xpointer and xmlns. These schemes would be used in locating subresources in documents of a specific XML vocabulary.

For instance, assume a street-mapping vocabulary in which you might code a document like the following:

   <street name="Main" segment="1001_3498" 
      xstart="34.3" ystart="679.2" 
      xend="145.7" yend="1003.0"/>
   <street name="Main" segment="1001_3499" 
      xstart="145.7" ystart="1003.0" 
      xend="145.7" yend="1372.2"/>

The developers of this vocabulary could adopt the XPointer syntax to their own purposes, enabling an application to locate a particular street (consisting of all segments sharing the same name) with a scheme-based XPointer such as:


where streetseg is the custom scheme.

Note that what appears within the parentheses following such a custom scheme may or may not be an XPath expression or a namespace URI. The Framework doesn't constrain schemes or schemedata very much, leaving the meaning and significance of the expression up to the conventions of the application in question.

Multiple Pointer Parts

When an XPointer consists of more than one pointer part, the XPointer-aware processor evaluates the XPointer from left to right. This enables the XPointer to serve either or both of two purposes: failure-proofing the XPointer and/or using namespace contexts in the XPointer.

"Failure-proofing" XPointers

If the first pointer part has an unrecognized scheme, or results in a resource or subresource error, the processor can fall back on the second; if the second fails, it can fall back on the third, and so on.

This makes XPointer much more robust than its simple XHTML counterpart. Assume the following XHTML hyperlink:

<a href="#speech-para2">

If the current document contains a named anchor whose value is speech-para2, all is well; the browser scrolls the document to place that named anchor at the top of the window. But if there is no named anchor, the only fallback possible for the browser is a rather crude one: to align the top of the document at the top of the window.

An XLink/XPointer solution to this problem might look like the following:

<anchor xlink:href="xpointer(id('speech-para2')) xpointer(id('speech-para3'))"

Thus, the processor would first try to locate an element whose ID-type attribute has a value of speech-para2; if no such element is located, the processor attempts to locate an element with an ID-type attribute of speech-para3; and if that attempt fails, the processor reports a subresource error.

Declaring and using namespaces

The other principal reason for using a multipart XPointer is to establish namespace contexts for evaluating XPath expressions in other pointer parts. When an xmlns-schemed pointer part is encountered, any pointer parts to its right may freely use elements and attributes with the associated namespace prefix. Note that to declare multiple namespaces, you must use multiple xmlns pointer parts; you can't declare more than one namespace in a given pointer part.

Consider this example (taken directly from the XPointer xmlns( ) Scheme spec):

   <x:a xmlns:x="http://example.com/foo">
      <x:a xmlns:x="http://example.org/bar">This element and its parent are in
      different namespaces."</x:a>

The following XPointer will fail, not because it fails to locate a (sub-)resource but because the reference to the x:a element can't be unambiguously evaluated by the processor:


To get around this problem, you'd use a multipart scheme-based XPointer, such as:

                  xmlns(x=http://example.com/foo) xpointer(//x:a)


                  xmlns(x=http://example.org/bar) xpointer(//x:a)

Note that you need to use an xmlns pointer part every time you need to use a namespace-qualified element or attribute name in a subsequent XPointer expression. Otherwise, the XPointer processor is unable to resolve namespace prefixes used in XPath expressions in the XPointer; the processor has no way, for example, to peek inside the target document to retrieve the namespace declarations that the latter makes.

One final note here: the spec explicitly says that the prefix used in your pointer parts needn't match the prefixes used in the resource. In effect, each occurrence of a namespace prefix — both in your XPointers and in a target resource as located by them — behaves as though it were physically replaced by the namespace URI prior to the act of locating the (sub-)resource. Thus, the preceding two examples might just as well be coded:

xmlns(abc=http://example.com/foo) xpointer(//abc:a)


xmlns(fershlugginer=http://example.org/bar) xpointer(//fershlugginer:a)

For clarity of intent, though, it never hurts to use exactly the same prefixes in an XPointer as appear in the target.

Mixing it up

When using multipart XPointers that declare namespaces, although it may seem natural to always begin with the xmlns pointer part, it's not a requirement. In fact, not starting off with the xmlns might be less confusing or otherwise desirable in certain circumstances. For instance:

xpointer(id("JSimpson")) xmlns(mydoc=http://mydoc.com) xpointer(/mydoc:root)

Here, the "fallback" convention for multipart XPointers says to attempt to locate the element whose ID-type attribute has a value of JSimpson; if that attempt fails, fall back to the alternative: locate the root mydoc:root element of the target resource. The only requirement is that a corresponding xmlns pointer part must appear to the left of any pointer part that uses a namespace prefix; the xmlns pointer parts need not, however, precede all other pointer parts.

Also note that succeeding xmlns parts for the same prefix override one another. Thus (this is a single complete XPointer broken over two lines for clarity):

xmlns(w=http://wexample1.com) xpointer(//w:bush) 
   xmlns(w=http//wexample2.com) xpointer(//w:bush)

This attempts to return a location set consisting of all bush elements in the http://wexample1.com namespace; failing that, the XPointer falls back and attempts to return a location-set consisting of all bush elements in the http://wexample2.com namespace. (Remember not to be confused by the w: prefix, which may or may not actually be used in the target document. What counts is the namespace URI, regardless of the prefix associated with it.)

Using XPointers in a URI

You may already have concluded how to do this, based on a handful of examples in this chapter. Syntactically, including an XPointer fragment identifier in a URI is the same as doing so in XHTML: separate the XPointer from what precedes it using a hash/pound character, #, as in these examples (using scheme-based XPointer, shorthand pointer, and two flavors of the element( ) scheme, respectively):


If the XPointer is locating content in the same document in which the XPointer itself appears, simply prefix the XPointer with a hash, as in:


As a final note, remember a couple of additional considerations when using XPointer in URIs, which I've pointed out in this and the previous chapter:

  • Escape special characters as needed, both to comply with XPointer's own constraints and those of the standards with which XPointer must interoperate. These special characters include the circumflex (^) for escaping unbalanced parentheses, the percent sign (%), markup-significant characters such as the less-than sign (left angle bracket, <), and spaces, as well as other characters in non-ASCII encodings.
  • While XPointer itself does not require the use of quotation marks, XPath expressions used in scheme-based XPointers frequently do. Furthermore, because XPointers in XLink and other hyperlinking contexts are used in attribute values, you need to remain aware of nested-quotation-mark issues in the event that your scheme-based XPointers do use quotation marks of their own (such as in embedded XPath expressions).
Personal tools