Using Xpath in ECMAScripts

XPath is a wonderful little tool for traversing the DOM tree, and it is available at your disposal in just about every new browser, save for Internet Explorer. The syntax can seem a tad daunting at first, but the basic parts, and the ones you'll be using the most, are quite easy to pick up.

Syntax

There is a pretty good rundown of how document.evaluate works at developer.mozilla.org, but for your convenience, I've included the most relevant part:

First try

Once we have that tidbit of information, we can try counting all paragraphs in this document:

Clicking the button will count the paragraphs only if the page is served as regular old html:

If it says there are 0 paragraphs, your browser interpreted this page as XHTML, not HTML.

A solution to the problem

The problem is that just as one enters the domain of XHTML, namespaces become very important and XPath 1.0 never use the default namespace. This is probably a source of frustration for many developers, but there is a rather simple solution for this case.

Note what is in the first parameter of the evaluate call, I no longer use //p, but //h:p, and I have defined another function, nsr. If you look back at the syntax, it says the third parameter is the namespaceResolver function. It gets a string as the first parameter, which contains the namespace prefix, and the function should return the relevant namespace uri. Since using a default namespace does not work, I simply made a namespace with the 'h' prefix using the same namespace uri as the default. This works just fine, because XPath works with the uri, not the prefix.

It is also worth noting that there is no need to do any alterations to the document itself. All the tags are already defined as being in the http://www.w3.org/1999/xhtml namespace.

Try it for yourself:

If it says there are 0 paragraphs, your browser interpreted this page as HTML, not XHTML.

So what to do with the differing results?

Of course, using browser sniffing is out of the question. If you even had the thought, you need to punish yourself by smacking yourself over the head with a large herring. For all you others that do not currently have a headache, here is the reward in the form of a solution.

 0 ? 'h:' : '';
    // ns now contains the string 'h:' if namespaces are supported.
    var result = document.evaluate('count(//'+ns+'p)',document,nsr,1,null); // Fetch all paragraphs
    document.getElementById('cc-count-result').firstChild.data = 
        'I counted '+result.numberValue+' paragraphs in this article. ';
}
]]>

Test it:

If the result was 0, contact me at mail at robbiegee dot com. Remember to mention which browser you are using. PS: Internet Explorer does not support evaluate, so don't bother pointing that out.

Getting serious

Of course, counting paragraphs is not all that interesting. Let's do something useful, like creating an automatic index of all sections within this page!

 0 ? 'h:' : '';
    var iterator = document.evaluate('//'+ns+'h2',document,nsr,0,null);
    
    // Note that modifying any of the nodes will invalidate the iterator
    var header, headers = new Array();
    while(header = iterator.iterateNext())
        headers[headers.length] = header;
    
    var index = document.createElement('ul');
        index.setAttribute('id','automatic-index');
        index.style.border = '1px solid black';
        index.style.background = '#eeeeee';
        index.style.position = 'fixed';
        index.style.top = '0px';
        index.style.right = '0px';
    
    for(var i=0,end=headers.length;i

Try it: A box should appear in the top right corner, containing links to all the headers.

Contact me