Perl XML::LibXML: findnode vs. findvalue vs. find - what's the difference?

CraigP picture CraigP · Nov 6, 2013 · Viewed 11.5k times · Source

I am using XML::LibXML, and I simply need to get a count of the nodes specified by an XPath expression.

Using either of the first two code lines below yields what I'm looking for. I can use the count XPath function with either findvalue or find but not findnodes (yes, I know, because it returns a list).

my $node_cnt = $dom->findvalue("count($xpath_str)");  # WORKS!
my $node_cnt = $dom->find("count($xpath_str)");       # WORKS!
my @node_cnt = $dom->findnodes("count($xpath_str)");  # count doesn't work!

Which leads me to a general nagging question: What's the difference between the three find types? In the documentation, it says:

$string = $node->findvalue($xpath)
$result = $node->find($xpath)
@nodes  = $node->findnodes($xpath_expression)
  1. Is there really a difference between the argument $xpath_expression vs. just $xpath in the documentation?

  2. For the two returning a scalar, what's the difference?

I'm trying to understand the significance of using one find type over the other - thank you!

Answer

Borodin picture Borodin · Nov 6, 2013

The difference is the type of value that the methods return.

  • findnodes is used to return a list of nodes. If the method is called in list context then it returns a list of objects of the appropriate type, such as XML::LibXML::Element, XML::LibXML::Text etc. If it is called in scalar context then it returns a single XML::LibXML::NodeList object that contains the same information.

    It cannot be used to return an arbitrary expression, for instance $dom->findnodes('42') will return nothing. You can only ever get a list document nodes from this method.

  • findvalue is used to return a single text or numeric value, i.e. not an XML node. If you pass an XPath expression that evaluates to a node list then it converts that list to text by concatenating all the text nodes within any of the nodes in the list.

  • find can return anything. It will return a node list as an XML::LibXML::NodeList object, a numeric value as an XML::LibXML::Number object, a string literal as a XML::LibXML::Literal object etc. Unlike findnodes, it always returns a single scalar value, even if called in list context.

    I have never chosen to use find. It looks like it is intended as a catch all when you don't know what sort of result to expect.

For instance, you would probably want to write my $nrecs = $dom->findvalue('count(/root/record)') to get the number of records within the root element. $nrecs would be a simple perl numeric value.

On the other hand, to get a list of those records, you would use my @records = $dom->findnodes('/root/record'). Now @records contains a number of XML::LibXML::Element objects.

In your examples,

my $node_cnt = $dom->findvalue("count($xpath_str)");  # WORKS!

this sets $node_cnt to a simple Perl number, while this

my $node_cnt = $dom->find("count($xpath_str)");  # WORKS!

sets $node_cnt to a XML::LibXML::Number object, that happens to stringify (when you print it) to the same as the previous statement. Prove this for yourself by printing print ref $node_cnt in both cases.

Then

my @node_cnt = $dom->findnodes("count($xpath_str)");  # count doesn't work!

fails because the XPath count evaluates to a number, not an XML node (it isn't part of the XML tree). There is no way of converting numbers to nodes, so the result is an empty list. If it was the other way round, and we had called findvalue on an expression that evaluated to a node list, then there is a vaguely sensible way of converting to a text value and findvalue does its best and returns the concatenation of all the text nodes contained by the nodes in the list.