Parsing simple XML with Nokogiri

Vincent picture Vincent · Oct 15, 2010 · Viewed 13k times · Source

I have the following XML:

<links>

  <item>
    <title>Title 1</title>
    <url>http://www.example.com/url-1</url>
  </item>

  <item>
   <title>Title 2</title>
   <url>http://www.example.com/url-2</url>
  </item>

  <item>
    <title>Title 3</title>
    <url>http://www.example.com/url-3</url>
  </item>

</links>

And, I would like to convert it to a HTML list:

<ul>
  <li><a href="http://www.example.com/url-1">Title 1</a></li>
  <li><a href="http://www.example.com/url-2">Title 2</a></li>
  <li><a href="http://www.example.com/url-3">Title 3</a></li>
</ul>

Currently I have this:

Controller:

require 'nokogiri'
doc = Nokogiri::XML(...)

@links = doc.xpath('//links/item').map do |i|
  {'title' => i.xpath('//title'), 'url' => i.xpath('//url')}
end

Template:

<ul>
  <% @links.each do |l| %>
    <li><a href="<%= l['url'] %>"><%= l['title'] %></a></li>
  <% end %>
</ul> 

Resulting HTML:

<ul>
  <li><a href="http://www.example.com/url-1http://www.example.com/url-2http://www.example.com/url-3">Title 1Title 2Title 3</a></li>
  <li><a href="http://www.example.com/url-1http://www.example.com/url-2http://www.example.com/url-3">Title 1Title 2Title 3</a></li>
  <li><a href="http://www.example.com/url-1http://www.example.com/url-2http://www.example.com/url-3">Title 1Title 2Title 3</a></li>
</ul>

What am I doing wrong? Is there a more optimal way of doing this?

Answer

Dimitre Novatchev picture Dimitre Novatchev · Oct 15, 2010

Replace this:

@links = doc.xpath('//links/item').map do |i| 
  {'title' => i.xpath('//title'), 'url' => i.xpath('//url')} 

with:

@links = doc.xpath('//links/item').map do |i| 
  {'title' => i.xpath('title'), 'url' => i.xpath('url')} 

Explanation:

//title 

and

//url

are absolute XPath expressions and they select all (respectively) title and all url elements in the XML document.

Contrast this with:

title

and

url

These are relative XPath expressions and select all (respectively) title and url children of the current node only.