Getting attribute's value in Nokogiri to extract link URLs

Kreeki picture Kreeki · Aug 18, 2011 · Viewed 53k times · Source

I have a document which look like this:

<div id="block">
    <a href="http://google.com">link</a>
</div>

I can't get Nokogiri to get me the value of href attribute. I'd like to store the address in a Ruby variable as a string.

Answer

Michael Kohl picture Michael Kohl · Aug 18, 2011
html = <<HTML
  <div id="block">
    <a href="http://google.com">link</a>
  </div>
HTML
doc = Nokogiri::HTML(html)
doc.xpath('//div/a/@href')
#=> [#<Nokogiri::XML::Attr:0x80887798 name="href" value="http://google.com">]

Or if you wanna be more specific about the div:

>> doc.xpath('//div[@id="block"]/a/@href')
=> [#<Nokogiri::XML::Attr:0x80887798 name="href" value="http://google.com">]
>> doc.xpath('//div[@id="block"]/a/@href').first.value
=> "http://google.com"