Get a value of an attribute by HtmlAgilityPack

denied picture denied · Jan 20, 2014 · Viewed 36k times · Source

I want to get a value of an attribute by HtmlAgilityPack. Html code:

<link href="style.css">
<link href="anotherstyle.css">
<link href="anotherstyle2.css">
<link itemprop="thumbnailUrl" href="http://image.jpg">
<link href="anotherstyle5.css">
<link href="anotherstyle7.css">

I want to get last href attribute.

My c# code:

HtmlWeb web = new HtmlWeb();
HtmlAgilityPack.HtmlDocument htmldoc = web.Load(Url);
htmldoc.OptionFixNestedTags = true;
var navigator = (HtmlNodeNavigator)htmldoc.CreateNavigator();
string xpath = "//link/@href";
string val = navigator.SelectSingleNode(xpath).Value;

But that code return first href value.

Answer

Sergey Berezovskiy picture Sergey Berezovskiy · Jan 20, 2014

Following XPath selects link elements which have href attribute defined. Then from links you are selecting last one:

var link = doc.DocumentNode.SelectNodes("//link[@href]").LastOrDefault();
// you can also check if link is not null
var href = link.Attributes["href"].Value; // "anotherstyle7.css"

You can also use last() XPath operator

var link = doc.DocumentNode.SelectSingleNode("/link[@href][last()]");
var href = link.Attributes["href"].Value;

UPDATE: If you want to get last element which has both itemprop and href attributes, then use XPath //link[@href and @itemprop][last()] or //link[@href and @itemprop] if you'll go with first approach.