I'm writing a little Download-Roboter, that is searching for links in lower layers for it self.
What i need to find are all links in an html-Page (the links to .jpg files as well as the links to .pgn, .pdf, .html,.... - files)
I´m using the html-agilitypack to find all a-href links.
Sample code:
foreach (HtmlNode link in htmlDocument.DocumentNode.SelectNodes("//a[@href]"))
{
HtmlAttribute attribute = link.Attributes["href"];
links.Add(attribute.Value);
}
But i want to find the data-urls as well.
What XPath-syntax do i have to use to find data-urls. An example data-url in an htmlcode:
<div class="cbreplay" data-url="2012\edmonton\partien.pgn"></div>
I need the "2012\edmonton\partien.pgn" out of this example. How can i realize this with XPath syntax?
Best greetings, if i made some bad mistakes, tell me. This is my first question ever.
The following should do what you want:
foreach (HtmlNode divNode in htmlDocument.DocumentNode.SelectNodes("//div[@data-url]"))
{
HtmlAttribute attribute = divNode.Attributes["data-url"];
links.Add(attribute.Value);
}
Effectively, the statement //div[@data-url]
should select all nodes with a data-url attribute. We then pull out this attribute.
If there are nodes other than divs with this attribute, then //*[@data-url]
should do the trick.