jsoup: How to select the parent nodes, which have children satisfying a condition

asliwinski picture asliwinski · Jun 8, 2013 · Viewed 13.4k times · Source

Here's the part of the HTML (simplified for the question):

<a href="/auctions?id=4672" class="auction sec"> 
 <div class="progress"> 
  <div class="guarantee"> 
   <img src="/img/ico/2.png" /> 
  </div> 
 </div> </a>
<a href="/auctions?id=4670" class="auction">  
 <div class="progress"> 
  <div class="guarantee"> 
   <img src="/img/ico/1.png" /> 
  </div> 
 </div> </a>

What I want to get is the vector containing the ids of the auctions, for which the 2.png image is displayed (id=4672 in this case). How to construct the Selector query in order to obtain this?

http://jsoup.org/apidocs/org/jsoup/select/Selector.html - Here I can only find how to select the children, not the parents...

Any help appreciated, including the usage of other libraries. I've tried Jsoup because it seemed to be the most popular.

Answer

ollo picture ollo · Jun 10, 2013

You can use parent() method:

final String html = "<a href=\"/auctions?id=4672\" class=\"auction sec\"> \n"
        + " <div class=\"progress\"> \n"
        + "  <div class=\"guarantee\"> \n"
        + "   <img src=\"/img/ico/2.png\" /> \n"
        + "  </div> \n"
        + " </div> </a>\n"
        + "<a href=\"/auctions?id=4670\" class=\"auction\">  \n"
        + " <div class=\"progress\"> \n"
        + "  <div class=\"guarantee\"> \n"
        + "   <img src=\"/img/ico/1.png\" /> \n"
        + "  </div> \n"
        + " </div> </a>";

Document doc = Jsoup.parse(html);

for( Element element : doc.select("img") ) // Select all 'img' tags
{
    Element divGuarantee = element.parent(); // Get parent element of 'img'
    Element divProgress = divGuarantee.parent(); // Get parent of parent etc.

    // ...
}