I want to select an element with specific text from the HTML using JSoup. The html is
<td style="vertical-align:bottom;text-align:center;width:15%">
<div style="background-color:#FFDD93;font-size:10px;margin:5px auto 0px auto;text-align:left;" class="genbg"><span class="corners-top-subtab"><span></span></span>
<div><b>Pantry/Catering</b>
<div>
<div style="color:#00700B;">✓ Pantry Car Avbl
<br />✓ Catering Avbl</div>
</div>
<div>
<div><span>Dinner is served after departure from NZM on 1st day.;</span>...
<br /><a style="font-size:10px;color:Red;" onClick="expandPost($(this).parent());" href="javascript:void(0);">Read more...</a>
</div>
<div style="display:none;">Dinner :2 chapati, rice, dal and chicken curry (NV) and paneer curry in veg &Ice cream.; Breakfast:2 bread slices with jam and butter. ; Omlet of 2 eggs (Non veg),vada and sambar(veg)..; coffee & lime juice</div>
</div>
</div><span class="corners-bottom-subtab"><span></span></span>
</div>
I want to find the div element containing the text "Pantry/Catering". I tried
doc.select("div:contains(Pantry/Catering)").first();
But this doesnt seem to work. How can I get this element using Jsoup?
When I run your code it selects the outer div
, while I'm presuming what your looking for is the inner div
. The documentation says that it selects the "elements that contains the specified text". In this simple html:
<div><div><b>Pantry/Catering</b></div></div>
The selector div:contains(Pantry/Catering)
matches twice because both contain the text 'Pantry/Catering':
<!-- First Match -->
<div><div><b>Pantry/Catering</b></div></div>
<!-- Second Match -->
<div><b>Pantry/Catering</b></div>
The matches are always in that order because jsoup matches from the outside. Therefore .first()
always matches the outer div
. To extract the inner div
you could use .get(1)
.
Extracting the inner div
in full:
doc.select("div:contains(Pantry/Catering)").get(1)