Given this XML:
<DocText>
<WithQuads>
<Page pageNumber="3">
<Word>
July
<Quad>
<P1 X="84" Y="711.25" />
<P2 X="102.062" Y="711.25" />
<P3 X="102.062" Y="723.658" />
<P4 X="84.0" Y="723.658" />
</Quad>
</Word>
<Word>
</Word>
<Word>
30,
<Quad>
<P1 X="104.812" Y="711.25" />
<P2 X="118.562" Y="711.25" />
<P3 X="118.562" Y="723.658" />
<P4 X="104.812" Y="723.658" />
</Quad>
</Word>
</Page>
</WithQuads>
I'd like to find the nodes that have text of 'July' and a Quad/P1/X attribute Greater than 90. Thus, in this case, it should not return any matches. However, if I use GT (>) or LT (<), I get a match on the first Word element. If I use eq (=), I get no match.
So:
//Word[text()='July' and //P1[@X < 90]]
will return true, as will
//Word[text()='July' and //P1[@X > 90]]
How do I constrain this properly on the P1@X attribute?
In addition, imagine I have multiple Page elements, for different page numbers. How would I additionally constrain the above search to find Nodes with text()='July', P1@X < 90
, and Page@pageNumber=3
?
Generally I would consider the use of an unprefixed // as a bad smell in an XPath.
Try this:-
/DocText/WithQuads/Page/Word[text()='July' and Quad/P1/@X > 90]
Your problem is that you use the //P1[@X < 90]
which starts back at the beginning of the document and starts hunting any P1
hence it will always be true. Similarly //P1[@X > 90]
is always true.