I am trying to get all the text in a node for a following set and returning as one value (not multiple nodes).
<p>
"I love eating out."
<br>
<br>
"This is my favorite restaurant."
<br>
"I will definitely be back"
</p>
I am using '/p' and get all the results but it returns with line breaks. Also trying '/p/text()' results in getting each text between each tag as a separate returned value. The ideal return would be --
"I love eating out. This is my favorite restaurant. I will definitely be back"
I've tried searching other questions but couldn't find something as close. Please not that in the current environment I am restricted to only use an XPath Query and cannot parse after or setup any HTML pre-parsing. Specifically I'm using the importXML function inside of Google Docs.
Use:
normalize-space(/)
When this XPath expression is evaluated, the string value of the document node (/
) is first produced and this is provided as argument to the standard XPath function normalize-space()
.
By definition, normalize-space()
returns its argument with the leading and trailing adjacent whitespace characters eliminated, and any interim such group of adjacent whitespace characters -- replaced by a single space character.
The evaluation of the above XPath expression results in:
"I love eating out." "This is my favorite restaurant." "I will definitely be back"
To eliminate the quotes, we additionally use the translate()
function:
normalize-space(translate(/,'"', ''))
The result of evaluating this expression is:
I love eating out. This is my favorite restaurant. I will definitely be back
Finally, to have this result wrapped in quotes itself, we use the concat()
function:
concat('"',
normalize-space(translate(/,'"', '')),
'"'
)
The evaluation of this XPath expression produces exactly the wanted result:
"I love eating out. This is my favorite restaurant. I will definitely be back"
XSLT - based verification:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="/">
<xsl:value-of select=
"concat('"',
normalize-space(translate(/,'"', '')),
'"'
)"/>
</xsl:template>
</xsl:stylesheet>
When this transformation is applied on the provided XML document (corrected to be made well-formed):
<p>
"I love eating out."
<br />
<br />
"This is my favorite restaurant."
<br />
"I will definitely be back"
</p>
the XPath expression is evaluated and the result of this evaluation is copied to the output:
"I love eating out. This is my favorite restaurant. I will definitely be back"