Incredibly basic lxml questions: getting HTML/string content of lxml.etree._Element?

AP257 picture AP257 · Mar 22, 2011 · Viewed 25.3k times · Source

This is such a basic question that I actually can't find it in the docs :-/

In the following:

img = house_tree.xpath('//img[@id="mainphoto"]')[0]

How do I get the HTML of the <img/> tag?

I've tried adding html_content() but get AttributeError: 'lxml.etree._Element' object has no attribute 'html_content'.

Also, it was a tag with some content inside (e.g. <p>text</p>) how would I get the content (e.g. text)?

Many thanks!

Answer

vonPetrushev picture vonPetrushev · Mar 22, 2011

I suppose it will be as simple as:

from lxml.etree import tostring
inner_html = tostring(img)

As for getting content from inside <p>, say, some selected element el:

content = el.text_content()