soup.find("tagName", { "id" : "articlebody" })
Why does this NOT return the <div id="articlebody"> ... </div>
tags and stuff in between? It returns nothing. And I know for a fact it exists because I'm staring right at it from
soup.prettify()
soup.find("div", { "id" : "articlebody" })
also does not work.
(EDIT: I found that BeautifulSoup wasn't correctly parsing my page, which probably meant the page I was trying to parse isn't properly formatted in SGML or whatever)
You should post your example document, because the code works fine:
>>> import BeautifulSoup
>>> soup = BeautifulSoup.BeautifulSoup('<html><body><div id="articlebody"> ... </div></body></html')
>>> soup.find("div", {"id": "articlebody"})
<div id="articlebody"> ... </div>
Finding <div>
s inside <div>
s works as well:
>>> soup = BeautifulSoup.BeautifulSoup('<html><body><div><div id="articlebody"> ... </div></div></body></html')
>>> soup.find("div", {"id": "articlebody"})
<div id="articlebody"> ... </div>