How would I, using BeautifulSoup, search for tags containing ONLY the attributes I search for?
For example, I want to find all <td valign="top">
tags.
The following code:
raw_card_data = soup.fetch('td', {'valign':re.compile('top')})
gets all of the data I want, but also grabs any <td>
tag that has the attribute valign:top
I also tried:
raw_card_data = soup.findAll(re.compile('<td valign="top">'))
and this returns nothing (probably because of bad regex)
I was wondering if there was a way in BeautifulSoup to say "Find <td>
tags whose only attribute is valign:top
"
UPDATE
FOr example, if an HTML document contained the following <td>
tags:
<td valign="top">.....</td><br />
<td width="580" valign="top">.......</td><br />
<td>.....</td><br />
I would want only the first <td>
tag (<td width="580" valign="top">
) to return
As explained on the BeautifulSoup documentation
You may use this :
soup = BeautifulSoup(html)
results = soup.findAll("td", {"valign" : "top"})
EDIT :
To return tags that have only the valign="top" attribute, you can check for the length of the tag attrs
property :
from BeautifulSoup import BeautifulSoup
html = '<td valign="top">.....</td>\
<td width="580" valign="top">.......</td>\
<td>.....</td>'
soup = BeautifulSoup(html)
results = soup.findAll("td", {"valign" : "top"})
for result in results :
if len(result.attrs) == 1 :
print result
That returns :
<td valign="top">.....</td>