I'd like to extract the content Hello world
. Please note that there are multiples <table>
and similar <td colspan="2">
on the page as well:
<table border="0" cellspacing="2" width="800">
<tr>
<td colspan="2"><b>Name: </b>Hello world</td>
</tr>
<tr>
...
I tried the following:
hello = soup.find(text='Name: ')
hello.findPreviousSiblings
But it returned nothing.
In addition, I'm also having problem with the following extracting the My home address
:
<td><b>Address:</b></td>
<td>My home address</td>
I'm also using the same method to search for the text="Address: "
but how do I navigate down to the next line and extract the content of <td>
?
The contents
operator works well for extracting text
from <tag>text</tag>
.
<td>My home address</td>
example:
s = '<td>My home address</td>'
soup = BeautifulSoup(s)
td = soup.find('td') #<td>My home address</td>
td.contents #My home address
<td><b>Address:</b></td>
example:
s = '<td><b>Address:</b></td>'
soup = BeautifulSoup(s)
td = soup.find('td').find('b') #<b>Address:</b>
td.contents #Address: