if a page has <div class="class1">
and <p class="class1">
, then soup.findAll(True, 'class1')
will find them both.
If it has <p class="class1 class2">
, though, it will not be found. How do I find all objects with a certain class, regardless of whether they have other classes, too?
Unfortunately, BeautifulSoup treats this as a class with a space in it 'class1 class2'
rather than two classes ['class1','class2']
. A workaround is to use a regular expression to search for the class instead of a string.
This works:
soup.findAll(True, {'class': re.compile(r'\bclass1\b')})