Top "Beautifulsoup" questions

Beautiful Soup is a Python package for parsing HTML/XML.

Convert io.BytesIO to io.StringIO to parse HTML page

I'm trying to parse a HTML page I retrieved through pyCurl but the pyCurl WRITEFUNCTION is returning the page as …

html beautifulsoup pycurl stringio type-conversion
How to re-install lxml?

Python version and Device used Python 2,7.5 Mac 10.7.5 BeautifulSoup 4.2.1. I'm following the BeautifulSoup tutorial but when I try to parse a …

python web-scraping beautifulsoup lxml easy-install
How can I insert a new tag into a BeautifulSoup object?

Trying to get my head around html construction with BS. I'm trying to insert a new tag: self.new_soup.…

python beautifulsoup
Selenium: Iterating through groups of elements

I've done this with BeautifulSoup but it's a bit cumbersome, and I'm trying to figure out if I can do …

python html selenium beautifulsoup html-parsing
beautifulsoup, html5lib: module object has no attribute _base

When I updated my packages I have this new error: class TreeBuilderForHtml5lib(html5lib.treebuilders._base.TreeBuilder): AttributeError: 'module' …

beautifulsoup html5lib
Wait page to load before getting data with requests.get in python 3

I have a page that i need to get the source to use with BS4, but the middle of the …

python-3.x web-scraping beautifulsoup python-requests
Parsing HTML in python - lxml or BeautifulSoup? Which of these is better for what kinds of purposes?

From what I can make out, the two main HTML parsing libraries in Python are lxml and BeautifulSoup. I've chosen …

python beautifulsoup html-parsing lxml
BeautifulSoup: get tag name of element itself, not its children

I have the below (simplified) code, which uses the following source: <html> <p>line 1</p&…

tags beautifulsoup
BeautifulSoup - TypeError: 'NoneType' object is not callable

I need to make my code backwards compatible with python2.6 and BeautifulSoup 3. My code was written using python2.7 and at …

python beautifulsoup backwards-compatibility
Using beautifulsoup to extract text between line breaks (e.g. <br /> tags)

I have the following HTML that is within a larger document <br /> Important Text 1 <br /> <…

python html html-parsing beautifulsoup