Top "Beautifulsoup" questions

Beautiful Soup is a Python package for parsing HTML/XML.

How to extract a JSON object that was defined in a HTML page javascript block using Python?

I am downloading HTML pages that have data defined in them in the following way: ... <script type= "text/javascript"&…

python html-parsing beautifulsoup headless-browser
how to scrape links with phantomjs

Can PhantomJS be used an an alternative to BeautifulSoup? I am trying to search on Etsy and visit all the …

javascript beautifulsoup phantomjs casperjs
Set lxml as default BeautifulSoup parser

I'm working on a web scraping project and have ran into problems with speed. To try to fix it, I …

python html beautifulsoup html-parsing lxml
BeautifulSoup - how should I obtain the body contents

I'm parsing HTML with BeautifulSoup. At the end, I would like to obtain the body contents, but without the body …

python django beautifulsoup html5lib
Best way for a beginner to learn screen scraping by Python

This might be one of those questions that are difficult to answer, but here goes: I don't consider my self …

python screen-scraping beautifulsoup lxml scrapy
BeautifulSoup HTML table parsing

I am trying to parse information (html tables) from this site: http://www.511virginia.org/RoadConditions.aspx?j=All&…

python beautifulsoup html-table html-parsing mechanize
Extract element with no class attribute

I need to navigate to an html element of a particular type. However, there are many such elements of that …

python beautifulsoup
Remove <br> tags from a parsed Beautiful Soup list?

I'm currently getting into a for loop with all the rows I want: page = urllib2.urlopen(pageurl) soup = BeautifulSoup(page) …

python beautifulsoup html-parsing
Beautiful Soup if Class "Contains" or Regex?

If my class names are constantly different say for example: listing-col-line-3-11 dpt 41 listing-col-block-1-22 dpt 41 listing-col-line-4-13 CWK 12 Normally …

python regex web-scraping beautifulsoup
Get immediate parent tag with BeautifulSoup in Python

I've researched this question but haven't seen an actual solution to solving this. I'm using BeautifulSoup with Python and what …

python html beautifulsoup html-parsing