Top "Beautifulsoup" questions

Beautiful Soup is a Python package for parsing HTML/XML.

BeatifulSoup4 get_text still has javascript

I'm trying to remove all the html/javascript using bs4, however, it doesn't get rid of javascript. I still see …

python beautifulsoup nltk
Web scraping - how to access content rendered in JavaScript via Angular.js?

I'm trying to scrape data from the public site asx.com.au The page http://www.asx.com.au/asx/…

python angularjs web-scraping beautifulsoup urllib2
Decoding HTML entities with Python

I'm trying to decode HTML entries from here NYTimes.com and I cannot figure out what I am doing wrong. …

python unicode character-encoding content-type beautifulsoup
Python correct encoding of Website (Beautiful Soup)

I am trying to load a html-page and output the text, even though i am getting the webpage correctly, BeautifulSoup …

python encoding utf-8 beautifulsoup mojibake
Difference between .string and .text BeautifulSoup

I noticed something odd about when working with BeautifulSoup and couldn't find any documentation to support this so I wanted …

python beautifulsoup
Python BeautifulSoup: wildcard attribute/id search

I have this: dates = soup.findAll("div", {"id" : "date"}) However, I need id to be a wildcard search since the …

python beautifulsoup
Beautifulsoup multiple class selector

I want to select all the divs which have BOTH A and B as class attributes. The following selection soup.…

python beautifulsoup
Find next siblings until a certain one using beautifulsoup

The webpage is something like this: <h2>section1</h2> <p>article</p> &…

python find beautifulsoup scrape siblings
Matching partial ids in BeautifulSoup

I'm using BeautifulSoup. I have to find any reference to the <div> tags with id like: post-#. For …

python beautifulsoup
How to save back changes made to a HTML file using BeautifulSoup in Python?

I have the script below, which modifies href attributes in an HTML file (in the future, it will be a …

python html-parsing beautifulsoup