Top "Html-parsing" questions

HTML parsing is the process of consuming a serialization of an HTML document and producing a representation that you can work with programmatically — e.g., in order to extract data from it.

What is the preg_replace regex to replace this HTML tag?

How would I convert strings like this: <span class="it">CONTENT</span> Into this: {it}CONTENT{/…

php regex html-parsing preg-replace
Set lxml as default BeautifulSoup parser

I'm working on a web scraping project and have ran into problems with speed. To try to fix it, I …

python html beautifulsoup html-parsing lxml
What is the best practice for parsing remote content with jQuery?

Following a jQuery ajax call to retrieve an entire XHTML document, what is the best way to select specific elements …

jquery html-parsing
Jsoup Java HTML parser : Executing Javascript events

Can I fill out forms, execute events and Javascript functions in Jsoup? If yes how can I? Or should I …

java jsoup html-parsing dom-events
Using XPath Contains against HTML in Java

I'm scraping values from HTML pages using XPath inside of a java program to get to a specific tag and …

java xpath html-parsing
How do I convert a document made in Jsoup (the Java html parser) into a string

I have a document that was made in jsoup that looks like this Document doc = Jsoup.connect("http://en.wikipedia.…

java html-parsing jsoup html-parser
Selenium - Get elements html rather Text Value

Via that code i have extracted all desired text out of a html document private void RunThroughSearch(string url) { private …

c# html-parsing selenium-webdriver
BeautifulSoup HTML table parsing

I am trying to parse information (html tables) from this site: http://www.511virginia.org/RoadConditions.aspx?j=All&…

python beautifulsoup html-table html-parsing mechanize
Parse HTML in objective C

I've to parse an Html for my iOS app. I read on the web that I should use the Xpath …

ios objective-c xpath html-parsing tfhpple
Remove <br> tags from a parsed Beautiful Soup list?

I'm currently getting into a for loop with all the rows I want: page = urllib2.urlopen(pageurl) soup = BeautifulSoup(page) …

python beautifulsoup html-parsing