Can you provide examples of parsing HTML?

Chas. Owens picture Chas. Owens · Apr 21, 2009 · Viewed 29.6k times · Source

How do you parse HTML with a variety of languages and parsing libraries?


When answering:

Individual comments will be linked to in answers to questions about how to parse HTML with regexes as a way of showing the right way to do things.

For the sake of consistency, I ask that the example be parsing an HTML file for the href in anchor tags. To make it easy to search this question, I ask that you follow this format

Language: [language name]

Library: [library name]

[example code]

Please make the library a link to the documentation for the library. If you want to provide an example other than extracting links, please also include:

Purpose: [what the parse does]

Answer

Ward Werbrouck picture Ward Werbrouck · Apr 21, 2009

Language: JavaScript
Library: jQuery

$.each($('a[href]'), function(){
    console.debug(this.href);
});

(using firebug console.debug for output...)

And loading any html page:

$.get('http://stackoverflow.com/', function(page){
     $(page).find('a[href]').each(function(){
        console.debug(this.href);
    });
});

Used another each function for this one, I think it's cleaner when chaining methods.