What is the best practice for parsing remote content with jQuery?

slypete picture slypete · Jun 23, 2009 · Viewed 25.9k times · Source

Following a jQuery ajax call to retrieve an entire XHTML document, what is the best way to select specific elements from the resulting string? Perhaps there is a library or plugin that solves this issue?

jQuery can only select XHTML elements that exist in a string if they're normally allowed in a div in the W3C specification; therefore, I'm curious about selecting things like <title>, <script>, and <style>.

According to the jQuery documentation:

http://docs.jquery.com/Core/jQuery#htmlownerDocument

The HTML string cannot contain elements that are invalid within a div, such as html, head, body, or title elements.

Therefore, since we have established that jQuery does not provide a way to do this, how would I select these elements? As an example, if you can show me how to select the remote page's title, that would be perfect!

Thanks, Pete

Answer

David Burrows picture David Burrows · Jul 1, 2009

Instead of hacking jQuery to do this I'd suggest you drop out of jQuery for a minute and use raw XML dom methods. Using XML Dom methods you would can do this:

  window.onload = function(){ 
    $.ajax({
          type: 'GET', 
          url: 'text.html',
          dataType: 'html',
          success: function(data) {

            //cross platform xml object creation from w3schools
            try //Internet Explorer
              {
              xmlDoc=new ActiveXObject("Microsoft.XMLDOM");
              xmlDoc.async="false";
              xmlDoc.loadXML(data);
              }
            catch(e)
              {
              try // Firefox, Mozilla, Opera, etc.
                {
                parser=new DOMParser();
                xmlDoc=parser.parseFromString(data,"text/xml");
                }
              catch(e)
                {
                alert(e.message);
                return;
                }
              }

            alert(xmlDoc.getElementsByTagName("title")[0].childNodes[0].nodeValue);
          }
    });
  }

No messing about with iframes etc.