Using SimpleXML to read RSS feed

geoffs3310 picture geoffs3310 · Feb 3, 2011 · Viewed 59.7k times · Source

I am using PHP and simpleXML to read the following rss feed:

http://feeds.bbci.co.uk/news/england/rss.xml

I can get most of the information I want like so:

$rss = simplexml_load_file('http://feeds.bbci.co.uk/news/england/rss.xml');

echo '<h1>'. $rss->channel->title . '</h1>';

foreach ($rss->channel->item as $item) {
   echo '<h2><a href="'. $item->link .'">' . $item->title . "</a></h2>";
   echo "<p>" . $item->pubDate . "</p>";
   echo "<p>" . $item->description . "</p>";
} 

But how would I output the thumbnail image that is in the following tag:

<media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/51078000/jpg/_51078953_226alanpotbury.jpg"/>  

Answer

Josh Davis picture Josh Davis · Feb 4, 2011

As you already know, SimpleXML lets you select an node's child using the object property operator -> or a node's attribute using the array access ['name']. It's great, but the operation only works if what you select belongs to the same namespace.

If you want to "hop" from a namespace to another, you can use the children() or attributes() methods. In your case, this is made a bit trickier because you have <item/> in the global namespace, the node you're looking for is in the "media" namespace* and then the attributes are in the global namespace again (they are not prefixed.) So using the normal object/array notation you'll have to "hop" twice:

foreach ($rss->channel->item as $item)
{
    // we load the attributes into $thumbAttr
    // you can either use the namespace prefix
    $thumbAttr = $item->children('media', true)->thumbnail->attributes();

    // or preferably the namespace name, read note below for an explanation
    $thumbAttr = $item->children('http://search.yahoo.com/mrss/')->thumbnail->attributes();

    echo $thumbAttr['url'];
}

*Note

I refer to the namespace as the "media" namespace but that's not really correct. The namespace name is http://search.yahoo.com/mrss/, and "media" is just a prefix, some sort of alias if you will. What's important to keep in mind is that http://search.yahoo.com/mrss/ is the real name of the namespace. At some point, your RSS provider might decide to change the prefix to, say, "yahoo" and your script will stop working if your script refers to the "media" prefix. However, if you use the namespace name, it will keep working no matter the prefix.