Getting title and meta tags from external website

MacMac picture MacMac · Sep 14, 2010 · Viewed 128.2k times · Source

I want to try figure out how to get the

<title>A common title</title>
<meta name="keywords" content="Keywords blabla" />
<meta name="description" content="This is the description" />

Even though if it's arranged in any order, I've heard of the PHP Simple HTML DOM Parser but I don't really want to use it. Is it possible for a solution except using the PHP Simple HTML DOM Parser.

preg_match will not be able to do it if it's invalid HTML?

Can cURL do something like this with preg_match?

Facebook does something like this but it's properly used by using:

<meta property="og:description" content="Description blabla" />

I want something like this so that it is possible when someone posts a link, it should retrieve the title and the meta tags. If there are no meta tags, then it it ignored or the user can set it themselves (but I'll do that later on myself).

Answer

shamittomar picture shamittomar · Sep 14, 2010

This is the way it should be:

function file_get_contents_curl($url)
{
    $ch = curl_init();

    curl_setopt($ch, CURLOPT_HEADER, 0);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);

    $data = curl_exec($ch);
    curl_close($ch);

    return $data;
}

$html = file_get_contents_curl("http://example.com/");

//parsing begins here:
$doc = new DOMDocument();
@$doc->loadHTML($html);
$nodes = $doc->getElementsByTagName('title');

//get and display what you need:
$title = $nodes->item(0)->nodeValue;

$metas = $doc->getElementsByTagName('meta');

for ($i = 0; $i < $metas->length; $i++)
{
    $meta = $metas->item($i);
    if($meta->getAttribute('name') == 'description')
        $description = $meta->getAttribute('content');
    if($meta->getAttribute('name') == 'keywords')
        $keywords = $meta->getAttribute('content');
}

echo "Title: $title". '<br/><br/>';
echo "Description: $description". '<br/><br/>';
echo "Keywords: $keywords";