PHP DOM: parsing a HTML list into an array?

laukok picture laukok · Apr 28, 2012 · Viewed 28.4k times · Source

I have the below HTML string, and I would like to turn it into an array.

$string = '
<a href="#" class="something">1</a>
<a href="#" class="something">2</a>
<a href="#" class="something">3</a>
<a href="#" class="something">4</a>
';

Here's my current code with DOMDocument:

$dom = new DOMDocument;
$dom->loadHTML($string);
foreach( $dom->getElementsByTagName('a') as $node)
{
    $array[] = $node->nodeValue; 
}

print_r($array);

However, this gives the below output:

Array ( [0] => 1 [1] => 2 [2] => 2 [3] => 4)

But I am looking for this result:

Array ( 
[0] => <a href="#" class="something">1</a>
[1] => <a href="#" class="something">2</a> 
[2] => <a href="#" class="something">3</a>
[3] => <a href="#" class="something">4</a>
)

Is this possible?

Answer

Ry- picture Ry- · Apr 28, 2012

Pass the node to DOMDocument::saveHTML to get its HTML representation:

$string = '
<a href="#" class="something">1</a>
<a href="#" class="something">2</a>
<a href="#" class="something">3</a>
<a href="#" class="something">4</a>
';

$dom = new DOMDocument;
$dom->loadHTML($string);
foreach($dom->getElementsByTagName('a') as $node)
{
    $array[] = $dom->saveHTML($node);
}

print_r($array);

Result:

Array
(
    [0] => <a href="#" class="something">1</a>
    [1] => <a href="#" class="something">2</a>
    [2] => <a href="#" class="something">3</a>
    [3] => <a href="#" class="something">4</a>
)

Only works with PHP 5.3.6 and higher, by the way.