How to parse html table to array with symfony dom crawler

arif hidayat picture arif hidayat · Jun 28, 2016 · Viewed 9.6k times · Source

I have html table and I want to make array from that table

$html = '<table>
<tr>
    <td>satu</td>
    <td>dua</td>
</tr>
<tr>
    <td>tiga</td>
    <td>empat</td>
</tr>
</table>

My array must look like this

array(
   array(
      "satu",
      "dua",
   ),
   array(
     "tiga",
     "empat",
   )
)

I have tried the below code but could not get the array as I need

$crawler = new Crawler();
$crawler->addHTMLContent($html);
$row = array();
$tr_elements = $crawler->filterXPath('//table/tr');
foreach ($tr_elements as $tr) {
 // ???????
}

Answer

Francisco Luz picture Francisco Luz · Apr 23, 2017
$table = $crawler->filter('table')->filter('tr')->each(function ($tr, $i) {
    return $tr->filter('td')->each(function ($td, $i) {
        return trim($td->text());
    });
});

print_r($table);

The above example will give you a multidimensional array where the first layer are the table lines "tr" and the second layer are the table columns "td".

EDIT

If you got nested tables, this code will flatten them out nicely into a single dimension array.

$html = 'MY HTML HERE';
$crawler = new Crawler($html);

$flat = function(string $selector) use ($crawler) {
    $result = [];
    $crawler->filter($selector)->each(function ($table, $i) use (&$result) {
        $table->filter('tr')->each(function ($tr, $i) use (&$result) {
            $tr->filter('td')->each(function ($td, $i) use (&$result) {
                $html = trim($td->html());
                if (strpos($html, '<table') !== FALSE) return;

                $iterator = $td->getIterator()->getArrayCopy()[0];
                $address = $iterator->getNodePath();

                if (!empty($html)) $result[$address] = $html;
            });
        });
    });
    return $result;
};

// The selector gotta point to the most outwards table.
print_r($flat('#Prod fieldset div table'));