Using PHP DOM document, to select HTML element by its class and get its text

Abhishek Madhani picture Abhishek Madhani · Aug 12, 2013 · Viewed 37k times · Source

I trying to get text from div where class = 'review-text', by using PHP's DOM element with following HTML (same structure) and following code.

However this doesn't seem to work

  1. HTML

    $html = '
        <div class="page-wrapper">
            <section class="page single-review" itemtype="http://schema.org/Review" itemscope="" itemprop="review">
                <article class="review clearfix">
                    <div class="review-content">
                        <div class="review-text" itemprop="reviewBody">
                        Outstanding ... 
                        </div>
                    </div>
                </article>
            </section>
        </div>
    ';
    
  2. PHP Code

        $classname = 'review-text';
        $dom = new DOMDocument;
        $dom->loadHTML($html);
        $xpath     = new DOMXPath($dom);
        $results = $xpath->query("//*[@class and contains(concat(' ', normalize-space(@class), ' '), ' $classname ')]");
    
        if ($results->length > 0) {
            echo $review = $results->item(0)->nodeValue;
        }
    

The XPATH syntax to select element by Class is provided at this Blog

I have tried many example from StackOverflow, online tutorials, but none seems to work. Am I missing something ?

Answer

Frank Houweling picture Frank Houweling · Aug 12, 2013

The following XPath query does what you want. Just replace the argument provided to $xpath->query with the following:

//div[@class="review-text"]

Edit: For easy development, you can test your own XPath query's online at http://www.xpathtester.com/test.

Edit2: Tested this code; it worked perfectly.

<?php

$html = '
    <div class="page-wrapper">
        <section class="page single-review" itemtype="http://schema.org/Review" itemscope="" itemprop="review">
            <article class="review clearfix">
                <div class="review-content">
                    <div class="review-text" itemprop="reviewBody">
                    Outstanding ... 
                    </div>
                </div>
            </article>
        </section>
    </div>
';

$classname = 'review-text';
$dom = new DOMDocument;
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$results = $xpath->query("//*[@class='" . $classname . "']");

if ($results->length > 0) {
    echo $review = $results->item(0)->nodeValue;
}

?>