How to speed up file_get_contents?

Lucas picture Lucas · Dec 3, 2012 · Viewed 15.8k times · Source

Here's my code:

$language = $_GET['soundtype'];
$word = $_GET['sound'];
$word = urlencode($word);
if ($language == 'english') {
    $url = "<the first url>";
} else if ($language == 'chinese') {
    $url = "<the second url>";
}
$opts = array(
  'http'=>array(
    'method'=>"GET",
    'header'=>"User-Agent: <my user agent>"
  )
);
$context = stream_context_create($opts);
$page = file_get_contents($url, false, $context);
header('Content-Type: audio/mpeg');
echo $page;

But I've found that this runs terribly slow.

Are there any possible methods of optimization?

Note: $url is a remote url.

Answer

MrCode picture MrCode · Dec 3, 2012

It's slow because file_get_contents() reads the entire file into $page, PHP waits for the file to be received before outputting the content. So what you're doing is: downloading the entire file on the server side, then outputting it as a single huge string.

file_get_contents() does not support streaming or grabbing offsets of the remote file. An option is to create a raw socket with fsockopen(), do the HTTP request, and read the response in a loop, as you read each chunk, output it to the browser. This will be faster because the file will be streamed.

Example from the Manual:

$fp = fsockopen("www.example.com", 80, $errno, $errstr, 30);
if (!$fp) {
    echo "$errstr ($errno)<br />\n";
} else {

    header('Content-Type: audio/mpeg');

    $out = "GET / HTTP/1.1\r\n";
    $out .= "Host: www.example.com\r\n";
    $out .= "Connection: Close\r\n\r\n";
    fwrite($fp, $out);
    while (!feof($fp)) {
        echo fgets($fp, 128);
    }
    fclose($fp);
}

The above is looping while there is still content available, on each iteration it reads 128 bytes and then outputs it to the browser. The same principle will work for what you're doing. You'll need to make sure that you don't output the response HTTP headers which will be the first few lines, because since you are doing a raw request, you will get the raw response with headers included. If you output the response headers you will end up with a corrupt file.