Creating a zip file on the fly from files stored on S3 using php

cubiclewar picture cubiclewar · Jul 8, 2016 · Viewed 11.2k times · Source

I have a Laravel web app in which users can upload files. These files can be sensitive and although they are stored on S3 they are only accessed via my webservers (streamed download). Once uploaded users may wish to download a selection of these files.

Previously when users went to download a selection of files my web server would download the files from S3, zip them locally and then send the zip down to the client. However once in production due to file sizes the server response would frequently time out.

As an alternative method I want to zip the files on the fly via ZipStream but I haven't had much luck. The zip file either ends up with corrupted files or is corrupted itself and incredibly small.

If it possible to pass a stream resource for a file on S3 to ZipStream and what is the best way to address my timeout issues?

I have tried several method my most recent two are as follows:

// First method using fopen
// Results in tiny corrupt zip files
if (!($fp = fopen("s3://{$bucket}/{$key}", 'r')))
{
    die('Could not open stream for reading');
}

$zip->addFileFromPath($file->orginal_filename, "s3://{$bucket}/{$key}");
fclose($fp);


// Second method tried get download the file from s3 before sipping
// Results in a reasonable sized zip file that is corrupt
$contents = file_get_contents("s3://{$bucket}/{$key}");

$zip->addFile($file->orginal_filename, $contents); 

Each of these sits within a loop that goes through each files. After the loop I call $zip->finish().

Note I do not get any php errors just corrupt files.

Answer

cubiclewar picture cubiclewar · Aug 22, 2016

In the end the solution was to use signed S3 url's and curl to provide a file stream for ZipStream as demonstrated by s3 bucket steam zip php. The resulting code edited from the aforementioned source is as follows:

public function downloadZip()
{
    // ...

    $s3 = Storage::disk('s3');
    $client = $s3->getDriver()->getAdapter()->getClient();
    $client->registerStreamWrapper();
    $expiry = "+10 minutes";

    // Create a new zipstream object
    $zip = new ZipStream($zipName . '.zip');

    foreach($files as $file)
    {
        $filename = $file->original_filename;

        // We need to use a command to get a request for the S3 object
        //  and then we can get the presigned URL.
        $command = $client->getCommand('GetObject', [
            'Bucket' => config('filesystems.disks.s3.bucket'),
            'Key' => $file->path()
        ]);

        $signedUrl = $request = $client->createPresignedRequest($command, $expiry)->getUri();

        // We want to fetch the file to a file pointer so we create it here
        //  and create a curl request and store the response into the file
        //  pointer.
        // After we've fetched the file we add the file to the zip file using
        //  the file pointer and then we close the curl request and the file
        //  pointer.
        // Closing the file pointer removes the file.
        $fp = tmpfile();
        $ch = curl_init($signedUrl);
        curl_setopt($ch, CURLOPT_TIMEOUT, 120);
        curl_setopt($ch, CURLOPT_FILE, $fp);
        curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
        curl_exec($ch);
        curl_close($ch);
        $zip->addFileFromStream($filename, $fp);
        fclose($fp);
    }

    $zip->finish();
}

Note this requires curl and php-curl to be installed and functioning on your server.