I have been searching for a good method, and banging my head against the wall.
In a file sharing service project, I have been assigned to determine the best method available for upload large files.
After searching a lot of questions here on stackoverflow and other forums, here's what I got :
Increase the script maximum execution time, along with maximum file size allowed
This case really doesn't fit good. It will almost timeout everytime when the file is being uploaded through a normal broadband connection (1mbps-2mbps). Even if PHP scripts are executed after the upload has been done, there is still no guarantee that the upload will not timeout.
Chunked upload.
Although I kind of understand what I'm supposed to do here, but what I'm confused about is that, say a 1GB file is being uploaded, and I'm reading it in chunks of 2MB, still if the upload is slow, the php script execution will timeout and give error.
Use other languages like Java and Perl?
Is it really efficient to use java or perl for handling file uploads?
Method used by the client is not the problem here, as we'll be issuing a client SDK, and can implement the method of our choice in it. Both the client and server end implementations will be decided by us.
What method, according to you, should be the best one, considering that the memory usage should be efficient, and there may be many concurrent uploads going on?
How do Dropbox, and similar cloud storage services handle big file uploads, and still stay fast at it?
I suggest you use PHP I/O streams with AJAX. This will keep the memory footprint low on the server and you can easily build an async file upload. Note that this uses the HTML5 API which is available only in modern browsers.
Check out this post: https://web.archive.org/web/20170803172549/http://www.webiny.com/blog/2012/05/07/webiny-file-upload-with-html5-and-ajax-using-php-streams/
Pasting the code from the article here:
HTML
<input type="file" name="upload_files" id="upload_files" multiple="multiple">
JS
function upload(fileInputId, fileIndex)
{
// take the file from the input
var file = document.getElementById(fileInputId).files[fileIndex];
var reader = new FileReader();
reader.readAsBinaryString(file); // alternatively you can use readAsDataURL
reader.onloadend = function(evt)
{
// create XHR instance
xhr = new XMLHttpRequest();
// send the file through POST
xhr.open("POST", 'upload.php', true);
// make sure we have the sendAsBinary method on all browsers
XMLHttpRequest.prototype.mySendAsBinary = function(text){
var data = new ArrayBuffer(text.length);
var ui8a = new Uint8Array(data, 0);
for (var i = 0; i < text.length; i++) ui8a[i] = (text.charCodeAt(i) & 0xff);
if(typeof window.Blob == "function")
{
var blob = new Blob([data]);
}else{
var bb = new (window.MozBlobBuilder || window.WebKitBlobBuilder || window.BlobBuilder)();
bb.append(data);
var blob = bb.getBlob();
}
this.send(blob);
}
// let's track upload progress
var eventSource = xhr.upload || xhr;
eventSource.addEventListener("progress", function(e) {
// get percentage of how much of the current file has been sent
var position = e.position || e.loaded;
var total = e.totalSize || e.total;
var percentage = Math.round((position/total)*100);
// here you should write your own code how you wish to proces this
});
// state change observer - we need to know when and if the file was successfully uploaded
xhr.onreadystatechange = function()
{
if(xhr.readyState == 4)
{
if(xhr.status == 200)
{
// process success
}else{
// process error
}
}
};
// start sending
xhr.mySendAsBinary(evt.target.result);
};
}
PHP
// read contents from the input stream
$inputHandler = fopen('php://input', "r");
// create a temp file where to save data from the input stream
$fileHandler = fopen('/tmp/myfile.tmp', "w+");
// save data from the input stream
while(true) {
$buffer = fgets($inputHandler, 4096);
if (strlen($buffer) == 0) {
fclose($inputHandler);
fclose($fileHandler);
return true;
}
fwrite($fileHandler, $buffer);
}