Streaming file upload using bottle (or flask or similar)

Tomáš Plešek picture Tomáš Plešek · Feb 23, 2013 · Viewed 10.7k times · Source

I have a REST frontend written using Python/Bottle which handles file uploads, usually large ones. The API is wirtten in such a way that:

The client sends PUT with the file as a payload. Among other things, it sends Date and Authorization headers. This is a security measure against replay attacks -- the request is singed with a temporary key, using target url, the date and several other things

Now the problem. The server accepts the request if the supplied date is in given datetime window of 15 minutes. If the upload takes long enough time, it will be longer than the allowed time delta. Now, the request authorization handling is done using decorator on bottle view method. However, bottle won't start the dispatch process unless the upload is finished, so the validation fails on longer uploads.

My question is: is there a way to explain to bottle or WSGI to handle the request immediately and stream the upload as it goes? This would be useful for me for other reasons as well. Or any other solutions? As I am writing this, WSGI middleware comes to mind, but still, I'd like external insight.

I would be willing to switch to Flask, or even other Python frameworks, as the REST frontend is quite lightweight.

Thank you

Answer

petrus-jvrensburg picture petrus-jvrensburg · Apr 15, 2013

I recommend splitting the incoming file into smaller-sized chunks on the frontend. I'm doing this to implement a pause/resume function for large file uploads in a Flask application.

Using Sebastian Tschan's jquery plugin, you can implement chunking by specifying a maxChunkSize when initializing the plugin, as in:

$('#file-select').fileupload({
    url: '/uploads/',
    sequentialUploads: true,
    done: function (e, data) {
        console.log("uploaded: " + data.files[0].name)
    },
    maxChunkSize: 1000000 // 1 MB
});

Now the client will send multiple requests when uploading large files. And your server-side code can use the Content-Range header to patch the original large file back together. For a Flask application, the view might look something like:

# Upload files
@app.route('/uploads/', methods=['POST'])
def results():

    files = request.files

    # assuming only one file is passed in the request
    key = files.keys()[0]
    value = files[key] # this is a Werkzeug FileStorage object
    filename = value.filename

    if 'Content-Range' in request.headers:
        # extract starting byte from Content-Range header string
        range_str = request.headers['Content-Range']
        start_bytes = int(range_str.split(' ')[1].split('-')[0])

        # append chunk to the file on disk, or create new
        with open(filename, 'a') as f:
            f.seek(start_bytes)
            f.write(value.stream.read())

    else:
        # this is not a chunked request, so just save the whole file
        value.save(filename)

    # send response with appropriate mime type header
    return jsonify({"name": value.filename,
                    "size": os.path.getsize(filename),
                    "url": 'uploads/' + value.filename,
                    "thumbnail_url": None,
                    "delete_url": None,
                    "delete_type": None,})

For your particular application, you will just have to make sure that the correct auth headers are still sent with each request.

Hope this helps! I was struggling with this problem for a while ;)