I'm trying to get my images thumbnailed and stored on s3 using django-storages, boto, and sorl-thumbnail. I have it working, but it's very slow, even with small images. I don't mind it being slow when I save the form and upload the images to s3, but I'd like it to display the image quickly after that.
The answer to this SO question explains that the thumbnail won't be created until first access, but that you can use get_thumbnail() to create it beforehand.
Django + S3 (boto) + Sorl Thumbnail: Suggestions for optimisation
I'm doing that, and now it seems that all entries into the thumbnail_kvstore table are created when uploading the image, rather than when it is displayed.
The problem is that the page displaying the image is still really slow. Looking at the logging panel in the debug toolbar, it looks like there is still lots of communication with s3. It seems like after the image and thumbnails are uploaded and cached, page should render quickly without communicating with s3.
What am I doing wrong? Thanks!
Update: weak hack seems to have gotten it working, but I'd love to know how to do this properly:
https://github.com/asciitaxi/sorl-thumbnail/commit/545cce3f5e719a91dd9cc21d78bb973b2211bbbf
Update: more information for @sorl
I'm working with 2 views:
ADD VIEW: In this view I submit the form to create the model with the image in it. The image is uploaded to s3. In a post_save signal, I call get_thumbnail() to generate the thumbnail before it's needed:
im = get_thumbnail(instance.image, '360x360')
DISPLAY VIEW: In this view I display the thumbnail generated in the add view:
{% thumbnail object.image "360x360" as im %}
<img src="{{ im.url }}" width="{{ im.width }}" height="{{ im.height }}">
{% endthumbnail %}
Without the patch:
ADD VIEW: creates 3 entries in the kvstore table, accesses the cache 10 times (6 sets, 4 gets), logging tab of debug toolbar says "establishing HTTP connection" 12 times
DISPLAY VIEW: still just 3 entries in the kvstore table, just 1 get from cache, but debug toolbar says "establishing HTTP connection" 3 times still
With only the change on line 122:
ADD VIEW: same as above, except the logging only says "establishing HTTP connection" 2 times DISPLAY VIEW: same as above, except the logging only says "establishing HTTP connection" 1 time
Also adding the change on line 118:
ADD VIEW: same as above, but now we are down to 2 "establishing HTTP connection" messages DISPLAY VIEW: same as above, with no logging messages at all
UPDATE: It looks like storage._setup() is called twice, and storage.url() is called once. Based on the timing, I'd say each one makes connections to s3:
1304711315.4
_setup
1304711317.84
1304711317.84
_setup
1304711320.3
1304711320.39
_url
1304711323.66
This seems to be reflected by the boto logging, which says "establishing HTTP connection" 3 times.
As the author of sorl thumbnail I am really interested in solving this if it is not working as I intended. If the key value sotre is populated it will currently store: name, storage and size. I have made the assumption that the url is based on the name and thus should not cause any storage calls. Looking at django storages, https://github.com/e-loue/django-storages/blob/master/storages/backends/s3boto.py#L214 it seems like a safe assumption to make. In your patch you have patched the read method for some reason. When creating a thumbnail a ImageFile instance is fetched from cache (if not create it) then you can of course call read which will read the file, but the intended use is .url which calls url on the storage with the cached name which inturn should be a non storage access op. Could you try to isolate your problem to exacly where in your code this storage access happends?
Also make sure you have THUMBNAIL_DEBUG on and that you have the key value store properly set up.