Resize images on the fly in CloudFront and get them in the same URL instantly: AWS CloudFront -> S3 -> Lambda -> CloudFront

katericata picture katericata · Feb 17, 2017 · Viewed 8.5k times · Source

TLDR: We have to trick CloudFront 307 redirect caching by creating new cache behavior for responses coming from our Lambda function.

You will not believe how close we are to achieve this. We have stucked so badly in the last step.

Business case:

Our application stores images in S3 and serves them with CloudFront in order to avoid any geographic slow downs around the globe. Now, we want to be really flexible with the design and to be able to request new image dimentions directly in the CouldFront URL! Each new image size will be created on demand and then stored in S3, so the second time it is requested it will be served really quickly as it will exist in S3 and also will be cached in CloudFront.

Lets say the user had uploaded the image chucknorris.jpg. Only the original image will be stored in S3 and wil be served on our page like this:

//xxxxx.cloudfront.net/chucknorris.jpg

We have calculated that we now need to display a thumbnail of 200x200 pixels. Therefore we put the image src to be in our template:

//xxxxx.cloudfront.net/chucknorris-200x200.jpg

When this new size is requested, the amazon web services have to provide it on the fly in the same bucket and with the requested key. This way the image will be directly loaded in the same URL of CloudFront.

I made an ugly drawing with the architecture overview and the workflow on how we are doing this in AWS:

enter image description here

Here is how Python Lambda ends:

return {
    'statusCode': '301',
    'headers': {'location': redirect_url},
    'body': ''
}

The problem:

If we make the Lambda function redirect to S3, it works like a charm. If we redirect to CloudFront, it goes into redirect loop because CloudFront caches 307 (as well as 301, 302 and 303). As soon as our Lambda function redirects to CloudFront, CloudFront calls the API Getaway URL instead of fetching the image from S3:

enter image description here

I would like to create new cache behavior in CloudFront's Behaviors settings tab. This behavior should not cache responses from Lambda or S3 (don't know what exactly is happening internally there), but should still cache any followed requests to this very same resized image. I am trying to set path pattern -\d+x\d+\..+$, add the ARN of the Lambda function in add "Lambda Function Association" and set Event Type Origin Response. Next to that, I am setting the "Default TTL" to 0.

But I cannot save the behavior due to some error:

enter image description here

Are we on the right way, or is the idea of this "Lambda Function Association" totally different?

Answer

katericata picture katericata · Feb 19, 2017

Finally I was able to solve it. Although this is not really a structural solution, it does what we need.

First, thanks to the answer of Michael, I have used path patterns to match all media types. Second, the Cache Behavior page was a bit misleading to me: indeed the Lambda association is for Lambda@Edge, although I did not see this anywhere in all the tooltips of the cache behavior: all you see is just Lambda. This feature cannot help us as we do not want to extend our AWS service scope with Lambda@Edge just because of that particular problem.

Here is the solution approach:
I have defined multiple cache behaviors, one per media type that we support:

enter image description here

For each cache behavior I set the Default TTL to be 0.

And the most important part: In the Lambda function, I have added a Cache-Control header to the resized images when putting them in S3:

s3_resource.Bucket(BUCKET).put_object(Key=new_key, 
                                      Body=edited_image_obj,
                                      CacheControl='max-age=12312312',
                                      ContentType=content_type)

To validate that everything works, I see now that the new image dimention is served with the cache header in CloudFront:

enter image description here