I need to set cache-control headers for an entire s3 bucket, both existing and future files and was hoping to do it in a bucket policy. I know I can edit the existing ones and I know how to specify them on put if I upload them myself but unfortunately the app that uploads them cannot set the headers as it uses s3fs to copy the files there.
There are now 3 ways to get this done: via the AWS Console, via the command line, or via the s3cmd command line tool.
This is now the recommended solution. It is straight forward, but it can take some time.
(thanks to @biplob - please give him some love below)
Originally, when I created this bucket policies were a no go, so I figured how to do it using aws-cli, and it is pretty slick. When researching I couldn't find any examples in the wild, so I thought I would post some of my solutions to help those in need.
NOTE: By default, aws-cli only copies a file's current metadata, EVEN IF YOU SPECIFY NEW METADATA.
To use the metadata that is specified on the command line, you need to add the '--metadata-directive REPLACE' flag. Here are a some examples.
For a single file
aws s3 cp s3://mybucket/file.txt s3://mybucket/file.txt --metadata-directive REPLACE \
--expires 2034-01-01T00:00:00Z --acl public-read --cache-control max-age=2592000,public
For an entire bucket (note --recursive flag):
aws s3 cp s3://mybucket/ s3://mybucket/ --recursive --metadata-directive REPLACE \
--expires 2034-01-01T00:00:00Z --acl public-read --cache-control max-age=2592000,public
A little gotcha I found, if you only want to apply it to a specific file type, you need to exclude all the files, then include the ones you want.
Only jpgs and pngs:
aws s3 cp s3://mybucket/ s3://mybucket/ --exclude "*" --include "*.jpg" --include "*.png" \
--recursive --metadata-directive REPLACE --expires 2034-01-01T00:00:00Z --acl public-read \
--cache-control max-age=2592000,public
Here are some links to the manual if you need more info:
Known Issues:
"Unknown options: --metadata-directive, REPLACE"
this can be caused by an out of date awscli - see @eliotRosewater's answer below
S3cmd is a "Command line tool for managing Amazon S3 and CloudFront services". While this solution requires a git pull it might be a simpler and more comprehensive solution.
For full instructions, see @ashishyadaveee11's post below
Hope it helps!