How to make images hosted on Amazon S3 less public but not completely private?

Jay Godse picture Jay Godse · Mar 31, 2010 · Viewed 8k times · Source

I fired up a sample application that uses Amazon S3 for image hosting. I managed to coax it into working. The application is hosted at github.com. The application lets you create users with a profile photo. When you upload the photo, the web application stores it on Amazon S3 instead of your local file system. (Very important if you host at heroku.com)

However, when I did a "view source" in the browser of the page I noticed that the URL of the picture was an Amazon S3 URL in the S3 bucket that I assigned to the app. I cut & pasted the URL and was able to view the picture in the same browser, and in in another browser in which I had no open sessions to my web app or to Amazon S3.

Is there any way that I could restrict access to that URL (and image) so that it is accessible only to browsers that are logged into my applications?

Most of the information I found about Amazon ACLs only talk about access for only the owner or to groups of users authenticated with Amazon or AmazonS3, or to everybody anonymously.

EDIT----UPDATE July 7, 2010

Amazon has just announced more ways to restrict access to S3 objects and buckets. Among other ways, you can now restrict access to an S3 object by qualifying the HTTP referrer. This looks interesting...I can't wait until they update their developer documents.

Answer

davidtbernal picture davidtbernal · Mar 31, 2010

For files where privacy actually matters, we handle this as follows:

  • Files are stored with a private ACL, meaning that only an authorized agent can download (or upload) them
  • To access a file, we link to http://myapp.com/download/{s3-path}, where download corresponds to a controller (in the MVC sense)
  • ACLs are implemented as appropriate so that only logged-in users can access that controller/action
  • That controller downloads the file using the API, then streams it out to the user with correct mime-type, cache headers, file size, etc.

Using this method, you end up using a lot more bandwidth than you need, but you still save on storage. For us this works out, because we tend to run out of storage much more quickly than bandwidth.

For files where privacy only sort of matters, we generate a random hash that we use for the URL. This is basically security through obscurity, and you have to be careful that your hash is sufficiently difficult to guess.

However, when I did a "view source" in the browser of the page I noticed that the URL of the picture was an Amazon S3 URL in the S3 bucket that I assigned to the app. I cut & pasted the URL and was able to view the picture in the same browser, and in in another browser in which I had no open sessions to my web app or to Amazon S3.

Keep in mind that this is no different than any image stored elsewhere in your document root. You may or may not need the kind of security you're looking for.