Amazon S3 & Checksum

Florent picture Florent · Dec 23, 2011 · Viewed 14.1k times · Source

I try to verify the integrity of a file that was uploaded to a bucket but I don't find any information of this.

In the file's headers, there is a "E-tag" but I think its not a md5 checksum.

So, how can I check if the file that I uploaded on Amazon S3 is the same that I have on my computer ?

Thanks. :)

Answer

svetianov picture svetianov · Dec 24, 2011

If you are using the REST API to upload an object (up to 5GB) in a single operation, then you can add the Content-MD5 header in your PUT request. According the the S3 documentation for PUT, the Content-MD5 header is:

The base64 encoded 128-bit MD5 digest of the message (without the headers) according to RFC 1864. This header can be used as a message integrity check to verify that the data is the same data that was originally sent. Although it is optional, we recommend using the Content-MD5 mechanism as an end-to-end integrity check.

Check this answer on how to compute a base64 encoded 128-bit MD5 digest. If you are using s3curl, you can include the computed digest in your request headers using the --contentMd5 option.

If the md5 digest computed by Amazon upon upload completion does not match the md5 digest you provided in the Content-MD5 header, Amazon will respond with a BadDigest error code.

If you are using multipart upload, the Content-MD5 header serves as an integrity check for each part individually. Once the multipart upload is finalized, Amazon does not currently provide a way to verify the integrity of the assembled file.