AmazonS3 putObject with InputStream length example

JohnIdol picture JohnIdol · Dec 2, 2011 · Viewed 108.3k times · Source

I am uploading a file to S3 using Java - this is what I got so far:

AmazonS3 s3 = new AmazonS3Client(new BasicAWSCredentials("XX","YY"));

List<Bucket> buckets = s3.listBuckets();

s3.putObject(new PutObjectRequest(buckets.get(0).getName(), fileName, stream, new ObjectMetadata()));

The file is being uploaded but a WARNING is raised when I am not setting the content length:

com.amazonaws.services.s3.AmazonS3Client putObject: No content length specified for stream > data.  Stream contents will be buffered in memory and could result in out of memory errors.

This is a file I am uploading and the stream variable is an InputStream, from which I can get the byte array like this: IOUtils.toByteArray(stream).

So when I try to set the content length and MD5 (taken from here) like this:

// get MD5 base64 hash
MessageDigest messageDigest = MessageDigest.getInstance("MD5");
messageDigest.reset();
messageDigest.update(IOUtils.toByteArray(stream));
byte[] resultByte = messageDigest.digest();
String hashtext = new String(Hex.encodeHex(resultByte));

ObjectMetadata meta = new ObjectMetadata();
meta.setContentLength(IOUtils.toByteArray(stream).length);
meta.setContentMD5(hashtext);

It causes the following error to come back from S3:

The Content-MD5 you specified was invalid.

What am I doing wrong?

Any help appreciated!

P.S. I am on Google App Engine - I cannot write the file to disk or create a temp file because AppEngine does not support FileOutputStream.

Answer

user2413809 picture user2413809 · May 24, 2013

Because the original question was never answered, and I had to run into this same problem, the solution for the MD5 problem is that S3 doesn't want the Hex encoded MD5 string we normally think about.

Instead, I had to do this.

// content is a passed in InputStream
byte[] resultByte = DigestUtils.md5(content);
String streamMD5 = new String(Base64.encodeBase64(resultByte));
metaData.setContentMD5(streamMD5);

Essentially what they want for the MD5 value is the Base64 encoded raw MD5 byte-array, not the Hex string. When I switched to this it started working great for me.