Upload object to AWS S3 without creating a file using aws-sdk-go

Eric Reis Figueiredo picture Eric Reis Figueiredo · Dec 3, 2017 · Viewed 7.7k times · Source

I am trying to upload an object to AWS S3 using golang sdk without needing to create a file in my system (trying to upload only the string). But I am having difficulties to accomplish that. Can anybody give me an example of how can I upload to AWS S3 without needing to create a file?

AWS Example of how to upload a file:

// Creates a S3 Bucket in the region configured in the shared config
// or AWS_REGION environment variable.
//
// Usage:
//    go run s3_upload_object.go BUCKET_NAME FILENAME
func main() {
    if len(os.Args) != 3 {
        exitErrorf("bucket and file name required\nUsage: %s bucket_name filename",
            os.Args[0])
    }

    bucket := os.Args[1]
    filename := os.Args[2]

    file, err := os.Open(filename)
    if err != nil {
        exitErrorf("Unable to open file %q, %v", err)
    }

    defer file.Close()

    // Initialize a session in us-west-2 that the SDK will use to load
    // credentials from the shared credentials file ~/.aws/credentials.
    sess, err := session.NewSession(&aws.Config{
        Region: aws.String("us-west-2")},
    )

    // Setup the S3 Upload Manager. Also see the SDK doc for the Upload Manager
    // for more information on configuring part size, and concurrency.
    //
    // http://docs.aws.amazon.com/sdk-for-go/api/service/s3/s3manager/#NewUploader
    uploader := s3manager.NewUploader(sess)

    // Upload the file's body to S3 bucket as an object with the key being the
    // same as the filename.
    _, err = uploader.Upload(&s3manager.UploadInput{
        Bucket: aws.String(bucket),

        // Can also use the `filepath` standard library package to modify the
        // filename as need for an S3 object key. Such as turning absolute path
        // to a relative path.
        Key: aws.String(filename),

        // The file to be uploaded. io.ReadSeeker is preferred as the Uploader
        // will be able to optimize memory when uploading large content. io.Reader
        // is supported, but will require buffering of the reader's bytes for
        // each part.
        Body: file,
    })
    if err != nil {
        // Print the error and exit.
        exitErrorf("Unable to upload %q to %q, %v", filename, bucket, err)
    }

    fmt.Printf("Successfully uploaded %q to %q\n", filename, bucket)
}

I already tried to create the file programmatically but it is creating the file on my system and then uploading it to S3.

Answer

Eric Reis Figueiredo picture Eric Reis Figueiredo · Sep 6, 2018

In this answer, I will post all the things that worked for me that related to this question. Many thanks to @ThunderCat and @Flimzy that alerted me that the body parameter of the upload request was already an io.Reader. I will post some sample codes commenting on what I've learned from this question and how it helped me solve this problem. Perhaps this will help others like me and @AlokKumarSingh.

Case 1: You already have the data in memory (e.g. receiving data from a streaming/messaging service like Kafka, Kinesis or SQS)

func main() {
    if len(os.Args) != 3 {
        fmt.Printf(
            "bucket and file name required\nUsage: %s bucket_name filename",
            os.Args[0],
        )
    }

    bucket := os.Args[1]
    filename := os.Args[2]

    // this is your data that you have in memory
    // in this example it is hard coded but it may come from very distinct
    // sources, like streaming services for example.
    data := "Hello, world!"

    // create a reader from data data in memory
    reader := strings.NewReader(data)

    sess, err := session.NewSession(&aws.Config{
        Region: aws.String("us-east-1")},
    )
    uploader := s3manager.NewUploader(sess)

    _, err = uploader.Upload(&s3manager.UploadInput{
        Bucket: aws.String(bucket),
        Key: aws.String(filename),
        // here you pass your reader
        // the aws sdk will manage all the memory and file reading for you
        Body: reader,
    })
    if err != nil {.
        fmt.Printf("Unable to upload %q to %q, %v", filename, bucket, err)
    }

    fmt.Printf("Successfully uploaded %q to %q\n", filename, bucket)
}

Case 2: You already have a persisted file and you want to upload it but you dont want to maintain the whole file in memory:

func main() {
    if len(os.Args) != 3 {
        fmt.Printf(
            "bucket and file name required\nUsage: %s bucket_name filename",
            os.Args[0],
        )
    }

    bucket := os.Args[1]
    filename := os.Args[2]

    // open your file
    // the trick here is that the method os.Open just returns for you a reader
    // for the desired file, so you will not maintain the whole file in memory.
    // I know this might sound obvious, but for a starter (as I was at the time
    // of the question) it is not.
    fileReader, err := os.Open(filename)
    if err != nil {
        fmt.Printf("Unable to open file %q, %v", err)
    }
    defer fileReader.Close()

    sess, err := session.NewSession(&aws.Config{
        Region: aws.String("us-east-1")},
    )
    uploader := s3manager.NewUploader(sess)

    _, err = uploader.Upload(&s3manager.UploadInput{
        Bucket: aws.String(bucket),
        Key:    aws.String(filename),
        // here you pass your reader
        // the aws sdk will manage all the memory and file reading for you
        Body: fileReader,
    })
    if err != nil {
        fmt.Printf("Unable to upload %q to %q, %v", filename, bucket, err)
    }

    fmt.Printf("Successfully uploaded %q to %q\n", filename, bucket)
}

Case 3: This is how I implemented it on the final version of my system, but to understand why I did it I must give you some background.

My use case evolved a bit. The upload code was going to be a function in Lambda and the files turned out to be huge. What do this changes mean: If I uploaded the file via an entry point in API Gateway attached to a Lambda function I would have to wait for the whole file to complete the upload in Lambda. Since lambda is priced by the duration and memory usage of the invocation, this could be a really big problem.

So, to solve this problem I used a pre-signed post URL for the upload. How does this affect the architecture/workflow?

Instead of upload to S3 from my backend code, I just create and authenticate a URL for posting the object to S3 in the backend and send this URL to the frontend. With that, I just implemented a multipart upload to that URL. I know that this is a lot more specific than the question, but it wasn't easy to discover this solution, so I think it would be a good idea to document it here for others.

Here is a sample of how to create that pre-signed URL in nodejs.

const AWS = require('aws-sdk');

module.exports.upload = async (event, context, callback) => {

  const s3 = new AWS.S3({ signatureVersion: 'v4' });
  const body = JSON.parse(event.body);

  const params = {
    Bucket: process.env.FILES_BUCKET_NAME,
    Fields: {
      key: body.filename,
    },
    Expires: 60 * 60
  }

  let promise = new Promise((resolve, reject) => {
    s3.createPresignedPost(params, (err, data) => {
      if (err) {
        reject(err);
      } else {
        resolve(data);
      }
    });
  })

  return await promise
    .then((data) => {
      return {
        statusCode: 200,
        body: JSON.stringify({
          message: 'Successfully created a pre-signed post url.',
          data: data,
        })
      }
    })
    .catch((err) => {
      return {
        statusCode: 400,
        body: JSON.stringify({
          message: 'An error occurred while trying to create a pre-signed post url',
          error: err,
        })
      }
    });
};

If you want to use go it's the same idea, you just have to change de sdk.