terraform copy/upload files to aws ec2 instance

pratik picture pratik · May 30, 2020 · Viewed 11.2k times · Source

We have cronjob and shell script which we want to copy or upload to aws ec2 instance while creating instance using terraform.

we tried

  1. file provisioner : but its not wokring , and read this option does not work with all terraform version
      provisioner "file" {
        source      = "abc.sh"
        destination = "/home/ec2-user/basic2.sh"
      }
  1. tried data template file option
    data "template_file" "userdata_line" {
      template = <<EOF
    #!/bin/bash
    mkdir /home/ec2-user/files2
    cd /home/ec2-user/files2
    sudo touch basic2.sh
    sudo chmod 777 basic2.sh
    base64 basic.sh |base64 -d >basic2.sh
    EOF
    }

tried all option but none of them working.
could u please help or advise .
I am new to terraform so struggling on this from long time.

Answer

Martin Atkins picture Martin Atkins · May 30, 2020

When starting from an AMI that has cloud-init installed (which is common in many official Linux distri), we can use cloud-init's write_files module to place arbitrary files into the filesystem, as long as they are small enough to fit within the constraints of the user_data argument along with all of the other cloud-init data.

As with all cloud-init modules, we configure write_files using cloud-init's YAML-based configuration format, which begins with the special marker string #cloud-config on a line of its own, followed by a YAML data structure. Because JSON is a subset of YAML, we can use Terraform's jsonencode to produce a valid value[1].

locals {
  cloud_config_config = <<-END
    #cloud-config
    ${jsonencode({
      write_files = [
        {
          path        = "/etc/example.txt"
          permissions = "0644"
          owner       = "root:root"
          encoding    = "b64"
          content     = filebase64("${path.module}/example.txt")
        },
      ]
    })}
  END
}

The write_files module can accept data in base64 format when we set encoding = "b64", so we use that in conjunction with Terraform's filebase64 function to include the contents of an external file. Other approaches are possible here, such as producing a string dynamically using Terraform templates and using base64encode to encode it as the file contents.

If you can express everything you want cloud-init to do in a single configuration file like the above then you can assign local.cloud_config_config directly as your instance user_data, and cloud-config will should recognize and process it on system boot:

  user_data = local.cloud_config_config

If you instead need to combine creating the file with some other actions, like running a shell script, you can use cloud-init's multipart archive format to encode multiple "files" for cloud-init to process. Terraform has a cloudinit provider that contains a data source for easily constructing a multipart archive for cloud-init:

data "cloudinit_config" "example" {
  gzip          = false
  base64_encode = false

  part {
    content_type = "text/cloud-config"
    filename     = "cloud-config.yaml"
    content      = local.cloud_config_config
  }

  part {
    content_type = "text/x-shellscript"
    filename     = "example.sh"
    content  = <<-EOF
      #!/bin/bash
      echo "Hello World"
    EOT
  }
}

This data source will produce a single string at cloudinit_config.example.rendered which is a multipart archive suitable for use as user_data for cloud-init:

  user_data = cloudinit_config.example.rendered

EC2 imposes a maximum user-data size of 64 kilobytes, so all of the encoded data together must fit within that limit. If you need to place a large file that comes close to or exceeds that limit, it would probably be best to use an intermediate other system to transfer that file, such as having Terraform write the file into an Amazon S3 bucket and having the software in your instance retrieve that data using instance profile credentials. That shouldn't be necessary for small data files used for system configuration, though.

It's important to note that from the perspective of Terraform and EC2 the content of user_data is just an arbitrary string. Any issues in processing the string must be debugged within the target operating system itself, by reading the cloud-init logs to see how it interpreted the configuration and what happened when it tried to take those actions.


[1]: We could also potentially use yamlencode, but at the time I write this that function has a warning that its exact formatting may change in future Terraform versions, and that's undesirable for user_data because it would cause the instance to be replaced. If you are reading this in the future and that warning is no longer present in the yamldecode docs, consider using yamlencode instead.