Access AWS S3 from Lambda within VPC

musingsole picture musingsole · Sep 29, 2016 · Viewed 18.7k times · Source

Overall, I'm pretty confused by using AWS Lambda within a VPC. The problem is Lambda is timing out while trying to access an S3 bucket. The solution seems to be a VPC Endpoint.

I've added the Lambda function to a VPC so it can access an RDS hosted database (not shown in the code below, but functional). However, now I can't access S3 and any attempt to do so times out.

I tried creating a VPC S3 Endpoint, but nothing has changed.

VPC Configuration

I'm using a simple VPC created by default whenever I first made an EC2 instance. It has four subnets, all created by default.

VPC Route Table

_Destination - Target - Status - Propagated_

172.31.0.0/16 - local - Active - No

pl-63a5400a (com.amazonaws.us-east-1.s3) - vpce-b44c8bdd - Active - No

0.0.0.0/0 - igw-325e6a56 - Active - No

Simple S3 Download Lambda:

import boto3
import pymysql
from StringIO import StringIO

def lambda_handler(event, context):
    s3Obj = StringIO()

    return boto3.resource('s3').Bucket('marineharvester').download_fileobj('Holding - Midsummer/sample', s3Obj)

Answer

Geoff  picture Geoff · Jun 11, 2017

With boto3, the S3 urls are virtual by default, which then require internet access to be resolved to region specific urls. This causes the hanging of the Lambda function until timeout.

To resolve this requires use of a Config object when creating the client, which tells boto3 to create path based S3 urls instead:

import boto3 
import botocore

client = boto3.client('s3', 'ap-southeast-2', config=botocore.config.Config(s3={'addressing_style':'path'}))

Note that the region in the call must be the region to which you are deploying the lambda and VPC Endpoint.

Then you will be able to use the pl-xxxxxx prefix list for the VPC Endpoint within the Lambda's security group, and still access S3.

Here is a working CloudFormation script that demonstrates this. It creates an S3 bucket, a lambda (that puts records into the bucket) associated to a VPC containing only private subnets and the VPC Endpoint, and necessary IAM roles.