How to read image file from S3 bucket directly into memory?

Dims picture Dims · May 18, 2017 · Viewed 34.6k times · Source

I have the following code

import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import numpy as np
import boto3
s3 = boto3.resource('s3', region_name='us-east-2')
bucket = s3.Bucket('sentinel-s2-l1c')
object = bucket.Object('tiles/10/S/DG/2015/12/7/0/B01.jp2')
object.download_file('B01.jp2')
img=mpimg.imread('B01.jp2')
imgplot = plt.imshow(img)
plt.show(imgplot)

and it works. But the problem it downloads file into current directory first. Is it possible to read file and decode it as image directly in RAM?

Answer

Greg Merritt picture Greg Merritt · Dec 20, 2017

I would suggest using io module to read the file directly in to memory, without having to use a temporary file at all.

For example:

import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import numpy as np
import boto3
import io

s3 = boto3.resource('s3', region_name='us-east-2')
bucket = s3.Bucket('sentinel-s2-l1c')
object = bucket.Object('tiles/10/S/DG/2015/12/7/0/B01.jp2')

file_stream = io.StringIO()
object.download_fileobj(file_stream)
img = mpimg.imread(file_stream)
# whatever you need to do

You could also use io.BytesIO if your data is binary.