I'm trying to list objects in an Amazon s3 bucket in python using boto3
.
It seems boto3
has 2 functions for listing the objects in a bucket: list_objects()
and list_objects_v2()
.
What is the difference between the 2 and what is the benefit of using one over the other?
Comparison side by side.
list_objects() :
response = client.list_objects(
Bucket='string',
Delimiter='string',
EncodingType='url',
#Marker to list continuous page
Marker='string',
MaxKeys=123,
Prefix='string'
)
list_objects_v2()
response = client.list_objects_v2(
Bucket='string',
Delimiter='string',
EncodingType='url',
MaxKeys=123,
Prefix='string',
# Replace marker to list continuous page
ContinuationToken='string',
# set to True to fetch key owner info. Default is False.
FetchOwner=True|False,
# This is similar to the Marker in list_object()
StartAfter='string'
)
Added features. Due to the 1000 keys per page listing limits, using marker to list multiple pages can be an headache. Logically, you need to keep track the last key you successfully processed. With ContinuationToken
, you don't need to know the last key, you just check existence of NextContinuationToken
in the response. You can spawn parallel process to deal with multiply of 1000 keys without dealing with the last key to fetch next page.