S3 Object Expiration using boto

user2005798 picture user2005798 · Feb 20, 2013 · Viewed 10.6k times · Source

I was trying to figure out a way to clean up my s3 bucket. I want to delete all the keys that are older than X days ( In my case X is 30 days).

I couldn't figure out a way to delete the objects in s3.

I used the following approaches, none of which worked (By worked, I mean I tried getting the object after X days, and s3 was still serving the object. I was expecting "Object not found" or "Object expired" message

Approach 1:

    k = Key(bucket)
    k.key = my_key_name
    expires = datetime.utcnow() + timedelta(seconds=(10))
    expires = expires.strftime("%a, %d %b %Y %H:%M:%S GMT")
    k.set_contents_from_filename(filename,headers={'Expires':expires})

Approach 2:

    k = Key(bucket)
    k.key = "Event_" + str(key_name) + "_report"
    expires = datetime.utcnow() + timedelta(seconds=(10))
    expires = expires.strftime("%a, %d %b %Y %H:%M:%S GMT")
    k.set_meta_data('Expires', expires)
    k.set_contents_from_filename(filename)

If anyone can share the code that was working for them, which deletes s3 objects, that would be really great

Answer

jamesls picture jamesls · Feb 22, 2013

You can use lifecycle policies to delete objects from s3 that are older than X days. For example, suppose you have these objects:

logs/first
logs/second
logs/third
otherfile.txt

To expire everything under logs/ after 30 days, you'd say:

import boto
from boto.s3.lifecycle import (
    Lifecycle,
    Expiration,
)

lifecycle = Lifecycle()
lifecycle.add_rule(
    'rulename',
     prefix='logs/',
     status='Enabled',
     expiration=Expiration(days=30)
)

s3 = boto.connect_s3()
bucket = s3.get_bucket('boto-lifecycle-test')
bucket.configure_lifecycle(lifecycle)

You can also retrieve the lifecycle configuration:

>>> config = bucket.get_lifecycle_config()
>>> print(config[0])
<Rule: ruleid>
>>> print(config[0].prefix)
logs/
>>> print(config[0].expiration)
<Expiration: in: 30 days>