I was trying to figure out a way to clean up my s3 bucket. I want to delete all the keys that are older than X days ( In my case X is 30 days).
I couldn't figure out a way to delete the objects in s3.
I used the following approaches, none of which worked (By worked, I mean I tried getting the object after X days, and s3 was still serving the object. I was expecting "Object not found" or "Object expired" message
Approach 1:
k = Key(bucket)
k.key = my_key_name
expires = datetime.utcnow() + timedelta(seconds=(10))
expires = expires.strftime("%a, %d %b %Y %H:%M:%S GMT")
k.set_contents_from_filename(filename,headers={'Expires':expires})
Approach 2:
k = Key(bucket)
k.key = "Event_" + str(key_name) + "_report"
expires = datetime.utcnow() + timedelta(seconds=(10))
expires = expires.strftime("%a, %d %b %Y %H:%M:%S GMT")
k.set_meta_data('Expires', expires)
k.set_contents_from_filename(filename)
If anyone can share the code that was working for them, which deletes s3 objects, that would be really great
You can use lifecycle policies to delete objects from s3 that are older than X days. For example, suppose you have these objects:
logs/first
logs/second
logs/third
otherfile.txt
To expire everything under logs/ after 30 days, you'd say:
import boto
from boto.s3.lifecycle import (
Lifecycle,
Expiration,
)
lifecycle = Lifecycle()
lifecycle.add_rule(
'rulename',
prefix='logs/',
status='Enabled',
expiration=Expiration(days=30)
)
s3 = boto.connect_s3()
bucket = s3.get_bucket('boto-lifecycle-test')
bucket.configure_lifecycle(lifecycle)
You can also retrieve the lifecycle configuration:
>>> config = bucket.get_lifecycle_config()
>>> print(config[0])
<Rule: ruleid>
>>> print(config[0].prefix)
logs/
>>> print(config[0].expiration)
<Expiration: in: 30 days>