s3cmd: searching for files based on extension and delete from bucket

Martin picture Martin · Jul 24, 2012 · Viewed 7.1k times · Source

I have a S3 bucket with thousands of folders and many txt files inside those folders.

I would like to list all txt files inside the bucket so I can check if they're removable. Then remove them if they are.

Any idea how to do this with s3cmd?

Answer

Christopher picture Christopher · Jul 27, 2012

This is fairly simple, but depends on how sophisticated you want the check to be. Suppose you wanted to remove every text file whose filename includes 'foo':

s3cmd --recursive ls s3://mybucket |
    awk '{ print $4 }' | grep "*.txt" | grep "foo" | xargs s3cmd del

If you want a more sophisticated check than grep can handle, just redirect the first three commands to a file, then either manually edit the file or use or awk or perl or whatever your favorite tool is, then cat the output into s3cmd (depending on the check, you could do it all with piping, too):

s3cmd --recursive ls s3://mybucket | awk '{ print $4 }' | grep "*.txt" > /tmp/textfiles
magic-command-to-check-filenames /tmp/textfiles
cat /tmp/textfiles | xargs s3cmd del