I have a S3 bucket with thousands of folders and many txt files inside those folders.
I would like to list all txt files inside the bucket so I can check if they're removable. Then remove them if they are.
Any idea how to do this with s3cmd?
This is fairly simple, but depends on how sophisticated you want the check to be. Suppose you wanted to remove every text file whose filename includes 'foo':
s3cmd --recursive ls s3://mybucket |
awk '{ print $4 }' | grep "*.txt" | grep "foo" | xargs s3cmd del
If you want a more sophisticated check than grep can handle, just redirect the first three commands to a file, then either manually edit the file or use or awk
or perl
or whatever your favorite tool is, then cat
the output into s3cmd
(depending on the check, you could do it all with piping, too):
s3cmd --recursive ls s3://mybucket | awk '{ print $4 }' | grep "*.txt" > /tmp/textfiles
magic-command-to-check-filenames /tmp/textfiles
cat /tmp/textfiles | xargs s3cmd del