I'm iterating over a large group files inside a directory tree using the for loop.
While doing so, I want to monitor the progress through a progress bar in console. So, I decided to use tqdm for this purpose.
Currently, my code looks like this:
for dirPath, subdirList, fileList in tqdm(os.walk(target_dir)):
sleep(0.01)
dirName = dirPath.split(os.path.sep)[-1]
for fname in fileList:
*****
Output:
Scanning Directory....
43it [00:23, 11.24 it/s]
So, my problem is that it is not showing a progress bar. I want to know how to use it properly and get a better understanding of it working. Also, if there are any other alternatives to tqdm that can be used here.
You can't show a percentage complete unless you know what "complete" means.
While os.walk
is running, it doesn't know how many files and folders it's going to end up iterating: the return type of os.walk
has no __len__
. It'd have to look all the way down the directory tree, enumerating all the files and folders, in order to count them. In other words, os.walk
would have to do all of its work twice in order to tell you how many items it's going to produce, which is inefficient.
If you're dead set on showing a progress bar, you could spool the data into an in-memory list: list(os.walk(target_dir))
. I don't recommend this. If you're traversing a large directory tree this could consume a lot of memory. Worse, if followlinks
is True
and you have a cyclic directory structure (with children linking to their parents), then it could end up looping forever until you run out of RAM.