I want to build a program that uses some basic code to read through a folder and tell me how many files are in the folder. Here is how I do that currently:
import os
folders = ['Y:\\path1', 'Y:\\path2', 'Y:\\path3']
for stuff in folders:
for root, dirs, files in os.walk(stuff, topdown=True):
print("there are", len(files), "files in", root)
This works great until there are multiple folders inside the "main" folder as it can return a long, junky list of files due to poor folder/file management. So I would like to go only to the second level at most. example:
Main Folder
---file_i_want
---file_i_want
---Sub_Folder
------file_i_want <--*
------file_i want <--*
------Sub_Folder_2
---------file_i_dont_want
---------file_i_dont_want
I know how to go to only the first level with a break
and with del dirs[:]
taken from this post and also this post.
import os
import pandas as pd
folders = ['Y:\\path1', 'Y:\\path2', 'Y:\\path3']
for stuff in folders:
for root, dirs, files in os.walk(stuff, topdown=True):
print("there are", len(files), "files in", root)
del dirs[:] # or a break here. does the same thing.
But no matter my searching I can't find out how to go two layers deep. I may just not be understanding the other posts on it or something? I was thinking something like del dirs[:2]
but to no avail. Can someone guide me or explain to mehow to accomplish this?
you could do like this:
depth = 2
# [1] abspath() already acts as normpath() to remove trailing os.sep
#, and we need ensures trailing os.sep not exists to make slicing accurate.
# [2] abspath() also make /../ and ////, "." get resolved even though os.walk can returns it literally.
# [3] expanduser() expands ~
# [4] expandvars() expands $HOME
stuff = os.path.abspath(os.path.expanduser(os.path.expandvars(stuff)))
for root,dirs,files in os.walk(stuff):
if root[len(stuff):].count(os.sep) < depth:
for f in files:
print(os.path.join(root,f))
key is: if root[len(stuff):].count(os.sep) < depth
It removes stuff
from root
, so result is relative to stuff
. Just count the number of files separators.
The depth acts like find
command found in Linux, i.e. -maxdepth 0
means do nothing, -maxdepth 1
only scan files in first level, and -maxdepth 2
scan files included sub-directory.
Of course, it still scans the full file structure, but unless it's very deep that'll work.
Another solution would be to only use os.listdir
recursively (with directory check) with a maximum recursion level, but that's a little trickier if you don't need it. Since it's not that hard, here's one implementation:
def scanrec(root):
rval = []
def do_scan(start_dir,output,depth=0):
for f in os.listdir(start_dir):
ff = os.path.join(start_dir,f)
if os.path.isdir(ff):
if depth<2:
do_scan(ff,output,depth+1)
else:
output.append(ff)
do_scan(root,rval,0)
return rval
print(scanrec(stuff)) # prints the list of files not below 2 deep
Note: os.listdir
and os.path.isfile
perform 2 stat
calls so not optimal. In Python 3.5, the use of os.scandir
could avoid that double call.