I try to get the name of subdirectories with Python3 script on Windows10. Thus, I wrote code as follows:
from pathlib2 import Path
p = "./path/to/target/dir"
[str(item) for item in Path(p).rglob(".")]
# obtained only subdirectories path names including target directory itself.
It is good for me to get this result, but I don't know why the pattern of rglob argument returns this reuslt.
Can someone explain this?
Thanks.
Every directory in a posix-style filesystem features two files from the get go: ..
, which refers to the parent directory, and .
, which refers to the current directory:
$ mkdir tmp; cd tmp
tmp$ ls -a
. ..
tmp$ cd .
tmp$ # <-- still in the same directory
- with the notable exception of /..
, which refers to the root itself since the root has not parent.
A Path
object from python's pathlib
is, when it is created, just a wrapper around a string that is assumed to point somewhere into the filesystem. It will only refer to something tangible when it is resolved:
>>> Path('.')
PosixPath('.') # just a fancy string
>>> Path('.').resolve()
PosixPath('/current/working/dir') # an actual point in your filesystem
The bottom line is that
/current/working/dir
and /current/working/dir/.
are, from the filesystem's point of view, completely equivalent, andpathlib.Path
will also reflect that as soon as it is resolved.By matching the glob
call to .
, you found all links pointing to the current directories below the initial directory. The results from glob
get resolved on return, so the .
doesn't appear in there any more.
As a source for this behavior, see this section of PEP428 (which serves as the specification for pathlib
), where it briefly mentions path equivalence.