What expands to all files in current directory recursively?

Ramon picture Ramon · Nov 6, 2009 · Viewed 33.3k times · Source

I know **/*.ext expands to all files in all subdirectories matching *.ext, but what is a similar expansion that includes all such files in the current directory as well?

Answer

This will work in Bash 4:

ls -l {,**/}*.ext

In order for the double-asterisk glob to work, the globstar option needs to be set (default: on):

shopt -s globstar

From man bash:

    globstar
                  If set, the pattern ** used in a filename expansion con‐
                  text will match a files and zero or more directories and
                  subdirectories.  If the pattern is followed by a /, only
                  directories and subdirectories match.

Now I'm wondering if there might have once been a bug in globstar processing, because now using simply ls **/*.ext I'm getting correct results.

Regardless, I looked at the analysis kenorb did using the VLC repository and found some problems with that analysis and in my answer immediately above:

The comparisons to the output of the find command are invalid since specifying -type f doesn't include other file types (directories in particular) and the ls commands listed likely do. Also, one of the commands listed, ls -1 {,**/}*.* - which would seem to be based on mine above, only outputs names that include a dot for those files that are in subdirectories. The OP's question and my answer include a dot since what is being sought is files with a specific extension.

Most importantly, however, is that there is a special issue using the ls command with the globstar pattern **. Many duplicates arise since the pattern is expanded by Bash to all file names (and directory names) in the tree being examined. Subsequent to the expansion the ls command lists each of them and their contents if they are directories.

Example:

In our current directory is the subdirectory A and its contents:

A
└── AB
    └── ABC
        ├── ABC1
        ├── ABC2
        └── ABCD
            └── ABCD1

In that tree, ** expands to "A A/AB A/AB/ABC A/AB/ABC/ABC1 A/AB/ABC/ABC2 A/AB/ABC/ABCD A/AB/ABC/ABCD/ABCD1" (7 entries). If you do echo ** that's the exact output you'd get and each entry is represented once. However, if you do ls ** it's going to output a listing of each of those entries. So essentially it does ls A followed by ls A/AB, etc., so A/AB gets shown twice. Also, ls is going to set each subdirectory's output apart:

...
<blank line>
directory name:
content-item
content-item

So using wc -l counts all those blank lines and directory name section headings which throws off the count even farther.

This a yet another reason why you should not parse ls.

As a result of this further analysis, I recommend not using the globstar pattern in any circumstance other than iterating over a tree of files in this manner:

for entry in **
do
    something "$entry"
done

As a final comparison, I used a Bash source repository I had handy and did this:

shopt -s globstar dotglob
diff <(echo ** | tr ' ' '\n') <(find . | sed 's|\./||' | sort)
0a1
> .

I used tr to change spaces to newlines which is only valid here since no names include spaces. I used sed to remove the leading ./ from each line of output from find. I sorted the output of find since it is normally unsorted and Bash's expansion of globs is already sorted. As you can see, the only output from diff was the current directory . output by find. When I did ls ** | wc -l the output had almost twice as many lines.