Reading all files in all directories

Relative0 picture Relative0 · Apr 15, 2013 · Viewed 19.3k times · Source

I have the code working to read in the values of a single text file but am having difficulties reading all files from all directories and putting all of the contents together.

Here is what I have:

filename = '*'
filesuffix = '*'
location = os.path.join('Test', filename + "." + filesuffix)
Document = filename
thedictionary = {}
with open(location) as f:
 file_contents = f.read().lower().split(' ') # split line on spaces to make a list
 for position, item in enumerate(file_contents): 
     if item in thedictionary:
      thedictionary[item].append(position)
     else:
      thedictionary[item] = [position]
wordlist = (thedictionary, Document)
#print wordlist
#print thedictionary

note that I am trying to stick the wildcard * in for the filename as well as the wildcard for the filesuffix. I get the following error:

"IOError: [Errno 2] No such file or directory: 'Test/.'"

I am not sure if this is even the right way to do it but it seems that if I somehow get the wildcards working - it should work.

I have gotten this example to work: Python - reading files from directory file not found in subdirectory (which is there)

Which is a little different - but don't know how to update it to read all files. I am thinking that in this initial set of code:

previous_dir = os.getcwd()
os.chdir('testfilefolder')
#add something here?
for filename in os.listdir('.'):

That I would need to add something where I have an outer for loop but don't quite know what to put in it..

Any thoughts?

Answer

Martijn Pieters picture Martijn Pieters · Apr 15, 2013

Python doesn't support wildcards directly in filenames to the open() call. You'll need to use the glob module instead to load files from a single level of subdirectories, or use os.walk() to walk an arbitrary directory structure.

Opening all text files in all subdirectories, one level deep:

import glob

for filename in glob.iglob(os.path.join('Test', '*', '*.txt')):
    with open(filename) as f:
        # one file open, handle it, next loop will present you with a new file.

Opening all text files in an arbitrary nesting of directories:

import os
import fnmatch

for dirpath, dirs, files in os.walk('Test'):
    for filename in fnmatch.filter(files, '*.txt'):
        with open(os.path.join(dirpath, filename)):
            # one file open, handle it, next loop will present you with a new file.