Reading files in a particular order in python

user1620012 picture user1620012 · Aug 23, 2012 · Viewed 60.8k times · Source

Lets say I have three files in a folder: file9.txt, file10.txt and file11.txt and i want to read them in this particular order. Can anyone help me with this?

Right now I am using the code

import glob, os
for infile in glob.glob(os.path.join( '*.txt')):
    print "Current File Being Processed is: " + infile

and it reads first file10.txt then file11.txt and then file9.txt.

Can someone help me how to get the right order?

Answer

Martijn Pieters picture Martijn Pieters · Aug 23, 2012

Files on the filesystem are not sorted. You can sort the resulting filenames yourself using the sorted() function:

for infile in sorted(glob.glob('*.txt')):
    print "Current File Being Processed is: " + infile

Note that the os.path.join call in your code is a no-op; with only one argument it doesn't do anything but return that argument unaltered.

Note that your files will sort in alphabetical ordering, which puts 10 before 9. You can use a custom key function to improve the sorting:

import re
numbers = re.compile(r'(\d+)')
def numericalSort(value):
    parts = numbers.split(value)
    parts[1::2] = map(int, parts[1::2])
    return parts

 for infile in sorted(glob.glob('*.txt'), key=numericalSort):
    print "Current File Being Processed is: " + infile

The numericalSort function splits out any digits in a filename, turns it into an actual number, and returns the result for sorting:

>>> files = ['file9.txt', 'file10.txt', 'file11.txt', '32foo9.txt', '32foo10.txt']
>>> sorted(files)
['32foo10.txt', '32foo9.txt', 'file10.txt', 'file11.txt', 'file9.txt']
>>> sorted(files, key=numericalSort)
['32foo9.txt', '32foo10.txt', 'file9.txt', 'file10.txt', 'file11.txt']