I read 150 excel files in a loop, opening them with xlrd.open_workbook()
, which returns a Book
object. At the end, when I tried to umount
the volume, I was unable, and when I checked with lsof
, I found that 6 of the files were still open:
$ lsof | grep volumename
python2 32349 deeenes mem REG 0,40 138240 181517 /.../150119.xls
python2 32349 deeenes mem REG 0,40 135168 181482 /.../150609.xls
python2 32349 deeenes mem REG 0,40 140800 181495 /.../140828.xls
python2 32349 deeenes 5r REG 0,40 140800 181495 /.../140828.xls
python2 32349 deeenes 6r REG 0,40 135168 181482 /.../150609.xls
python2 32349 deeenes 7r REG 0,40 138240 181517 /.../150119.xls
Here is my function I read the xls files with: (stripped for clarity)
import sys
import xlrd
from xlrd.biffh import XLRDError
def read_xls(xls_file, sheet = '', return_table = True):
try:
book = xlrd.open_workbook(xls_file, on_demand = True)
try:
sheet = book.sheet_by_name(sheet)
except XLRDError:
sheet = book.sheet_by_index(0)
table = [[str(c.value) for c in sheet.row(i)] for i in xrange(sheet.nrows)]
if not return_table:
table = None
return table
except IOError:
sys.stdout.write('No such file: %s\n' % xls_file)
sys.stdout.flush()
The Book
object does not have close()
method, neither have any open file type objects among its properties, except the stdout. This howto does not tell about this (haven't found the official docs). I don't see how I could close the file, and also it is weird that 6 remains open after reading 150 of them.
Edit: it might be related to this, but still should not leave open files, and I don't want to read all sheets.
In case you open a workbook with on_demand = True
for a more economic resource use (see here how does it work), you need to call release_resources()
method at the end. As a minimal example:
import xlrd
book = xlrd.open_workbook('workbook.xls', on_demand = True)
sheet = book.sheet_by_index(0)
data = [[str(c.value) for c in sheet.row(i)] for i in xrange(sheet.nrows)]
book.release_resources()
del book