I have a list of lists like:
[
[u'email', u'salutation', u'firstname', u'lastname', u'remarks', None, None, None, None, None],
[u'[email protected]', u'Mr', u'Daniel', u'Radcliffe', u'expecto patronum', None, None, None, None, None],
[u'[email protected]', u'Mr', u'Severus', u'Snape', u'Always', None, None, None, None, None],
]
I want to insert this to an excel file. It is possible to do so one by one by writing each element.
book = xlwt.Workbook(encoding="utf-8")
sheet1 = book.add_sheet("Sheet 1")
row = 0
for l in listdata:
col = 0
for e in l:
if e:
sheet1.write(row, col, e)
col+=1
row+=1
But this method does not look very efficient as the each element of the entire list has to be traversed. Is there a more efficient method to do the same in python with xlwt
?
EDIT: Fixed error in benchmark code.
You can shorten things a bit to make them more pythonic:
for i, l in enumerate(listdata):
for j, col in enumerate(l):
sheet.write(i, j, col)
But as far as I know there is no easy method to write to as there is with csv.reader
.
PS: In your supplied code, you never increment row
or col
, so you overwrite the cell at (0,0)
every iteration of the nested for loop. Careful! Using enumerate
will fix that.
As it turns out, join
ing each row together with a comma and writing it is roughly 3 times faster than using enumerate once.
Here's some test code:
import xlwt
import timeit
def wrapper(fn, *args, **kwargs):
def wrapped():
return fn(*args, **kwargs)
return wrapped
def excel_writer():
xldoc = xlwt.Workbook()
sheet1 = xldoc.add_sheet("Sheet1", cell_overwrite_ok=True)
rows = [[str(y) for y in xrange(100)] for x in xrange(10000)]
fn1 = wrapper(cell_writer, rows, sheet1)
fn2 = wrapper(row_writer, rows, sheet1)
print timeit.timeit(fn1, number=10)/10
print timeit.timeit(fn2, number=10)/10
xldoc.save('myexcel.xls')
def cell_writer(rows, sheet):
for i, row in enumerate(rows):
for j, col in enumerate(row):
sheet.write(i, j, col)
def row_writer(rows, sheet):
rows = [', '.join(row) for row in rows]
for i, strrow in enumerate(rows):
sheet.write(i, 0, strrow)
if __name__ == '__main__':
excel_writer()
with number = 1
(divided by 1 of course):
cell_writer:
15.2915050441
row_writer:
0.205128928987
with number = 10
:
cell_writer:
17.3386430596
row_writer:
0.204951626882
I attribute the big time difference to the increased speed of join
over writing to excel. The biggest bottleneck in terms of speed, of course, the excel writing.
However, be aware that the time it takes to split the cells apart in excel may outweigh the time saved with the row_writer
approach. It may also inconvenience the end user; exercise judgement!