It is quite easy to add many pandas dataframes into excel work book as long as it is different worksheets. But, it is somewhat tricky to get many dataframes into one worksheet if you want to use pandas built-in df.to_excel functionality.
# Creating Excel Writer Object from Pandas
writer = pd.ExcelWriter('test.xlsx',engine='xlsxwriter')
workbook=writer.book
worksheet=workbook.add_worksheet('Validation')
df.to_excel(writer,sheet_name='Validation',startrow=0 , startcol=0)
another_df.to_excel(writer,sheet_name='Validation',startrow=20, startcol=0)
The above code won't work. You will get the error of
Sheetname 'Validation', with case ignored, is already in use.
Now, I have experimented enough that I found a way to make it work.
writer = pd.ExcelWriter('test.xlsx',engine='xlsxwriter') # Creating Excel Writer Object from Pandas
workbook=writer.book
df.to_excel(writer,sheet_name='Validation',startrow=0 , startcol=0)
another_df.to_excel(writer,sheet_name='Validation',startrow=20, startcol=0)
This will work. So, my purpose of posting this question on stackoverflow is twofold. Firstly, I hope this will help someone if he/she is trying to put many dataframes into a single work sheet at excel.
Secondly, Can someone help me understand the difference between those two blocks of code? It appears to me that they are pretty much the same except the first block of code created worksheet called "Validation" in advance while the second does not. I get that part.
What I don't understand is why should it be any different ? Even if I don't create the worksheet in advance, this line, the line right before the last one,
df.to_excel(writer,sheet_name='Validation',startrow=0 , startcol=0)
will create a worksheet anyway. Consequently, by the time we reached the last line of code the worksheet "Validation" is already created as well in the second block of code. So, my question basically, why should the second block of code work while the first doesn't?
Please also share if there is another way to put many dataframes into excel using the built-in df.to_excel functionality !!
To create the Worksheet in advance, you need to add the created sheet to the sheets
dict:
writer.sheets['Validation'] = worksheet
Using your original code:
# Creating Excel Writer Object from Pandas
writer = pd.ExcelWriter('test.xlsx',engine='xlsxwriter')
workbook=writer.book
worksheet=workbook.add_worksheet('Validation')
writer.sheets['Validation'] = worksheet
df.to_excel(writer,sheet_name='Validation',startrow=0 , startcol=0)
another_df.to_excel(writer,sheet_name='Validation',startrow=20, startcol=0)
If we look at the pandas function to_excel
, it uses the writer's write_cells
function:
excel_writer.write_cells(formatted_cells, sheet_name, startrow=startrow, startcol=startcol)
So looking at the write_cells
function for xlsxwriter
:
def write_cells(self, cells, sheet_name=None, startrow=0, startcol=0):
# Write the frame cells using xlsxwriter.
sheet_name = self._get_sheet_name(sheet_name)
if sheet_name in self.sheets:
wks = self.sheets[sheet_name]
else:
wks = self.book.add_worksheet(sheet_name)
self.sheets[sheet_name] = wks
Here we can see that it checks for sheet_name
in self.sheets
, and so it needs to be added there as well.