I want to use excel files to store data elaborated with python. My problem is that I can't add sheets to an existing excel file. Here I suggest a sample code to work with in order to reach this issue
import pandas as pd
import numpy as np
path = r"C:\Users\fedel\Desktop\excelData\PhD_data.xlsx"
x1 = np.random.randn(100, 2)
df1 = pd.DataFrame(x1)
x2 = np.random.randn(100, 2)
df2 = pd.DataFrame(x2)
writer = pd.ExcelWriter(path, engine = 'xlsxwriter')
df1.to_excel(writer, sheet_name = 'x1')
df2.to_excel(writer, sheet_name = 'x2')
writer.save()
writer.close()
This code saves two DataFrames to two sheets, named "x1" and "x2" respectively. If I create two new DataFrames and try to use the same code to add two new sheets, 'x3' and 'x4', the original data is lost.
import pandas as pd
import numpy as np
path = r"C:\Users\fedel\Desktop\excelData\PhD_data.xlsx"
x3 = np.random.randn(100, 2)
df3 = pd.DataFrame(x3)
x4 = np.random.randn(100, 2)
df4 = pd.DataFrame(x4)
writer = pd.ExcelWriter(path, engine = 'xlsxwriter')
df3.to_excel(writer, sheet_name = 'x3')
df4.to_excel(writer, sheet_name = 'x4')
writer.save()
writer.close()
I want an excel file with four sheets: 'x1', 'x2', 'x3', 'x4'. I know that 'xlsxwriter' is not the only "engine", there is 'openpyxl'. I also saw there are already other people that have written about this issue, but still I can't understand how to do that.
Here a code taken from this link
import pandas
from openpyxl import load_workbook
book = load_workbook('Masterfile.xlsx')
writer = pandas.ExcelWriter('Masterfile.xlsx', engine='openpyxl')
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
data_filtered.to_excel(writer, "Main", cols=['Diff1', 'Diff2'])
writer.save()
They say that it works, but it is hard to figure out how. I don't understand what "ws.title", "ws", and "dict" are in this context.
Which is the best way to save "x1" and "x2", then close the file, open it again and add "x3" and "x4"?
Thank you. I believe that a complete example could be good for anyone else who have the same issue:
import pandas as pd
import numpy as np
path = r"C:\Users\fedel\Desktop\excelData\PhD_data.xlsx"
x1 = np.random.randn(100, 2)
df1 = pd.DataFrame(x1)
x2 = np.random.randn(100, 2)
df2 = pd.DataFrame(x2)
writer = pd.ExcelWriter(path, engine = 'xlsxwriter')
df1.to_excel(writer, sheet_name = 'x1')
df2.to_excel(writer, sheet_name = 'x2')
writer.save()
writer.close()
Here I generate an excel file, from my understanding it does not really matter whether it is generated via the "xslxwriter" or the "openpyxl" engine.
When I want to write without loosing the original data then
import pandas as pd
import numpy as np
from openpyxl import load_workbook
path = r"C:\Users\fedel\Desktop\excelData\PhD_data.xlsx"
book = load_workbook(path)
writer = pd.ExcelWriter(path, engine = 'openpyxl')
writer.book = book
x3 = np.random.randn(100, 2)
df3 = pd.DataFrame(x3)
x4 = np.random.randn(100, 2)
df4 = pd.DataFrame(x4)
df3.to_excel(writer, sheet_name = 'x3')
df4.to_excel(writer, sheet_name = 'x4')
writer.save()
writer.close()
this code do the job!