I have been provided with a xlsb file full of data. I want to process the data using python. I can convert it to csv using excel or open office, but I would like the whole process to be more automated. Any ideas?
Update: I took a look at this question and used the first answer:
import subprocess
subprocess.call("cscript XlsToCsv.vbs data.xlsb data.csv", shell=False)
The issue is the file contains greek letters so the encoding is not preserved. Opening the csv with Notepad++ it looks as it should, but when I try to insert into a database comes like this ���. Opening the file as csv, just to read text is displayed like this: \xc2\xc5\xcb instead of ΒΕΛ.
I realize it's an issue in encoding, but it's possible to retain the original encoding converting the xlsb file to csv ?
I've encountered this same problem and using pyxlsb does it for me:
from pyxlsb import open_workbook
with open_workbook('HugeDataFile.xlsb') as wb:
for sheetname in wb.sheets:
with wb.get_sheet(sheetname) as sheet:
for row in sheet.rows():
values = [r.v for r in row] # retrieving content
csv_line = ','.join(values) # or do your thing