How can I convert a XLSB file to csv using python?

IordanouGiannis picture IordanouGiannis · Mar 13, 2014 · Viewed 34.1k times · Source

I have been provided with a xlsb file full of data. I want to process the data using python. I can convert it to csv using excel or open office, but I would like the whole process to be more automated. Any ideas?

Update: I took a look at this question and used the first answer:

import subprocess
subprocess.call("cscript XlsToCsv.vbs data.xlsb data.csv", shell=False)

The issue is the file contains greek letters so the encoding is not preserved. Opening the csv with Notepad++ it looks as it should, but when I try to insert into a database comes like this ���. Opening the file as csv, just to read text is displayed like this: \xc2\xc5\xcb instead of ΒΕΛ.

I realize it's an issue in encoding, but it's possible to retain the original encoding converting the xlsb file to csv ?

Answer

Sergio Lucero picture Sergio Lucero · Jan 25, 2018

I've encountered this same problem and using pyxlsb does it for me:

from pyxlsb import open_workbook

with open_workbook('HugeDataFile.xlsb') as wb:
    for sheetname in wb.sheets:
        with wb.get_sheet(sheetname) as sheet:
            for row in sheet.rows():
                values = [r.v for r in row]  # retrieving content
                csv_line = ','.join(values)  # or do your thing