How to properly assemble a valid xlsx file from its internal sub-components?

nick130586 picture nick130586 · Jun 18, 2012 · Viewed 38.1k times · Source

I'm trying to create an xlsx file programmatically on iOS. Since the internal data of xlsx files is basically stored in separate xml files, I tried to recreate xlsx structure with all its files and subdirectories, compress them into a zip file and set its extension to xlsx. I use GDataXML parser/writer for creating all the necessary xml files. However, the file I get can't be opened as xlsx file. Even if I rip all the data from a valid xlsx file, create all the xml files manually by copying data from the original xml files and compress them manually, I can't recreate a valid xlsx file.

The questions are:

  • is xlsx really just an archive containing xml files?
  • how do I create a valid xlsx file programmatically if I can't just compress xml files into zip file and set its extension to xlsx?

Answer

jmcnamara picture jmcnamara · Jun 20, 2012

In answer to your questions:

  1. XLSX is just a collection of XML files in a zip container. There is no other magic.
  2. If you decompress/unzip a valid XLSX files and then recompress/zip it and you can't read the resulting output then the problem is with the zipping software or the files your rezipped. Try a different library/utility or check the default compression type and levels that it uses and try match it to whatever Excel uses. Or check the zip file to make sure the directory structure was maintained.

Example of the contents of an xlsx file:

unzip -l example.xlsx
Archive:  example.xlsx
  Length     Date   Time    Name
 --------    ----   ----    ----
      769  10-15-14 09:23   xl/worksheets/sheet1.xml
      550  10-15-14 09:22   xl/workbook.xml
      201  10-15-14 09:22   xl/sharedStrings.xml
      ...

I regularly unzip XLSX files, make minor changes for testing and re-zip them without any issue.

Update: The important thing is to avoid zipping the parent directory. Here is an example using the zip system utility on Linux or the OS X:

# Unzip an xlsx file into a directory.
unzip example.xlsx -d newdir

# Make some valid changes to the files.
cd newdir/
vi xl/worksheets/sheet1.xml

# Rezip the files *FROM* the unzipped directory.
# Note: you could also re-zip to the original file if required.
find . -type f | xargs zip ../newfile.xlsx

# Check the file looks okay.
cd ..
unzip -l newfile.xlsx
xdg-open newfile.xlsx