writing fixed width, space delimited CSV output in Python

jvm picture jvm · Apr 12, 2011 · Viewed 7k times · Source

I would like to write a fixed width, space delimited and minimally quoted CSV file using Python's csv writer. An example of the output:

item1           item2  
"next item1"    "next item2"
anotheritem1    anotheritem2  

If I use

writer.writerow( ("{0:15s}".format(item1), "{0:15s}".format(item2)) )
...

then, with the space delimiter, the formatting is broken as either quotes or escapes (depending on the csv.QUOTE_* constant) are added due to the trailing spaces of the items formatting:

"item1          " "item2          "
"next item1     " "next item2     "
"anotheritem1   " "anotheritem2   "

Of course, I could format everything myself:

writer.writerow( ("{0:15s}{1:15s}".format(item1, item2)) )

but then there is not much point in using the csv writer. Also, I would have to sort out manually those cases when the space is embedded in the items and quoting/escaping should be used. In other words, it seems I would need a (non-existing) "QUOTE_ABSOLUTELYMINIMAL" csv constant that would act as the "QUOTE_MINIMAL" one but would also ignore trailing spaces.

Is there a way to achieve the "QUOTE_ABSOLUTELYMINIMAL" behaviour or another way to get a fixed width, space delimited CSV output using Python's CSV module?

The reason why I want the fixed-width feature in a CSV file is a better readability. So it will be processed as CSV for both reading and writing but better readable due to the column structure. Reading is not a problem as the csv skipinitialspace option takes care of ignoring the extra spaces. To my surprise, writing seems to be a problem...

EDIT: I conclude it is impossible to achieve with the current csv plugin. It is not a built-in option and I cannot see any reasonable way how to achieve it manually as it seems there is no way to write extra delimiters by the Python's csv writer without quoting or escaping them. Thus, I will probably have to write my own csv writer.

Answer

Ethan Furman picture Ethan Furman · Aug 5, 2011

The basic problem you are running into is that csv and fixed-format are basically opposing views of data storage. Making them work together is not a common practice. Also, if you only have quotes on the items with spaces in them, it will throw off the alignment on those rows:

testing     "rather hmm "
strange     "ways to    "
"store some " "csv data   "
testing     testing    

Reading that data back in results in wrong results as well:

'testing' 'rather hmm '
'strange' 'ways to    '
'store some ' 'csv data   '
'testing' 'testing' ''

Notice the extra field at the end of the last row. Given these problems, I would go with your example of

"item1          " "item2          "
"next item1     " "next item2     "
"anotheritem1   " "anotheritem2   "

which I find very readable, is easy to generate with the existing csv library, and gets correctly parsed when read back in. Here's the code I used to generate it:

import csv

class SpaceCsv(csv.Dialect):
    "csv format for exporting tables"
    delimiter = None
    doublequote = True
    escapechar = None
    lineterminator = '\n'
    quotechar = '"'
    skipinitialspace = True
    quoting = csv.QUOTE_MINIMAL
csv.register_dialect('space', SpaceCsv)

data = (
        ('testing    ', 'rather hmm '),
        ('strange    ', 'ways to    '),
        ('store some ', 'csv data   '),
        ('testing    ', 'testing    '),

temp = open(r'c:\tmp\fixed.csv', 'w')
writer = csv.writer(temp, dialect='space')
for row in data:
    writer.writerow(row)
temp.close()

You will, of course, need to have all your data padded to the same length, either before getting to the function that does all this, or in the function itself. Oh, and if you have numeric data you'll have to make padding allowances for that as well.