Python/Numpy - Save Array with Column AND Row Titles

Scott B picture Scott B · Mar 28, 2012 · Viewed 8.3k times · Source

I want to save a 2D array to a CSV file with row and column "header" information (like a table). I know that I could use the header argument to numpy.savetxt to save the column names, but is there any easy way to also include some other array (or list) as the first column of data (like row titles)?

Below is an example of how I currently do it. Is there a better way to include those row titles, perhaps some trick with savetxt I'm unaware of?

import csv
import numpy as np

data = np.arange(12).reshape(3,4)
# Add a '' for the first column because the row titles go there...
cols = ['', 'col1', 'col2', 'col3', 'col4']
rows = ['row1', 'row2', 'row3']

with open('test.csv', 'wb') as f:
   writer = csv.writer(f)
   writer.writerow(cols)
   for row_title, data_row in zip(rows, data):
      writer.writerow([row_title] + data_row.tolist())

Answer

jorgeca picture jorgeca · Mar 29, 2012

Maybe you'd prefer to do something like this:

# Column of row titles
rows = np.array(['row1', 'row2', 'row3'], dtype='|S20')[:, np.newaxis]
with open('test.csv', 'w') as f:
    np.savetxt(f, np.hstack((rows, data)), delimiter=', ', fmt='%s')

This is implicitly converting data to an array of strings, and takes about 200 ms for every million items in my computer.

The dtype '|S20' means strings of twenty characters. If it's too low, your numbers will get chopped:

>>> np.asarray([123], dtype='|S2')
array(['12'], 
  dtype='|S2')

Another option, that from my limited testing is slower, but gives you a lot more control and doesn't have the chopping issue would be using np.char.mod, like

# Column of row titles
rows = np.array(['row1', 'row2', 'row3'])[:, np.newaxis]
str_data = np.char.mod("%10.6f", data)
with open('test.csv', 'w') as f:
    np.savetxt(f, np.hstack((rows, str_data)), delimiter=', ', fmt='%s')