Structured 2D Numpy Array: setting column and row names

freebie picture freebie · Jun 22, 2017 · Viewed 19.1k times · Source

I'm trying to find a nice way to take a 2d numpy array and attach column and row names as a structured array. For example:

import numpy as np

column_names = ['a', 'b', 'c']
row_names    = ['1', '2', '3']

matrix = np.reshape((1, 2, 3, 4, 5, 6, 7, 8, 9), (3, 3))

# TODO: insert magic here

matrix['3']['a']  # 7

I've been able to use set the columns like this:

matrix.dtype = [(n, matrix.dtype) for n in column_names]

This lets me do matrix[2]['a'] but now I want to rename the rows so I can do matrix['3']['a'].

Answer

MSeifert picture MSeifert · Jun 22, 2017

As far as I know it's not possible to "name" the rows with pure structured NumPy arrays.

But if you have it's possible to provide an "index" (which essentially acts like a "row name"):

>>> import pandas as pd
>>> import numpy as np
>>> column_names = ['a', 'b', 'c']
>>> row_names    = ['1', '2', '3']

>>> matrix = np.reshape((1, 2, 3, 4, 5, 6, 7, 8, 9), (3, 3))
>>> df = pd.DataFrame(matrix, columns=column_names, index=row_names)
>>> df
   a  b  c
1  1  2  3
2  4  5  6
3  7  8  9

>>> df['a']['3']      # first "column" then "row"
7

>>> df.loc['3', 'a']  # another way to index "row" and "column"
7