Say I have a dictionary that looks like this:
dictionary = {'A' : {'a': [1,2,3,4,5],
'b': [6,7,8,9,1]},
'B' : {'a': [2,3,4,5,6],
'b': [7,8,9,1,2]}}
and I want a dataframe that looks something like this:
A B
a b a b
0 1 6 2 7
1 2 7 3 8
2 3 8 4 9
3 4 9 5 1
4 5 1 6 2
Is there a convenient way to do this? If I try:
In [99]:
DataFrame(dictionary)
Out[99]:
A B
a [1, 2, 3, 4, 5] [2, 3, 4, 5, 6]
b [6, 7, 8, 9, 1] [7, 8, 9, 1, 2]
I get a dataframe where each element is a list. What I need is a multiindex where each level corresponds to the keys in the nested dict and the rows corresponding to each element in the list as shown above. I think I can work a very crude solution but I'm hoping there might be something a bit simpler.
Pandas wants the MultiIndex values as tuples, not nested dicts. The simplest thing is to convert your dictionary to the right format before trying to pass it to DataFrame:
>>> reform = {(outerKey, innerKey): values for outerKey, innerDict in dictionary.iteritems() for innerKey, values in innerDict.iteritems()}
>>> reform
{('A', 'a'): [1, 2, 3, 4, 5],
('A', 'b'): [6, 7, 8, 9, 1],
('B', 'a'): [2, 3, 4, 5, 6],
('B', 'b'): [7, 8, 9, 1, 2]}
>>> pandas.DataFrame(reform)
A B
a b a b
0 1 6 2 7
1 2 7 3 8
2 3 8 4 9
3 4 9 5 1
4 5 1 6 2
[5 rows x 4 columns]