Concatenate Pandas columns under new multi-index level

Zero picture Zero · May 12, 2014 · Viewed 33.2k times · Source

Given a dictionary of data frames like:

dict = {'ABC': df1, 'XYZ' : df2}   # of any length...

where each data frame has the same columns and similar index, for example:

data           Open     High      Low    Close   Volume
Date                                                   
2002-01-17  0.18077  0.18800  0.16993  0.18439  1720833
2002-01-18  0.18439  0.21331  0.18077  0.19523  2027866
2002-01-21  0.19523  0.20970  0.19162  0.20608   771149

What is the simplest way to combine all the data frames into one, with a multi-index like:

symbol         ABC                                       XYZ
data           Open     High      Low    Close   Volume  Open ...
Date                                                   
2002-01-17  0.18077  0.18800  0.16993  0.18439  1720833  ...
2002-01-18  0.18439  0.21331  0.18077  0.19523  2027866  ...
2002-01-21  0.19523  0.20970  0.19162  0.20608   771149  ...

I've tried a few methods - eg for each data frame replace the columns with a multi-index like .from_product(['ABC', columns]) and then concatenate along axis=1, without success.

Answer

Karl D. picture Karl D. · May 12, 2014

You can do it with concat (the keys argument will create the hierarchical columns index):

d = {'ABC' : df1, 'XYZ' : df2}
print pd.concat(d.values(), axis=1, keys=d.keys())


                XYZ                                          ABC           \
               Open     High      Low    Close   Volume     Open     High   
Date                                                                        
2002-01-17  0.18077  0.18800  0.16993  0.18439  1720833  0.18077  0.18800   
2002-01-18  0.18439  0.21331  0.18077  0.19523  2027866  0.18439  0.21331   
2002-01-21  0.19523  0.20970  0.19162  0.20608   771149  0.19523  0.20970   


                Low    Close   Volume  
Date                                   
2002-01-17  0.16993  0.18439  1720833  
2002-01-18  0.18077  0.19523  2027866  
2002-01-21  0.19162  0.20608   771149

Really concat wants lists so the following is equivalent:

print(pd.concat([df1, df2], axis=1, keys=['ABC', 'XYZ']))