For a dataframe like this:
d = {'id': [1,1,1,2,2], 'Month':[1,2,3,1,3],'Value':[12,23,15,45,34], 'Cost':[124,214,1234,1324,234]}
df = pd.DataFrame(d)
Cost Month Value id
0 124 1 12 1
1 214 2 23 1
2 1234 3 15 1
3 1324 1 45 2
4 234 3 34 2
to which I apply pivot_table
df2 = pd.pivot_table(df,
values=['Value','Cost'],
index=['id'],
columns=['Month'],
aggfunc=np.sum,
fill_value=0)
to get df2:
Cost Value
Month 1 2 3 1 2 3
id
1 124 214 1234 12 23 15
2 1324 0 234 45 0 34
is there an easy way to format resulting dataframe column names like
id Cost1 Cost2 Cost3 Value1 Value2 Value3
1 124 214 1234 12 23 15
2 1324 0 234 45 0 34
If I do:
df2.columns =[s1 + str(s2) for (s1,s2) in df2.columns.tolist()]
I get:
Cost1 Cost2 Cost3 Value1 Value2 Value3
id
1 124 214 1234 12 23 15
2 1324 0 234 45 0 34
How to get rid of the extra level?
thanks!
Using clues from @chrisb's answer, this gave me exactly what I was after:
df2.reset_index(inplace=True)
which gives:
id Cost1 Cost2 Cost3 Value1 Value2 Value3
1 124 214 1234 12 23 15
2 1324 0 234 45 0 34
and in case of multiple index columns, this post explains it well. just to be complete, here is how:
df2.columns = [' '.join(col).strip() for col in df2.columns.values]