I have a Pandas Dataframe that contains one column containing cells containing a dictionary of key:value pairs, like this:
{"name":"Test Thorton","company":"Test Group","address":"10850 Test #325\r\n","city":"Test City","state_province":"CA","postal_code":"95670","country":"USA","email_address":"[email protected]","phone_number":"999-888-3333","equipment_description":"I'm a big red truck\r\n\r\nRSN# 0000","response_desired":"week","response_method":"email"}
I'm trying to parse the dictionary, so the resulting Dataframe contains a new column for each key and the row is populated with the resulting values for each column, like this:
//Before
1 2 3 4 5
a b c d {6:y, 7:v}
//After
1 2 3 4 5 6 7
a b c d {6:y, 7:v} y v
Suggestions much appreciated.
consider df
df = pd.DataFrame([
['a', 'b', 'c', 'd', dict(F='y', G='v')],
['a', 'b', 'c', 'd', dict(F='y', G='v')],
], columns=list('ABCDE'))
df
A B C D E
0 a b c d {'F': 'y', 'G': 'v'}
1 a b c d {'F': 'y', 'G': 'v'}
Option 1
Use pd.Series.apply
, assign new columns in place
df.E.apply(pd.Series)
F G
0 y v
1 y v
Assign it like this
df[['F', 'G']] = df.E.apply(pd.Series)
df.drop('E', axis=1)
A B C D F G
0 a b c d y v
1 a b c d y v
Option 2
Pipeline the whole thing using the pd.DataFrame.assign
method
df.drop('E', 1).assign(**pd.DataFrame(df.E.values.tolist()))
A B C D F G
0 a b c d y v
1 a b c d y v