My dataframe has four columns with colors. I want to combine them into one column called "Colors" and use commas to separate the values.
For example, I'm trying to combine into a Colors column like this :
ID Black Red Blue Green Colors
120 NaN red NaN green red, green
121 black Nan blue NaN black, blue
My code is:
df['Colors'] = df[['Black, 'Red', 'Blue', 'Green']].apply(lambda x: ', '.join(x), axis=1)
But the output for ID 120 is: , red, , green
And the output for ID 121 is: black, , blue,
FOUND MY PROBLEM! Earlier in my code, I replaced "None" with " " instead of NaN. Upon making the change, plus incorporating feedback to insert [x.notnull()], it works!
df['Black'].replace('None', np.nan, inplace=True)
df['Colors'] = df[['Black, 'Red', 'Blue', 'Green']].apply(lambda x: ', '.join(x[x.notnull()]), axis=1)
You just need to handle NaNs
df['Colors'] = df[['Black', 'Red', 'Blue', 'Green']].apply(lambda x: ', '.join(x[x.notnull()]), axis = 1)
ID Black Red Blue Green Colors
0 120 NaN red NaN green red, green
1 121 black NaN blue NaN black, blue