Delete column from pandas DataFrame

John picture John · Nov 16, 2012 · Viewed 2M times · Source

When deleting a column in a DataFrame I use:

del df['column_name']

And this works great. Why can't I use the following?

del df.column_name

Since it is possible to access the column/Series as df.column_name, I expected this to work.

Answer

LondonRob picture LondonRob · Aug 9, 2013

The best way to do this in pandas is to use drop:

df = df.drop('column_name', 1)

where 1 is the axis number (0 for rows and 1 for columns.)

To delete the column without having to reassign df you can do:

df.drop('column_name', axis=1, inplace=True)

Finally, to drop by column number instead of by column label, try this to delete, e.g. the 1st, 2nd and 4th columns:

df = df.drop(df.columns[[0, 1, 3]], axis=1)  # df.columns is zero-based pd.Index 

Also working with "text" syntax for the columns:

df.drop(['column_nameA', 'column_nameB'], axis=1, inplace=True)

Note: Introduced in v0.21.0 (October 27, 2017), the drop() method accepts index/columns keywords as an alternative to specifying the axis.

So we can now just do:

df.drop(columns=['B', 'C'])