How to apply custom column order (on Categorical) to pandas boxplot?

Question 1

How to apply custom column order (on Categorical) to pandas boxplot?

python pandas boxplot categorical-data

smci · Mar 21, 2013 · Viewed 12.6k times · Source

Answer

Answer

Hard to say how to do this without a working example. My first guess would be to just add an integer column with the orders that you want.

A simple, brute-force way would be to add each boxplot one at a time.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

df = pd.DataFrame(np.random.rand(37,4), columns=list('ABCD'))
columns_my_order = ['C', 'A', 'D', 'B']
fig, ax = plt.subplots()
for position, column in enumerate(columns_my_order):
    ax.boxplot(df[column], positions=[position])

ax.set_xticks(range(position+1))
ax.set_xticklabels(columns_my_order)
ax.set_xlim(xmin=-0.5)
plt.show()

Question 2

EDIT: this question arose back with pandas ~0.13 and was obsoleted by direct support somewhere between version 0.15-0.18 (as per @Cireo's late answer)

I can get a boxplot of a salary column in a pandas DataFrame...

train.boxplot(column='Salary', by='Category', sym='')

...however I can't figure out how to define the index-order used on column 'Category' - I want to supply my own custom order, according to another criterion:

category_order_by_mean_salary = train.groupby('Category')['Salary'].mean().order().keys()

How can I apply my custom column order to the boxplot columns? (other than ugly kludging the column names with a prefix to force ordering)

'Category' is a string (really, should be a categorical, but this was back in 0.13, where categorical was a third-class citizen) column taking 27 distinct values: ['Accounting & Finance Jobs','Admin Jobs',...,'Travel Jobs']. So it can be easily factorized with pd.Categorical.from_array()

On inspection, the limitation is inside pandas.tools.plotting.py:boxplot(), which converts the column object without allowing ordering:

pandas.core.frame.py.boxplot() is a passthrough to
pandas.tools.plotting.py:boxplot() which instantiates ...
matplotlib.pyplot.py:boxplot() which instantiates ...
matplotlib.axes.py:boxplot()

I suppose I could either hack up a custom version of pandas boxplot(), or reach into the internals of the object. And also file an enhance request.

How to apply custom column order (on Categorical) to pandas boxplot?

Answer

Related questions