Ordering boxplot x-axis in seaborn

amaatouq picture amaatouq · Nov 9, 2016 · Viewed 17.1k times · Source

My dataframe round_data looks like this:

      error                         username                    task_path
0      0.02  n49vq14uhvy93i5uw33tf7s1ei07vngozrzlsr6q6cnh8w...    39.png
1      0.10  n49vq14uhvy93i5uw33tf7s1ei07vngozrzlsr6q6cnh8w...    45.png
2      0.15  n49vq14uhvy93i5uw33tf7s1ei07vngozrzlsr6q6cnh8w...    44.png
3     0.25  xdoaztndsxoxk3wycpxxkhaiew3lrsou3eafx3em58uqth...    43.png
...     ...                                                ...       ...
1170  -0.11  9qrz4829q27cu3pskups0vir0ftepql7ynpn6in9hxx3ux...    33.png
1171   0.15  9qrz4829q27cu3pskups0vir0ftepql7ynpn6in9hxx3ux...    34.png


[1198 rows x 3 columns]

I want to have a boxplot showing the error of each user sorted by their average performance. What I have is:

    ax = sns.boxplot(x="username", y="error", data=round_data,
                 whis=np.inf, color="c",ax=ax)

which results into this plot: boxplot

How can I sort the x-axis (i.e., users) by mean error?

Answer

amaatouq picture amaatouq · Nov 10, 2016

ok, I figured out the answer:

    grouped = round_data[round_data.batch==i].groupby("username")
users_sorted_average = pd.DataFrame({col:vals['absolute_error'] for col,vals in grouped}).mean().sort_values(ascending=True)   

passing users_sorted_average for the "order" parameter in the seaborn plot function would give the desired behavior:

    ax = sns.boxplot(x="username", y="error", data=round_data,
                 whis=np.inf,ax=ax,color=c,order=users_sorted_average.index)

enter image description here