plotting value_counts() in seaborn barplot

AZhao picture AZhao · Jul 16, 2015 · Viewed 31.4k times · Source

I'm having trouble getting a barplot in seaborn. Here's my reproducible data:

people = ['Hannah', 'Bethany', 'Kris', 'Alex', 'Earl', 'Lori']
reputation = ['awesome', 'cool', 'brilliant', 'meh', 'awesome', 'cool']
dictionary = dict(zip(people, reputation))
df = pd.DataFrame(dictionary.values(), dictionary.keys())
df = df.rename(columns={0:'reputation'})

Then I want to get a bar plot showing the value counts of different reputation. I've tried:

sns.barplot(x = 'reputation', y = df['reputation'].value_counts(), data = df, ci = None)

and

sns.barplot(x = 'reputation', y = df['reputation'].value_counts().values, data = df, ci = None)

but both return blank plots.

Any idea what I can do to get this?

Answer

BrenBarn picture BrenBarn · Jul 16, 2015

In the latest seaborn, you can use the countplot function:

seaborn.countplot(x='reputation', data=df)

To do it with barplot you'd need something like this:

seaborn.barplot(x=df.reputation.value_counts().index, y=df.reputation.value_counts())

You can't pass 'reputation' as a column name to x while also passing the counts in y. Passing 'reputation' for x will use the values of df.reputation (all of them, not just the unique ones) as the x values, and seaborn has no way to align these with the counts. So you need to pass the unique values as x and the counts as y. But you need to call value_counts twice (or do some other sorting on both the unique values and the counts) to ensure they match up right.