Are there functions to retrieve the histogram counts of a Series in pandas?

Rafael S. Calsaverini picture Rafael S. Calsaverini · Jun 17, 2013 · Viewed 13.7k times · Source

There is a method to plot Series histograms, but is there a function to retrieve the histogram counts to do further calculations on top of it?

I keep using numpy's functions to do this and converting the result to a DataFrame or Series when I need this. It would be nice to stay with pandas objects the whole time.

Answer

Andy Hayden picture Andy Hayden · Jun 17, 2013

If your Series was discrete you could use value_counts:

In [11]: s = pd.Series([1, 1, 2, 1, 2, 2, 3])

In [12]: s.value_counts()
Out[12]:
2    3
1    3
3    1
dtype: int64

You can see that s.hist() is essentially equivalent to s.value_counts().plot().

If it was of floats an awful hacky solution could be to use groupby:

s.groupby(lambda i: np.floor(2*s[i]) / 2).count()