In pandas 20.1, with the interval type, is it possible to find the midpoint, left or center values in a series.
For example:
Create an interval datatype column, and perform some aggregation calculations over these intervals:
df_Stats = df.groupby(['month',pd.cut(df['Distances'], np.arange(0, 135,1))]).agg(aggregations)
This returns df_Stats with an interval column datatype : df['Distances']
Now I want to associate the left end of the interval to the result of these aggregations using a series function:
df['LeftEnd'] = df['Distances'].left
However, I can run this element wise:
df.loc[0]['LeftEnd'] = df.loc[0]['Distances'].left
This works. Thoughts?
So pd.cut()
actually creates a CategoricalIndex
, with an IntervalIndex
as the categories.
In [13]: df = pd.DataFrame({'month': [1, 1, 2, 2], 'distances': range(4), 'value': range(4)})
In [14]: df
Out[14]:
distances month value
0 0 1 0
1 1 1 1
2 2 2 2
3 3 2 3
In [15]: result = df.groupby(['month', pd.cut(df.distances, 2)]).value.mean()
In [16]: result
Out[16]:
month distances
1 (-0.003, 1.5] 0.5
2 (1.5, 3.0] 2.5
Name: value, dtype: float64
You can simply coerce them to an IntervalIndex
(this also works if they are a column), then access.
In [17]: pd.IntervalIndex(result.index.get_level_values('distances')).left
Out[17]: Float64Index([-0.003, 1.5], dtype='float64')
In [18]: pd.IntervalIndex(result.index.get_level_values('distances')).right
Out[18]: Float64Index([1.5, 3.0], dtype='float64')
In [19]: pd.IntervalIndex(result.index.get_level_values('distances')).mid
Out[19]: Float64Index([0.7485, 2.25], dtype='float64')