pandas daily average, pandas.resample

OGMGIC picture OGMGIC · Sep 20, 2016 · Viewed 9.6k times · Source

I have a csv file similar to this

Date,Temp1,Temp2

23-Oct-09 01:00:00,21.1,22.3

23-Oct-09 04:00:00,22.3,23.8

23-Oct-09 07:00:00,21.4,21.3

23-Oct-09 10:00:00,21.5,21.6

23-Oct-09 13:00:00,22.3,23.8

23-Oct-09 16:00:00,21.4,21.3

23-Oct-09 19:00:00,21.1,22.3

23-Oct-09 22:00:00,21.4,21.3

24-Oct-09 01:00:00,22.3,23.8

24-Oct-09 04:00:00,22.3,23.8

24-Oct-09 07:00:00,21.1,22.3

24-Oct-09 10:00:00,22.3,23.8

24-Oct-09 13:00:00,21.1,22.3

24-Oct-09 16:00:00,22.3,23.8

24-Oct-09 19:00:00,21.1,22.3

24-Oct-09 22:00:00,22.3,23.8

I have read the data with:

df=pd.read_csv('data.csv', index_col=0)

and converted the index to date time

df.index=pd.to_datetime(df.index)

Now I want to take the mean of each daily temperature, I have been trying to use pd.resample as below, but have been receiving errors. I've read the pandas.resample docs and numerous examples on here and am still at a loss...

df_avg = df.resample('D', how = 'mean')

DataError: No numeric types to aggregate

I would like df_avg to be a dataframe with a datetime index and the two 2 columns. I am using pandas 0.17.1 and python 3.5.2, any help greatly appreciated!

Answer

jezrael picture jezrael · Sep 20, 2016

You need convert string columns to float first:

#add parameter parse_dates for convert to datetime first column
df=pd.read_csv('data.csv', index_col=0, parse_dates=[0])

df['Temp1'] = df.Temp1.astype(float)
df['Temp2'] = df.Temp2.astype(float)

df_avg = df.resample('D').mean()

If astype return error, problem is there are some non numeric values. So you need use to_numeric with errors='coerce' - then all 'problematic' values are converted to NaN:

df['Temp1'] = pd.to_numeric(df.Temp1, errors='coerce')
df['Temp2'] = pd.to_numeric(df.Temp2, errors='coerce')

You can also check all rows with problematic values with boolean indexing:

print df[pd.to_numeric(df.Temp1, errors='coerce').isnull()]
print df[pd.to_numeric(df.Temp2, errors='coerce').isnull()]