I have a pandas DataFrame consisting of some sensor readings taken over time like this:
diode1 diode2 diode3 diode4
Time
0.530 7 0 10 16
1.218 17 7 14 19
1.895 13 8 16 17
2.570 8 2 16 17
3.240 14 8 17 19
3.910 13 6 17 18
4.594 13 5 16 19
5.265 9 0 12 16
5.948 12 3 16 17
6.632 10 2 15 17
I have written code to add another row with the means of each column:
# List of the averages for the test.
averages = [df[key].describe()['mean'] for key in df]
indexes = df.index.tolist()
indexes.append('mean')
df.reindex(indexes)
# Adding the mean row to the bottom of the DataFrame
i = 0
for key in df:
df.set_value('mean', key, averages[i])
i += 1
This gives me the result I want, which is a DataFrame like this:
diode1 diode2 diode3 diode4
Time
0.53 7.0 0.0 10.0 16.0
1.218 17.0 7.0 14.0 19.0
1.895 13.0 8.0 16.0 17.0
2.57 8.0 2.0 16.0 17.0
3.24 14.0 8.0 17.0 19.0
3.91 13.0 6.0 17.0 18.0
4.594 13.0 5.0 16.0 19.0
5.265 9.0 0.0 12.0 16.0
5.948 12.0 3.0 16.0 17.0
6.632 10.0 2.0 15.0 17.0
mean 11.6 4.1 14.9 17.5
However, I am sure that this is not the most efficient way of adding the row. I have tried using append with the means saved as a pandas Series but ended up with something like this:
diode1 diode2 diode3 diode4 mean
0 7.0 0.0 10.0 14.0 NaN
1 9.0 0.0 10.0 15.0 NaN
2 10.0 5.0 14.0 20.0 NaN
3 6.0 0.0 7.0 14.0 NaN
4 7.0 0.0 10.0 15.0 NaN
5 7.0 0.0 8.0 14.0 NaN
6 7.0 0.0 11.0 14.0 NaN
7 7.0 0.0 2.0 11.0 NaN
8 2.0 0.0 4.0 12.0 NaN
9 4.0 0.0 0.0 6.0 NaN
10 NaN NaN NaN NaN [11.6, 4.1, 14.9, 17.5]
I was wondering if there was a more efficient means of adding a row with the index 'mean' and the averages of each column to the bottom of a pandas DataFrame.
Use loc
for setting with enlargement:
df.loc['mean'] = df.mean()
The resulting output:
diode1 diode2 diode3 diode4
Time
0.53 7.0 0.0 10.0 16.0
1.218 17.0 7.0 14.0 19.0
1.895 13.0 8.0 16.0 17.0
2.57 8.0 2.0 16.0 17.0
3.24 14.0 8.0 17.0 19.0
3.91 13.0 6.0 17.0 18.0
4.594 13.0 5.0 16.0 19.0
5.265 9.0 0.0 12.0 16.0
5.948 12.0 3.0 16.0 17.0
6.632 10.0 2.0 15.0 17.0
mean 11.6 4.1 14.9 17.5