I have a text file of temperature data that looks like this:
3438012868.0 0.0 21.7 22.6 22.5 22.5 21.2
3438012875.0 0.0 21.6 22.6 22.5 22.5 21.2
3438012881.9 0.0 21.7 22.5 22.5 22.5 21.2
3438012888.9 0.0 21.6 22.6 22.5 22.5 21.2
3438012895.8 0.0 21.6 22.5 22.6 22.5 21.3
3438012902.8 0.0 21.6 22.5 22.5 22.5 21.2
3438012909.7 0.0 21.6 22.5 22.5 22.5 21.2
3438012916.6 0.0 21.6 22.5 22.5 22.5 21.2
3438012923.6 0.0 21.6 22.6 22.5 22.5 21.2
3438012930.5 0.0 21.6 22.5 22.5 22.5 21.2
3438012937.5 0.0 21.7 22.5 22.5 22.5 21.2
3438012944.5 0.0 21.6 22.5 22.5 22.5 21.3
3438012951.4 0.0 21.6 22.5 22.5 22.5 21.2
3438012958.4 0.0 21.6 22.5 22.5 22.5 21.3
3438012965.3 0.0 21.6 22.6 22.5 22.5 21.2
3438012972.3 0.0 21.6 22.5 22.5 22.5 21.3
3438012979.2 0.0 21.6 22.6 22.5 22.5 21.2
3438012986.1 0.0 21.6 22.5 22.5 22.5 21.3
3438012993.1 0.0 21.6 22.5 22.6 22.5 21.2
3438013000.0 0.0 21.6 0.0 22.5 22.5 21.3
3438013006.9 0.0 21.6 22.6 22.5 22.5 21.2
3438013014.4 0.0 21.6 22.5 22.5 22.5 21.3
3438013021.9 0.0 21.6 22.5 22.5 22.5 21.3
3438013029.9 0.0 21.6 22.5 22.5 22.5 21.2
3438013036.9 0.0 21.6 22.6 22.5 22.5 21.2
3438013044.6 0.0 21.6 22.5 22.5 22.5 21.2
but the entire file is much longer, this is the first few lines. The first column is a timestamp and the next 6 columns are temperature recordings. I need to write a loop that will find the average of the 6 measurements but will ignore measurement of 0.0 because this just means the sensor wasn't turned on. Later in the measurements, the first column does have a measurement. Is there a way for me to write an if statement or another way to only find averages of the non-zero numbers in a list? Right now, I have:
time = []
t1 = []
t2 = []
t3 = []
t4 = []
t5 = []
t6 = []
newdate = []
temps = open('file_path','r')
sepfile = temps.read().replace('\n','').split('\r')
temps.close()
for plotpair in sepfile:
data = plotpair.split('\t')
time.append(float(data[0]))
t1.append(float(data[1]))
t2.append(float(data[2]))
t3.append(float(data[3]))
t4.append(float(data[4]))
t5.append(float(data[5]))
t6.append(float(data[6]))
for data_seconds in time:
date = datetime(1904,1,1,5,26,02)
delta = timedelta(seconds=data_seconds)
newdate.append(date+delta)
for datapoint in t2,t3,t4,t5,t6:
temperatures = np.array([t2,t3,t4,t5,t6]).mean(0).tolist()
which only finds the average for the last 5 measurements. I'm hoping to find a better method that will ignore 0.0's and include the first column when it is a non-0.
Prior questions show you have NumPy installed. So using NumPy, you could set the zeros to NaN and then call np.nanmean
to take the mean, ignoring NaNs:
import numpy as np
data = np.genfromtxt('data')
data[data == 0] = np.nan
means = np.nanmean(data[:, 1:], axis=1)
yields
array([ 22.1 , 22.08 , 22.08 , 22.08 , 22.1 , 22.06 , 22.06 ,
22.06 , 22.08 , 22.06 , 22.08 , 22.08 , 22.06 , 22.08 ,
22.08 , 22.08 , 22.08 , 22.08 , 22.08 , 21.975, 22.08 ,
22.08 , 22.08 , 22.06 , 22.08 , 22.06 ])