is there a way to plot an average in gnuplot?

kirill_igum picture kirill_igum · Jul 20, 2012 · Viewed 9.3k times · Source

Suppose I have a file with two columns of data file.dat. I would normally plot it with

plot "file.dat" u 1:2

I want to average over 10 (for example) preceding point and 10 following points, and plot it on the same plot. I can easily do that using some external script, where I make another column:

for(i=-10;i<=10;++i)
  $3[j] += $2[j-i]

However, I'd like to know the way to do it in gnuplot. My next step would be to do a Gaussian averaging.

Answer

andyras picture andyras · Jul 21, 2012

This is, perhaps surprisingly, not built in to gnuplot. There is no good way to manipulate individual data points in gnuplot, nor ranges of data points, because of how gnuplot processes the data as a stream.

One great thing about gnuplot is how easy it makes it to call external scripts and tools. If you want to use an external script to process the data from within gnuplot, you can do it like this:

plot "<script.py data.dat" u 1:2

As an example, you could use the python script below. It's sort of overkill, but you can set the parameter values either hard-coded in the script or at the command line.

#!/usr/bin/python2.7

import sys 

if (len(sys.argv) > 6): 
 print ""
 print "This script takes one mandatory argument, the name of a file containing"
 print "data to be plotted.  It takes up to four optional arguments as follows:"
 print " 1) the number of points before a data point to add into average."
 print " 2) the number of points after a data point to add into average."
 print " 3) the column number of y data (first column is column 1)"
 print " 4) the column number of x data (first column is column 1)"
 print ""
 exit()

# set variable defaults
box_back = 10   # number of points before current point to add into average
box_front = 10  # number of points after current point to add into average
y_col = 2       # column number of y data (first column is column 1)
x_col = 1       # column number of x data (first column is column 1)

# assign variables from command line arguments
inputFileName = str(sys.argv[1])
if (len(sys.argv) > 2): 
 box_back = int(sys.argv[2])
if (len(sys.argv) > 3): 
 box_front = int(sys.argv[3])
if (len(sys.argv) > 4): 
 y_col = int(sys.argv[4])
if (len(sys.argv) > 5): 
 x_col = int(sys.argv[5])

# open input file
f = open(inputFileName)

# make list from lines in file
lines = f.readlines()

# make sure boxcar average will work
if ((box_back + box_front + 1) > len(lines)):
 print ""
 print "ERROR: too many points for boxcar averaging."
 print ""
 exit()

# this is the number of points encompassed in the boxcar average
num_points = box_back + box_front + 1 

# this variable is the running sum.
sum_vals = 0

# add up values for first boxcar average
for i_ in range(0,num_points):
 sum_vals += float(lines[i_].split()[y_col-1])
print float(lines[box_back].split()[x_col-1]),sum_vals/num_points

# each subsequent average differs only in the first and last points from the
# previous average.
for i_ in range(box_back+1,len(lines)-box_front):
 sum_vals += float(lines[i_+box_front].split()[y_col-1])
 sum_vals -= float(lines[i_-box_back-1].split()[y_col-1])
 print float(lines[i_].split()[x_col-1]),sum_vals/num_points