I'm trying to plot a histogram whose bins are normalized by the number of elements in the bin.
I'm using the following
binwidth=5
bin(x,width)=width*floor(x/width) + binwidth/2.0
plot 'file' using (bin($2, binwidth)):($4) smooth freq with boxes
to get a basic histogram, but I want the value of each bin to be divided by the size of the bin. How can I go about this in gnuplot, or using external tools if necessary?
In gnuplot 4.4, functions take on a different property, in that they can execute multiple successive commands, and then return a value (see gnuplot tricks) This means that you can actually calculate the number of points, n, within the gnuplot file without having to know it in advance. This code runs for a file, "out.dat", containing one column: a list of n samples from a normal distribution:
binwidth = 0.1
set boxwidth binwidth
sum = 0
s(x) = ((sum=sum+1), 0)
bin(x, width) = width*floor(x/width) + binwidth/2.0
plot "out.dat" u ($1):(s($1))
plot "out.dat" u (bin($1, binwidth)):(1.0/(binwidth*sum)) smooth freq w boxes
The first plot statement reads through the datafile and increments sum once for each point, plotting a zero.
The second plot statement actually uses the value of sum to normalise the histogram.