Find the "peak" of a set of data

Robert Sköld picture Robert Sköld · Aug 18, 2010 · Viewed 12.8k times · Source

I have a set of data, for which I'd like to find an average peak. I've done some testing in Numbers.app to see what I'm after and if I make a chart of the dataset it has a feature it calls "polynomial trendline" which draws a curve of the data and the peak of that curve looks exactly like the point/value I'm after.

So how could I programmatically calculate that curve and find that tangent on the curve?

I've been looking around on wikipedia and found topics like "Normal distribution" and "Polynomial regression" which seems very much related, but I've always found it hard to follow the equations on wikipedia so I'm hoping maybe someone here could give me a programatic example.

Here's a couple of charts to illustrate what I'm after. The green dots are the data points and the blue line is the "polynomial trendline" (of order 6). The "peak" of that trendline is what I'm after.

Example with even dataset Example with uneven dataset

Updated question:

After some answers I realize my question need to be rephrased as the problem is not really how to find the peak of the curve but more of how to generate that blue curve from the green points so I can find where in the dataset the "weight" lies. The goal is to get a sort of 'average maximum'.

I guess another question would be "what is this particular problem actually called?" ;)

Answer

nico picture nico · Aug 18, 2010

Although the data looks like that you're not necessarily after a normal distribution.

The topic of distribution fitting is quite complex and, unless you have some clear a priori assumptions of what your data distribution is, I would not venture there. In case you have assumptions on the type of distribution, have a look at least squares or maximum likelihood extimation methods.

However, I would suggest you should rather use a bezier-spline or LOESS to "smooth" your data and then just find the maximum of the computed curve.

I doubt that an approach using the derivative would work here.