I have data which is of the gaussian form when plotted as histogram. I want to plot a gaussian curve on top of the histogram to see how good the data is. I am using pyplot from matplotlib. Also I do NOT want to normalize the histogram. I can do the normed fit, but I am looking for an Un-normalized fit. Does anyone here know how to do it?
Thanks! Abhinav Kumar
As an example:
import pylab as py
import numpy as np
from scipy import optimize
# Generate a
y = np.random.standard_normal(10000)
data = py.hist(y, bins = 100)
# Equation for Gaussian
def f(x, a, b, c):
return a * py.exp(-(x - b)**2.0 / (2 * c**2))
# Generate data from bins as a set of points
x = [0.5 * (data[1][i] + data[1][i+1]) for i in xrange(len(data[1])-1)]
y = data[0]
popt, pcov = optimize.curve_fit(f, x, y)
x_fit = py.linspace(x[0], x[-1], 100)
y_fit = f(x_fit, *popt)
plot(x_fit, y_fit, lw=4, color="r")
This will fit a Gaussian plot to a distribution, you should use the pcov
to give a quantitative number for how good the fit is.
A better way to determine how well your data is Gaussian, or any distribution is the Pearson chi-squared test. It takes some practise to understand but it is a very powerful tool.