I want to calculate the multivariate gaussian density function for a data set I have on python. My dataset has 21 variables and there 75 data points.
I have calculated the covariance matrix (cov) for this which is a 21*21 array, and the mean array, m, which has shape (21,). The other input I need to use this scipy function is "Quantiles(array-like), with the last axis of x denoting the components".
I don't really understand what the quantiles refers to.
I wrote my quantiles input for the function as quantiles = np.array([0.0, 0.01, 0.05, 0.1, 1-0.10, 1-0.05, 1-0.01, 1.0]) but I keep getting an error when I then compute scipy.stats.multivariate_normal.pdf(quantiles,m,cov)
The error is: valueError:operands could not be broadcast together with shapes (1,8) (21,)
Could anyone help??
I think the document asks for a x
which contains in its last axis the actual random vectors, in a rather incomprehensible way. The following code works:
import numpy as np
from scipy.stats import multivariate_normal
mean = np.array([0.5, 0.1, 0.3])
cov = np.array([[0.1, 0.0, 0.0], [0.0, 1.5, 0.0], [0.0, 0.0, 0.9]])
x = np.random.uniform(size=(100, 3))
y = multivariate_normal.pdf(x, mean=mean, cov=cov)
print(y)
So build your data matrix x
such that in the first dimension (each row) it contains your data vectors. The second dimension (columns) will constitute your 21 separate variables. So basically, you need to insert your data into a (75,21)
sized matrix. Becareful that the mean vector and covariance matrix entries correspond to the correct variables.