How to interpret the values returned by numpy.correlate and numpy.corrcoef?

Question 1

How to interpret the values returned by numpy.correlate and numpy.corrcoef?

python numpy scipy correlation

khan · Nov 18, 2012 · Viewed 40.2k times · Source

Answer

Answer

numpy.correlate simply returns the cross-correlation of two vectors.

if you need to understand cross-correlation, then start with http://en.wikipedia.org/wiki/Cross-correlation.

A good example might be seen by looking at the autocorrelation function (a vector cross-correlated with itself):

import numpy as np

# create a vector
vector = np.random.normal(0,1,size=1000) 

# insert a signal into vector
vector[::50]+=10

# perform cross-correlation for all data points
output = np.correlate(vector,vector,mode='full')

Code graph

This will return a comb/shah function with a maximum when both data sets are overlapping. As this is an autocorrelation there will be no "lag" between the two input signals. The maximum of the correlation is therefore vector.size-1.

if you only want the value of the correlation for overlapping data, you can use mode='valid'.

Question 2

I have two 1D arrays and I want to see their inter-relationships. What procedure should I use in numpy? I am using numpy.corrcoef(arrayA, arrayB) and numpy.correlate(arrayA, arrayB) and both are giving some results that I am not able to comprehend or understand.

Can somebody please shed light on how to understand and interpret those numerical results (preferably, using an example)?

How to interpret the values returned by numpy.correlate and numpy.corrcoef?

Answer

Related questions