Calculate Chi-square with NA values

user2105555 picture user2105555 · Nov 15, 2014 · Viewed 7.4k times · Source

I want to perform a chi-squared test between two values with missing data. How can I do this? I've looked this up several times and across different sources, none of which had been successful.

Answer

LyzandeR picture LyzandeR · Nov 15, 2014

I assume that you want to perform a chi-squared test between two random variables x, y using the Pearson's Chi-squared test.

You can use the formula as it is i.e. chisq.test(x,y) or chisq.test(cont_table) where cont_table is the contingency table of random variables x and y.

As it is mentioned on the documentation of the function (link1):

"If x is a matrix with at least two rows and columns, it is taken as a two-dimensional contingency table: the entries of x must be non-negative integers. Otherwise, x and y must be vectors or factors of the same length; cases with missing values are removed, the objects are coerced to factors, and the contingency table is computed from these."

You cannot (and should not) perform a chi-squared test with missing values included in your vectors/matrix because it would lead to erroneous results.

check these links for more info: link1 link2