2013년 7월 25일 목요일

Chi-square Test [nonparametric test] -No24


Chi-square test is one of the representative nonparametric test. This test verify the correlation between categorical variables.

In order to verify two discrete variables, chi-square tests statistical differences between observation data and expectations.
This test is also statistical hypothesis test whose null hypothesis is that two categorical variables have no relations.

It is quite easy to understand Chi-square test by conducting a test with different samples.

As you can see below, there are two table which has a data of Candidate preference by man and woman. I would like to compare two data. Intuitively, we can guess that there is no difference by man and woman on case-2. However, there is a quite difference preference on case-1.




[Case-1]
Candidate-1 Candidate-2
man 35 8
woman 10 45

[Case-2]
Candidate-1Candidate-2
man1035
woman1134


Let's do a test right away.

[ CASE1 Chi-test result ]

> chisq.test(zz, simulate.p.value = TRUE)

        Pearson's Chi-squared test with simulated p-value (based on 2000
        replicates)

data:  zz
X-squared = 38.8319, df = NA, p-value = 0.0004998



[ CASE2 Chi-test result ] 

> chisq.test(zz, simulate.p.value = TRUE)

        Pearson's Chi-squared test with simulated p-value (based on 2000
        replicates)

data:  zz
X-squared = 0.0621, df = NA, p-value = 1


As you expected, first test reject null hypothesis, so we are strongly believed that there is possibility man and woman has a different preference.
Second test result  leads us that there is no difference of preference by sex.



* I would like to add  how to get X-squared value (X is a Greek capital letter Chi)
 Why don't we calculate the case-1 by ourselves using the calculation logic which is attributed to Karl Pearson.

This test is conducted with the assumption that observation frequency should be consistent with the expectation frequency up to a certain point.




Let's calculate it
==(35-19.74)^2/19.74+(10-25.26)^2/25.26+(8-23.26)^2/23.26+(45-29.74)^2/29.74
==>>
38.85718483

This value is quite similar to the value from  R chi square.

X-squared = 38.8319

It is interesting.

댓글 없음:

댓글 쓰기