Test for Homogeneity
In a study where there are two characteristics the researchers want to know whether these two characteristics, say “A” and “B,” are linked or independent. For such study we have paired observations in categorical data of size n, which is summarized in the contingency table.
The first column of contingency table should list categorical values (or levels)
for the characteristic “A”.
Then the rest of columns
correspond to categorical values (or responses) of the characteristic “B”,
and provide the cell frequencies  's.
The contingency table can be visualized by mosaic plot below.
The area of the tiles in the mosaic plot is proportional to the number
's.
The contingency table can be visualized by mosaic plot below.
The area of the tiles in the mosaic plot is proportional to the number  of observations
for the response of B within the level of A.
Thus, homogeneity can be indicated by the tiles of similar size across different levels of A.
 of observations
for the response of B within the level of A.
Thus, homogeneity can be indicated by the tiles of similar size across different levels of A.
The statement of null hypothesis becomes “the two characteristics are
independent.”
Let 
 and
 and 
 denote the total counts of the respective value A and B
(i.e., the raw and column sum in the contingency table).
Under the null hypothesis,
the expected frequencies for the contingency table are given by
denote the total counts of the respective value A and B
(i.e., the raw and column sum in the contingency table).
Under the null hypothesis,
the expected frequencies for the contingency table are given by
 where n denotes the total cell counts.
Then the chi-square statistic is
where n denotes the total cell counts.
Then the chi-square statistic is
 =
= 
Let  and
 and  denote the number of categorical values
in the row and the column, respectively.
Then we should compare the statistic with
chi-square distribution
with
 denote the number of categorical values
in the row and the column, respectively.
Then we should compare the statistic with
chi-square distribution
with 
 degrees of freedom,
and construct the critical region
degrees of freedom,
and construct the critical region
 to determine whether the null hypothesis can be rejected or not.
Equivalently we can reject the null hypothesis
(that is, we can find dependence and evidence of association of the two characteristics)
if p-value
to determine whether the null hypothesis can be rejected or not.
Equivalently we can reject the null hypothesis
(that is, we can find dependence and evidence of association of the two characteristics)
if p-value
 is significant (that is,
is significant (that is, 
 ).
).
© TTU Mathematics
