Chi-Square

Chi-square test
How to perform Chi-square tests using SPSS

The chi-square is one of the most popular statistics because it is easy to calculate and interpret. There are two kinds of chi-square tests. The first is called a one-way analysis, and the second is called a two-way analysis. The purpose of both is to determine whether the observed frequencies (counts) markedly differ from the frequencies that we would expect by chance.

The observed cell frequencies are organized in rows and columns like a spreadsheet. This table of observed cell frequencies is called a //contingency// table, and the chi-square test if part of a //contingency table analysis//.

The chi-square statistic is the sum of the contributions from each of the individual cells. Every cell in a table contributes something to the overall chi-square statistic. If a given cell differs markedly from the expected frequency, then the contribution of that cell to the overall chi-square is large. If a cell is close to the expected frequency for that cell, then the contribution of that cell to the overall chi-square is low. A large chi-square statistic indicates that somewhere in the table, the observed frequencies differ markedly from the expected frequencies. It does not tell which cell (or cells) are causing the high chi-square...only that they are there. When a chi-square is high, you must visually examine the table to determine which cell(s) are responsible.

When there are exactly two rows and two columns, the chi-square statistic becomes inaccurate, and Yate's correction for continuity is usually applied. Statistics Calculator will automatically use Yate's correction for two-by-two tables when the expected frequency of any cell is less than 5 or the total N is less than 50. If there is only one column or one row (a one-way chi-square test), the degrees of freedom is the number of cells minus one. For a two way chi-square, the degrees of freedom is the number or rows minus one times the number of columns minus one.

Using the chi-square statistic and its associated degrees of freedom, the software reports the probability that the differences between the observed and expected frequencies occurred by chance. Generally, a probability of .05 or less is considered to be a significant difference. A standard spreadsheet interface is used to enter the counts for each cell. After you've finished entering the data, the program will print the chi-square, degrees of freedom and probability of chance.

Use caution when interpreting the chi-square statistic if any of the expected cell frequencies are less than five. Also, use caution when the total for all cells is less than 50.

A drug manufacturing company conducted a survey of customers. The research question is: Is there a significant relationship between packaging preference (size of the bottle purchased) and economic status? There were four packaging sizes: small, medium, large, and jumbo. Economic status was: lower, middle, and upper. The following data was collected.
 * __Example__**

|| || Lower || Middle || Upper ||
 * Small || 24 || 22 || 18 ||
 * Medium || 23 || 28 || 19 ||
 * Large || 18 || 27 || 29 ||
 * Jumbo || 16 || 21 || 33 ||

Chi-square statistic = 9.743 Degrees of freedom = 6 Probability of chance = .1359

Chi-square Explained Another Way: []