Contingency Table -- from Wolfram MathWorld

A contingency table, sometimes called a two-way frequency table, is a tabular mechanism with at least two rows and two columns used in statistics to present categorical data in terms of frequency counts. More precisely, an contingency table shows the observed frequency of two variables, the observed frequencies of which are arranged into rows and columns. The intersection of a row and a column of a contingency table is called a cell.

gender	cup	cone	sundae	sandwich	other
male	592	300	204	24	80
female	410	335	180	20	55

For example, the above contingency table has two rows and five columns (not counting header rows/columns) and shows the results of a random sample of adults classified by two variables, namely gender and favorite way to eat ice cream (Larson and Farber 2014). One benefit of having data presented in a contingency table is that it allows one to more easily perform basic probability calculations, a feat made easier still by augmenting a summary row and column to the table.

gender	cup	cone	sundae	sandwich	other	total
male	592	300	204	24	80	1200
female	410	335	180	20	55	1000
total	1002	635	384	44	135	2200

The above table is an extended version of the first table obtained by adding a summary row and column. These summaries allow easier computation of several different probability-related quantities. For example, there's a probability that the person sampled prefers their ice cream in a cup, while the probability that a random participant is female is . What's more, computing conditional probabilities is made easier using contingency tables, e.g., the probability that a person prefers ice cream sandwiches given that the person is male is , while the conditional probability that a person is male given that ice cream sandwiches are preferred is .

Other common statistical analyses can be performed on data given in contingency table form. For example, one useful value to know is the so-called expected frequency of the cell at the intersection of column and row , the formula for which is given by

(1)

Computing says that the value one would expect at cell --i.e., the expected number of men who prefer to eat ice cream from a cup--is approximately

(2)

whereby one may deduce that there are somehow "more than expected" of that particular demographic included in the given sample. Note, too, that knowing automatically gives, e.g., , without repeated application of ():

E_(2,1)=(total people who prefer cups)-E_(1,1) approx 1002-546.54=455.46.

(3)

One of the major benefits of computing expected frequencies is the ability to test whether the two variables being examined--in this case, gender and favorite way to eat ice cream--are actually independent as they've been assumed throughout. This is done by computing, for each cell , the expected frequency , comparing it to the observed frequency , and then performing a chi-squared test.

Another common test associated to contingency tables is so-called homogeneity of proportions test which is a form of chi-squared test used to determine whether several proportions are equal when samples are taken from different populations (Larson and Farber 2014). Worth noting is that both of the above-mentioned instances of chi-squared testing requires a randomly-selected sampling of observed frequencies, each of whose expected frequency is at least 5. These tests play important roles throughout various branches of statistics.

Contingency Table

See also

Explore with Wolfram|Alpha

References

Referenced on Wolfram|Alpha

Cite this as:

Subject classifications