In the English language, the probability of encountering the th most common word is given roughly by for up to 1000 or so. The law breaks down for less frequent words, since the harmonic series diverges. Pierce's (1980, p. 87) statement that for is incorrect. Goetz states the law as follows: The frequency of a word is inversely proportional to its statistical rank such that
where is the number of different words.