In the English language, the probability of encountering the th most common word is given roughly by
for
up to 1000 or so. The law breaks down for less frequent words,
since the harmonic series diverges. Pierce's (1980,
p. 87) statement that
for
is incorrect. Goetz states the
law as follows: The frequency of a word is inversely proportional to its statistical
rank
such that
where is the number of different words.