The word quantile has no fewer than two distinct meanings in probability. Specific elements in the range of a variate are called quantiles, and denoted (Evans et al. 2000, p. 5). This particular meaning has close ties to the so-called quantile function, a function which assigns to each probability attained by a certain probability density function a value defined by
(1)
|
The th -tile is that value of , say , which corresponds to a cumulative frequency of (Kenney and Keeping 1962). If , the quantity is called a quartile, and if , it is called a percentile.
A parametrized version of quantile is implemented as Quantile[list, q, a, b, c, d], which returns
(2)
|
where is the th order statistic, is the floor function, is the ceiling function, is the fractional part, and
(3)
|
There are a number of slightly different definitions of the quantile that are in common use, as summarized in the following table.
# | plotting position | description | ||||
Q1 | 0 | 0 | 1 | 0 | inverted empirical CDF | |
Q2 | -- | -- | -- | -- | inverted empirical CDF with averaging | |
Q3 | 0 | 0 | 0 | observation numberer closest to | ||
Q4 | 0 | 0 | 0 | 1 | California Department of Public Works method | |
Q5 | 0 | 0 | 1 | Hazen's model (popular in hydrology) | ||
Q6 | 0 | 1 | 0 | 1 | Weibull quantile | |
Q7 | 1 | 0 | 1 | interpolation points divide sample range into intervals | ||
Q8 | 0 | 1 | unbiased median | |||
Q9 | 0 | 1 | approximate unbiased estimate for a normal distribution |
The Wolfram Language's parametrization can handle all of these but Q2. In Q1, the empirical distribution function is the estimated cumulative proportion of the data set that does not exceed any specified value. Q2 is essentially the same as Q1 except that averages are taken at points of discontinuity. In Q3, the th quantile is the observation numbered closest to , where is the sample size. In Q4, the interpolation points divide the sample range into intervals. In Q6, the vertices divide the sample into regions, each with probability on average. It was proposed by Weibull in 1939, and plots at the mean position. Q7 divides the range into intervals, of which exactly lie to the left of . Q8 plots at the median position. Q9 is used in quantile-quantile plots. If is the normal distribution and is the plotting position of , then is an approximately unbiased estimate of .