PERCENT_RANK Window Function
A row's relative position within its partition, scaled to [0, 1]. First row gets 0, last row gets 1, the rest are spaced by (rank-1)/(N-1). The cleanest way to filter top-x-percent of a group.
PERCENT_RANK() calculates a row's relative position within its partition as a number between 0 and 1. The first row always gets 0, the last row always gets 1, and everything in between is spread out proportionally.
The formula it uses is straightforward: (rank − 1) / (total_rows − 1). So if there are 5 rows, the percent ranks are 0.00, 0.25, 0.50, 0.75, 1.00.
[PARTITION BY column_list]
ORDER BY column_list [ASC | DESC]
)
Like the ranking functions, PERCENT_RANK() takes no arguments. ORDER BY is required.
CREATE TABLE employees (
year INT NOT NULL,
name VARCHAR(20) NOT NULL,
amount NUMERIC NOT NULL
);
INSERT INTO employees VALUES
(2018, 'Jack', 15000),
(2018, 'Jane', 9000),
(2018, 'John', 8000),
(2018, 'Stephen', 7000),
(2018, 'Harold', 6000),
(2019, 'Yin', 20000),
(2019, 'Jack', 18000),
(2019, 'Jane', 16000),
(2019, 'John', 14000),
(2019, 'Harold', 12000);
SELECT name, amount,
PERCENT_RANK() OVER (ORDER BY amount) AS pct_rank
FROM employees
WHERE year = 2018;
| name | amount | pct_rank |
|---|---|---|
| Harold | 6000 | 0.00 |
| Stephen | 7000 | 0.25 |
| John | 8000 | 0.50 |
| Jane | 9000 | 0.75 |
| Jack | 15000 | 1.00 |
Five rows means four "gaps" between them, so each step is 0.25. Harold (the smallest) gets 0; Jack (the largest) gets 1.
SELECT year, name, amount,
ROUND(PERCENT_RANK() OVER (
PARTITION BY year
ORDER BY amount
)::numeric, 2) AS pct_rank
FROM employees
ORDER BY year, amount;
The percent rank resets in each year — every year has its own 0..1 distribution.
WITH ranked AS (
SELECT year, name, amount,
PERCENT_RANK() OVER (
PARTITION BY year
ORDER BY amount DESC -- DESC so high amounts get low pct_rank
) AS pr
FROM employees
)
SELECT * FROM ranked WHERE pr <= 0.25;
Both produce numbers between 0 and 1, but they answer different questions:
| Function | Formula | Range | First row | Last row |
|---|---|---|---|---|
| PERCENT_RANK | (rank−1)/(N−1) | 0 ≤ x ≤ 1 | 0 | 1 |
| CUME_DIST | (rows ≤ this) / N | 0 < x ≤ 1 | 1/N | 1 |
PERCENT_RANK is "where in the rank order am I, scaled to 0..1." CUME_DIST is "what fraction of rows are at or below me." They're close but not the same.
Ties get the same percent rank. Concretely, the percent rank is computed using the same rank-with-gaps that RANK() would assign — so two tied rows share the percent rank of whichever comes first.
PERCENT_RANK()returns each row's relative position as a value in [0, 1].- First row is always 0; last row is always 1.
- Formula:
(rank − 1) / (total_rows − 1). - Use it for percentile cutoffs (top 10% / bottom quartile / etc.).
- Subtly different from
CUME_DIST— pick the one that matches your definition.