Multicolumn Index
An index over two or more columns. Column order is what makes or breaks it — queries have to use a leftmost prefix of the column list. Plus the storage cost and which index types support multicolumn.
A multicolumn index (or composite index) is one index built over two or more columns. It speeds up queries that filter or sort by those columns together — and it's much more efficient than maintaining several single-column indexes that happen to all participate in the same query.
The catch: column order matters a great deal. A query has to use the leftmost prefix of the index for the index to help.
ON table_name (col_a, col_b, col_c);
Suppose you create an index on (a, b, c) in that order. The optimizer can use it for queries that filter on:
| Query filter | Index used? |
|---|---|
WHERE a = ? | Yes |
WHERE a = ? AND b = ? | Yes |
WHERE a = ? AND b = ? AND c = ? | Yes |
WHERE b = ? | Mostly no — skips the leading column |
WHERE c = ? | No |
WHERE b = ? AND c = ? | Mostly no — still skips the leading column |
WHERE a = ? is your most common query, lead with a. If you typically filter on b too, follow with b. Less-frequent columns go last.CREATE TABLE students3 (
student_id SERIAL PRIMARY KEY,
first_name TEXT,
last_name TEXT
);
-- Generate 2 million rows for a meaningful benchmark
INSERT INTO students3 (first_name, last_name)
SELECT 'First_' || g,
'Last_' || g
FROM generate_series(1, 2000000) g;
EXPLAIN ANALYZE
SELECT * FROM students3
WHERE first_name = 'First_999500'
AND last_name = 'Last_999500';
-- Sequential Scan on students3 ... rows=2000000 ... actual time=350.4 ms
PostgreSQL has to scan every row — the cost grows with the table.
CREATE INDEX idx_students3_name
ON students3 (first_name, last_name);
EXPLAIN ANALYZE
SELECT * FROM students3
WHERE first_name = 'First_999500'
AND last_name = 'Last_999500';
-- Index Scan using idx_students3_name ... actual time=0.05 ms
Several thousand times faster. The index lets PostgreSQL navigate straight to the matching row.
-- Uses the index — first_name is the leading column
SELECT * FROM students3 WHERE first_name = 'First_500';
-- Doesn't use the index efficiently — last_name is not the leading column
SELECT * FROM students3 WHERE last_name = 'Last_500';
If you frequently search on last_name alone, you'd want either a separate index on last_name, or to reorder the multicolumn index as (last_name, first_name).
| Type | Multicolumn? |
|---|---|
| B-tree | Yes — up to 32 columns |
| GIST | Yes |
| GIN | Yes |
| BRIN | Yes |
| Hash | No — single column only |
| SP-GIST | No — single column only |
A multicolumn index is bigger than a single-column index — it stores combinations of values. Adding too many columns can make the index larger than the table, slow down writes, and use up disk that could be spent elsewhere. Aim for the smallest column list that handles your hot queries.
- Multicolumn (composite) indexes index two or more columns at once.
- Order matters — the index helps queries that filter on the leftmost prefix of the columns.
- Put the column you filter on most often first.
- B-tree, GIST, GIN, and BRIN support multicolumn; Hash and SP-GIST don't.
- Bigger index, more disk, slower writes — keep the column list as small as possible.