Indexes Overview B-tree · Hash · GIN · GiST · BRIN

Indexes — Introduction

A side data structure that lets the database jump straight to the rows you want instead of scanning every page. Learn the index types PostgreSQL offers, when each fits, and the trade-offs before you sprinkle them everywhere.

What an index is

An index is a side data structure that helps the database find rows quickly. The classic analogy is the index at the back of a book: instead of flipping through every page to find a topic, you look it up in a sorted list and jump straight there. Database indexes work the same way — without one, finding a row means scanning the whole table; with one, the database can navigate directly to the matching rows.

The trade-off: faster reads in exchange for slightly slower writes (every INSERT / UPDATE / DELETE has to keep the index in sync) and some extra disk space. For most tables, that's a great deal — but only if the index actually gets used.

Without index vs with index

Syntax

CREATE INDEX index_name
    ON table_name
    [USING index_type]
    (column_name [, column_name, ...]);

Defaults to a B-tree index, which is the right choice for the vast majority of cases.

The index types PostgreSQL offers

Type	Best for	Notes
B-tree	Equality, range, ORDER BY, anything sortable	Default. Use it unless you have a reason not to.
Hash	Equality only (=)	Slightly faster than B-tree for pure equality, but rarely worth it.
GIN	Arrays, JSONB, full-text search	Many keys per row. Slower writes, fast contains-style queries.
GiST	Geometric data, ranges, full-text	Generalized for "nearest neighbour" and overlap queries.
SP-GiST	Non-balanced data (phone numbers, IPs)	Space-partitioned variant of GiST.
BRIN	Very large tables with naturally ordered data (timestamps)	Tiny size, lossy. Great for "data lake" style logs.

Example — B-tree (the default)

SQL

CREATE INDEX idx_customer_id
    ON customers (customer_id);

-- Same thing explicitly using B-tree:
CREATE INDEX idx_customer_id
    ON customers USING BTREE (customer_id);

Equality (=), range (<, >, BETWEEN), IS NULL, sorting (ORDER BY), and even pattern matching anchored at the left (LIKE 'abc%') all use B-tree indexes.

Example — Hash for pure equality

SQL

CREATE INDEX idx_product_id
    ON customers USING HASH (product_id);

Hash indexes only help with =. They're rarely the best choice — B-tree handles equality nearly as fast and supports much more besides.

Example — GIN for JSONB / arrays / full-text

SQL

-- JSONB key/value lookups
CREATE INDEX idx_doc_data ON documents USING GIN (data);

-- "tag in array" lookups
CREATE INDEX idx_post_tags ON posts USING GIN (tags);

-- Full-text search
CREATE INDEX idx_post_search ON posts USING GIN (to_tsvector('english', body));

GIN ("Generalized Inverted iNdex") shines when one row contains many searchable values — like every tag in an array, every word in a body of text, or every key in a JSONB document.

Example — BRIN for huge time-series tables

SQL

-- Logs table where rows are inserted in time order
CREATE INDEX idx_logs_created ON logs USING BRIN (created_at);

BRIN ("Block Range INdex") stores summary information for ranges of disk blocks. The index itself is tiny, but it relies on the data being naturally ordered by the indexed column. For a time-series log table that's almost always the case.

Trade-offs to keep in mind

Win	Cost
Faster `SELECT`, `JOIN`, `ORDER BY`	Disk space — sometimes more than the table itself
Faster constraint checks (UNIQUE, FOREIGN KEY)	Slower `INSERT`, `UPDATE`, `DELETE`
Quicker MIN/MAX/COUNT in many cases	Index may need REINDEX as it bloats over time

💡 Don't index everything. An index that never gets used is pure overhead. Use EXPLAIN to see which queries are slow, then index the columns those queries filter or join on. Look at pg_stat_user_indexes later to find indexes that aren't pulling their weight.

Indexes you get for free

Some indexes are created automatically — you don't have to ask:

Primary keys get a unique B-tree index automatically.
UNIQUE constraints create a unique B-tree index to enforce themselves.
Foreign keys get an index on the parent side (the PK) but not on the child side — see the FK page for why you usually want to add the child index yourself.

Where to go next

Unique index — enforce no-duplicates plus get a fast lookup.
Multicolumn index — index more than one column; column order matters.
Partial index — index only the rows you actually query.
Index on expression — index a computed value (e.g. LOWER(email)).
REINDEX / DROP / list — maintenance and inspection.