PostgreSQL Bulk Processing PostgreSQL · Cursor · Bulk Processing

PL/pgSQL Bulk Processing

Master PL/pgSQL bulk processing the PostgreSQL way — set-based SQL beats Oracle-style BULK COLLECT/FORALL translations. Covers single-statement UPDATEs, UPDATE...FROM unnest() for per-row distinct values (the FORALL equivalent), array_agg for collection, multi-row VALUES INSERTs, and cursor-based batching for very large result sets.

What is "Bulk Processing"?

If you're coming from Oracle PL/SQL, you know BULK COLLECT INTO and FORALL — language constructs that let you fetch many rows into an array in one go, then perform a batch DML against that array. The point: avoid the per-row context-switching overhead of "loop, fetch one row, do something, loop again."

PostgreSQL doesn't have those exact statements. What it has is even better for most cases: an SQL engine that's already optimized to operate on whole result sets at once. The PL/pgSQL equivalent of bulk operations falls into three patterns:

Pattern	Best for
Set-based SQL — single UPDATE/INSERT covering many rows	The vast majority of use cases. Almost always the fastest.
Arrays + `SELECT array_agg(...)`	When you need the data in PL/pgSQL memory for further processing
Cursor + batched FETCH	When working with very large result sets that won't fit in memory

⚠️ Don't translate Oracle BULK COLLECT code mechanically into PostgreSQL. The Oracle pattern of "fetch into array, loop the array, run individual updates" is usually the wrong PostgreSQL idiom — a single set-based SQL statement is faster and clearer. Reach for arrays only when the per-row processing genuinely can't be expressed in plain SQL.

Sample Table

PL/pgSQL — Setup

DROP TABLE IF EXISTS employees;

CREATE TABLE employees (
    employee_id SERIAL PRIMARY KEY,
    salary      DECIMAL(10, 2)
);

INSERT INTO employees (salary) VALUES
    (50000.00),
    (60000.00),
    (70000.00),
    (80000.00),
    (90000.00);

SELECT * FROM employees;

OUTPUT

psql — initial state

postgres=#SELECT * FROM employees;

employee_id | salary

-------------+----------

1 | 50000.00

2 | 60000.00

3 | 70000.00

4 | 80000.00

5 | 90000.00

(5 rows)

Examples

Example 1 — The Idiomatic Bulk Update

Goal: increase every employee's salary by 10%. The PostgreSQL way — a single SQL statement that operates on the whole table:

PL/pgSQL — set-based UPDATE

DROP FUNCTION IF EXISTS update_all_salaries();

CREATE OR REPLACE FUNCTION update_all_salaries()
RETURNS VOID AS $$
BEGIN
    UPDATE employees
    SET    salary = salary * 1.1;
END;
$$ LANGUAGE plpgsql;

SELECT update_all_salaries();
SELECT * FROM employees;

OUTPUT

psql — bulk update

postgres=#SELECT update_all_salaries();

update_all_salaries

---------------------

(1 row)

postgres=#SELECT * FROM employees;

employee_id | salary

-------------+----------

1 | 55000.00

2 | 66000.00

3 | 77000.00

4 | 88000.00

5 | 99000.00

(5 rows)

One UPDATE statement updated all five rows. PostgreSQL's planner saw this as a sequential scan plus an in-place update — about as efficient as it gets. This is the "bulk processing" pattern in PostgreSQL — let SQL do the work.

Example 2 — array_agg Into an Array

When you genuinely need the data in PL/pgSQL memory — say, to compute something across rows that requires iteration — use array_agg to collect a column into an array in one SELECT:

PL/pgSQL — array_agg into array

DO $$
DECLARE
    emp_ids      INT[];
    emp_salaries DECIMAL[];
    total        DECIMAL := 0;
BEGIN
    -- Collect both columns into arrays in one round trip
    SELECT array_agg(employee_id ORDER BY employee_id),
           array_agg(salary      ORDER BY employee_id)
    INTO   emp_ids, emp_salaries
    FROM   employees;

    -- Walk the arrays in PL/pgSQL
    FOR i IN 1..array_length(emp_ids, 1) LOOP
        RAISE NOTICE 'Employee %: %', emp_ids[i], emp_salaries[i];
        total := total + emp_salaries[i];
    END LOOP;

    RAISE NOTICE 'Total payroll: %', total;
END
$$;

OUTPUT

psql — array walk

postgres=#DO $$ DECLARE emp_ids INT[]; emp_salaries DECIMAL[]; total DECIMAL := 0; BEGIN SELECT array_agg(...) INTO emp_ids, emp_salaries FROM employees; ... END $$;

NOTICE: Employee 1: 55000.00

NOTICE: Employee 2: 66000.00

NOTICE: Employee 3: 77000.00

NOTICE: Employee 4: 88000.00

NOTICE: Employee 5: 99000.00

NOTICE: Total payroll: 385000.00

⚠️ Even this example is silly when SQL can do it directly. A simple SELECT SUM(salary) FROM employees would compute the total faster than fetching everything into arrays and looping. The array pattern earns its keep when per-row processing genuinely can't be expressed in SQL — string formatting, calling external functions, complex conditional logic that depends on previous-row state, etc.

Example 3 — When Bulk Update Differs Per Row

Sometimes the value to update each row to differs per row. If you have those values in PL/pgSQL — perhaps computed from external logic — you can update them all in one SQL statement using UPDATE ... FROM unnest(...):

PL/pgSQL — bulk update from arrays

DO $$
DECLARE
    -- Imagine these came from some external computation
    target_ids      INT[]     := ARRAY[1, 3, 5];
    target_bonuses  DECIMAL[] := ARRAY[1000.00, 2000.00, 3000.00];
BEGIN
    -- Apply all the per-id updates in a single SQL statement
    UPDATE employees AS e
    SET    salary = e.salary + b.bonus
    FROM   unnest(target_ids, target_bonuses) AS b(emp_id, bonus)
    WHERE  e.employee_id = b.emp_id;

    RAISE NOTICE 'Bonuses applied';
END
$$;

SELECT * FROM employees;

OUTPUT

psql — bulk update from arrays

postgres=#DO $$ DECLARE target_ids INT[] := ARRAY[1, 3, 5]; target_bonuses DECIMAL[] := ARRAY[1000.00, 2000.00, 3000.00]; BEGIN UPDATE employees ... FROM unnest(...) ... END $$;

NOTICE: Bonuses applied

postgres=#SELECT * FROM employees;

employee_id | salary

-------------+----------

1 | 56000.00

2 | 66000.00

3 | 79000.00

4 | 88000.00

5 | 102000.00

(5 rows)

Three different bonus amounts applied to three different employees in a single UPDATE statement. unnest turns the parallel arrays into a virtual table of (emp_id, bonus) rows; the UPDATE joins against it. This is the FORALL equivalent in PostgreSQL — and it's just plain SQL.

Example 4 — Bulk INSERT from a Single VALUES Clause

Inserting many rows? One INSERT with multiple VALUES tuples beats a loop of single inserts by a wide margin — fewer round trips, fewer parse/plan cycles, single transaction commit:

PL/pgSQL — bulk insert

DO $$
BEGIN
    -- One INSERT, six rows
    INSERT INTO employees (salary) VALUES
        (45000), (52000), (61000), (47000), (58000), (63000);
END
$$;

SELECT count(*) FROM employees;

OUTPUT

psql — bulk insert

postgres=#SELECT count(*) FROM employees;

count

-------

(1 row)

For really large bulk loads — say, thousands of rows — even a multi-row INSERT can be beaten by COPY, PostgreSQL's bulk load command. But for hundreds of rows, multi-row VALUES is plenty.

Example 5 — Cursor + Batch FETCH for Very Large Sets

If a result set is too large to fit in memory, walk it with a cursor, processing N rows at a time. FETCH FORWARD n returns up to n rows from a cursor:

PL/pgSQL — batched cursor fetch

DO $$
DECLARE
    big_cursor CURSOR FOR SELECT employee_id, salary FROM employees ORDER BY employee_id;
    rec        record;
    batch_cnt  INTEGER := 0;
BEGIN
    OPEN big_cursor;

    LOOP
        FETCH big_cursor INTO rec;
        EXIT WHEN NOT FOUND;

        batch_cnt := batch_cnt + 1;
        -- Process rec.employee_id, rec.salary

        -- Every 3 rows, do something batched (commit-style work, network call, etc)
        IF batch_cnt % 3 = 0 THEN
            RAISE NOTICE 'Processed batch ending at id %', rec.employee_id;
        END IF;
    END LOOP;

    CLOSE big_cursor;
    RAISE NOTICE 'Total rows: %', batch_cnt;
END
$$;

OUTPUT

psql — batched processing

postgres=#DO $$ DECLARE big_cursor CURSOR FOR SELECT ... ; ... LOOP FETCH ... INTO rec; ... END $$;

NOTICE: Processed batch ending at id 3

NOTICE: Processed batch ending at id 6

NOTICE: Processed batch ending at id 9

NOTICE: Total rows: 11

Cursor-based batching is the right pattern when memory is tight or when each batch needs its own bookkeeping (logging, progress reporting, periodic commits in a long-running migration).

Choosing the Right Pattern

Want to…	Best pattern
Update many rows the same way	Single `UPDATE` statement
Update many rows, each with a different value	`UPDATE ... FROM unnest(arr1, arr2)`
Insert many rows of constant data	One `INSERT` with multiple `VALUES`
Insert from a query	`INSERT INTO target SELECT ...`
Load from a file (very large)	`COPY`
Aggregate over rows	`SELECT SUM/COUNT/AVG ... INTO var FROM ...`
Per-row processing that needs application logic	`FOR rec IN SELECT ... LOOP`
Process a result set too large to fit in memory	Cursor + batched `FETCH`

Best Practices

Set-based SQL beats array-based loops in nearly every case. A single UPDATE/INSERT/DELETE that the planner sees is faster than a Python-style "fetch into list, modify, write back."
Reach for arrays only when application logic genuinely demands it — string formatting, calling stored procedures, complex per-row decisions involving previous-row state.
Use UPDATE ... FROM unnest(...) for the case where each row needs a different update value — this is PostgreSQL's FORALL.
Bulk INSERT with multi-row VALUES is dramatically faster than looped single inserts. For really large loads, use COPY.
Don't translate Oracle BULK COLLECT patterns mechanically. Re-think the problem in set-based terms first.
Use cursors for memory-bounded processing when result sets are larger than RAM. Otherwise, plain SQL or FOR rec IN SELECT is simpler.
Trust the planner. PostgreSQL is very good at executing single statements over large datasets efficiently; manual batching at the PL/pgSQL level rarely outperforms it.

Conclusion

PostgreSQL doesn't have Oracle's BULK COLLECT / FORALL statements. It doesn't need them — set-based SQL is the bulk-processing primitive.
For most "bulk operations," a single UPDATE, INSERT, or DELETE statement is the right answer. The planner handles the rest.
For per-row distinct values, use UPDATE ... FROM unnest(arr1, arr2) AS t(...) — equivalent to FORALL.
For collecting data into PL/pgSQL memory, use SELECT array_agg(col) INTO arr FROM ... — equivalent to BULK COLLECT.
Reach for arrays and loops only when application logic genuinely can't be expressed in SQL.
For result sets too large to fit in memory, walk them with a cursor in batches.