PostgreSQL stored procedures written in PL/pgSQL offer powerful database-side processing capabilities. However, when not properly optimized, they can become performance bottlenecks. In this post, I'll share some tips for identifying and fixing common slowdowns in your PL/pgSQL code.
Common PL/pgSQL Performance Issues
When working with PostgreSQL stored procedures, several patterns can lead to suboptimal performance:
1. Missing Indexes
Perhaps the most common issue isn't in the PL/pgSQL code itself but in the tables it operates on. Queries that would benefit from indexes but don't have them can dramatically slow down your procedures.
-- Before: Sequential scan on large table
EXPLAIN ANALYZE SELECT * FROM large_table WHERE rarely_indexed_column = 'some_value';
-- After: Add an index
CREATE INDEX idx_rarely_indexed_column ON large_table(rarely_indexed_column);
2. Row-by-Row Processing (LOOP vs. Set-Based Operations)
A frequent anti-pattern is processing data row-by-row when PostgreSQL could handle it as a single set-based operation.
-- Inefficient: Row-by-row processing
FOR rec IN SELECT id FROM large_table WHERE status = 'pending' LOOP
UPDATE processed_items SET status = 'completed' WHERE id = rec.id;
END LOOP;
-- Efficient: Set-based operation
UPDATE processed_items
SET status = 'completed'
WHERE id IN (SELECT id FROM large_table WHERE status = 'pending');
3. Excessive Function Calls
Calling functions repeatedly within loops can significantly degrade performance:
-- Inefficient: Function called for each row
FOR rec IN SELECT id FROM items LOOP
IF check_condition(rec.id) THEN -- Function call for every row
-- Do something
END IF;
END LOOP;
-- More efficient: Get all ids that meet the condition in one go
FOR rec IN SELECT id FROM items WHERE exists_in_other_table(id) LOOP
-- Do something
END LOOP;
4. Inefficient Transaction Management
Creating many small transactions instead of batching changes can hurt performance:
-- Inefficient: One transaction per update
FOR rec IN SELECT id FROM items LOOP
UPDATE items SET processed = true WHERE id = rec.id; -- Implicit transaction per update
END LOOP;
-- More efficient: One transaction for all updates
BEGIN;
UPDATE items SET processed = true WHERE id IN (SELECT id FROM items_to_process);
COMMIT;
Measuring Performance
To identify bottlenecks, use PostgreSQL's built-in analysis tools:
-
EXPLAIN ANALYZE: Shows the execution plan and actual runtime for
queries
EXPLAIN ANALYZE SELECT * FROM large_table WHERE complex_condition;
-
pg_stat_statements: Extension that tracks execution statistics for
all SQL statements
CREATE EXTENSION pg_stat_statements; SELECT query, calls, total_time, mean_time FROM pg_stat_statements ORDER BY mean_time DESC LIMIT 10;
-
Auto_explain: Log execution plans automatically
LOAD 'auto_explain'; SET auto_explain.log_min_duration = 250; -- log queries taking > 250ms
Conclusion
Optimizing PL/pgSQL procedures often comes down to using set-based operations over row-by-row processing, ensuring proper indexing, minimizing function calls in loops, and managing transactions efficiently.
By applying these principles and measuring performance with PostgreSQL's analytical tools, you can significantly speed up your database procedures and improve overall application performance.
Have you encountered other PL/pgSQL performance issues? Share your experiences in the comments below!