Integer division truncates silently — no error, no rounding

When both operands of / are integer types, PostgreSQL performs integer division. The result is truncated toward zero with no warning. This silently loses precision in calculations like percentages, averages, and rates. Ensure at least one operand is numeric/float. Common fix: cast the numerator: `count::numeric / total`. In aggregate contexts: `avg(col)` already returns numeric, but `sum(int_col) / count(*)` uses integer division — use `sum(col)::numeric / count(*)`.

LIKE is case-sensitive; ILIKE is not — and LIKE is faster

LIKE performs byte-by-byte comparison and is case-sensitive. ILIKE calls a locale-aware comparison. ILIKE cannot use a standard B-tree index — it requires a functional index on lower() or pg_trgm extension for indexing. For prefix matching, lower(col) LIKE lower(prefix)||'%' uses the functional index. For substring/suffix matching, use pg_trgm with a GIN index: `CREATE INDEX ON products USING gin(name gin_trgm_ops)` — then ILIKE and LIKE '%substring%' both use it.

now() is frozen within a transaction — use clock_timestamp() for real wall time

now(), current_timestamp, and current_date all return the timestamp when the current transaction started. In a multi-second or multi-minute transaction, they stay frozen. This means using now() for elapsed-time calculations inside a transaction will always show zero elapsed time. now() = current_timestamp = transaction start. clock_timestamp() = actual system clock, changes with each call. statement_timestamp() = start of the current SQL statement. For audit timestamps that should reflect the real commit time, use clock_timestamp() in triggers.

Timezone trap: TIMESTAMP vs TIMESTAMPTZ store and display differently

TIMESTAMP WITHOUT TIME ZONE stores exactly what you give it — no conversion, no timezone awareness. If your app inserts '2024-03-15 14:00:00' assuming UTC but the DB session is in America/New_York, TIMESTAMPTZ would store the UTC equivalent, but TIMESTAMP stores the literal value. When DST changes, TIMESTAMP values become ambiguous. Rule of thumb: use TIMESTAMPTZ everywhere. Convert to local time only for display using `timezone('America/New_York', col)` or AT TIME ZONE. Store in UTC, display locally. The only valid use of TIMESTAMP (no tz) is for purely local concepts like 'business hours' where timezone context is always implicit.

date_trunc returns a timestamp, not a date — type mismatches in GROUP BY

date_trunc always preserves the input type. date_trunc('month', timestamptz) returns a timestamptz at midnight on the first of the month. Comparing or grouping with a DATE column requires an explicit cast. date_trunc('day', ts)::date is the idiomatic way to convert a timestamp to a date. For date-only grouping in reporting queries, `created_at::date` is shorter and clearer than `date_trunc('day', created_at)::date`.

string_agg without ORDER BY produces non-deterministic results

Aggregate functions process rows in an undefined order unless ORDER BY is specified. string_agg without ORDER BY may return different orderings on different runs, different PostgreSQL versions, or after VACUUM. This is also true for array_agg. Always specify ORDER BY inside ordered aggregates if the output order matters. This applies to string_agg, array_agg, json_agg, jsonb_agg, and xmlagg. The ORDER BY clause goes inside the function call, not the outer query ORDER BY.

COUNT(col) skips NULLs but COUNT(*) does not

This is the most misunderstood COUNT behavior. COUNT(*) counts every row matching the WHERE clause. COUNT(col) counts only rows where col is NOT NULL. The difference is invisible until you have NULL values, and it can make a query look correct while silently undercounting. Use COUNT(*) for 'how many rows'. Use COUNT(col) intentionally when you specifically want to count non-null values. Use COUNT(DISTINCT col) to count unique non-null values. Never use COUNT(1) — it's identical to COUNT(*) and confuses readers.

BETWEEN includes both endpoints — easy to double-count in date ranges

BETWEEN is inclusive on both ends. For continuous ranges like dates and timestamps, this means the last moment of one range and the first moment of the next range can overlap. Especially dangerous when using DATE BETWEEN for inclusive daily ranges with timestamps. The half-open interval pattern [start, end) — >= start AND < end — is the correct approach for all timestamp range queries. It prevents double-counting at boundaries and works correctly regardless of the timestamp precision. Only use BETWEEN for true inclusive endpoints like integer IDs.

COALESCE evaluates ALL arguments — not short-circuit like IF

Unlike the short-circuit AND/OR operators, COALESCE does not guarantee short-circuit evaluation. COALESCE(fast_col, expensive_function()) will call expensive_function() even when fast_col is not NULL in some execution contexts (though PostgreSQL may optimize this — but it is not guaranteed by the standard). In practice, PostgreSQL often short-circuits COALESCE, but the SQL standard does not require it. For correctness with volatile or expensive functions, use CASE WHEN col IS NOT NULL THEN col ELSE fallback END. This is also important with functions that have side effects.

Arrays are 1-indexed in PostgreSQL, not 0-indexed

Unlike virtually every programming language, SQL arrays in PostgreSQL are 1-indexed. Accessing index 0 returns NULL without an error. This is one of the most common bugs when processing arrays in PL/pgSQL functions written by developers who code in 0-indexed languages. Remember: PostgreSQL array subscripts start at 1. array_length(arr, 1) gives the count. The last element is arr[array_length(arr, 1)]. You can technically create arrays with non-standard start indexes (arr[2:5]) but this is rarely used and confusing. Just remember: start at 1.

NOT IN with a subquery that returns any NULL always returns empty set

NOT IN is syntactic sugar for a series of != comparisons. Since NULL != anything evaluates to NULL (not true), the moment a NULL appears in the NOT IN list, no comparison can succeed. This makes NOT IN with nullable subqueries extremely dangerous. NOT EXISTS is the safest and most performant alternative to NOT IN for subqueries. It correctly handles NULLs and often uses a more efficient anti-join execution plan. Only use NOT IN when the subquery column is declared NOT NULL.

strpos() returns 0 (not -1 or NULL) when substring not found

Developers from Python, JavaScript, or Java expect substring search functions to return -1 or NULL when not found. PostgreSQL's strpos() (and position()) returns 0. The check > -1 always evaluates to true since 0 > -1. PostgreSQL strpos minimum return value is 0 (not found) or 1+ (position). Always check > 0 for 'found' and = 0 for 'not found'. This also applies to position('x' IN col) which uses the same semantics.

UPDATE ... FROM with multiple matching rows produces non-deterministic results

UPDATE target SET col = source.val FROM source WHERE target.id = source.target_id — if source has duplicate target_ids, PostgreSQL will update target using one of the source rows but doesn't specify which one. No error is raised, and the result is non-deterministic. Always ensure the FROM clause in UPDATE produces at most one row per target row. Use DISTINCT ON, a subquery with GROUP BY, or add additional join conditions to enforce uniqueness. Enable log_min_messages = NOTICE and look for pg-update-ambiguous warnings in development.

regexp_replace without 'g' flag only replaces the FIRST match

This differs from many regex flavors where replace-all is the default. PostgreSQL's regexp_replace mirrors POSIX sed where a single replacement is the default and /g enables global replacement. The 'g' flag enables global replacement. Combine with 'i' for case-insensitive global replacement: regexp_replace(col, 'pattern', 'replacement', 'gi'). Remember: flags go in the fourth argument, not embedded in the pattern.

Unlogged tables are wiped on crash or unclean shutdown

UNLOGGED tables do not write to the Write-Ahead Log (WAL). This makes writes 5–10x faster. However, since PostgreSQL uses WAL for crash recovery, unlogged tables cannot be recovered after a crash. They are also not replicated to standbys. After any non-clean shutdown, the table is automatically truncated to empty. UNLOGGED is great for staging tables, caches, and import buffers where data loss is acceptable and speed matters. Never use it for data that must survive a server restart. Also note: UNLOGGED tables are not replicated to hot-standby replicas, so reads on replicas will always see an empty table.

Prepared statements cache a query plan — bad for skewed data distributions

PostgreSQL plans a prepared statement on first execution and caches that plan. If the data is highly skewed (e.g., 99% of rows have status='active', 1% have status='archived'), the plan for 'active' (seq scan) is terrible for 'archived' (which needs an index scan). Because the cached plan is reused, one set of parameters runs well while another runs orders of magnitude slower. After 5 executions, PostgreSQL switches prepared statements to a 'generic plan' that doesn't depend on parameter values. This is often good but can be catastrophic for skewed distributions. Set plan_cache_mode = force_custom_plan to always re-plan, at the cost of planning overhead. Monitor slow queries by parameter value, not just query text.

pg_dump is consistent but still impacts production — it holds a snapshot

pg_dump uses REPEATABLE READ isolation, acquiring a transaction snapshot at the start. This is safe for consistency, but while the backup runs, autovacuum cannot remove dead tuples created after the snapshot began. On a busy database, a 2-hour pg_dump can cause significant table bloat that takes hours of additional vacuuming to clean up. For production databases, take backups from a streaming replica using pg_basebackup or pg_dump against the replica. This offloads the snapshot overhead entirely. The replica lags by replication delay, which is acceptable for backup purposes. Also monitor pg_stat_activity for long-running transactions that block autovacuum.

Row-Level Security is silently bypassed by table OWNER and superusers

When you ENABLE ROW LEVEL SECURITY on a table, it only applies to non-owner roles. The table owner and any superuser bypass all policies by default. If your application connects to PostgreSQL as the table owner (a common setup), RLS provides zero protection. You must explicitly force it to apply to the owner with FORCE ROW LEVEL SECURITY. Best practice: separate the table owner role from the application runtime role. The owner should only be used for DDL (migrations). The app connects as a least-privilege role. This ensures RLS applies without needing FORCE ROW LEVEL SECURITY. If you must use the owner for app connections, FORCE ROW LEVEL SECURITY is mandatory.

Know Before You Burn

PostgreSQL Gotchas & Traps

Q: COALESCE evaluates ALL arguments — not short-circuit like IF

Unlike the short-circuit AND/OR operators, COALESCE does not guarantee short-circuit evaluation. COALESCE(fast_col, expensive_function()) will call expensive_function() even when fast_col is not NULL in some execution contexts (though PostgreSQL may optimize this — but it is not guaranteed by the standard). In practice, PostgreSQL often short-circuits COALESCE, but the SQL standard does not require it. For correctness with volatile or expensive functions, use CASE WHEN col IS NOT NULL THEN col ELSE fallback END. This is also important with functions that have side effects.

The most common PostgreSQL function mistakes — NULL surprises, timezone traps, silent data loss, and performance landmines. With correct and incorrect examples side by side.

▲

In SQL, NULL represents an unknown value. Any comparison with NULL returns NULL (treated as false in WHERE). This means '= NULL' never matches any row — not even rows where the value IS actually NULL. The correct operator is IS NULL.

Wrong

incorrect

SELECT * FROM users WHERE deleted_at = NULL;

Result: 0 rows (always — even when deleted_at IS NULL)

Correct

correct

SELECT * FROM users WHERE deleted_at IS NULL;

Result: Rows where deleted_at is NULL

Use IS NULL / IS NOT NULL for null checks. For equality comparisons that should treat NULL = NULL as true, use IS NOT DISTINCT FROM instead of =, and IS DISTINCT FROM instead of !=.

▲

When both operands of / are integer types, PostgreSQL performs integer division. The result is truncated toward zero with no warning. This silently loses precision in calculations like percentages, averages, and rates.

Wrong

incorrect

SELECT 5 / 2 AS result;

Result: 2

Correct

correct

SELECT 5.0 / 2 AS result;
-- or
SELECT 5 / 2.0;
-- or
SELECT 5::numeric / 2;

Result: 2.5

Ensure at least one operand is numeric/float. Common fix: cast the numerator: `count::numeric / total`. In aggregate contexts: `avg(col)` already returns numeric, but `sum(int_col) / count(*)` uses integer division — use `sum(col)::numeric / count(*)`.

▲

LIKE performs byte-by-byte comparison and is case-sensitive. ILIKE calls a locale-aware comparison. ILIKE cannot use a standard B-tree index — it requires a functional index on lower() or pg_trgm extension for indexing.

Wrong

incorrect

-- Case-insensitive check that can't use an index:
SELECT * FROM products WHERE name ILIKE 'apple%';

Result: Correct results but full table scan on large tables

Correct

correct

-- Create a functional index:
CREATE INDEX idx_products_name_lower ON products (lower(name));

-- Then query:
SELECT * FROM products WHERE lower(name) LIKE lower('apple%');

Result: Same results, uses the index

For prefix matching, lower(col) LIKE lower(prefix)||'%' uses the functional index. For substring/suffix matching, use pg_trgm with a GIN index: `CREATE INDEX ON products USING gin(name gin_trgm_ops)` — then ILIKE and LIKE '%substring%' both use it.

▲

now(), current_timestamp, and current_date all return the timestamp when the current transaction started. In a multi-second or multi-minute transaction, they stay frozen. This means using now() for elapsed-time calculations inside a transaction will always show zero elapsed time.

Wrong

incorrect

-- Inside a long transaction — this doesn't measure actual elapsed time:
SELECT now() - start_time AS elapsed FROM long_running_task;

Result: Incorrect elapsed time if called within same transaction as start_time assignment

Correct

correct

-- Use clock_timestamp() for real wall-clock time:
SELECT clock_timestamp() - start_time AS elapsed FROM long_running_task;

-- Or timeofday() which returns a text timestamp:
SELECT timeofday();

Result: Actual wall-clock elapsed time

now() = current_timestamp = transaction start. clock_timestamp() = actual system clock, changes with each call. statement_timestamp() = start of the current SQL statement. For audit timestamps that should reflect the real commit time, use clock_timestamp() in triggers.

▲

TIMESTAMP WITHOUT TIME ZONE stores exactly what you give it — no conversion, no timezone awareness. If your app inserts '2024-03-15 14:00:00' assuming UTC but the DB session is in America/New_York, TIMESTAMPTZ would store the UTC equivalent, but TIMESTAMP stores the literal value. When DST changes, TIMESTAMP values become ambiguous.

Wrong

incorrect

-- Using TIMESTAMP (no timezone) for user-facing times:
CREATE TABLE events (starts_at TIMESTAMP);

Result: Ambiguous timestamps across DST boundaries and timezone changes

Correct

correct

-- Use TIMESTAMPTZ for all user-facing datetime columns:
CREATE TABLE events (starts_at TIMESTAMPTZ);

-- Set session timezone explicitly:
SET timezone = 'UTC';

-- Or per connection in pg_hba/connection string

Result: Unambiguous UTC storage, correct local display per user timezone

Rule of thumb: use TIMESTAMPTZ everywhere. Convert to local time only for display using `timezone('America/New_York', col)` or AT TIME ZONE. Store in UTC, display locally. The only valid use of TIMESTAMP (no tz) is for purely local concepts like 'business hours' where timezone context is always implicit.

▲

date_trunc always preserves the input type. date_trunc('month', timestamptz) returns a timestamptz at midnight on the first of the month. Comparing or grouping with a DATE column requires an explicit cast.

Wrong

incorrect

-- Trying to group by date alongside date_trunc result:
SELECT date_trunc('day', created_at) AS day, count(*)
FROM events
GROUP BY date_trunc('day', created_at), event_date;  -- event_date is DATE type

Result: May work or silently compare different types; unexpected results

Correct

correct

-- Cast to date if you need a DATE:
SELECT date_trunc('day', created_at)::date AS day, count(*)
FROM events
GROUP BY 1;

-- Or use created_at::date directly:
SELECT created_at::date AS day, count(*) FROM events GROUP BY 1;

Result: Correct grouping by calendar day as DATE type

date_trunc('day', ts)::date is the idiomatic way to convert a timestamp to a date. For date-only grouping in reporting queries, `created_at::date` is shorter and clearer than `date_trunc('day', created_at)::date`.

▲

Aggregate functions process rows in an undefined order unless ORDER BY is specified. string_agg without ORDER BY may return different orderings on different runs, different PostgreSQL versions, or after VACUUM. This is also true for array_agg.

Wrong

incorrect

SELECT string_agg(name, ', ') FROM users GROUP BY dept;

Result: "Bob, Alice, Charlie" — order unpredictable and may change

Correct

correct

SELECT string_agg(name, ', ' ORDER BY name) FROM users GROUP BY dept;

Result: "Alice, Bob, Charlie" — stable alphabetical order

Always specify ORDER BY inside ordered aggregates if the output order matters. This applies to string_agg, array_agg, json_agg, jsonb_agg, and xmlagg. The ORDER BY clause goes inside the function call, not the outer query ORDER BY.

▲

This is the most misunderstood COUNT behavior. COUNT(*) counts every row matching the WHERE clause. COUNT(col) counts only rows where col is NOT NULL. The difference is invisible until you have NULL values, and it can make a query look correct while silently undercounting.

Wrong

incorrect

-- Trying to count total records but using a nullable column:
SELECT COUNT(email) AS total_users FROM users;

Result: May return fewer than actual row count if any emails are NULL

Correct

correct

-- Count all rows:
SELECT COUNT(*) AS total_users FROM users;

-- Count rows where email is set (intentional NULL-skipping):
SELECT COUNT(email) AS users_with_email FROM users;

Result: Correct total row count

Use COUNT(*) for 'how many rows'. Use COUNT(col) intentionally when you specifically want to count non-null values. Use COUNT(DISTINCT col) to count unique non-null values. Never use COUNT(1) — it's identical to COUNT(*) and confuses readers.

▲

BETWEEN is inclusive on both ends. For continuous ranges like dates and timestamps, this means the last moment of one range and the first moment of the next range can overlap. Especially dangerous when using DATE BETWEEN for inclusive daily ranges with timestamps.

Wrong

incorrect

-- Range for March that overlaps with April 1st midnight:
SELECT * FROM events
WHERE created_at BETWEEN '2024-03-01' AND '2024-03-31';

Result: Misses events on March 31 after midnight; '2024-03-31' = '2024-03-31 00:00:00'

Correct

correct

-- Use half-open interval for timestamp ranges:
SELECT * FROM events
WHERE created_at >= '2024-03-01'
  AND created_at < '2024-04-01';

Result: All events in March, including March 31 at any time

The half-open interval pattern [start, end) — >= start AND < end — is the correct approach for all timestamp range queries. It prevents double-counting at boundaries and works correctly regardless of the timestamp precision. Only use BETWEEN for true inclusive endpoints like integer IDs.

▲

Unlike the short-circuit AND/OR operators, COALESCE does not guarantee short-circuit evaluation. COALESCE(fast_col, expensive_function()) will call expensive_function() even when fast_col is not NULL in some execution contexts (though PostgreSQL may optimize this — but it is not guaranteed by the standard).

Wrong

incorrect

-- This may still call the expensive fallback:
SELECT COALESCE(cached_value, compute_expensive_default(row_id)) FROM data;

Result: Potentially calls compute_expensive_default() even when cached_value is not NULL

Correct

correct

-- Use CASE for guaranteed short-circuit evaluation:
SELECT CASE WHEN cached_value IS NOT NULL
            THEN cached_value
            ELSE compute_expensive_default(row_id)
       END FROM data;

Result: compute_expensive_default() only called when cached_value IS NULL

In practice, PostgreSQL often short-circuits COALESCE, but the SQL standard does not require it. For correctness with volatile or expensive functions, use CASE WHEN col IS NOT NULL THEN col ELSE fallback END. This is also important with functions that have side effects.

▲

Unlike virtually every programming language, SQL arrays in PostgreSQL are 1-indexed. Accessing index 0 returns NULL without an error. This is one of the most common bugs when processing arrays in PL/pgSQL functions written by developers who code in 0-indexed languages.

Wrong

incorrect

SELECT (ARRAY['a','b','c'])[0];

Result: NULL — no error, just silent wrong result

Correct

correct

SELECT (ARRAY['a','b','c'])[1];

Result: a

Remember: PostgreSQL array subscripts start at 1. array_length(arr, 1) gives the count. The last element is arr[array_length(arr, 1)]. You can technically create arrays with non-standard start indexes (arr[2:5]) but this is rarely used and confusing. Just remember: start at 1.

▲

NOT IN is syntactic sugar for a series of != comparisons. Since NULL != anything evaluates to NULL (not true), the moment a NULL appears in the NOT IN list, no comparison can succeed. This makes NOT IN with nullable subqueries extremely dangerous.

Wrong

incorrect

-- manager_id column has some NULLs:
SELECT name FROM employees
WHERE id NOT IN (SELECT manager_id FROM employees);

Result: 0 rows — because some manager_id values are NULL

Correct

correct

-- Use NOT EXISTS instead:
SELECT name FROM employees e1
WHERE NOT EXISTS (
  SELECT 1 FROM employees e2
  WHERE e2.manager_id = e1.id
);

-- Or filter NULLs from the subquery:
WHERE id NOT IN (
  SELECT manager_id FROM employees WHERE manager_id IS NOT NULL
);

Result: Correct set of non-managers

NOT EXISTS is the safest and most performant alternative to NOT IN for subqueries. It correctly handles NULLs and often uses a more efficient anti-join execution plan. Only use NOT IN when the subquery column is declared NOT NULL.

▲

Developers from Python, JavaScript, or Java expect substring search functions to return -1 or NULL when not found. PostgreSQL's strpos() (and position()) returns 0. The check > -1 always evaluates to true since 0 > -1.

Wrong

incorrect

-- This matches ALL rows, not just rows containing 'foo':
SELECT * FROM logs WHERE strpos(message, 'foo') > -1;

Result: All rows returned because strpos returns 0 (not -1), and 0 > -1 is true

Correct

correct

-- Correct: check for > 0
SELECT * FROM logs WHERE strpos(message, 'foo') > 0;

-- Or use LIKE/ILIKE:
SELECT * FROM logs WHERE message LIKE '%foo%';

Result: Only rows where message contains 'foo'

PostgreSQL strpos minimum return value is 0 (not found) or 1+ (position). Always check > 0 for 'found' and = 0 for 'not found'. This also applies to position('x' IN col) which uses the same semantics.

▲

PostgreSQL performs implicit type casting on literals and parameters, but this can prevent index scans. The optimizer uses an index only when the WHERE expression exactly matches the index expression including type. A mismatch causes a type cast on the indexed column, invalidating the index.

Wrong

incorrect

-- col is type TEXT but we pass an integer:
CREATE INDEX idx_col ON t(col);
SELECT * FROM t WHERE col = 123;  -- 123 is integer

Result: Sequential scan — index not used because of implicit cast

Correct

correct

-- Pass the correct type:
SELECT * FROM t WHERE col = '123';

-- Or for parameterized queries, ensure the bind type matches:
-- $1::text or cast in application layer

Result: Index scan used

This is particularly common with ORMs that infer parameter types. Check query plans with EXPLAIN ANALYZE when a query seems slower than expected. Type-match your parameters to the column type. Use `EXPLAIN (ANALYZE, BUFFERS)` to confirm index usage.

▲

UPDATE target SET col = source.val FROM source WHERE target.id = source.target_id — if source has duplicate target_ids, PostgreSQL will update target using one of the source rows but doesn't specify which one. No error is raised, and the result is non-deterministic.

Wrong

incorrect

-- source may have multiple rows per target.id:
UPDATE orders SET status = s.status
FROM order_updates s
WHERE orders.id = s.order_id;

Result: Non-deterministic — arbitrary source row used when duplicates exist

Correct

correct

-- Use a subquery to explicitly pick one row:
UPDATE orders SET status = s.status
FROM (
  SELECT DISTINCT ON (order_id) order_id, status
  FROM order_updates
  ORDER BY order_id, updated_at DESC
) s
WHERE orders.id = s.order_id;

Result: Deterministic — uses latest update per order

Always ensure the FROM clause in UPDATE produces at most one row per target row. Use DISTINCT ON, a subquery with GROUP BY, or add additional join conditions to enforce uniqueness. Enable log_min_messages = NOTICE and look for pg-update-ambiguous warnings in development.

▲

This differs from many regex flavors where replace-all is the default. PostgreSQL's regexp_replace mirrors POSIX sed where a single replacement is the default and /g enables global replacement.

Wrong

incorrect

-- Trying to strip all digits:
SELECT regexp_replace('a1b2c3', '[0-9]', '');

Result: ab2c3 — only first digit removed

Correct

correct

SELECT regexp_replace('a1b2c3', '[0-9]', '', 'g');

Result: abc — all digits removed

The 'g' flag enables global replacement. Combine with 'i' for case-insensitive global replacement: regexp_replace(col, 'pattern', 'replacement', 'gi'). Remember: flags go in the fourth argument, not embedded in the pattern.

▲

Regular VACUUM runs concurrently without blocking reads or DML. VACUUM FULL is different: it creates a new copy of the table file, blocks all other access, and swaps the files at the end. On a large table, this can block production traffic for minutes or hours.

Wrong

incorrect

-- Reclaiming space on a busy production table:
VACUUM FULL orders;

Result: All reads and writes to 'orders' blocked for the duration — production outage

Correct

correct

-- Use pg_repack extension for online table repack (no lock):
-- $ pg_repack -t orders mydb

-- Or use CLUSTER (still locks, but reorders by index):
-- Schedule during low-traffic window

-- Monitor bloat first:
SELECT n_dead_tup, n_live_tup,
  ROUND(100.0 * n_dead_tup / NULLIF(n_live_tup + n_dead_tup,0),1) AS dead_pct
FROM pg_stat_user_tables WHERE relname = 'orders';

Result: Space reclaimed without blocking production traffic

Use regular VACUUM for routine cleanup — it runs concurrently. Only use VACUUM FULL when you need physical space reclaimed AND can afford a maintenance window. For production reclaim without locking, use the pg_repack or pgcompacttable extensions. Monitor bloat via pg_stat_user_tables before deciding.

▲

Sequences operate outside transaction control by design. If they rolled back, concurrent transactions would deadlock trying to get the next value. The consequence: after a failed INSERT, the sequence value is still incremented. An identity column using SEQUENCE will always show gaps after rollbacks, errors, or even regular DELETE operations.

Wrong

incorrect

-- Assuming IDs will be gapless for auditing or ordering:
SELECT * FROM orders WHERE id BETWEEN 1001 AND 1010;
-- Expecting exactly 10 rows

Result: May return fewer rows — IDs 1003, 1007 etc. may not exist due to rolled-back transactions

Correct

correct

-- Use a separate monotonic counter if gapless sequences are required:
CREATE SEQUENCE invoice_number_seq;
-- Only advance within a committed transaction:
SELECT nextval('invoice_number_seq') -- only in successful final INSERT

-- Or accept gaps and order by created_at, not ID:
SELECT * FROM orders ORDER BY created_at;

Result: Correct expectations about ID sequences

IDs from SERIAL/BIGSERIAL/IDENTITY columns are guaranteed to be unique and increasing, but NOT gapless. Never rely on sequential IDs for business logic (invoice numbers, check numbers). Use a separate gapless counter mechanism, or better: use the timestamp for ordering and UUIDs for identity.

▲

UNLOGGED tables do not write to the Write-Ahead Log (WAL). This makes writes 5–10x faster. However, since PostgreSQL uses WAL for crash recovery, unlogged tables cannot be recovered after a crash. They are also not replicated to standbys. After any non-clean shutdown, the table is automatically truncated to empty.

Wrong

incorrect

-- Using UNLOGGED for a sessions table in production:
CREATE UNLOGGED TABLE user_sessions (
  token TEXT PRIMARY KEY,
  user_id BIGINT,
  expires_at TIMESTAMPTZ
);

Result: All sessions wiped after any server crash — every user logged out with no warning

Correct

correct

-- Use UNLOGGED only for truly ephemeral data: caches, import staging, temp ETL:
CREATE UNLOGGED TABLE import_staging (
  raw_data JSONB
);
-- Truncate manually after ETL; data loss on crash is acceptable here

-- For sessions, use a regular table or an external store like Redis

Result: Data preserved across restarts; crash recovery works

UNLOGGED is great for staging tables, caches, and import buffers where data loss is acceptable and speed matters. Never use it for data that must survive a server restart. Also note: UNLOGGED tables are not replicated to hot-standby replicas, so reads on replicas will always see an empty table.

▲

When you declare a FK constraint like `REFERENCES users(id)`, PostgreSQL validates the constraint using the primary key index on users. It does NOT create an index on the orders.user_id column. Without that index, any query joining orders to users by user_id does a sequential scan of orders. Worse: when you DELETE a user, PostgreSQL must scan ALL of orders to find referencing rows — holding a lock the entire time.

Wrong

incorrect

CREATE TABLE orders (
  id      BIGSERIAL PRIMARY KEY,
  user_id BIGINT REFERENCES users(id)
  -- No index on user_id!
);

Result: Seq scan on orders for user lookups; slow cascading DELETE on users table

Correct

correct

CREATE TABLE orders (
  id      BIGSERIAL PRIMARY KEY,
  user_id BIGINT NOT NULL REFERENCES users(id)
);
-- Manually create the index:
CREATE INDEX idx_orders_user_id ON orders(user_id);

-- Find missing FK indexes:
SELECT conrelid::regclass AS table,
       a.attname AS column
FROM pg_constraint c
JOIN pg_attribute a ON a.attrelid = c.conrelid AND a.attnum = ANY(c.conkey)
WHERE c.contype = 'f'
  AND NOT EXISTS (
    SELECT 1 FROM pg_index i
    WHERE i.indrelid = c.conrelid
      AND c.conkey[1] = ANY(i.indkey)
  );

Result: Fast FK lookups and non-blocking DELETEs on parent table

Always create an index on the foreign key column (the referencing side). This is not automatic. Use the query in the 'correct' block to audit your schema for missing FK indexes. Tools like pganalyze and pgsync also flag this automatically.

▲

A slow query is killed by statement_timeout. But a query blocked waiting to acquire a lock (like a migration waiting for an exclusive lock) can wait forever with only statement_timeout set — it's 'waiting', not 'running'. lock_timeout solves the blocked-waiting problem. Without lock_timeout, a schema migration waiting for an AccessExclusiveLock can queue behind long-running transactions and block all subsequent queries.

Wrong

incorrect

-- Only setting statement_timeout:
SET statement_timeout = '30s';
ALTER TABLE orders ADD COLUMN notes TEXT;

Result: Migration may wait indefinitely for a lock — queueing all other queries behind it

Correct

correct

-- Set both:
SET lock_timeout = '5s';       -- fail fast if we can't get the lock
SET statement_timeout = '60s'; -- fail if the DDL itself runs too long
ALTER TABLE orders ADD COLUMN notes TEXT;

-- For zero-downtime migrations, prefer non-blocking operations:
-- ADD COLUMN with a DEFAULT that has a volatility constraint
-- Or use pg_repack, logical replication, and cut-over strategies

Result: Migration fails fast instead of queue-blocking production traffic

Always set lock_timeout for DDL operations in production. A value of 1s–5s forces the migration to either acquire the lock quickly or abort — preventing a lock queue from forming. Use BEGIN; SET LOCAL lock_timeout = '2s'; ... COMMIT; to scope it to the transaction. Combine with retries in your migration tooling.

▲

PostgreSQL plans a prepared statement on first execution and caches that plan. If the data is highly skewed (e.g., 99% of rows have status='active', 1% have status='archived'), the plan for 'active' (seq scan) is terrible for 'archived' (which needs an index scan). Because the cached plan is reused, one set of parameters runs well while another runs orders of magnitude slower.

Wrong

incorrect

-- ORM sends: EXECUTE prepared_stmt('archived') -- uses the 'active' plan
PREPARE get_orders AS
  SELECT * FROM orders WHERE status = $1;
EXECUTE get_orders('active');   -- plans for high-count status
EXECUTE get_orders('archived'); -- reuses bad plan for low-count status

Result: seq scan used for 'archived' even though an index would be 100x faster

Correct

correct

-- Option 1: Use EXECUTE (not EXECUTE on a PREPARE) for dynamic SQL in plpgsql:
EXECUTE format('SELECT * FROM orders WHERE status = %L', status_val);

-- Option 2: Add plan_cache_mode = force_generic_plan to use custom plans:
SET plan_cache_mode = force_custom_plan;

-- Option 3: Use OFFSET 0 trick to prevent caching (use sparingly):
SELECT * FROM orders WHERE status = $1 OFFSET 0;

Result: Correct plan chosen per execution based on actual parameter value

After 5 executions, PostgreSQL switches prepared statements to a 'generic plan' that doesn't depend on parameter values. This is often good but can be catastrophic for skewed distributions. Set plan_cache_mode = force_custom_plan to always re-plan, at the cost of planning overhead. Monitor slow queries by parameter value, not just query text.

▲

pg_dump uses REPEATABLE READ isolation, acquiring a transaction snapshot at the start. This is safe for consistency, but while the backup runs, autovacuum cannot remove dead tuples created after the snapshot began. On a busy database, a 2-hour pg_dump can cause significant table bloat that takes hours of additional vacuuming to clean up.

Wrong

incorrect

-- Running pg_dump during peak hours on a write-heavy database:
pg_dump -Fc mydb > mydb.dump

Result: Table bloat accumulates during dump; post-backup autovacuum spike; query slowdowns

Correct

correct

-- Schedule pg_dump during off-peak hours
-- Or use streaming replication + pg_basebackup from a replica:
pg_basebackup -h replica_host -D /backup/dir -Ft -z

-- Monitor long-running transactions that block vacuum:
SELECT pid, age(backend_xid), state, query
FROM pg_stat_activity
WHERE backend_xid IS NOT NULL
ORDER BY age(backend_xid) DESC;

Result: Backup runs without impacting primary vacuum cycles

For production databases, take backups from a streaming replica using pg_basebackup or pg_dump against the replica. This offloads the snapshot overhead entirely. The replica lags by replication delay, which is acceptable for backup purposes. Also monitor pg_stat_activity for long-running transactions that block autovacuum.

▲

Unlike thread-based databases, PostgreSQL uses one OS process per connection. Each process allocates work_mem, stack space, and shared memory overhead — roughly 5–10 MB idle, more under load. Large connection counts also slow down lock management. Applications that open many connections (e.g. serverless functions, microservices) quickly exhaust max_connections (default 100), causing 'too many connections' errors.

Wrong

incorrect

-- Each serverless function invocation opens its own connection:
-- 1000 concurrent Lambda invocations = 1000 PG connections
const client = new Client(); await client.connect();
// ... query ...
await client.end(); // too slow; connections pile up

Result: FATAL: sorry, too many clients already — max_connections (100) exceeded

Correct

correct

-- Use a connection pooler like PgBouncer in transaction mode:
-- App connects to PgBouncer (port 6432), not PG directly
-- PgBouncer maintains a small pool (e.g. 20) to PG

-- Or use pg_bouncer's connection string:
DATABASE_URL=postgresql://user:pass@pgbouncer:6432/mydb

-- Tune max_connections conservatively:
-- max_connections = (RAM_GB * 100) — keep it under 200 for most setups
SHOW max_connections;

Result: Stable connection count; no connection exhaustion; lower RAM usage

PgBouncer (transaction mode) is the standard solution. It multiplexes many application connections onto a small pool of real PG connections. Supabase, Railway, and Neon all use this pattern. Set work_mem carefully: work_mem * max_connections * 2 must fit in RAM. Never set max_connections > 500 without careful tuning.

▲

The PostgreSQL query planner uses statistics from pg_statistic (populated by ANALYZE) to estimate row counts and choose join strategies. After a bulk INSERT or COPY into a previously empty table, the statistics are completely wrong — the planner thinks the table has ~0 rows. This leads to seq scans instead of index scans, nested loops instead of hash joins, and other catastrophically wrong plans.

Wrong

incorrect

-- Load data and immediately query without ANALYZE:
COPY orders FROM '/tmp/orders.csv' CSV HEADER;
SELECT * FROM orders o JOIN users u ON u.id = o.user_id WHERE u.country = 'US';

Result: Planner uses old stats (0 rows in orders); picks wrong join strategy; query is 100x slower

Correct

correct

-- Always ANALYZE after bulk loads:
COPY orders FROM '/tmp/orders.csv' CSV HEADER;
ANALYZE orders;  -- update statistics immediately

-- For very large tables, VACUUM ANALYZE:
VACUUM ANALYZE orders;

-- In autovacuum: lower the threshold for large tables:
ALTER TABLE orders SET (
  autovacuum_analyze_scale_factor = 0.01  -- analyze after 1% change instead of 20%
);

Result: Planner sees accurate row counts; optimal join strategy chosen

autovacuum_analyze runs automatically but has a default threshold of 20% row change. For a 100M-row table, that means 20M inserts before auto-analyze. After any ETL or bulk load, always run ANALYZE manually. In CI/CD pipelines, include ANALYZE as a post-migration step. Check pg_stat_user_tables.last_analyze to verify.

▲

When you ENABLE ROW LEVEL SECURITY on a table, it only applies to non-owner roles. The table owner and any superuser bypass all policies by default. If your application connects to PostgreSQL as the table owner (a common setup), RLS provides zero protection. You must explicitly force it to apply to the owner with FORCE ROW LEVEL SECURITY.

Wrong

incorrect

-- Table owned by 'app_user'; application connects as 'app_user':
ALTER TABLE orders ENABLE ROW LEVEL SECURITY;
CREATE POLICY tenant_isolation ON orders
  USING (tenant_id = current_setting('app.tenant_id')::int);
-- Connecting as 'app_user' bypasses this policy!

Result: All tenants can see all rows — RLS silently bypassed because app connects as owner

Correct

correct

-- Force RLS to apply even to the table owner:
ALTER TABLE orders ENABLE ROW LEVEL SECURITY;
ALTER TABLE orders FORCE ROW LEVEL SECURITY;

-- Better: use a dedicated low-privilege role for app connections:
-- CREATE ROLE app_runtime NOLOGIN;
-- GRANT SELECT, INSERT, UPDATE ON orders TO app_runtime;
-- App connects as app_runtime (not the owner)
-- RLS applies automatically since app_runtime is not the owner

Result: RLS policies enforced for all app connections

Best practice: separate the table owner role from the application runtime role. The owner should only be used for DDL (migrations). The app connects as a least-privilege role. This ensures RLS applies without needing FORCE ROW LEVEL SECURITY. If you must use the owner for app connections, FORCE ROW LEVEL SECURITY is mandatory.

Found a gotcha that's not listed? Let us know — community contributions welcome.

PostgreSQL Gotchas & Traps

NULL kills comparisons silently — nothing equals NULL, not even NULL

Integer division truncates silently — no error, no rounding

LIKE is case-sensitive; ILIKE is not — and LIKE is faster

now() is frozen within a transaction — use clock_timestamp() for real wall time

Timezone trap: TIMESTAMP vs TIMESTAMPTZ store and display differently

date_trunc returns a timestamp, not a date — type mismatches in GROUP BY

string_agg without ORDER BY produces non-deterministic results

COUNT(col) skips NULLs but COUNT(*) does not

BETWEEN includes both endpoints — easy to double-count in date ranges

COALESCE evaluates ALL arguments — not short-circuit like IF

Arrays are 1-indexed in PostgreSQL, not 0-indexed

NOT IN with a subquery that returns any NULL always returns empty set

strpos() returns 0 (not -1 or NULL) when substring not found

Implicit type casts in indexes — your index may not be used

UPDATE ... FROM with multiple matching rows produces non-deterministic results

regexp_replace without 'g' flag only replaces the FIRST match

VACUUM FULL takes an exclusive lock — blocks all reads AND writes

Sequences don't roll back — gaps appear after any aborted transaction

Unlogged tables are wiped on crash or unclean shutdown

Foreign keys don't auto-create an index on the referencing column

lock_timeout vs statement_timeout — different problems, often confused

Prepared statements cache a query plan — bad for skewed data distributions

pg_dump is consistent but still impacts production — it holds a snapshot

connection limits: each connection costs ~5–10 MB of RAM regardless of activity

Stale statistics after bulk load — planner uses wrong row estimates

Row-Level Security is silently bypassed by table OWNER and superusers