⚙

PostgreSQL System Administration Functions

Complete reference for PostgreSQL system administration functions covering configuration reloading, backend process control, advisory locks, WAL management, and visibility map operations. Includes pg_cancel_backend(), pg_terminate_backend(), pg_reload_conf(), and pg_advisory_lock(). Updated for PostgreSQL 16.

42 functions

What are PostgreSQL System Administration Functions?

PostgreSQL system administration functions are used by DBAs to manage server configuration, terminate sessions, control checkpoints, and coordinate application-level locking. pg_cancel_backend() sends an interrupt to a running query, pg_terminate_backend() forcibly disconnects a session, and pg_reload_conf() reloads postgresql.conf without a full restart. Advisory locks (pg_advisory_lock and pg_advisory_xact_lock) provide application-level mutual exclusion.

brin_summarize_new_values

PG 9.5+→ integer

Scans a table and updates BRIN index summaries for any page ranges that have not yet been summarized. Returns the number of page ranges updated. Needed after bulk inserts that bypassed the autosummarize mechanism.

DBA

Signature

brin_summarize_new_values ( index regclass ) → integer

Parameters

Parameter	Type	Description
index	regclass	Name or OID of the BRIN index

Examples

5 new page ranges summarized

sql

SELECT brin_summarize_new_values('orders_created_at_brin_idx');

→5

All ranges already summarized

sql

SELECT brin_summarize_new_values('sensor_readings_ts_idx');

→0

Many new ranges after bulk load

sql

-- After bulk COPY insert:
SELECT brin_summarize_new_values('events_time_idx');

→120

Update all BRIN indexes

sql

SELECT indexrelid::regclass, brin_summarize_new_values(indexrelid) AS ranges_updated FROM pg_index WHERE indexrelid::regclass::text LIKE '%brin%';

→(rows)

Log maintenance runs

sql

SELECT brin_summarize_new_values('logs_ts_idx') AS ranges, now() AS ran_at;

→0 | 2026-03-16 ...

⚠Anti-Pattern— Reindexing BRIN indexes after every bulk insert

REINDEX is a heavyweight operation that locks the table. brin_summarize_new_values() only processes new, unsummarized ranges without locking.

✓ Instead: Use brin_summarize_new_values() for incremental updates after bulk inserts; reserve REINDEX for BRIN corruption or major schema changes.

BRIN indexes use autosummarize to keep up with inserts, but it may lag after large bulk loads. Running brin_summarize_new_values() immediately after a bulk insert ensures the index is fully updated and queries use it efficiently.

example

COPY events FROM '/data/events_2026.csv'; SELECT brin_summarize_new_values('events_ts_brin_idx');

→BRIN index updated immediately after bulk load

Common Gotchas

⚠

LIKE is case-sensitive; ILIKE is not — and LIKE is faster

LIKE 'hello%' will not match 'Hello'. Use ILIKE for case-insensitive pattern matching, but expect a performance cost.

⚠

Arrays are 1-indexed in PostgreSQL, not 0-indexed

PostgreSQL arrays start at index 1 by default. array[0] returns NULL, not the first element — silently wrong.

⚠

Implicit type casts in indexes — your index may not be used

WHERE col = 5 may not use an index on col (text type) because the integer 5 is cast to text, preventing index use.

current_setting

PG 7.4+→ text

Returns the current value of a configuration parameter. With missing_ok=true, returns NULL instead of error for unknown settings.

DBAReliability

Signatures

current_setting ( setting_name text ) → text

current_setting ( setting_name text, missing_ok boolean ) → text

Parameters

Parameter	Type	Description
setting_name	text	GUC parameter name
missing_ok	boolean	If true, return NULL for unknown settings instead of raising an error

Examples

sql

SELECT current_setting('timezone');

→UTC

sql

SELECT current_setting('app.user_id', true);

→42 (or NULL if not set)

sql

SELECT current_setting('server_version_num')::integer;

→180000 (for PG 18)

Safe custom setting read with default fallback

sql

SELECT COALESCE(current_setting('myapp.feature_flag', true), 'off') AS flag_value;

→'off'

⚠Anti-Pattern— Using current_setting() without the missing_ok parameter in application code

current_setting('myapp.setting') raises an error if the setting doesn't exist. In application code that checks for optional custom settings, this causes uncaught exceptions.

✓ Instead: Use current_setting('myapp.setting', true) with missing_ok=true — it returns NULL instead of raising an error when the setting is absent. Use COALESCE to provide a default: COALESCE(current_setting('myapp.setting', true), 'default').

Use `current_setting('app.key', true)` with `missing_ok=true` in RLS policies and functions to safely read application context set by `set_config`. The `true` parameter prevents errors when the setting hasn't been set yet.

example

CREATE FUNCTION get_current_tenant() RETURNS bigint AS $ SELECT coalesce(current_setting('app.tenant_id', true), '0')::bigint; $ LANGUAGE SQL STABLE;

→Returns tenant ID from session context, 0 if not set

gin_clean_pending_list

PG 9.6+→ bigint

Cleans up the GIN (Generalized Inverted Index) pending list by moving entries from the pending list to the main index structure. Returns the number of pages removed from the pending list. Equivalent to VACUUM on a GIN index.

DBA

Signature

gin_clean_pending_list ( index regclass ) → bigint

Parameters

Parameter	Type	Description
index	regclass	Name or OID of the GIN index

Examples

12 pending list pages merged into main index

sql

SELECT gin_clean_pending_list('articles_search_gin_idx');

→12

Pending list already empty

sql

SELECT gin_clean_pending_list('documents_tsvector_idx');

→0

Large cleanup after bulk load

sql

-- After bulk insert into full-text search table:
SELECT gin_clean_pending_list('posts_tsv_idx');

→450

Clean all GIN indexes

sql

SELECT indexrelid::regclass, gin_clean_pending_list(indexrelid) FROM pg_index WHERE indexrelid IN (SELECT oid FROM pg_class WHERE relam = (SELECT oid FROM pg_am WHERE amname='gin'));

→(rows)

Clean and check resulting size

sql

SELECT gin_clean_pending_list('tags_gin_idx'), pg_relation_size('tags_gin_idx') AS idx_size_after;

→0 | 8192

⚠Anti-Pattern— Setting gin_pending_list_limit to 0 to avoid pending list entirely

Disabling the pending list with gin_pending_list_limit=0 makes every insert directly update the main GIN structure — dramatically slowing inserts on high-write tables.

✓ Instead: Keep the default or tuned gin_pending_list_limit and call gin_clean_pending_list() periodically (or let autovacuum handle it).

GIN indexes use a fast-update pending list (gin_pending_list_limit) that defers merging for performance. After large bulk inserts, search queries slow down as they must scan both the pending list and main index. gin_clean_pending_list() eliminates this dual-scan overhead.

example

COPY articles FROM '/data/articles.csv'; SELECT gin_clean_pending_list('articles_fts_idx');

→GIN index fully merged; full-text search at normal speed

Common Gotchas

⚠

LIKE is case-sensitive; ILIKE is not — and LIKE is faster

LIKE 'hello%' will not match 'Hello'. Use ILIKE for case-insensitive pattern matching, but expect a performance cost.

⚠

Arrays are 1-indexed in PostgreSQL, not 0-indexed

PostgreSQL arrays start at index 1 by default. array[0] returns NULL, not the first element — silently wrong.

⚠

Implicit type casts in indexes — your index may not be used

WHERE col = 5 may not use an index on col (text type) because the integer 5 is cast to text, preventing index use.

pg_advisory_lock

PG 8.2+→ void

Obtains an exclusive session-level advisory lock. The application defines what the lock key means. Blocks until the lock is available.

SecurityDBAReliability

Signatures

pg_advisory_lock ( key bigint ) → void

pg_advisory_lock ( key1 integer, key2 integer ) → void

Parameters

Parameter	Type	Description
key	bigint or two integers	Application-defined lock identifier

Examples

sql

SELECT pg_advisory_lock(12345);

→(acquires lock 12345)

sql

SELECT pg_advisory_lock(hashtext('my_job_name'));

→Named lock using hash

Per-job lock using string hash as key

sql

SELECT pg_advisory_lock(hashtext('import_job_' || job_id)) FROM jobs WHERE id = 1;

→(void)

Count advisory locks held by current session

sql

SELECT count(*) FROM pg_locks WHERE locktype = 'advisory' AND pid = pg_backend_pid();

→1

⚠Anti-Pattern— Acquiring session advisory locks without guaranteed release

Session-level advisory locks persist until explicitly released or the session ends. A crashed or pooled connection that holds advisory locks can block other sessions indefinitely.

✓ Instead: Use transaction-level advisory locks (pg_advisory_xact_lock()) which auto-release on commit/rollback. For session locks, always pair with pg_advisory_unlock() in exception handlers or use pg_advisory_unlock_all() in connection pool reset queries.

Advisory locks are ideal for preventing duplicate job execution across multiple application servers. Use `pg_try_advisory_lock` for non-blocking acquisition — if it returns false, another process has the lock.

example

-- Non-blocking: skip if already locked
IF pg_try_advisory_lock(hashtext('process_invoices')) THEN
  -- do work
  PERFORM pg_advisory_unlock(hashtext('process_invoices'));
END IF;

→Safe distributed mutex for job scheduling

pg_advisory_lock_shared

PG 8.2+→ void

Acquires a session-level shared advisory lock. Multiple sessions can hold shared locks on the same key simultaneously. Blocks if an exclusive advisory lock is held by another session. Released when the session ends.

DeveloperDBA

Signatures

pg_advisory_lock_shared ( key bigint ) → void

pg_advisory_lock_shared ( key1 integer, key2 integer ) → void

Parameters

Parameter	Type	Description
key	bigint or two integers	Application-defined lock identifier (64-bit or two 32-bit values)

Examples

Acquire shared lock on key 42

sql

SELECT pg_advisory_lock_shared(42);

→(void)

Shared lock using two-integer form (hash into 64-bit)

sql

SELECT pg_advisory_lock_shared(1001, 2002);

→(void)

Use string hash as lock key

sql

SELECT pg_advisory_lock_shared(hashtext('data_import_job'));

→(void)

3 sessions hold shared lock on key 42

sql

SELECT count(*) FROM pg_locks WHERE locktype='advisory' AND classid=0 AND objid=42 AND mode='ShareLock';

→3

Release the shared lock when done

sql

SELECT pg_advisory_unlock_shared(42);

→true

⚠Anti-Pattern— Using advisory locks as a substitute for proper table locking

Advisory locks are purely cooperative — they only work if all code paths acquire and release them correctly. Any process that skips the lock protocol bypasses coordination entirely.

✓ Instead: Use advisory locks for high-level coordination (job scheduling, cache invalidation); use SELECT ... FOR UPDATE for row-level data concurrency where automatic enforcement is needed.

Shared advisory locks allow concurrent readers while exclusive locks (pg_advisory_lock()) block all others. This mirrors reader-writer lock semantics without requiring a dedicated coordination table.

example

-- Readers: SELECT pg_advisory_lock_shared(job_id); -- Writers (exclusive):
SELECT pg_advisory_lock(job_id); -- blocks until all readers release

→Multiple readers proceed concurrently; writer waits for readers

pg_advisory_unlock

PG 8.2+→ boolean

Releases a session-level advisory lock previously acquired with pg_advisory_lock or pg_try_advisory_lock.

SecurityDBAReliability

Signatures

pg_advisory_unlock ( key bigint ) → boolean

pg_advisory_unlock_all ( ) → void

Parameters

Parameter	Type	Description
key	bigint or two integers	Lock identifier to release

Examples

sql

SELECT pg_advisory_unlock(12345);

→true (released), false (wasn't held)

sql

SELECT pg_advisory_unlock_all();

→Releases all advisory locks held by this session

Exception-safe lock/work/unlock pattern

sql

SELECT pg_advisory_lock(42); DO $ BEGIN PERFORM work(); PERFORM pg_advisory_unlock(42); EXCEPTION WHEN OTHERS THEN PERFORM pg_advisory_unlock(42); RAISE; END $;

→(void)

Release and confirm all advisory locks cleared

sql

SELECT pg_advisory_unlock(42) AS released, count(*) AS remaining_locks FROM pg_locks WHERE locktype='advisory' AND pid=pg_backend_pid();

→true | 0

⚠Anti-Pattern— Calling pg_advisory_unlock() on a lock you did not acquire

pg_advisory_unlock() returns false (not an error) if you try to release a lock you do not hold. Mismatched lock/unlock calls in complex workflows can leave locks held longer than intended or release another worker's lock if keys collide.

✓ Instead: Always track which advisory locks your session holds. Consider using a table to log acquired locks so cleanup code can release exactly the right ones. Use pg_try_advisory_lock() return value to confirm acquisition.

Session-level advisory locks persist until explicitly released or the session ends. Always pair `pg_advisory_lock` with `pg_advisory_unlock`. Use transaction-level `pg_advisory_xact_lock` if you want automatic release at transaction end.

example

-- Transaction-level: auto-released at commit/rollback
SELECT pg_advisory_xact_lock(42);

→Lock released automatically when transaction ends

pg_advisory_unlock_all

PG 8.2+→ void

Releases all session-level advisory locks held by the current session. Useful for cleanup after a lock-holding operation completes or errors. Does not affect transaction-level advisory locks.

DeveloperDBA

Signature

pg_advisory_unlock_all ( ) → void

Examples

Release all session advisory locks

sql

SELECT pg_advisory_unlock_all();

→(void)

Acquire multiple locks then release all at once

sql

SELECT pg_advisory_lock(1); SELECT pg_advisory_lock(2); SELECT pg_advisory_unlock_all();

→(void)

Count then release

sql

SELECT count(*) FROM pg_locks WHERE locktype = 'advisory' AND pid = pg_backend_pid() BEFORE pg_advisory_unlock_all();

→3

Reset connection state before returning to pool

sql

-- In connection pooling cleanup:
SELECT pg_advisory_unlock_all();

→(void)

Exception-safe cleanup pattern

sql

DO $ BEGIN PERFORM pg_advisory_lock(99); PERFORM some_operation(); EXCEPTION WHEN OTHERS THEN PERFORM pg_advisory_unlock_all(); RAISE; END $;

→(void)

⚠Anti-Pattern— Assuming advisory locks auto-release at transaction end

Session-level advisory locks (pg_advisory_lock) persist across transaction boundaries until explicitly released or the session ends. They do NOT roll back with ROLLBACK.

✓ Instead: For transaction-scoped advisory locks (auto-released on commit/rollback), use pg_advisory_xact_lock() instead of pg_advisory_lock().

When using PgBouncer or other connection pools in session mode, a connection may be returned to the pool with advisory locks still held. Call pg_advisory_unlock_all() in the pool reset statement (pool_reset_query) to guarantee clean state.

example

-- PgBouncer: server_reset_query = DISCARD ALL; SELECT pg_advisory_unlock_all();

→Connection returned to pool in clean state without stale advisory locks

pg_advisory_xact_lock

PG 8.2+→ void

Acquires a transaction-level exclusive advisory lock. Automatically released when the transaction ends (commit or rollback). Cannot be released manually — use for lock scopes that should match transaction lifetime.

DeveloperDBA

Signatures

pg_advisory_xact_lock ( key bigint ) → void

pg_advisory_xact_lock ( key1 integer, key2 integer ) → void

Parameters

Parameter	Type	Description
key	bigint or two integers	Application-defined lock identifier

Examples

Lock held for transaction duration

sql

BEGIN; SELECT pg_advisory_xact_lock(42); -- do work -- COMMIT; -- lock released automatically

→(void)

Per-user lock using string hash

sql

SELECT pg_advisory_xact_lock(hashtext('process_user_' || user_id::text)) FROM users WHERE id = $1;

→(void)

Two-integer form for combined key space

sql

SELECT pg_advisory_xact_lock(1001, 2002);

→(void)

ROLLBACK releases transaction advisory locks

sql

SELECT pg_advisory_xact_lock(42); ROLLBACK; -- Lock auto-released by rollback

→(void)

Transaction-level locks have non-null transactionid

sql

SELECT count(*) FROM pg_locks WHERE locktype='advisory' AND mode='ExclusiveLock' AND transactionid IS NOT NULL;

→1

⚠Anti-Pattern— Trying to manually release a transaction advisory lock

pg_advisory_unlock() will not release a lock acquired with pg_advisory_xact_lock(). The unlock function only works on session-level locks.

✓ Instead: Transaction advisory locks are designed to auto-release. Simply commit or rollback the transaction. If you need manual release, use pg_advisory_lock() and pg_advisory_unlock() instead.

pg_advisory_xact_lock() is almost always better than pg_advisory_lock() because it auto-releases on commit or rollback, eliminating the risk of stale locks from crashed sessions or forgotten unlocks.

example

BEGIN; SELECT pg_advisory_xact_lock(order_id) FROM orders WHERE id = $1 FOR UPDATE; -- Update order safely -- COMMIT;

→Lock automatically released at COMMIT — no manual unlock needed

Common Gotchas

⚠

now() is frozen within a transaction — use clock_timestamp() for real wall time

now() returns the start time of the current transaction, not the current wall clock time. It doesn't advance during a long transaction.

pg_backup_start

PG15+ (replaces pg_start_backup)→ pg_lsn

Starts a non-exclusive base backup. Returns the starting WAL LSN. Replaced pg_start_backup() in PostgreSQL 15. Requires superuser or pg_backup role. The session must call pg_backup_stop() to complete.

DBA

Signature

pg_backup_start ( label text [, fast boolean ] ) → pg_lsn

Parameters

Parameter	Type	Description
label	text	Label string for the backup (appears in backup_label file)
fast	boolean	If true, request immediate checkpoint rather than spread checkpoint; default false

Examples

Start backup and record starting LSN

sql

SELECT pg_backup_start('nightly_backup_2026_03_16');

→0/3000000

Start with immediate checkpoint (fast=true)

sql

SELECT pg_backup_start('hot_backup', true);

→0/5000000

Copy data directory while backup is active

sql

\! rsync -a $PGDATA/base/ /backup/base/  -- Then:

→(shell command)

Stop backup and collect stop LSN and files

sql

SELECT * FROM pg_backup_stop(true);

→(lsn, labelfile, spcmapfile)

Pattern: start, copy files, then stop

sql

SELECT pg_backup_start('pre_upgrade'), pg_sleep(0);

→(lsn)

⚠Anti-Pattern— Using pg_start_backup() in PostgreSQL 15+

pg_start_backup() is deprecated in PostgreSQL 15 (exclusive backup mode removed) and removed in PostgreSQL 17. It is also dangerous in high-concurrency setups.

✓ Instead: Use pg_backup_start() (non-exclusive) in PostgreSQL 15+ for base backups. For production, prefer pg_basebackup which handles this automatically.

pg_backup_start() marks the backup start in WAL. If the session ends without pg_backup_stop(), PostgreSQL 15+ automatically ends the backup but the backup_label file may not be written correctly. Always complete the process.

example

DO $ BEGIN PERFORM pg_backup_start('test'); PERFORM pg_sleep(0); PERFORM pg_backup_stop(); END $;

→Atomic start/stop within a transaction block

Common Gotchas

⚠

Unlogged tables are wiped on crash or unclean shutdown

UNLOGGED tables skip WAL and are extremely fast, but ALL data is truncated on crash or unclean shutdown — silently.

⚠

pg_dump is consistent but still impacts production — it holds a snapshot

pg_dump takes a consistent snapshot, but that snapshot prevents autovacuum from removing dead tuples — causing table bloat during long backups.

pg_backup_stop

PG15+ (replaces pg_stop_backup)→ record (lsn pg_lsn, labelfile text, spcmapfile text)

Finishes a non-exclusive base backup started by pg_backup_start(). Returns the stop LSN and the contents of the backup_label and tablespace_map files that must be included in the backup. Added in PostgreSQL 15.

DBA

Signature

pg_backup_stop ( [ wait_for_archive boolean ] ) → record ( lsn pg_lsn, labelfile text, spcmapfile text )

Parameters

Parameter	Type	Description
wait_for_archive	boolean	If true (default), wait for the stop WAL file to be archived before returning

Examples

Stop backup, get stop LSN and label file contents

sql

SELECT * FROM pg_backup_stop();

→(0/6000000, "START WAL...", "")

Do not wait for WAL archiving

sql

SELECT * FROM pg_backup_stop(false);

→(0/6000000, ..., ...)

Check stop LSN and label file size

sql

SELECT lsn, length(labelfile) AS label_size FROM pg_backup_stop();

→0/6000000 | 243

Save label file to a table

sql

WITH stop AS (SELECT * FROM pg_backup_stop()) SELECT stop.lsn, stop.labelfile INTO backup_label FROM stop;

→(row)

The labelfile content MUST be written to backup directory

sql

-- Write labelfile content to $BACKUP_DIR/backup_label before completing backup

→(shell step)

⚠Anti-Pattern— Discarding the labelfile return value

The labelfile returned by pg_backup_stop() is NOT automatically written anywhere. Ignoring it produces an unusable backup that cannot be recovered from.

✓ Instead: Always capture the labelfile and spcmapfile values and write them to the backup directory as backup_label and tablespace_map respectively.

The contents of labelfile and spcmapfile returned by pg_backup_stop() must be written to backup_label and tablespace_map files in your backup directory. Without them, PostgreSQL cannot perform recovery from the backup.

example

SELECT labelfile, spcmapfile FROM pg_backup_stop() \gset -- then write :labelfile to backup_dir/backup_label

→Backup label and tablespace map content for inclusion in backup

Common Gotchas

⚠

Unlogged tables are wiped on crash or unclean shutdown

UNLOGGED tables skip WAL and are extremely fast, but ALL data is truncated on crash or unclean shutdown — silently.

⚠

pg_dump is consistent but still impacts production — it holds a snapshot

pg_dump takes a consistent snapshot, but that snapshot prevents autovacuum from removing dead tuples — causing table bloat during long backups.

pg_cancel_backend

PG 8.0+→ boolean

Cancels the current query being run by a backend process (sends SIGINT). The connection remains open.

DBAReliability

Signature

pg_cancel_backend ( pid integer ) → boolean

Parameters

Parameter	Type	Description
pid	integer	Process ID of the backend whose query to cancel

Examples

sql

SELECT pg_cancel_backend(1234);

→true

sql

SELECT pg_cancel_backend(pid) FROM pg_stat_activity WHERE query_start < now() - interval '30 seconds' AND state = 'active' AND pid <> pg_backend_pid();

→Cancels queries running longer than 30 seconds

Cancel queries blocked on locks for more than 5 minutes

sql

SELECT pg_cancel_backend(pid) FROM pg_stat_activity WHERE state = 'active' AND wait_event_type = 'Lock' AND now() - query_start > interval '5 minutes';

→(rows)

Cancel all queries from a specific application

sql

SELECT pid, query, pg_cancel_backend(pid) AS cancelled FROM pg_stat_activity WHERE application_name = 'old-app' AND state = 'active';

→(rows cancelled)

⚠Anti-Pattern— Using pg_terminate_backend() when pg_cancel_backend() is sufficient

pg_terminate_backend() drops the entire connection. pg_cancel_backend() only cancels the current query, leaving the connection and transaction state intact. Terminating unnecessarily causes reconnect overhead.

✓ Instead: Prefer pg_cancel_backend(pid) for stuck queries — the connection stays open and the client can retry. Reserve pg_terminate_backend() for when you specifically need to close the connection.

`pg_cancel_backend` only interrupts the current query — the connection persists and the application can retry. `pg_terminate_backend` kills the entire connection. Use cancel for runaway queries in production, terminate for stuck/zombie connections.

example

-- Add to a monitoring script:
SELECT pid, query_start, query FROM pg_stat_activity WHERE state = 'active' AND query_start < now() - interval '1 min' AND query NOT LIKE 'SELECT pg_%';

→Identify long-running queries to consider cancelling

pg_column_size

PG 8.1+→ integer

Returns the number of bytes used to store the specified value (not the column definition, but an actual data value).

DBAReliability

Signature

pg_column_size ( any ) → integer

Parameters

Parameter	Type	Description
value	any	Value to measure the stored size of

Examples

sql

SELECT pg_column_size(1::integer);

→4

sql

SELECT pg_column_size('hello'::text);

→10

sql

SELECT pg_column_size('{}'::jsonb);

→20

Column metadata sizes from system catalog

sql

SELECT attname, pg_column_size(attname) AS name_bytes, pg_column_size(attdefault) AS default_bytes FROM pg_attribute WHERE attrelid = 'orders'::regclass AND attnum > 0;

→(bytes per column in catalog)

⚠Anti-Pattern— Using pg_column_size() to estimate total row storage

pg_column_size() measures the compressed size of a single value, not the space it occupies in a table row. Row headers, alignment padding, and NULL bitmaps add overhead that pg_column_size() does not include.

✓ Instead: Use pg_relation_size() for table-level storage. For per-row estimates use: SELECT sum(pg_column_size(t.*)) FROM t LIMIT 1 as a rough guide, but note alignment padding is not captured.

Use `pg_column_size` to compare how much space different representations use. For example, `jsonb` vs normalized table columns, or different numeric precision levels.

example

SELECT pg_column_size(42::int), pg_column_size(42::bigint), pg_column_size(42::numeric);

→4 | 8 | 7 — integer is most compact for small numbers

pg_create_logical_replication_slot

PG 9.4+→ record (slot_name name, lsn pg_lsn)

Creates a logical replication slot with the given name using the specified output plugin (e.g., 'pgoutput', 'test_decoding', 'wal2json'). Returns the slot name and the LSN at which the slot was created. Requires superuser or replication privilege.

DBA

Signature

pg_create_logical_replication_slot ( slot_name name, plugin name [, temporary boolean [, reserve_wal boolean]] ) → record (slot_name name, lsn pg_lsn)

Parameters

Parameter	Type	Description
slot_name	name	Unique name for the replication slot
plugin	name	Output plugin name: 'pgoutput', 'test_decoding', 'wal2json', etc.
temporary	boolean	If true, slot is dropped when session ends (optional, default false)
reserve_wal	boolean	If true, start reserving WAL immediately rather than on first use (optional, default false)

Examples

sql

SELECT * FROM pg_create_logical_replication_slot('my_cdc_slot', 'pgoutput');

→slot_name: my_cdc_slot | lsn: 0/4A2B3C4D

sql

SELECT * FROM pg_create_logical_replication_slot('test_slot', 'test_decoding', true);

→Temporary slot — dropped when session ends

sql

SELECT slot_name, plugin, confirmed_flush_lsn FROM pg_replication_slots WHERE slot_type = 'logical';

→All logical replication slots with their confirmed positions

sql

SELECT * FROM pg_create_logical_replication_slot('wal2json_slot', 'wal2json');

→slot_name: wal2json_slot | lsn: 0/4B1C2D3E

sql

SELECT pg_drop_replication_slot('test_slot');

→Slot dropped — no longer retaining WAL

⚠Anti-Pattern— Creating permanent logical slots without monitoring them

An unconsumed logical replication slot prevents WAL cleanup — the primary retains all WAL from the slot's `restart_lsn` forward. A forgotten slot can fill the pg_wal directory and cause the primary to PANIC. Always monitor `pg_replication_slots WHERE active = false`.

✓ Instead: SELECT slot_name, pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) AS retained_wal FROM pg_replication_slots WHERE NOT active;

Permanent logical replication slots retain WAL indefinitely if not consumed. For ad-hoc change data capture testing, create a temporary slot (`temporary = true`) — it is automatically dropped when the session ends, preventing WAL accumulation from forgotten slots.

example

SELECT * FROM pg_create_logical_replication_slot('temp_inspect', 'test_decoding', true);
SELECT * FROM pg_logical_slot_get_changes('temp_inspect', NULL, 10);

→Inspect recent changes without permanently retaining WAL

Common Gotchas

⚠

Unlogged tables are wiped on crash or unclean shutdown

UNLOGGED tables skip WAL and are extremely fast, but ALL data is truncated on crash or unclean shutdown — silently.

pg_drop_replication_slot

PG 9.4+→ void

Drops the named replication slot. The slot must not be currently in use. Once dropped, the slot no longer retains WAL, allowing it to be recycled. Must be run against the same database the logical slot was created on. Requires superuser or replication privilege.

DBA

Signature

pg_drop_replication_slot ( slot_name name ) → void

Parameters

Parameter	Type	Description
slot_name	name	Name of the replication slot to drop

Examples

sql

SELECT pg_drop_replication_slot('old_cdc_slot');

→(no output) — slot dropped

sql

SELECT slot_name, active, pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) AS retained FROM pg_replication_slots WHERE NOT active;

→Inactive slots that may be safe to drop (verify first!)

sql

DO $ DECLARE r RECORD; BEGIN FOR r IN SELECT slot_name FROM pg_replication_slots WHERE NOT active AND restart_lsn < pg_current_wal_lsn() - pg_wal_lsn_diff(pg_current_wal_lsn(), '0/0') LOOP PERFORM pg_drop_replication_slot(r.slot_name); END LOOP; END $;

→Drops all inactive slots (use with extreme caution)

sql

SELECT pg_drop_replication_slot('temp_decode_slot');

→Frees WAL retained by an abandoned development slot

⚠Anti-Pattern— Dropping replication slots on the wrong database

Logical replication slots are database-scoped. You must be connected to the same database as the slot to drop it. Connecting to a different database and calling `pg_drop_replication_slot` will raise 'replication slot ... does not exist'.

✓ Instead: \c target_database SELECT pg_drop_replication_slot('slot_name');

Before dropping a slot, verify how much WAL it is retaining with `pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)`. Dropping a slot being used by an active subscriber will break replication for that subscriber without warning.

example

SELECT slot_name, active, pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) AS retained_wal FROM pg_replication_slots;

→Review before dropping — active slots must not be dropped while in use

pg_export_snapshot

PG 9.2+→ text

Saves the current transaction snapshot and returns a snapshot identifier string. Other transactions can import this snapshot with SET TRANSACTION SNAPSHOT to see exactly the same data — essential for consistent parallel exports.

DBADeveloper

Signature

pg_export_snapshot ( ) → text

Examples

Export snapshot; other sessions can now import it

sql

BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ; SELECT pg_export_snapshot();

→00000003-00000001-1

Import snapshot for consistent view

sql

-- In another session:
BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;
SET TRANSACTION SNAPSHOT '00000003-00000001-1';

→(void)

Pass snapshot to parallel workers

sql

SELECT pg_export_snapshot() AS snap_id \gset
-- Pass :snap_id to parallel worker connections

→00000003-00000002-1

Pattern for parallel consistent exports

sql

BEGIN; SELECT pg_export_snapshot(); -- Worker 1: SELECT * FROM orders WHERE id BETWEEN 1 AND 50000;
-- Worker 2: SELECT * FROM orders WHERE id BETWEEN 50001 AND 100000;

→Parallel consistent dump

Current transaction ID alongside snapshot ID

sql

SELECT txid_current(), pg_export_snapshot();

→12345 | 00000003-00000001-1

⚠Anti-Pattern— Exporting a snapshot from a READ COMMITTED transaction

pg_export_snapshot() in READ COMMITTED mode produces a snapshot that advances with each statement — workers may see different data than expected.

✓ Instead: Always use REPEATABLE READ or SERIALIZABLE isolation: BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ; SELECT pg_export_snapshot();

pg_dump --jobs uses pg_export_snapshot() internally to coordinate parallel workers. If building a custom parallel export, export one snapshot in the coordinator connection, then have all worker connections import it for a consistent multi-session read.

example

-- Coordinator: BEGIN; SELECT pg_export_snapshot() AS snap;
-- Workers: BEGIN ISOLATION LEVEL REPEATABLE READ; SET TRANSACTION SNAPSHOT :snap;
-- All workers now see the same consistent snapshot

→Consistent parallel export without locking

Common Gotchas

⚠

now() is frozen within a transaction — use clock_timestamp() for real wall time

now() returns the start time of the current transaction, not the current wall clock time. It doesn't advance during a long transaction.

⚠

pg_dump is consistent but still impacts production — it holds a snapshot

pg_dump takes a consistent snapshot, but that snapshot prevents autovacuum from removing dead tuples — causing table bloat during long backups.

pg_get_wal_replay_pause_state

PG 13+→ text

Returns the current WAL replay pause state on a standby server. Returns 'not paused' (normal recovery), 'paused' (recovery paused, not applying WAL), or 'starting' (transitioning). More informative than pg_is_wal_replay_paused() which only returns a boolean. Requires superuser or pg_monitor role.

DBA

Signature

pg_get_wal_replay_pause_state ( ) → text

Examples

sql

SELECT pg_get_wal_replay_pause_state();

→'not paused' — standby is actively applying WAL

sql

SELECT pg_get_wal_replay_pause_state();

→'paused' — standby recovery is paused via pg_wal_replay_pause()

sql

SELECT pg_get_wal_replay_pause_state();

→'starting' — transitioning pause state

sql

SELECT pg_is_in_recovery(), pg_get_wal_replay_pause_state() AS replay_state FROM pg_control_recovery() LIMIT 1;

→true | not paused — standby is active

sql

-- Use in monitoring:
SELECT CASE pg_get_wal_replay_pause_state() WHEN 'not paused' THEN 'OK' WHEN 'paused' THEN 'WARN: recovery paused' ELSE 'INFO: transitioning' END AS replay_status;

→OK — normal recovery state

⚠Anti-Pattern— Pausing WAL replay on a standby without a monitoring alert

`pg_wal_replay_pause()` is meant for brief maintenance windows (e.g., taking a consistent read of the standby). Leaving replay paused indefinitely causes WAL to accumulate on the primary, eventually exhausting `wal_keep_size` and potentially breaking replication entirely.

✓ Instead: Always set a timeout or monitoring alert when pausing replay: `SELECT pg_wal_replay_pause(); -- do work --; SELECT pg_wal_replay_resume();` in a single session that will terminate on disconnect.

During the transition between paused and unpaused, `pg_is_wal_replay_paused()` may momentarily return false even though replay hasn't fully resumed. The three-state response of `pg_get_wal_replay_pause_state()` captures the 'starting' intermediate state accurately.

example

SELECT pg_get_wal_replay_pause_state() AS state, pg_last_wal_replay_lsn() AS last_applied;

→not paused | 0/4A2B3C4D — standby is applying WAL at this LSN

pg_is_in_recovery

PG 9.0+→ boolean

Returns true if the server is currently in recovery mode (standby/replica). Returns false on a primary server. Use to write queries that behave differently on primary vs. standby.

DBA

Signature

pg_is_in_recovery ( ) → boolean

Examples

Primary server returns false

sql

SELECT pg_is_in_recovery();

→false

Display server role

sql

SELECT CASE WHEN pg_is_in_recovery() THEN 'STANDBY' ELSE 'PRIMARY' END AS role;

→PRIMARY

Replication slots only exist on primary

sql

SELECT * FROM pg_stat_replication WHERE NOT pg_is_in_recovery();

→(rows)

Recovery status and WAL lag (standby)

sql

SELECT pg_is_in_recovery(), pg_last_wal_receive_lsn(), pg_last_wal_replay_lsn();

→true | 0/5000000 | 0/4FFFF00

Guard DML operations on standby

sql

DO $ BEGIN IF pg_is_in_recovery() THEN RAISE NOTICE 'Running on standby — read-only'; END IF; END $;

→NOTICE: Running on standby — read-only

⚠Anti-Pattern— Assuming a connection is always to the primary

In HA setups with load balancing, a connection may be routed to a standby. Write without checking pg_is_in_recovery() causes errors on read-only standbys.

✓ Instead: Check pg_is_in_recovery() = false before executing write statements in monitoring or admin scripts that may connect to any node in the cluster.

After a failover, the new primary will return pg_is_in_recovery() = false. Add this check to health monitoring to detect role changes and update load balancer config automatically.

example

SELECT pg_is_in_recovery() AS is_standby, pg_last_wal_replay_lsn() AS replay_lsn, now() - pg_last_xact_replay_timestamp() AS replication_lag;

→Recovery status, position, and lag in one query

pg_is_wal_replay_paused

PG 9.1+→ boolean

Returns true if WAL replay is currently requested to pause on a standby server. Note: returns the requested state — if a paused request was made but replay has not reached a safe pause point yet, it may still return true even though replay has not stopped.

DBA

Signature

pg_is_wal_replay_paused ( ) → boolean

Examples

WAL replay is running normally

sql

SELECT pg_is_wal_replay_paused();

→false

After pause request: returns true

sql

SELECT pg_wal_replay_pause(); SELECT pg_is_wal_replay_paused();

→true

Paused standby: replay LSN behind receive LSN

sql

SELECT pg_is_wal_replay_paused(), pg_last_wal_replay_lsn(), pg_last_wal_receive_lsn();

→true | 0/5000000 | 0/6000000

Health check status label

sql

SELECT CASE WHEN pg_is_wal_replay_paused() THEN 'PAUSED - lag may grow' ELSE 'OK' END AS replay_status;

→OK

Conditionally resume if paused

sql

SELECT pg_wal_replay_resume() WHERE pg_is_wal_replay_paused();

→(row)

⚠Anti-Pattern— Leaving WAL replay paused indefinitely

Pausing WAL replay causes the standby to fall behind the primary. WAL files accumulate on the primary until pg_wal fills up and the server stops.

✓ Instead: Always set a timeout or reminder: PERFORM pg_wal_replay_pause(); -- critical window -- PERFORM pg_wal_replay_resume(); Never leave paused in automation.

Temporarily pausing WAL replay on a standby during heavy analytical queries prevents page conflicts between the WAL replay process and your queries, reducing lock waits. Resume immediately after.

example

SELECT pg_wal_replay_pause(); -- Run heavy queries; SELECT pg_wal_replay_resume();

→Analytics queries run without WAL replay interference

Common Gotchas

⚠

Unlogged tables are wiped on crash or unclean shutdown

UNLOGGED tables skip WAL and are extremely fast, but ALL data is truncated on crash or unclean shutdown — silently.

pg_logical_emit_message

PG 11+→ pg_lsn

Emits a logical replication message into the WAL stream. Messages appear in the logical replication output and can be consumed by logical decoding plugins. Useful for sending out-of-band metadata, checkpoints, or custom events to replication consumers. Requires superuser.

DBA

Signature

pg_logical_emit_message ( transactional boolean, prefix text, content text ) → pg_lsn

Parameters

Parameter	Type	Description
transactional	boolean	If true, message is wrapped in the current transaction and only appears if the transaction commits. If false, emitted immediately regardless of transaction outcome.
prefix	text	A namespace/category prefix to identify the message type
content	text	The message payload (any text, often JSON)

Examples

sql

SELECT pg_logical_emit_message(true, 'migration', '{"version": 42, "applied_at": "2025-03-16"}');

→0/4B1C2D3E — message emitted as part of current transaction

sql

SELECT pg_logical_emit_message(false, 'heartbeat', 'ping');

→0/4B1C3000 — non-transactional heartbeat message emitted immediately

sql

SELECT pg_logical_emit_message(true, 'schema_change', json_build_object('table', 'orders', 'action', 'add_column', 'column', 'notes')::text);

→LSN of the emitted DDL event message

sql

SELECT pg_logical_emit_message(false, 'checkpoint', current_timestamp::text);

→Timestamp checkpoint message for downstream consumers

⚠Anti-Pattern— Using transactional messages to send notifications instead of NOTIFY

Transactional logical messages appear in WAL and are consumed by replication subscribers. They are not a substitute for `LISTEN/NOTIFY` for real-time in-process notifications. Using them for high-frequency event notification inflates WAL size unnecessarily.

✓ Instead: Use NOTIFY for real-time application notifications; use pg_logical_emit_message for cross-cluster metadata or migration markers consumed by CDC pipelines.

Logical replication slots need to see activity to prove the consumer is alive. Emit periodic non-transactional (`transactional = false`) heartbeat messages from a cron job or pg_cron to prevent idle slots from stalling WAL cleanup.

example

SELECT pg_logical_emit_message(false, 'heartbeat', NOW()::text);

→Keeps the logical decoding stream active without a real transaction

Common Gotchas

⚠

Unlogged tables are wiped on crash or unclean shutdown

UNLOGGED tables skip WAL and are extremely fast, but ALL data is truncated on crash or unclean shutdown — silently.

pg_logical_slot_get_changes

PG 9.4+→ setof record (lsn pg_lsn, xid xid, data text)

Retrieves and consumes changes from a logical replication slot up to the specified LSN or change count. Changes are returned as decoded output from the slot's plugin (e.g., JSON for wal2json, SQL-like for test_decoding). Advancing the slot means these changes cannot be retrieved again.

DBA

Signature

pg_logical_slot_get_changes ( slot_name name, upto_lsn pg_lsn, upto_nchanges integer, VARIADIC options text[] ) → setof record (lsn pg_lsn, xid xid, data text)

Parameters

Parameter	Type	Description
slot_name	name	Name of the logical replication slot
upto_lsn	pg_lsn	Upper LSN limit (NULL for no limit)
upto_nchanges	integer	Maximum number of changes to return (NULL for no limit)
options	text[]	Plugin-specific key-value options (variadic, e.g., 'include-timestamp', 'on')

Examples

sql

SELECT * FROM pg_logical_slot_get_changes('my_slot', NULL, 10);

→10 most recent changes consumed from the slot

sql

SELECT lsn, xid, data FROM pg_logical_slot_get_changes('test_decoding_slot', NULL, NULL);

→All pending changes decoded as SQL-like statements

sql

SELECT * FROM pg_logical_slot_get_changes('my_slot', pg_current_wal_lsn(), 100, 'include-timestamp', 'on');

→Up to 100 changes up to current LSN, with timestamps

sql

SELECT count(*) AS changes_consumed FROM pg_logical_slot_get_changes('cdc_slot', NULL, NULL);

→Total pending changes consumed in one call

⚠Anti-Pattern— Calling pg_logical_slot_get_changes without handling the result

Calling `pg_logical_slot_get_changes` and discarding the results (e.g., inside a DO block without capturing the output) still advances the slot's confirmed LSN. The changes are permanently consumed and lost. Always process or store the output before calling this function.

✓ Instead: INSERT INTO change_log SELECT * FROM pg_logical_slot_get_changes('my_slot', NULL, 1000); -- Store before consuming

`pg_logical_slot_get_changes` advances the slot — changes are gone after the call. Use `pg_logical_slot_peek_changes` with the same parameters to inspect changes without consuming them, then call `get_changes` when ready to commit the advance.

example

-- Preview without consuming:
SELECT * FROM pg_logical_slot_peek_changes('my_slot', NULL, 5);
-- Then consume:
SELECT * FROM pg_logical_slot_get_changes('my_slot', NULL, 5);

→Inspect then consume in a controlled two-phase pattern

Common Gotchas

⚠

Unlogged tables are wiped on crash or unclean shutdown

UNLOGGED tables skip WAL and are extremely fast, but ALL data is truncated on crash or unclean shutdown — silently.

pg_logical_slot_peek_changes

PG 9.4+→ setof record (lsn pg_lsn, xid xid, data text)

Peeks at changes in a logical replication slot without consuming them. The slot position is not advanced, so the same changes will be returned on the next call. Identical signature to pg_logical_slot_get_changes but non-destructive.

DBA

Signature

pg_logical_slot_peek_changes ( slot_name name, upto_lsn pg_lsn, upto_nchanges integer, VARIADIC options text[] ) → setof record (lsn pg_lsn, xid xid, data text)

Parameters

Parameter	Type	Description
slot_name	name	Name of the logical replication slot
upto_lsn	pg_lsn	Upper LSN limit (NULL for no limit)
upto_nchanges	integer	Maximum changes to return (NULL for no limit)
options	text[]	Plugin-specific key-value options (variadic)

Examples

sql

SELECT * FROM pg_logical_slot_peek_changes('my_slot', NULL, 5);

→First 5 pending changes — slot position unchanged

sql

SELECT count(*) FROM pg_logical_slot_peek_changes('my_slot', NULL, NULL);

→Count of all pending changes without consuming them

sql

SELECT data FROM pg_logical_slot_peek_changes('wal2json_slot', NULL, 10) WHERE data LIKE '%orders%';

→Preview decoded changes touching the orders table

sql

-- Safe inspection pattern:
SELECT lsn, xid, data FROM pg_logical_slot_peek_changes('cdc_slot', NULL, 100);
-- If output looks correct, consume:
SELECT * FROM pg_logical_slot_get_changes('cdc_slot', NULL, 100);

→Two-phase inspect-then-consume pattern

⚠Anti-Pattern— Using peek_changes in production pipelines instead of get_changes

If a CDC pipeline uses `peek_changes` in a loop without ever calling `get_changes`, the slot position never advances. WAL continues to accumulate indefinitely until pg_wal fills up and the server panics. Peek is for inspection only — production consumers must use `get_changes`.

✓ Instead: Always use pg_logical_slot_get_changes in production consumers. Use peek_changes only for ad-hoc debugging sessions.

When diagnosing a stalled or misbehaving CDC pipeline, `pg_logical_slot_peek_changes` lets you see exactly what changes are queued without risking data loss. You can call it as many times as needed — the slot position is never advanced.

example

SELECT lsn, data FROM pg_logical_slot_peek_changes('production_cdc', NULL, 20) ORDER BY lsn;

→First 20 pending decoded changes in LSN order — safe to call repeatedly

Common Gotchas

⚠

Unlogged tables are wiped on crash or unclean shutdown

UNLOGGED tables skip WAL and are extremely fast, but ALL data is truncated on crash or unclean shutdown — silently.

pg_ls_archive_statusdir

PG 10+→ setof record (name text, size bigint, modification timestamp)

Returns a set of records describing the files in the WAL archive status directory (pg_wal/archive_status). Each file indicates the archiving status of a WAL segment: .ready (waiting to archive), .done (archived), or .failed (archive command failed). Requires superuser or pg_monitor role.

DBA

Signature

pg_ls_archive_statusdir ( ) → setof record (name text, size bigint, modification timestamp)

Examples

sql

SELECT name FROM pg_ls_archive_statusdir() WHERE name LIKE '%.ready';

→WAL segments waiting to be archived (backlog)

sql

SELECT count(*) AS pending_archive FROM pg_ls_archive_statusdir() WHERE name LIKE '%.ready';

→17 — segments not yet archived, archiving may be slow or failed

sql

SELECT count(*) FILTER (WHERE name LIKE '%.done') AS done, count(*) FILTER (WHERE name LIKE '%.ready') AS pending FROM pg_ls_archive_statusdir();

→done: 200 | pending: 3

sql

SELECT name, modification FROM pg_ls_archive_statusdir() WHERE name LIKE '%.ready' ORDER BY modification LIMIT 1;

→Oldest WAL segment waiting to archive — high age indicates stalled archiving

⚠Anti-Pattern— Deleting .ready files from archive_status to skip archiving

Manually deleting `.ready` files from `pg_wal/archive_status` silently abandons archiving of those WAL segments. The segments themselves remain but will never be archived. This creates gaps in the WAL archive that make point-in-time recovery impossible past those gaps.

✓ Instead: Fix the archive_command failure first. If you must skip archiving for specific segments, use `pg_switch_wal` after fixing the underlying issue and verify archive continuity before relying on the archive for recovery.

Under normal operation, `.ready` files appear briefly and are replaced by `.done` after a successful archive. If `.ready` files accumulate, the archive_command is failing. Correlate with `pg_stat_archiver.last_failed_wal` to see the exact error.

example

SELECT last_archived_wal, last_failed_wal, last_failed_time FROM pg_stat_archiver;

→If last_failed_time is recent and last_failed_wal matches a .ready file, archiving is actively failing

Common Gotchas

⚠

Unlogged tables are wiped on crash or unclean shutdown

UNLOGGED tables skip WAL and are extremely fast, but ALL data is truncated on crash or unclean shutdown — silently.

pg_ls_dir

PG 8.1+→ setof text

Returns the names of files and directories in a directory within the database cluster data directory. Requires superuser or pg_monitor role. Returns one row per entry.

DBA

Signature

pg_ls_dir ( dirname text [, missing_ok boolean, include_dot_dirs boolean] ) → setof text

Parameters

Parameter	Type	Description
dirname	text	Directory path relative to the data directory
missing_ok	boolean	If true, return empty set for missing directory; default false
include_dot_dirs	boolean	If true, include . and ..; default false

Examples

List WAL files

sql

SELECT * FROM pg_ls_dir('pg_wal') LIMIT 5;

→

000000010000000000000001
000000010000000000000002
...

List only WAL segment files (not history/backup files)

sql

SELECT * FROM pg_ls_dir('pg_wal') WHERE pg_ls_dir ~ '^[0-9A-F]{24}
    
    
    
  

;

→(WAL segment files)

Count WAL files (monitor for buildup)

sql

SELECT count(*) FROM pg_ls_dir('pg_wal');

→128

Check WAL archive status

sql

SELECT * FROM pg_ls_dir('pg_wal/archive_status');

→

000...done
000...ready

Missing directory: empty set not error

sql

SELECT * FROM pg_ls_dir('nonexistent', true, false);

→(0 rows)

⚠Anti-Pattern— Using pg_ls_dir() to list arbitrary OS directories

pg_ls_dir() is restricted to subdirectories of the PostgreSQL data directory for security. It cannot list /tmp, /home, or other OS paths.

✓ Instead: Use pg_ls_logdir() for log files, pg_ls_waldir() for WAL files — these are specialised, safer alternatives with additional metadata.

A rapidly growing pg_ls_dir('pg_wal') count indicates WAL archiving is failing. Alert when the count exceeds wal_keep_size / wal_segment_size WAL files.

example

SELECT count(*) AS wal_files FROM pg_ls_dir('pg_wal') WHERE pg_ls_dir ~ '^[0-9A-F]{24}
    
    
    
  

;

→Number of WAL segment files (alert if > expected max)

Common Gotchas

⚠

Unlogged tables are wiped on crash or unclean shutdown

UNLOGGED tables skip WAL and are extremely fast, but ALL data is truncated on crash or unclean shutdown — silently.

pg_ls_logdir

PG 10+→ setof record (name text, size bigint, modification timestamp)

Returns a set of records describing the files in the PostgreSQL log directory (as configured by log_directory). Each row contains the filename, size in bytes, and last modification time. Requires superuser or pg_monitor role.

DBA

Signature

pg_ls_logdir ( ) → setof record (name text, size bigint, modification timestamp)

Examples

sql

SELECT * FROM pg_ls_logdir() ORDER BY modification DESC LIMIT 5;

→name: postgresql-2025-03-16.log | size: 1048576 | modification: 2025-03-16 09:30:00

sql

SELECT sum(size) AS total_log_bytes FROM pg_ls_logdir();

→52428800 — 50 MB total log files

sql

SELECT name, pg_size_pretty(size) FROM pg_ls_logdir() ORDER BY modification DESC LIMIT 1;

→Current log file name and size

sql

SELECT count(*) AS log_file_count, pg_size_pretty(sum(size)) AS total_size FROM pg_ls_logdir();

→47 files | 1024 MB total

sql

SELECT name FROM pg_ls_logdir() WHERE modification < NOW() - INTERVAL '7 days';

→Log files older than 7 days (candidates for rotation/deletion)

⚠Anti-Pattern— Using pg_read_file on log files for real-time log tailing

Reading log files through `pg_read_file` and `pg_ls_logdir` inside the database is not a reliable log streaming mechanism. Large log entries can be read mid-write, and there is no cursor position — each call rereads from byte 0 (or a given offset). Use `pg_lsclusters`, syslog, or a log shipping solution for real-time monitoring.

✓ Instead: Use log_destination = 'csvlog' + COPY FROM to bulk-load log data, or use an external log agent (Vector, Fluentd) that reads from the filesystem.

Use `pg_ls_logdir()` in a monitoring query to alert on log files growing beyond a threshold. Runaway lock contention or checkpoint warnings can flood logs with large bursts of messages.

example

SELECT name, pg_size_pretty(size) FROM pg_ls_logdir() WHERE size > 100*1024*1024 ORDER BY modification DESC;

→Log files exceeding 100 MB — investigate for excessive logging

pg_ls_waldir

PG 10+→ setof record (name text, size bigint, modification timestamp)

Returns a set of records describing the files in the WAL (pg_wal) directory. Each row contains the filename, size in bytes, and last modification time. Useful for monitoring WAL accumulation and identifying recycled vs active segments. Requires superuser or pg_monitor role.

DBA

Signature

pg_ls_waldir ( ) → setof record (name text, size bigint, modification timestamp)

Examples

sql

SELECT count(*) AS wal_file_count, pg_size_pretty(sum(size)) AS total_wal FROM pg_ls_waldir();

→64 files | 1024 MB total WAL

sql

SELECT * FROM pg_ls_waldir() WHERE name NOT LIKE '%.history' AND name NOT LIKE '%.partial' ORDER BY modification DESC LIMIT 5;

→5 most recently modified active WAL segments

sql

SELECT pg_size_pretty(sum(size)) FROM pg_ls_waldir() WHERE modification > NOW() - INTERVAL '1 hour';

→WAL generated in the last hour

sql

SELECT name FROM pg_ls_waldir() WHERE name LIKE '%.partial';

→Incomplete WAL segments (still being written)

sql

SELECT count(*) FROM pg_ls_waldir() WHERE length(name) = 24; -- Full 24-char WAL segment names

→Number of full WAL segments (excludes history and partial files)

⚠Anti-Pattern— Using pg_ls_waldir to count active replication lag instead of pg_stat_replication

WAL directory size reflects all retained segments (for archiving, replication slots, and recycling) — not just replication lag. Use `pg_wal_lsn_diff(sent_lsn, replay_lsn)` from `pg_stat_replication` for accurate per-replica lag measurement.

✓ Instead: SELECT application_name, pg_size_pretty(pg_wal_lsn_diff(sent_lsn, replay_lsn)) AS lag FROM pg_stat_replication;

WAL files accumulate in pg_wal when archiving is enabled but the archive command is failing. Use `pg_ls_waldir()` to monitor total WAL size. If it grows beyond `wal_keep_size` + active replication slots, investigate `pg_stat_archiver.last_failed_wal`.

example

SELECT pg_size_pretty(sum(size)) AS wal_dir_size, (SELECT pg_size_pretty(setting::bigint * 1024 * 1024) FROM pg_settings WHERE name = 'wal_keep_size') AS wal_keep_size FROM pg_ls_waldir();

→2048 MB accumulated vs 256 MB configured keep — archiving may be stalled

Common Gotchas

⚠

Unlogged tables are wiped on crash or unclean shutdown

UNLOGGED tables skip WAL and are extremely fast, but ALL data is truncated on crash or unclean shutdown — silently.

pg_read_binary_file

PG 9.0+→ bytea

Returns the contents of a server-side file as bytea. Unlike pg_read_file (which returns text), this function preserves binary content exactly. Restricted to files within the data directory. Requires superuser or pg_read_server_files role.

DBA

Signatures

pg_read_binary_file ( filename text ) → bytea

pg_read_binary_file ( filename text, offset bigint, length bigint ) → bytea

pg_read_binary_file ( filename text, offset bigint, length bigint, missing_ok boolean ) → bytea

Parameters

Parameter	Type	Description
filename	text	Path to the file, relative to the data directory
offset	bigint	Starting byte offset (optional)
length	bigint	Maximum bytes to read (optional)
missing_ok	boolean	If true, return NULL instead of error when file not found (optional)

Examples

sql

SELECT pg_read_binary_file('postgresql.conf');

→\x5b706f737467726573716c5d... (binary content)

sql

SELECT length(pg_read_binary_file('pg_wal/000000010000000000000001'));

→16777216 — 16MB WAL segment

sql

SELECT pg_read_binary_file('global/pg_control', 0, 100);

→First 100 bytes of the pg_control file as bytea

sql

SELECT pg_read_binary_file('nonexistent_file', 0, 0, true);

→NULL — file not found, no error

sql

SELECT encode(pg_read_binary_file('PG_VERSION'), 'escape');

→16\n — PostgreSQL major version number as text via binary read

⚠Anti-Pattern— Reading large binary files without specifying offset/length

Calling `pg_read_binary_file` without offset/length on a large WAL segment or table file reads the entire file into memory (up to 16MB for WAL, potentially gigabytes for data files). This can cause OOM conditions or timeouts.

✓ Instead: Always use `pg_stat_file(filename).size` to check file size first, then read in chunks: `pg_read_binary_file(filename, 0, 65536)` for the first 64KB.

When you need exact binary content (e.g., reading a custom binary format, computing checksums, or extracting byte ranges), `pg_read_binary_file` is the right tool. For text files, `pg_read_file` handles encoding conversion automatically.

example

SELECT md5(pg_read_binary_file('pg_wal/000000010000000000000001')) AS wal_segment_md5;

→MD5 hash of the WAL segment for integrity verification

pg_read_file

PG 8.1+→ text

Returns the contents of a text file in the server file system. The file must be within the database cluster directory or in a directory listed in log_directory. Requires superuser or pg_read_server_files role.

DBA

Signature

pg_read_file ( filename text [, offset bigint, length bigint [, missing_ok boolean]] ) → text

Parameters

Parameter	Type	Description
filename	text	Relative path within the data directory, or absolute path allowed by pg_file_settings
offset	bigint	Byte offset to start reading (optional)
length	bigint	Maximum bytes to read (optional)
missing_ok	boolean	If true, return NULL for missing files instead of error; default false

Examples

Read PostgreSQL major version file

sql

SELECT pg_read_file('PG_VERSION');

→16

First 500 bytes of postgresql.conf

sql

SELECT pg_read_file('postgresql.conf', 0, 500);

→# PostgreSQL configuration...

Full pg_hba.conf contents

sql

SELECT pg_read_file('pg_hba.conf');

→# PostgreSQL Client Authentication...

Read current log file list

sql

SELECT pg_read_file('current_logfiles');

→stderr log/postgresql-2026-03-16.log

Return NULL instead of error for missing file

sql

SELECT pg_read_file('missing_file.txt', 0, 100, true);

→NULL

⚠Anti-Pattern— Using pg_read_file() to read arbitrary OS files

pg_read_file() is restricted to files within the data directory or log_directory for security. Attempting to read /etc/passwd or other system files will fail.

✓ Instead: Use pg_read_file() only for PostgreSQL config and log files. For general file access, use server-side PL/Python or PL/Perl extensions if available and authorised.

Combine pg_read_file() with current_logfiles to read the current PostgreSQL log directly from SQL — no shell access needed. Useful in restricted environments.

example

SELECT pg_read_file((SELECT regexp_replace(setting, '^stderr ', '') FROM pg_settings WHERE name='log_directory') || '/' || (SELECT split_part(pg_read_file('current_logfiles'), E'\n', 1) SPLIT_PART ' ' 2));

→Current PostgreSQL log file contents

pg_reload_conf

PG 8.0+→ boolean

Sends a SIGHUP signal to the server, causing it to reload configuration files (postgresql.conf, pg_hba.conf). Requires superuser privilege.

SecurityDBA

Signature

pg_reload_conf ( ) → boolean

Examples

sql

SELECT pg_reload_conf();

→true (if successful)

Reload then check for pending-restart settings

sql

SELECT pg_reload_conf(); SELECT name, setting, pending_restart FROM pg_settings WHERE pending_restart;

→true | (any settings needing restart)

Reload with timestamp for audit trail

sql

SELECT pg_reload_conf(), now() AS reloaded_at;

→true | 2026-03-16 10:00:00

Verify config values after reload

sql

SELECT name, setting, source FROM pg_settings WHERE source = 'configuration file' AND name IN ('work_mem','max_connections','shared_buffers');

→(current values from postgresql.conf)

⚠Anti-Pattern— Reloading config without verifying the change took effect

pg_reload_conf() sends SIGHUP but returns immediately. Some settings (requiring restart) silently do not apply until a full restart; others may fail validation.

✓ Instead: After pg_reload_conf(), query pg_settings WHERE pending_restart = true to check if any settings still need a restart. Check pg_log for configuration errors.

Many `postgresql.conf` settings can be changed without a server restart — just reload with `pg_reload_conf()`. Check `pg_settings.context = 'sighup'` to identify reloadable parameters.

example

SELECT name, setting FROM pg_settings WHERE context = 'sighup' AND name LIKE 'log_%';

→Log settings changeable without restart

pg_replication_slot_advance

PG 9.5+→ record (slot_name name, end_lsn pg_lsn)

Advances the confirmed flush LSN of a replication slot forward to at least upto_lsn, allowing WAL before that point to be recycled. Use with caution — advancing past unconsumed changes permanently discards them. Requires superuser or replication privilege.

DBA

Signature

pg_replication_slot_advance ( slot_name name, upto_lsn pg_lsn ) → record (slot_name name, end_lsn pg_lsn)

Parameters

Parameter	Type	Description
slot_name	name	Name of the replication slot to advance
upto_lsn	pg_lsn	LSN to advance the slot to (changes before this LSN are discarded)

Examples

sql

SELECT * FROM pg_replication_slot_advance('cdc_slot', pg_current_wal_lsn());

→slot_name: cdc_slot | end_lsn: 0/4B1C2D3E — slot advanced to current position

sql

SELECT confirmed_flush_lsn, restart_lsn FROM pg_replication_slots WHERE slot_name = 'cdc_slot';

→Shows the slot's current confirmed and restart LSN before advancing

sql

SELECT * FROM pg_replication_slot_advance('old_slot', '0/5A000000');

→slot_name: old_slot | end_lsn: 0/5A000000 — slot advanced to specific LSN

sql

SELECT slot_name, pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) AS retained_before FROM pg_replication_slots WHERE slot_name = 'cdc_slot';

→Check retained WAL before advancing to confirm the expected reduction

⚠Anti-Pattern— Advancing a logical slot past unconsumed changes without subscriber resync

Advancing a slot's LSN permanently discards all changes before that point. The subscriber will not see those changes and will silently diverge from the publisher's data. Always coordinate with the subscriber team before advancing slots in production.

✓ Instead: Prefer fixing the slow consumer, increasing `wal_keep_size`, or adding disk space. Only advance the slot if you accept data loss on the subscriber and have a resync plan.

When a logical slot has fallen far behind and pg_wal is filling up, but you don't want to drop the slot and lose the subscriber entirely, advance the slot to the current LSN. The subscriber will see a gap and need to be re-synced, but replication can resume.

example

-- Step 1: Check how far behind
SELECT pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) FROM pg_replication_slots WHERE slot_name = 'my_slot';
-- Step 2: Advance
SELECT * FROM pg_replication_slot_advance('my_slot', pg_current_wal_lsn());

→Slot advanced; WAL before current LSN can now be recycled

pg_rotate_logfile

PG 8.0+→ boolean

Forces log file rotation when using PostgreSQL's built-in logging collector. Creates a new log file immediately.

DBAReliability

Signature

pg_rotate_logfile ( ) → boolean

Examples

sql

SELECT pg_rotate_logfile();

→true

Force rotation with timestamp

sql

SELECT pg_rotate_logfile(), now() AS rotated_at;

→true | 2026-03-16 10:00:00

Rotate then confirm new log file created

sql

SELECT pg_rotate_logfile(); SELECT * FROM pg_ls_logdir() ORDER BY modification DESC LIMIT 3;

→(3 most recent log files)

Verify new empty log file was created

sql

SELECT name, size, modification FROM pg_ls_logdir() ORDER BY modification DESC LIMIT 1;

→postgresql-2026-03-16.log | 0 | 2026-03-16 10:00:00

⚠Anti-Pattern— Using pg_rotate_logfile() as the only log management strategy

pg_rotate_logfile() only creates a new log file. Without periodic deletion of old files, log directories fill up. On systems with log_rotation_size or log_rotation_age set, rotation happens automatically.

✓ Instead: Set log_rotation_age and log_rotation_size in postgresql.conf for automatic rotation. Use pg_rotate_logfile() only for on-demand rotation before taking a log archive.

Call `pg_rotate_logfile()` before long maintenance windows to start a clean log file. Makes it easier to extract maintenance-period logs for analysis.

example

SELECT pg_rotate_logfile(); -- Start fresh log before VACUUM FULL or migration

→New log file created at current timestamp

pg_start_backup

PG 8.0+; deprecated in PG 15; removed in PG 17→ pg_lsn

Starts an exclusive base backup (legacy API). Puts the server into backup mode so that a filesystem copy can be taken. Deprecated in PostgreSQL 15 and removed in PostgreSQL 17 in favour of pg_backup_start(). Requires superuser or pg_monitor role.

DBA

Signature

pg_start_backup ( label text [, fast boolean [, exclusive boolean]] ) → pg_lsn

Parameters

Parameter	Type	Description
label	text	Arbitrary label string to identify this backup
fast	boolean	If true, requests an immediate checkpoint instead of a spread checkpoint (optional, default false)
exclusive	boolean	If true (default), exclusive backup; if false, non-exclusive (optional)

Examples

sql

SELECT pg_start_backup('nightly-backup-2025-03-16');

→0/4A2B3C4D — starting LSN of the backup

sql

SELECT pg_start_backup('prod-backup', true);

→0/4A2B3C4D — immediate checkpoint, then backup begins

sql

-- After taking the filesystem copy:
SELECT * FROM pg_stop_backup();

→lsn: 0/4B1C2D3E | labelfile: ... | spcmapfile: ...

sql

-- PG15+ preferred alternative:
SELECT * FROM pg_backup_start('nightly-backup-2025-03-16');

→lsn: 0/4A2B3C4D | labelfile text

⚠Anti-Pattern— Leaving a server in backup mode without calling pg_stop_backup

If `pg_start_backup()` is called but `pg_stop_backup()` is never called (e.g., due to a backup script crash), the server remains in backup mode: WAL files are retained indefinitely, filling the pg_wal directory. Always use a try/finally pattern to guarantee pg_stop_backup is called.

✓ Instead: Use pg_backup_start()/pg_backup_stop() on PG15+, or ensure pg_stop_backup() is always called, even on error. Consider using tools like pgBackRest or pg_basebackup that handle this automatically.

`pg_start_backup()` was removed in PostgreSQL 17. For new backup tooling on PG15+, use `pg_backup_start(label)` and `pg_backup_stop()`. The new API is always non-exclusive and thread-safe.

example

-- PG 15+ non-exclusive backup API:
SELECT * FROM pg_backup_start('my-backup');
-- ... copy files ...
SELECT * FROM pg_backup_stop(true);

→Clean backup with WAL archiving wait

Common Gotchas

⚠

Unlogged tables are wiped on crash or unclean shutdown

UNLOGGED tables skip WAL and are extremely fast, but ALL data is truncated on crash or unclean shutdown — silently.

⚠

pg_dump is consistent but still impacts production — it holds a snapshot

pg_dump takes a consistent snapshot, but that snapshot prevents autovacuum from removing dead tuples — causing table bloat during long backups.

pg_stat_file

PG 8.1+→ record (size bigint, access timestamptz, modification timestamptz, change timestamptz, creation timestamptz, isdir boolean)

Returns metadata about a file or directory in the server file system: size, access time, modification time, change time, creation time (NULL on some OSes), and whether it is a directory.

DBA

Signature

pg_stat_file ( filename text [, missing_ok boolean] ) → record (size bigint, access timestamptz, modification timestamptz, change timestamptz, creation timestamptz, isdir boolean)

Parameters

Parameter	Type	Description
filename	text	Path relative to the data directory
missing_ok	boolean	If true, return NULL record for missing files; default false

Examples

Metadata for PG_VERSION file

sql

SELECT * FROM pg_stat_file('PG_VERSION');

→(2, 2026-03-16 08:00:00, 2026-01-01, 2026-01-01, NULL, false)

Config file size and last modification time

sql

SELECT size, modification FROM pg_stat_file('postgresql.conf');

→28672 | 2026-03-15 14:30:00

First WAL file size (should be ~16MB)

sql

SELECT size/1024/1024 AS mb, modification FROM pg_stat_file('pg_wal/' || (SELECT * FROM pg_ls_dir('pg_wal') LIMIT 1));

→16 | 2026-03-16 09:00:00

How long since postgresql.conf was changed

sql

SELECT now() - modification AS config_age FROM pg_stat_file('postgresql.conf');

→4:30:00

NULL record for missing file

sql

SELECT * FROM pg_stat_file('missing.txt', true);

→NULL

⚠Anti-Pattern— Using pg_stat_file() to check arbitrary OS file sizes

Like other server file functions, pg_stat_file() is restricted to the PostgreSQL data directory. You cannot use it to check disk space or arbitrary file sizes outside the cluster.

✓ Instead: Use pg_database_size(), pg_tablespace_size(), pg_relation_size(), or pg_total_relation_size() for storage monitoring within PostgreSQL.

Check the modification time of postgresql.conf before issuing pg_reload_conf() to verify the config was actually saved recently (guards against sending SIGHUP after an accidental no-op config change).

example

SELECT modification > now() - interval '5 minutes' AS recently_modified FROM pg_stat_file('postgresql.conf');

→true if config was changed in last 5 minutes

Common Gotchas

⚠

now() is frozen within a transaction — use clock_timestamp() for real wall time

now() returns the start time of the current transaction, not the current wall clock time. It doesn't advance during a long transaction.

⚠

Timezone trap: TIMESTAMP vs TIMESTAMPTZ store and display differently

TIMESTAMP stores local time with no timezone info. TIMESTAMPTZ stores UTC and converts to session timezone on display. Using TIMESTAMP leads to silent data loss across timezone changes.

⚠

date_trunc returns a timestamp, not a date — type mismatches in GROUP BY

date_trunc('day', now()) returns a TIMESTAMPTZ, not a DATE. Grouping by date_trunc and a DATE column compares different types.

pg_stop_backup (legacy)

PG 8.0+; deprecated in PG 15; removed in PG 17→ pg_lsn (exclusive) or record (lsn pg_lsn, labelfile text, spcmapfile text) (non-exclusive)

Stops a base backup started with pg_start_backup() (legacy API). When exclusive is false, returns a record with LSN, backup label contents, and tablespace map contents. Deprecated in PG15 in favour of pg_backup_stop().

DBA

Signatures

pg_stop_backup ( ) → pg_lsn

pg_stop_backup ( exclusive boolean [, wait_for_archive boolean] ) → record (lsn pg_lsn, labelfile text, spcmapfile text)

Parameters

Parameter	Type	Description
exclusive	boolean	Must match the mode used in pg_start_backup (optional)
wait_for_archive	boolean	If true, wait for WAL archive to complete before returning (optional, default true)

Examples

sql

SELECT pg_stop_backup();

→0/4B1C2D3E — ending LSN (exclusive backup mode)

sql

SELECT lsn, labelfile FROM pg_stop_backup(false, true);

→lsn: 0/4B1C2D3E | labelfile: START WAL LOCATION: ...

sql

SELECT pg_stop_backup(false, false);

→Returns immediately without waiting for WAL archive

sql

-- PG15+ replacement:
SELECT * FROM pg_backup_stop(true);

→lsn | labelfile | spcmapfile — clean API

⚠Anti-Pattern— Not waiting for WAL archive completion before verifying backup integrity

Calling `pg_stop_backup(false, false)` returns immediately without waiting for WAL segments to be archived. If you then immediately test the backup, the WAL needed for recovery to the backup end LSN may not yet be archived, making the backup unrecoverable.

✓ Instead: Use `pg_stop_backup(false, true)` (default) to wait for archive completion, or verify archive with `pg_walfile_name(end_lsn)` and check the archive directory.

In non-exclusive mode, `pg_stop_backup(false)` returns the backup label and tablespace map as text. Save these to files named `backup_label` and `tablespace_map` in the backup root — they are required for recovery to work correctly.

example

\copy (SELECT labelfile FROM pg_stop_backup(false)) TO '/backup/backup_label'

→Backup label file saved for recovery use

Common Gotchas

⚠

Unlogged tables are wiped on crash or unclean shutdown

UNLOGGED tables skip WAL and are extremely fast, but ALL data is truncated on crash or unclean shutdown — silently.

⚠

pg_dump is consistent but still impacts production — it holds a snapshot

pg_dump takes a consistent snapshot, but that snapshot prevents autovacuum from removing dead tuples — causing table bloat during long backups.

pg_switch_wal

PG 10+ (renamed from pg_switch_xlog)→ pg_lsn

Forces a switch to the next WAL segment file. Returns the ending WAL LSN plus 1 of the just-completed segment. Useful to ensure the current WAL segment gets archived. Requires superuser or pg_switch_server_state role.

DBA

Signature

pg_switch_wal ( ) → pg_lsn

Examples

Force WAL segment switch, returns LSN after switch

sql

SELECT pg_switch_wal();

→0/3000000

Switch and see new WAL file

sql

SELECT pg_switch_wal(), pg_walfile_name(pg_current_wal_lsn());

→0/4000000 | 000000010000000000000003

Conditional: only switch on primary

sql

SELECT pg_switch_wal() FROM generate_series(1,1) WHERE pg_is_in_recovery() = false;

→(row)

Used after pg_backup_stop() to ensure archiving

sql

SELECT pg_switch_wal();  -- Then wait for archive_status directory to show .done

→0/5000000

Check how full the last segment was (16MB = full)

sql

SELECT lsn - '0/0'::pg_lsn AS bytes_in_segment FROM pg_switch_wal() lsn;

→16777216

⚠Anti-Pattern— Calling pg_switch_wal() during peak write load unnecessarily

Forcing WAL switches creates more WAL files to archive. Each switch causes I/O and archive overhead. Call it only when explicitly needed (after backup stops, before planned downtime).

✓ Instead: Let PostgreSQL switch WAL segments naturally via wal_rotation_size / archive_timeout. Use pg_switch_wal() only for explicit archiving needs.

After pg_backup_stop(), the current WAL segment may not be full and therefore not archived yet. Call pg_switch_wal() to force segment rotation so the last segment gets archived, making your backup immediately restorable.

example

SELECT pg_backup_stop(); SELECT pg_switch_wal();

→Backup completed and final WAL segment archived

Common Gotchas

⚠

Unlogged tables are wiped on crash or unclean shutdown

UNLOGGED tables skip WAL and are extremely fast, but ALL data is truncated on crash or unclean shutdown — silently.

⚠

pg_dump is consistent but still impacts production — it holds a snapshot

pg_dump takes a consistent snapshot, but that snapshot prevents autovacuum from removing dead tuples — causing table bloat during long backups.

pg_terminate_backend

PG 8.0+→ boolean

Terminates a backend process (disconnects the client). Returns true if the signal was sent successfully. Requires superuser or pg_signal_backend role.

SecurityDBA

Signature

pg_terminate_backend ( pid integer [, timeout bigint] ) → boolean

Parameters

Parameter	Type	Description
pid	integer	Process ID of the backend to terminate
timeout	bigint	Optional: wait up to this many milliseconds for termination to complete

Examples

sql

SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE state = 'idle' AND query_start < now() - interval '1 hour';

→Terminates long-idle sessions

sql

SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE datname = 'old_db' AND pid <> pg_backend_pid();

→Disconnect all other sessions from a database

Terminate all connections to a database before DROP DATABASE

sql

SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE datname = 'old_db' AND pid != pg_backend_pid();

→(rows)

Identify long-running queries before deciding to terminate

sql

SELECT pid, now() - query_start AS duration, query FROM pg_stat_activity WHERE state = 'active' AND now() - query_start > interval '30 minutes';

→(long-running queries)

⚠Anti-Pattern— Terminating backends without identifying the root cause first

Blindly terminating connections that appear stuck may mask underlying issues (long-running queries, deadlocks, missing indexes). The same problem will recur.

✓ Instead: First query pg_stat_activity to understand what the backend is doing and why. Use pg_cancel_backend() for queries (leaves connection open); use pg_terminate_backend() only when the connection itself must be closed.

`pg_cancel_backend` sends SIGINT (cancels the current query but keeps the connection). `pg_terminate_backend` sends SIGTERM (kills the entire connection). Try cancel first, then terminate if needed.

example

SELECT pg_cancel_backend(pid) FROM pg_stat_activity WHERE query_start < now() - interval '5 min' AND state = 'active';

→Cancels long-running queries gracefully

pg_try_advisory_lock

PG 8.2+→ boolean

Attempts to acquire a session-level advisory lock. Returns true if successful, false if the lock is not available (non-blocking).

SecurityDBAReliability

Signatures

pg_try_advisory_lock ( key bigint ) → boolean

pg_try_advisory_lock ( key1 integer, key2 integer ) → boolean

Parameters

Parameter	Type	Description
key	bigint or two integers	Application-defined lock identifier

Examples

sql

SELECT pg_try_advisory_lock(42);

→true (if lock acquired), false (if already locked)

sql

SELECT * FROM jobs WHERE pg_try_advisory_lock(id) LIMIT 1;

→Selects and locks one unlocked job

Skip job if another worker is already processing it

sql

SELECT CASE WHEN pg_try_advisory_lock(1234) THEN 'Processing' ELSE 'Skipping — another worker has this' END;

→'Processing'

Attempt lock and record result

sql

SELECT pid, pg_try_advisory_lock(99) AS got_lock FROM generate_series(1,1);

→12345 | true

⚠Anti-Pattern— Polling pg_try_advisory_lock() in a loop to wait for a lock

pg_try_advisory_lock() is non-blocking by design — it returns false immediately if the lock is taken. Polling it in a loop wastes CPU and creates thundering herd problems.

✓ Instead: Use pg_advisory_lock() when you must wait (it blocks cleanly). Use pg_try_advisory_lock() only when you want to skip processing if another worker already has the lock — not as a polling mechanism.

Use `pg_try_advisory_lock(id)` in a SELECT to implement a concurrent job queue: `SELECT id FROM jobs WHERE status = 'pending' AND pg_try_advisory_lock(id) LIMIT 1 FOR UPDATE SKIP LOCKED`. Multiple workers can safely claim different jobs.

example

SELECT id FROM pending_jobs WHERE pg_try_advisory_lock(id) ORDER BY created_at LIMIT 1;

→Claims one unclaimed job per worker call

pg_wal_lsn_diff

PG 10+→ numeric

Calculates the difference in bytes between two WAL LSNs. Returns lsn1 minus lsn2 as a numeric value. Positive result means lsn1 is ahead of lsn2 (more WAL has been generated). Used to measure replication lag, WAL accumulation, and backup sizes.

DBA

Signature

pg_wal_lsn_diff ( lsn1 pg_lsn, lsn2 pg_lsn ) → numeric

Parameters

Parameter	Type	Description
lsn1	pg_lsn	First (typically later) LSN
lsn2	pg_lsn	Second (typically earlier) LSN

Examples

sql

SELECT pg_wal_lsn_diff(pg_current_wal_lsn(), pg_current_wal_flush_lsn()) AS unflushed_bytes;

→0 — all WAL has been flushed to disk

sql

SELECT pg_size_pretty(pg_wal_lsn_diff(sent_lsn, replay_lsn)) AS replica_lag FROM pg_stat_replication WHERE application_name = 'standby1';

→2432 kB — bytes of WAL the replica is behind

sql

SELECT pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), checkpoint_lsn)) AS since_checkpoint FROM pg_control_checkpoint();

→5 MB — WAL generated since the last checkpoint

sql

SELECT pg_wal_lsn_diff(end_lsn, start_lsn) AS backup_wal_bytes FROM pg_stat_archiver LIMIT 1;

→Bytes of WAL generated during a backup window

sql

SELECT pg_size_pretty(pg_wal_lsn_diff(write_lsn, flush_lsn)) AS write_ahead_unflushed FROM pg_stat_replication;

→WAL written but not yet flushed on the replica

⚠Anti-Pattern— Comparing LSN strings directly instead of using pg_wal_lsn_diff

LSNs are formatted as hex segments separated by a slash (e.g., '0/4A2B3C4D'). Comparing them as strings gives lexicographic order, not numeric WAL position order. For any arithmetic or magnitude comparison, use `pg_wal_lsn_diff`.

✓ Instead: SELECT pg_wal_lsn_diff('0/4A2B3C4D', '0/4A2B1234') AS bytes_apart; -- Numeric comparison, not string

Join `pg_stat_replication` with `pg_wal_lsn_diff` to measure replication lag in bytes. Compare `sent_lsn - replay_lsn` for total lag, or `flush_lsn - replay_lsn` for the gap between flushed and applied WAL on the replica.

example

SELECT application_name, pg_size_pretty(pg_wal_lsn_diff(sent_lsn, replay_lsn)) AS replay_lag FROM pg_stat_replication;

→standby1 | 1024 kB — bytes behind the primary

Common Gotchas

⚠

Unlogged tables are wiped on crash or unclean shutdown

UNLOGGED tables skip WAL and are extremely fast, but ALL data is truncated on crash or unclean shutdown — silently.

pg_wal_replay_pause

PG 9.1+→ void

Requests a pause of WAL replay on a standby server. Replay stops at the next safe opportunity. Must be run on a standby (pg_is_in_recovery() = true). Requires superuser or pg_wal_replay_pause role.

DBA

Signature

pg_wal_replay_pause ( ) → void

Examples

Request WAL replay pause

sql

SELECT pg_wal_replay_pause();

→(void)

Pause and verify

sql

SELECT pg_wal_replay_pause(); SELECT pg_sleep(1); SELECT pg_is_wal_replay_paused();

→true

Only pause on standby

sql

SELECT pg_wal_replay_pause() WHERE pg_is_in_recovery();

→(row or no row)

Consistent read window during pause

sql

-- Run heavy analytical queries here (point-in-time snapshot while paused) -- Then:

→(queries)

Resume after use — always do this!

sql

SELECT pg_wal_replay_resume();

→(void)

⚠Anti-Pattern— Pausing WAL replay in automation without a guaranteed resume

If a script that calls pg_wal_replay_pause() crashes or exits without calling pg_wal_replay_resume(), the standby stays paused indefinitely, growing the replication lag.

✓ Instead: Wrap in a DO block with exception handling, or use advisory locks with a monitoring alert that fires if pg_is_wal_replay_paused() is true for more than N seconds.

By pausing WAL replay, you freeze the standby at a point-in-time LSN. All queries during the pause see a consistent snapshot of the database as of that LSN — useful for consistent analytical reads without streaming lag.

example

SELECT pg_wal_replay_pause(); -- Freeze standby at current LSN
SELECT * FROM large_report_query; -- Consistent snapshot
SELECT pg_wal_replay_resume();  -- Resume immediately

→Consistent read window without impacting primary

Common Gotchas

⚠

Unlogged tables are wiped on crash or unclean shutdown

UNLOGGED tables skip WAL and are extremely fast, but ALL data is truncated on crash or unclean shutdown — silently.

pg_wal_replay_resume

PG 9.1+→ void

Resumes WAL replay on a standby where it was paused with pg_wal_replay_pause(). Safe to call even if replay is not paused. Must be run on a standby.

DBA

Signature

pg_wal_replay_resume ( ) → void

Examples

Resume replay (safe even if not paused)

sql

SELECT pg_wal_replay_resume();

→(void)

Conditionally resume only if paused

sql

SELECT pg_wal_replay_resume() WHERE pg_is_wal_replay_paused();

→(conditional void)

Before/resume/after check

sql

SELECT pg_is_wal_replay_paused(); SELECT pg_wal_replay_resume(); SELECT pg_is_wal_replay_paused();

→true | (void) | false

Guard: only call on standby

sql

SELECT pg_wal_replay_resume() WHERE pg_is_in_recovery();

→(row)

Exception-safe pause/resume block

sql

DO $ BEGIN PERFORM pg_wal_replay_pause(); -- do work -- PERFORM pg_wal_replay_resume(); EXCEPTION WHEN OTHERS THEN PERFORM pg_wal_replay_resume(); RAISE; END $;

→(void)

⚠Anti-Pattern— Assuming pg_wal_replay_resume() fails if replay is not paused

pg_wal_replay_resume() is idempotent — calling it when replay is already running does nothing harmful. This makes it safe to call unconditionally in cleanup code.

✓ Instead: Call pg_wal_replay_resume() unconditionally in FINALLY/EXCEPTION blocks without wrapping in a pg_is_wal_replay_paused() check.

The most important use of pg_wal_replay_resume() is in EXCEPTION blocks to guarantee replay is resumed even on error, preventing indefinite standby lag.

example

DO $ BEGIN PERFORM pg_wal_replay_pause(); -- analytics window PERFORM run_my_reports(); PERFORM pg_wal_replay_resume(); EXCEPTION WHEN OTHERS THEN PERFORM pg_wal_replay_resume(); RAISE; END $;

→Replay always resumed, even on failure

Common Gotchas

⚠

Unlogged tables are wiped on crash or unclean shutdown

UNLOGGED tables skip WAL and are extremely fast, but ALL data is truncated on crash or unclean shutdown — silently.

pg_walfile_name

PG 10+ (renamed from pg_xlogfile_name)→ text

Converts a WAL LSN (Log Sequence Number) to the corresponding WAL filename in the pg_wal directory. The filename encodes timeline, segment high, and segment low bytes.

DBA

Signature

pg_walfile_name ( lsn pg_lsn ) → text

Parameters

Parameter	Type	Description
lsn	pg_lsn	WAL location to convert to a filename

Examples

Current WAL file being written

sql

SELECT pg_walfile_name(pg_current_wal_lsn());

→000000010000000000000003

WAL filename for a specific LSN

sql

SELECT pg_walfile_name('0/5000000'::pg_lsn);

→000000010000000000000005

Filename and offset for backup stop LSN

sql

SELECT pg_walfile_name(lsn), pg_walfile_name_offset(lsn) FROM pg_backup_stop();

→(..., (filename, offset))

File and exact LSN offset together

sql

SELECT pg_walfile_name(pg_current_wal_lsn()), pg_current_wal_lsn();

→00000001000000000000000A | 0/A123456

Verify current WAL file exists in pg_wal directory

sql

SELECT pg_ls_waldir() WHERE name = pg_walfile_name(pg_current_wal_lsn());

→(row)

⚠Anti-Pattern— Hardcoding WAL filenames in scripts

WAL filenames change with every segment switch and restart. Scripts that hardcode WAL filenames become stale within seconds on active servers.

✓ Instead: Always derive WAL filenames dynamically: pg_walfile_name(pg_current_wal_lsn()) or pg_walfile_name(specific_lsn).

After pg_switch_wal(), check that the previous segment has been archived by looking for a .done or .ready file in pg_wal/archive_status/ matching the segment returned by pg_walfile_name().

example

SELECT pg_walfile_name(pg_current_wal_lsn() - 1::bigint::pg_lsn);  -- get previous segment name

→Name of the just-archived WAL segment

Common Gotchas

⚠

Unlogged tables are wiped on crash or unclean shutdown

UNLOGGED tables skip WAL and are extremely fast, but ALL data is truncated on crash or unclean shutdown — silently.

⚠

pg_dump is consistent but still impacts production — it holds a snapshot

pg_dump takes a consistent snapshot, but that snapshot prevents autovacuum from removing dead tuples — causing table bloat during long backups.

pg_walfile_name_offset

PG 10+→ record (file_name text, file_offset integer)

Converts a WAL LSN (Log Sequence Number) to the WAL filename and byte offset within that file. Useful for determining which WAL file corresponds to an LSN during backup and recovery operations.

DBA

Signature

pg_walfile_name_offset ( lsn pg_lsn ) → record (file_name text, file_offset integer)

Parameters

Parameter	Type	Description
lsn	pg_lsn	WAL Log Sequence Number to convert

Examples

sql

SELECT * FROM pg_walfile_name_offset(pg_current_wal_lsn());

→file_name: 000000010000000000000047 | file_offset: 524288

sql

SELECT file_name FROM pg_walfile_name_offset('0/4A2B3C4D');

→000000010000000000000004 — WAL file containing this LSN

sql

SELECT (pg_walfile_name_offset(lsn)).file_name AS wal_file FROM pg_backup_start('test-backup');

→WAL file where backup starts

sql

SELECT pg_walfile_name_offset(pg_current_wal_flush_lsn()) AS flush_position;

→file_name and offset for the current flush position

sql

SELECT file_name, round(file_offset::numeric / (16*1024*1024) * 100, 1) AS pct_full FROM pg_walfile_name_offset(pg_current_wal_lsn());

→Current WAL file name and how full it is (percentage)

⚠Anti-Pattern— Using pg_walfile_name_offset output on a standby for primary WAL locations

On a standby server, `pg_walfile_name_offset` converts LSNs relative to the standby's current timeline. If the standby has diverged from the primary's timeline after a failover, the WAL file names will differ from those on the primary.

✓ Instead: Run pg_walfile_name_offset on the server whose WAL files you are working with. Do not translate primary LSNs on a standby or vice versa after a timeline change.

Get the WAL file names for both the backup start and stop LSNs using `pg_walfile_name_offset`. All WAL files between those names (inclusive) must be archived or retained for the backup to be recoverable.

example

SELECT (pg_walfile_name_offset(start_lsn)).file_name AS start_wal, (pg_walfile_name_offset(end_lsn)).file_name AS end_wal FROM backup_info;

→Start and end WAL filenames defining the recovery window

Common Gotchas

⚠

Unlogged tables are wiped on crash or unclean shutdown

UNLOGGED tables skip WAL and are extremely fast, but ALL data is truncated on crash or unclean shutdown — silently.

⚠

pg_dump is consistent but still impacts production — it holds a snapshot

pg_dump takes a consistent snapshot, but that snapshot prevents autovacuum from removing dead tuples — causing table bloat during long backups.

set_config

PG 7.4+→ text

Sets a configuration parameter to the given value. If is_local is true, the setting reverts at the end of the transaction.

DBAReliability

Signature

set_config ( parameter_name text, new_value text, is_local boolean ) → text

Parameters

Parameter	Type	Description
parameter_name	text	GUC parameter name (same as SET command)
new_value	text	New value for the parameter
is_local	boolean	If true, only applies for current transaction (SET LOCAL)

Examples

sql

SELECT set_config('search_path', 'myschema, public', false);

→myschema, public

sql

SELECT set_config('app.current_user_id', '42', true);

→42 (reverts after transaction)

sql

SELECT set_config('timezone', 'America/New_York', false);

→America/New_York

Set session-level search_path

sql

SELECT set_config('search_path', 'myapp,public', false);

→'myapp,public'

⚠Anti-Pattern— Using set_config() for settings that should persist beyond the transaction

set_config(name, value, true) is transaction-local — it rolls back on ROLLBACK. set_config(name, value, false) is session-level but doesn't persist across reconnects.

✓ Instead: For permanent changes use ALTER SYSTEM SET or edit postgresql.conf + pg_reload_conf(). For per-role defaults use ALTER ROLE user SET parameter = value. Use set_config() only for session or transaction scoping.

Use custom GUC variables (with `app.` prefix) set via `set_config('app.current_user_id', id, true)` to pass application context to RLS policies and triggers — without changing the database role.

example

-- In application connection setup:
SELECT set_config('app.user_id', $1::text, false);
-- In RLS policy:
CREATE POLICY user_rows ON data USING (user_id = current_setting('app.user_id')::bigint);

→Multi-tenant RLS without changing database roles

PostgreSQL System Administration Functions

Jump to function

What are PostgreSQL System Administration Functions?

brin_summarize_new_values

Signature

Parameters

Examples

Common Gotchas

current_setting

Signatures

Parameters

Examples

gin_clean_pending_list

Signature

Parameters

Examples

Common Gotchas

pg_advisory_lock

Signatures

Parameters

Examples

pg_advisory_lock_shared

Signatures

Parameters

Examples

pg_advisory_unlock

Signatures

Parameters

Examples

pg_advisory_unlock_all

Signature

Examples

pg_advisory_xact_lock

Signatures

Parameters

Examples

Common Gotchas

pg_backup_start

Signature

Parameters

Examples

Common Gotchas

pg_backup_stop

Signature

Parameters

Examples

Common Gotchas

pg_cancel_backend

Signature

Parameters

Examples

pg_column_size

Signature

Parameters

Examples

pg_create_logical_replication_slot

Signature

Parameters

Examples

Common Gotchas

pg_drop_replication_slot

Signature

Parameters

Examples

pg_export_snapshot

Signature

Examples

Common Gotchas

pg_get_wal_replay_pause_state

Signature

Examples

pg_is_in_recovery

Signature

Examples

pg_is_wal_replay_paused

Signature

Examples

Common Gotchas

pg_logical_emit_message

Signature