The PostgreSQL Functions I Actually Use Every Day
After building pgref.dev and cataloguing over 400 PostgreSQL functions, most days I use about 12 of them. The Pareto principle applies brutally to SQL. The rest exist for edge cases, for niche domains, for the one person at a financial institution who genuinely needs width_bucket(). The functions below are the ones I reach for without thinking.
coalesce()
I reach for coalesce() every time I have a nullable column that should display something sensible. That's most nullable columns.
SELECT
order_id,
coalesce(shipping_name, billing_name, 'Unknown Customer') AS display_name
FROM orders;
For years I wrote CASE WHEN shipping_name IS NULL THEN billing_name ELSE shipping_name END. That's nine tokens to do what coalesce does in one. The moment I made the switch I never went back.
The gotcha: coalesce evaluates all its arguments left to right and stops at the first non-null. So if you write coalesce(expensive_function(), 'default'), that function runs whenever the first argument is null. It's not lazy evaluation. I learned this the hard way when a coalesce branch that "never runs" started running under load and took a query from 10ms to 800ms.
nullif()
The most underrated function in PostgreSQL. I use nullif(a, b) constantly for avoiding divide-by-zero errors and normalising empty strings to NULL.
-- Avoid divide-by-zero
SELECT
total_revenue / nullif(total_orders, 0) AS avg_order_value
FROM monthly_summary;
-- Treat empty string the same as NULL
SELECT nullif(trim(notes), '') AS clean_notes FROM tickets;
nullif(a, b) returns NULL if a = b, otherwise returns a. It solves an annoying class of problems cleanly.
concat_ws()
The || concatenation operator burned me twice before I switched to concat_ws(). The issue: || propagates NULL. If any operand is NULL, the whole expression is NULL.
-- This returns NULL if middle_name is NULL
SELECT first_name || ' ' || middle_name || ' ' || last_name FROM users;
-- This skips NULL components gracefully
SELECT concat_ws(' ', first_name, middle_name, last_name) FROM users;
concat_ws takes a separator as the first argument, then any number of values, and skips NULLs entirely. For name assembly, address formatting, tag concatenation — anything where components are optional — it's exactly what you want.
date_trunc()
date_trunc() is how I group time-series data. You pass a precision string and a timestamp; it zeroes out everything below that precision.
SELECT
date_trunc('month', created_at) AS month,
count(*) AS signups
FROM users
GROUP BY 1
ORDER BY 1;
The trap I hit: date_trunc respects timezones only if you tell it which one. On a server set to UTC, date_trunc('day', created_at) gives you UTC days. If your users are in Sydney, that's wrong by ten hours. The fix:
date_trunc('day', created_at AT TIME ZONE 'Australia/Sydney')
I've debugged two separate "the daily report is off by a day" incidents that turned out to be this exact problem.
generate_series()
generate_series() generates rows. That sounds trivial until you need to fill gaps in date-grouped data.
-- Revenue by day, with zeros for days with no orders
SELECT
d::date AS day,
coalesce(sum(o.total), 0) AS revenue
FROM generate_series(
'2024-01-01'::timestamp,
'2024-01-31'::timestamp,
'1 day'
) AS d
LEFT JOIN orders o ON o.created_at::date = d::date
GROUP BY d
ORDER BY d;
Without generate_series, a GROUP BY on the orders table skips days with no orders entirely. The reporting client sees the chart with missing dates and files a bug. I've done this LEFT JOIN dance on every reporting project I've worked on.
row_number() vs rank()
I still get these wrong occasionally. row_number() gives every row a unique sequential integer regardless of ties. rank() gives tied rows the same number and then skips.
-- Get the single most recent order per customer
SELECT * FROM (
SELECT
*,
row_number() OVER (PARTITION BY customer_id ORDER BY created_at DESC) AS rn
FROM orders
) sub
WHERE rn = 1;
If I use rank() here instead of row_number(), and a customer has two orders at the exact same timestamp, both get rank = 1 and I get two rows back. For deduplication, always use row_number().
string_agg()
string_agg() concatenates values across rows. The part that took me a while to find: you can ORDER BY inside the aggregate.
SELECT
project_id,
string_agg(tag_name, ', ' ORDER BY tag_name) AS tags
FROM project_tags
GROUP BY project_id;
Without the ORDER BY, the order is arbitrary and changes between queries. On a project where a test was checking the exact output of a tags column, this caused intermittent failures for a week before I found it.
regexp_replace()
regexp_replace() is how I clean data from external sources — phone numbers, reference codes, user input.
-- Strip non-digit characters from phone numbers
SELECT regexp_replace(phone, '[^0-9]', '', 'g') FROM contacts;
The 'g' flag means "replace all matches." Without it, only the first match is replaced. Other regex implementations default to global replacement. PostgreSQL doesn't. I forget this about twice a year.
jsonb_build_object() and jsonb_set()
I reach for jsonb_build_object() whenever I'm assembling API responses or audit log entries inside a query.
SELECT
jsonb_build_object(
'user_id', u.id,
'email', u.email,
'plan', s.plan_name,
'expires_at', s.expires_at
) AS user_summary
FROM users u
JOIN subscriptions s ON s.user_id = u.id;
The thing I stopped doing: using -> and ->> in WHERE clauses on large tables without a GIN index. I had a query filtering on metadata->>'source' = 'api' over 2 million rows that was doing a sequential scan. Adding a generated column fixed it in five minutes.
left() and right()
left(str, n) and right(str, n) are small but I use them constantly for truncation and prefix extraction.
-- Truncate long descriptions for display
SELECT left(description, 150) || '…' AS preview FROM articles;
-- Extract file extension
SELECT right(filename, 4) AS ext FROM uploads;
The functions I thought I'd use constantly but almost never do
When I started building pgref.dev I assumed geometric functions would come up regularly — point, circle, polygon, the distance operators. They haven't. Not once in a production query. Same with the XML functions. array_agg I expected to use daily. I use it maybe once a month. Most of the time string_agg or a JSON approach fits better.
If I lost every page on pgref.dev except one, I'd keep the Conditional Functions page. coalesce and nullif together solve more problems than any other pair of functions I know.