🔍

PostgreSQL Full-Text Search Functions

Complete reference for PostgreSQL full-text search functions covering tsvector document vectors, tsquery query parsing, relevance ranking, and headline generation. Includes GIN index usage, text search configuration, and dictionary-based normalisation. Updated for PostgreSQL 16.

9 functions

What are PostgreSQL Full-Text Search Functions?

PostgreSQL full-text search functions convert documents and queries into tsvector and tsquery types for efficient lexeme-based searching. to_tsvector() normalises text into searchable lexemes, to_tsquery() parses a search query, ts_rank() provides relevance scoring, and ts_headline() generates highlighted result snippets. Full-text search with GIN indexes is significantly faster than ILIKE or regex for large text datasets.

to_tsvector

PG 8.3+tsvector

Converts a text document to a tsvector, normalizing words to their lexeme forms and removing stop words.

DeveloperData EngDBA

Signatures

to_tsvector ( document text ) → tsvector
to_tsvector ( config regconfig, document text ) → tsvector

Parameters

ParameterTypeDescription
configregconfigText search configuration (e.g., 'english', 'simple'). Defaults to the server default configuration.
documenttextText document to parse and normalize

Examples

sql
SELECT to_tsvector('english', 'The quick brown fox jumps over the lazy dog');
'brown':3 'dog':9 'fox':4 'jump':5 'lazi':8 'quick':2
sql
SELECT to_tsvector('simple', 'Hello World');
'hello':1 'world':2
Anti-PatternComputing to_tsvector at query time instead of storing it

Calling `to_tsvector` inside a WHERE clause forces PostgreSQL to re-parse and normalize every row on every query, making full-table scans the only option. Store the vector in a generated column and index it instead.

✓ Instead: -- Bad: recomputed on every query SELECT * FROM articles WHERE to_tsvector('english', body) @@ plainto_tsquery('search'); -- Good: generated column + GIN index ALTER TABLE articles ADD COLUMN tsv tsvector GENERATED ALWAYS AS (to_tsvector('english', coalesce(body,''))) STORED; CREATE INDEX articles_tsv_gin ON articles USING GIN(tsv); SELECT * FROM articles WHERE tsv @@ plainto_tsquery('english', 'search');

Pass the language explicitly: `to_tsvector('english', ...)`. The default config may not match your content language, leading to poor search quality. Use 'simple' to skip stemming for proper nouns or code.

example
ALTER TABLE articles ADD COLUMN search_vector tsvector GENERATED ALWAYS AS (to_tsvector('english', coalesce(title,'') || ' ' || coalesce(body,''))) STORED;
Auto-maintained search vector column

to_tsquery

PG 8.3+tsquery

Converts a search query text to a tsquery, applying stemming and stop word removal. Input must use tsquery operators (&, |, !, <->).

DeveloperData EngDBA

Signatures

to_tsquery ( query text ) → tsquery
to_tsquery ( config regconfig, query text ) → tsquery

Parameters

ParameterTypeDescription
configregconfigText search configuration
querytextSearch query with operators: & (AND), | (OR), ! (NOT), <-> (phrase)

Examples

sql
SELECT to_tsquery('english', 'jumping & dogs');
'jump' & 'dog'
sql
SELECT to_tsquery('english', 'fat | cat');
'fat' | 'cat'
sql
SELECT to_tsquery('cat & !dog');
'cat' & !'dog'
Anti-PatternPassing raw user input to to_tsquery

`to_tsquery` requires strict tsquery syntax — a bare space or unsupported character raises an error at runtime. Never pass unvalidated user input to it. Use `websearch_to_tsquery` or `plainto_tsquery` for user-facing search.

✓ Instead: -- Bad: user types 'dogs running' → ERROR: syntax error in tsquery SELECT * FROM articles WHERE tsv @@ to_tsquery('english', user_input); -- Good: safe for arbitrary user input SELECT * FROM articles WHERE tsv @@ websearch_to_tsquery('english', user_input);

`to_tsquery` requires proper tsquery syntax — passing raw user input will error on spaces and special chars. Use `plainto_tsquery` (implicit AND) or `websearch_to_tsquery` (Google-like syntax) for user input.

example
SELECT * FROM articles WHERE search_vector @@ websearch_to_tsquery('english', 'running fast dogs');
Articles about running, fast, and dogs

plainto_tsquery

PG 8.3+tsquery

Converts plain text to a tsquery, treating all words as an implicit AND. Safe for user-supplied input — no special syntax needed.

DeveloperData EngDBA

Signatures

plainto_tsquery ( query text ) → tsquery
plainto_tsquery ( config regconfig, query text ) → tsquery

Parameters

ParameterTypeDescription
configregconfigText search configuration
querytextPlain text search terms (spaces = AND)

Examples

sql
SELECT plainto_tsquery('english', 'quick brown fox');
'quick' & 'brown' & 'fox'
sql
SELECT * FROM docs WHERE body_tsv @@ plainto_tsquery('english', $1);
Full text search from user input
Anti-PatternUsing plainto_tsquery when phrase or OR search is needed

`plainto_tsquery` always treats all words as AND — users cannot express 'cat OR dog' or exact phrases. Offer `websearch_to_tsquery` for search boxes where users expect Google-style syntax.

✓ Instead: -- Bad: user types 'cat OR dog' → treated as AND, finds nothing with all three words SELECT * FROM docs WHERE tsv @@ plainto_tsquery('english', 'cat OR dog'); -- Parsed as: 'cat' & 'or' & 'dog' (or just 'cat' & 'dog' if 'or' is stop word) -- Good: respects OR keyword SELECT * FROM docs WHERE tsv @@ websearch_to_tsquery('english', 'cat OR dog');

`plainto_tsquery` is ideal for basic search boxes where users type words separated by spaces. For more advanced syntax (quoted phrases, minus for exclusion), use `websearch_to_tsquery`.

example
SELECT title, ts_rank(search_vector, q) AS rank FROM articles, plainto_tsquery('english', $1) AS q WHERE search_vector @@ q ORDER BY rank DESC LIMIT 10;
Top 10 ranked search results

websearch_to_tsquery

PG 11+tsquery

Converts a Google-like search string to a tsquery. Supports quoted phrases, unquoted AND, OR, and minus for NOT.

DeveloperData EngDBA

Signatures

websearch_to_tsquery ( query text ) → tsquery
websearch_to_tsquery ( config regconfig, query text ) → tsquery

Parameters

ParameterTypeDescription
configregconfigText search configuration
querytextSearch query with web-search syntax: OR for OR, -word for NOT, "phrase" for phrase

Examples

sql
SELECT websearch_to_tsquery('english', 'quick brown fox');
'quick' & 'brown' & 'fox'
sql
SELECT websearch_to_tsquery('english', 'cat OR dog');
'cat' | 'dog'
sql
SELECT websearch_to_tsquery('english', '"quick fox"');
'quick' <-> 'fox'
Anti-PatternUsing to_tsquery instead of websearch_to_tsquery for user-facing search inputs

`to_tsquery` raises an error on plain user input (spaces, special characters). For any search box that accepts free-form text, `websearch_to_tsquery` is the safe, correct choice — it never throws on unexpected input.

✓ Instead: -- Bad: crashes if user types 'quick brown fox' (space not valid in to_tsquery) SELECT * FROM articles WHERE tsv @@ to_tsquery('english', user_input); -- Good: handles any user input gracefully SELECT * FROM articles WHERE tsv @@ websearch_to_tsquery('english', user_input);

`websearch_to_tsquery` is the most user-friendly query parser — it handles common web search idioms without the strict syntax requirements of `to_tsquery`. Never errors on invalid input; just ignores unsupported tokens.

example
SELECT title FROM posts WHERE tsv @@ websearch_to_tsquery('english', user_input) ORDER BY ts_rank(tsv, websearch_to_tsquery('english', user_input)) DESC LIMIT 20;
Ranked search results from user-friendly input

ts_rank

PG 8.3+float4

Calculates a relevance score for a tsvector vs a tsquery. ts_rank_cd uses cover density, which rewards compact term clustering.

DeveloperData EngDBA

Signatures

ts_rank ( [weights float4[], ] tsvector, tsquery [, normalization integer] ) → float4
ts_rank_cd ( [weights float4[], ] tsvector, tsquery [, normalization integer] ) → float4

Parameters

ParameterTypeDescription
weightsfloat4[]Optional weights for D, C, B, A label classes [D, C, B, A]
tsvectortsvectorDocument vector
tsquerytsquerySearch query
normalizationintegerBitmask controlling length normalization (0=no norm, 1=doc length, 2=unique words, etc.)

Examples

sql
SELECT ts_rank(to_tsvector('english', body), to_tsquery('english', 'cat')) FROM docs ORDER BY ts_rank DESC;
Ranked results by relevance
sql
SELECT ts_rank_cd(search_vector, query) AS rank FROM articles, websearch_to_tsquery('english', 'database index') AS query WHERE search_vector @@ query ORDER BY rank DESC;
Cover-density ranked results
Anti-PatternCalling ts_rank without a WHERE @@ clause — ranking every row

`ts_rank` does not filter rows — it only scores them. Without a `WHERE tsv @@ query` clause, PostgreSQL computes a rank for every row in the table (including non-matching rows), then sorts all of them. Always pair `ts_rank` with a `@@` filter.

✓ Instead: -- Bad: scores and sorts every row in the table SELECT title, ts_rank(tsv, to_tsquery('english', 'database')) AS rank FROM articles ORDER BY rank DESC; -- Good: filter first, then rank SELECT title, ts_rank(tsv, q) AS rank FROM articles, to_tsquery('english', 'database') AS q WHERE tsv @@ q ORDER BY rank DESC LIMIT 20;

Without normalization, longer documents rank higher just because they contain more words. Use `ts_rank(vec, query, 1)` (divide by document length) or `ts_rank(vec, query, 2)` (divide by unique lexemes) to normalize.

example
SELECT title, ts_rank(tsv, q, 2) AS rank FROM articles, to_tsquery('english', 'postgresql') AS q WHERE tsv @@ q ORDER BY rank DESC LIMIT 5;
Top 5 most relevant articles, normalized by unique words

ts_headline

PG 8.3+text

Generates a highlighted summary of a document showing search term matches in context (like a search snippet).

DeveloperData EngDBA

Signatures

ts_headline ( document text, query tsquery [, options text] ) → text
ts_headline ( config regconfig, document text, query tsquery [, options text] ) → text

Parameters

ParameterTypeDescription
configregconfigText search configuration
documenttextOriginal document text
querytsqueryThe search query
optionstextComma-separated options: StartSel, StopSel, MaxWords, MinWords, ShortWord, HighlightAll, MaxFragments, FragmentDelimiter

Examples

sql
SELECT ts_headline('english', body, to_tsquery('english', 'database'), 'StartSel=<b>, StopSel=</b>') FROM docs WHERE id = 1;
"...uses a <b>database</b> to store..."
sql
SELECT ts_headline('english', content, q, 'MaxFragments=2, FragmentDelimiter=" ... "') FROM articles, to_tsquery('english', 'search index') q WHERE tsv @@ q;
Search snippet with highlights
Anti-PatternRunning ts_headline on the full document body without limiting result set first

`ts_headline` is CPU-intensive — it re-parses and tokenises the raw document text. Applying it to thousands of rows returned before a LIMIT multiplies the cost. Always paginate or LIMIT before calling `ts_headline`.

✓ Instead: -- Bad: ts_headline runs on every matching row before LIMIT SELECT ts_headline('english', body, q) FROM articles, ... WHERE tsv @@ q; -- Good: limit first using a CTE or subquery, then generate headlines WITH ranked AS ( SELECT id, ts_rank(tsv, q) AS rank FROM articles, websearch_to_tsquery('english', $1) AS q WHERE tsv @@ q ORDER BY rank DESC LIMIT 10 ) SELECT a.title, ts_headline('english', a.body, websearch_to_tsquery('english', $1), 'StartSel=<mark>, StopSel=</mark>, MaxFragments=2') FROM articles a JOIN ranked r ON a.id = r.id ORDER BY r.rank DESC;

Set `MaxFragments=2` or more to get multiple context fragments like Google search snippets. Use `FragmentDelimiter` to set the separator (e.g., ' ... '). Always wrap match terms in HTML tags for display.

example
SELECT ts_headline('english', body, q, 'StartSel=<mark>, StopSel=</mark>, MaxFragments=3') FROM articles, plainto_tsquery('english', $1) AS q WHERE tsv @@ q;
HTML-ready search snippets with highlighted terms

tsvector || tsvector

PG 8.3+tsvector

Concatenates two tsvectors, combining their lexemes and positions. Useful for building vectors from multiple columns.

DeveloperData EngDBA

Signature

tsvector || tsvector → tsvector

Parameters

ParameterTypeDescription
tsvector1tsvectorFirst tsvector
tsvector2tsvectorSecond tsvector to concatenate

Examples

sql
SELECT to_tsvector('fat cat') || to_tsvector('sat on a mat');
'cat':2 'fat':1 'mat':6 'sat':3
sql
SELECT setweight(to_tsvector('english', title), 'A') || setweight(to_tsvector('english', body), 'B') FROM articles;
Title words have higher weight than body
Anti-PatternConcatenating tsvectors at query time instead of storing in a generated column

Building `setweight(...) || setweight(...)` inside the WHERE clause forces every concatenation to happen at query time on every row, preventing GIN index use. Store the pre-computed concatenation in a generated column.

✓ Instead: -- Bad: concatenation at query time — full table scan, no index benefit SELECT * FROM articles WHERE (setweight(to_tsvector('english', title), 'A') || setweight(to_tsvector('english', body), 'B')) @@ plainto_tsquery('english', $1); -- Good: store and index the concatenated vector ALTER TABLE articles ADD COLUMN tsv tsvector GENERATED ALWAYS AS ( setweight(to_tsvector('english', coalesce(title,'')), 'A') || setweight(to_tsvector('english', coalesce(body,'')), 'B') ) STORED; CREATE INDEX articles_tsv_gin ON articles USING GIN(tsv);

Use `setweight(to_tsvector(title), 'A') || setweight(to_tsvector(body), 'B')` to build a weighted vector. ts_rank will score title matches higher than body matches automatically.

example
SELECT setweight(to_tsvector('english', coalesce(title,'')), 'A') || setweight(to_tsvector('english', coalesce(description,'')), 'B') AS tsv FROM products;
Weighted tsvector: title terms outrank description terms

setweight

PG 8.3+tsvector

Assigns a weight ('A', 'B', 'C', or 'D') to each lexeme in a tsvector. Used to give different importance to different parts of a document.

DeveloperData EngDBA

Signature

setweight ( tsvector, "char" ) → tsvector

Parameters

ParameterTypeDescription
tsvectortsvectorVector to assign weights to
weight_label"char"'A' (highest) through 'D' (lowest, default)

Examples

sql
SELECT setweight(to_tsvector('english', 'hot chocolate'), 'A');
'chocol':2A 'hot':1A
sql
SELECT setweight(to_tsvector(title), 'A') || setweight(to_tsvector(tags_str), 'B') || setweight(to_tsvector(body), 'C') FROM posts;
Multi-field weighted vector
Anti-PatternApplying setweight after concatenation — weights are lost

Calling `setweight(tsvector1 || tsvector2, 'A')` applies the weight to the already-concatenated result, overwriting any individual weights. Apply `setweight` to each part before concatenating.

✓ Instead: -- Bad: sets 'A' on everything — individual weights lost SELECT setweight(to_tsvector('english', title) || to_tsvector('english', body), 'A'); -- Good: weight each part independently, then concatenate SELECT setweight(to_tsvector('english', title), 'A') || setweight(to_tsvector('english', body), 'B');

For multi-column search with weights, compute once in a generated column: `GENERATED ALWAYS AS (setweight(to_tsvector(title), 'A') || setweight(to_tsvector(body), 'B')) STORED`. Index the generated column with GIN.

example
CREATE INDEX ON articles USING GIN(search_vector);
-- Generated column computed above
SELECT * FROM articles WHERE search_vector @@ q ORDER BY ts_rank(search_vector, q) DESC;
Indexed weighted full-text search

phraseto_tsquery

PG 8.3+tsquery

Converts text to a tsquery that requires the words to appear in sequence (phrase search) using the <-> operator.

DeveloperData EngDBA

Signatures

phraseto_tsquery ( query text ) → tsquery
phraseto_tsquery ( config regconfig, query text ) → tsquery

Parameters

ParameterTypeDescription
configregconfigText search configuration
querytextPhrase to search for

Examples

sql
SELECT phraseto_tsquery('english', 'quick brown fox');
'quick' <-> 'brown' <-> 'fox'
sql
SELECT * FROM articles WHERE tsv @@ phraseto_tsquery('english', 'machine learning');
Articles with 'machine' immediately followed by 'learning'
Anti-PatternUsing plainto_tsquery when phrase order matters

`plainto_tsquery('english', 'machine learning')` produces `'machin' & 'learn'` — it matches documents with both words anywhere, not necessarily adjacent. Use `phraseto_tsquery` when the exact sequence of words is required.

✓ Instead: -- Bad: matches docs with 'machine' and 'learning' anywhere, in any order SELECT * FROM docs WHERE tsv @@ plainto_tsquery('english', 'machine learning'); -- Good: requires adjacent sequence SELECT * FROM docs WHERE tsv @@ phraseto_tsquery('english', 'machine learning');

Use `phraseto_tsquery` when the user is searching for a specific phrase rather than individual words. This ensures 'machine learning' matches the compound term, not documents with 'machine' and 'learning' far apart.

example
SELECT * FROM docs WHERE tsv @@ phraseto_tsquery('english', 'primary key constraint');
Documents with the exact phrase 'primary key constraint'