character not in repertoire

Category: Data ExceptionVersions: All Postgres versions

What this means

SQLSTATE 22021 is raised when a character in an input string is not valid in the current database encoding or the target character repertoire. This commonly occurs during encoding conversion or when client and server encodings are mismatched.

Why it happens

1Sending data encoded in a different character set than the database expects
2Inserting bytes that are not valid in the database encoding (e.g., invalid UTF-8 sequences)
3Client_encoding mismatch causing conversion failure

How to reproduce

Insert containing bytes invalid in the current database encoding.

trigger — this will ERROR

-- In a UTF-8 database, inserting a raw Latin-1 byte sequence without conversion

ERROR: invalid byte sequence for encoding "UTF8": 0xe9

Fix 1: Set client_encoding to match the actual encoding of the data

When the client sends data in a non-UTF-8 encoding.

fix

SET client_encoding = 'LATIN1';

Why this works

Postgres converts data from client_encoding to the database encoding. Setting it correctly allows the server to perform the conversion instead of failing.

Fix 2: Clean invalid bytes before insert

When importing data from external sources with mixed encodings.

fix

SELECT convert_from(convert_to(input_col, 'UTF8'), 'UTF8') FROM staging;

Why this works

convert_from / convert_to normalise the encoding. Alternatively, use pg_catalog.pg_convert to strip or replace invalid sequences.

What not to do

✗

Set client_encoding to SQL_ASCII to bypass conversion

Why it's wrong: SQL_ASCII disables all encoding checks and allows arbitrary bytes, which silently corrupts data.

Sources

📚 Official docs: https://www.postgresql.org/docs/current/errcodes-appendix.html

📚 Feature docs: https://www.postgresql.org/docs/current/multibyte.html

🔧 Source ref: Class 22 — Data Exception

Confidence assessment

✅ HIGH confidence

Standard SQLSTATE for encoding violations. Behaviour consistent across all Postgres versions.