From MySQL to PostgreSQL: A Migration Survival Guide

Developer & creator of pgref.dev · 13 min read

mysqlmigrationschema

Moving from MySQL to PostgreSQL is easier than moving the other way. That's the good news. The bad news is "easier" still means a week of fixing things that worked fine yesterday, and a few of them will catch you completely off-guard.

This isn't about migration tooling — pgloader, AWS DMS, and similar tools handle the data transport. This is about the SQL and behaviour differences that break your application code after the data arrives.

Syntax changes that will break your queries

AUTO_INCREMENT → SERIAL or IDENTITY

-- MySQL
CREATE TABLE users (
  id INT AUTO_INCREMENT PRIMARY KEY,
  email VARCHAR(255)
);

-- PostgreSQL (preferred, SQL standard)
CREATE TABLE users (
  id INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  email TEXT
);

SERIAL is PostgreSQL-specific shorthand that still works. GENERATED ALWAYS AS IDENTITY is the SQL:2003 standard syntax preferred for new tables.

LIMIT syntax

-- MySQL: LIMIT offset, count (order is offset first)
SELECT * FROM products LIMIT 20, 10;  -- skip 20, get 10

-- PostgreSQL
SELECT * FROM products LIMIT 10 OFFSET 20;

MySQL's LIMIT x, y syntax isn't supported in PostgreSQL. The arguments are also reversed. This one causes silent wrong results if your code generates LIMIT clauses dynamically.

Backtick quoting → double quotes

-- MySQL: backticks for identifiers
SELECT `order`, `group` FROM `user`;

-- PostgreSQL: double quotes
SELECT "order", "group" FROM "user";

IFNULL → COALESCE

SELECT COALESCE(shipping_address, billing_address) FROM orders;

GROUP_CONCAT → string_agg

-- MySQL
SELECT project_id,
  GROUP_CONCAT(tag ORDER BY tag SEPARATOR ', ') FROM tags GROUP BY project_id;

-- PostgreSQL
SELECT project_id,
  string_agg(tag, ', ' ORDER BY tag) FROM tags GROUP BY project_id;

ON DUPLICATE KEY UPDATE → ON CONFLICT DO UPDATE

-- MySQL
INSERT INTO page_views (url, views) VALUES ($1, 1)
ON DUPLICATE KEY UPDATE views = views + 1;

-- PostgreSQL
INSERT INTO page_views (url, views) VALUES ($1, 1)
ON CONFLICT (url) DO UPDATE SET views = page_views.views + 1;

PostgreSQL's version is more explicit — you name the constraint or column that causes the conflict. The EXCLUDED pseudo-table holds the values you tried to insert.

Behaviour changes that will break your logic

These don't produce errors. They quietly give wrong results.

Case sensitivity in string comparisons

MySQL's default collation is case-insensitive. PostgreSQL is always case-sensitive. Queries that worked on MySQL may return different row counts in PostgreSQL with no error. Normalise at write time or query with lower() both sides.

GROUP BY strictness

This took a team I was working with a full day to track down. MySQL lets you GROUP BY a subset of columns and select non-aggregated columns that aren't in the GROUP BY. PostgreSQL doesn't.

-- MySQL: sometimes works (picks arbitrary values from the group)
SELECT id, name, department, MAX(salary)
FROM employees
GROUP BY department;

-- PostgreSQL: ERROR
-- column "employees.id" must appear in the GROUP BY clause
-- or be used in an aggregate function

Transaction behaviour on error

PostgreSQL aborts the entire transaction when an error occurs. After any error in a transaction, every subsequent statement fails with ERROR: current transaction is aborted, commands ignored until end of transaction block.

Code that relies on "if one insert fails, skip it and continue" needs to be rewritten to use savepoints:

BEGIN;
SAVEPOINT before_insert;
INSERT INTO orders VALUES (1, 'alice');
-- if this fails, rollback to savepoint and continue
ROLLBACK TO SAVEPOINT before_insert;
INSERT INTO orders VALUES (2, 'bob');
COMMIT;

Empty string vs NULL

PostgreSQL always treats '' as a valid non-null value. Code that inserts empty strings and queries WHERE col IS NULL to find "empty" values will find nothing.

Schema migration specifics

MySQL type	PostgreSQL equivalent	Notes
`VARCHAR(n)`	`TEXT` or `VARCHAR(n)`	PG's TEXT has no performance penalty over VARCHAR
`TINYINT(1)`	`BOOLEAN`	MySQL uses TINYINT(1) for booleans
`DATETIME`	`TIMESTAMPTZ`	Use TIMESTAMPTZ, not plain TIMESTAMP
`JSON`	`JSONB`	JSONB is binary-stored and indexable; always prefer it
`BLOB`	`BYTEA`	Binary data
`LONGTEXT`	`TEXT`	PG TEXT has no length limit
`AUTO_INCREMENT`	`GENERATED ALWAYS AS IDENTITY`	Or SERIAL (legacy)
`ENUM('a','b')`	`TEXT` with CHECK constraint	PG enums are harder to alter later

One important MySQL-specific note: tables using the MyISAM storage engine don't enforce foreign keys. If your MySQL tables were MyISAM, you may have orphaned rows that will cause foreign key violations when you create the constraints in PostgreSQL. Clean them up first:

-- Find orphaned order_items with no matching order
SELECT oi.id FROM order_items oi
LEFT JOIN orders o ON o.id = oi.order_id
WHERE o.id IS NULL;

The things that are genuinely better in PostgreSQL

CTEs that actually optimise

Before PostgreSQL 12, CTEs were optimisation fences. Since PG 12, the planner can inline CTEs and optimise them like subqueries. Complex queries with multiple CTEs are much faster.

Window functions from the start

MySQL only got window functions in version 8.0 (2018). If your MySQL installation is older, you've been writing ugly self-joins for things PostgreSQL handles cleanly.

JSONB

MySQL's JSON type is stored as text and parsed on access. PostgreSQL's JSONB is binary-stored with GIN index support. There's no comparison.

EXPLAIN ANALYZE with BUFFERS

EXPLAIN (ANALYZE, BUFFERS) shows the plan, actual execution time, estimated vs actual row counts, and buffer hit/miss statistics. Debugging performance in PostgreSQL is genuinely easier.

Check the pgref.dev MariaDB pages if you're migrating from MariaDB specifically — there are additional divergences worth knowing about.