Complete reference for MariaDB aggregate functions covering COUNT, SUM, AVG, MAX, MIN, GROUP_CONCAT, JSON_ARRAYAGG, STDDEV, VARIANCE, and bit aggregates. Every function includes syntax, GROUP BY examples, and comparisons with MySQL and PostgreSQL aggregate behaviour. Updated for MariaDB 11.x.
MariaDB aggregate functions compute a single result from multiple rows and are used with GROUP BY clauses. GROUP_CONCAT() is one of MariaDB's most distinctive aggregates — it concatenates grouped values into a comma-separated string and supports ORDER BY within the aggregate and a SEPARATOR clause. MariaDB also supports JSON_ARRAYAGG() and JSON_OBJECTAGG() for building JSON directly from queries, mirroring PostgreSQL's json_agg() and json_object_agg().
COUNT
MariaDB 5.5+→ BIGINT
Returns the number of rows matching the query, or the number of non-NULL values in a column.
Signatures
COUNT(*)
COUNT([DISTINCT] expr)
Parameters
Parameter
Type
Description
expr
ANY
Column or expression to count; * counts all rows regardless of NULLs
Examples
Count all rows including NULLs
sql
SELECTCOUNT(*) FROMorders;
→-- 1500
Count non-NULL values only
sql
SELECTCOUNT(amount) FROMorders;
→-- 1480
Count distinct non-NULL values
sql
SELECTCOUNT(DISTINCTcustomer_id) FROMorders;
→-- 342
With GROUP BY
sql
SELECTdept, COUNT(*) FROMemployeesGROUPBYdept;
→-- grouped row counts per dept
NULL handling — COUNT(expr) ignores NULLs
sql
SELECTCOUNT(NULL);
→-- 0
⚠Anti-Pattern— Confusing COUNT(*) and COUNT(col)
COUNT(col) silently skips NULLs, so it can return a smaller number than COUNT(*) without any warning.
✓ Instead: Be explicit: use COUNT(*) for total rows and COUNT(col) when you specifically want non-NULL counts.
COUNT(*) counts all rows including those with NULL values in every column. COUNT(col) only counts rows where col is not NULL. Always be explicit about which you need to avoid silent undercounting.
Returns the sum of all non-NULL values in an expression.
Signature
SUM([DISTINCT] expr)
Parameters
Parameter
Type
Description
expr
NUMERIC
Numeric column or expression to sum
Examples
Sum all non-NULL values
sql
SELECTSUM(amount) FROMorders;
→-- 98432.50
With GROUP BY
sql
SELECTdept, SUM(salary) FROMemployeesGROUPBYdept;
→-- sum per dept
DISTINCT values only
sql
SELECTSUM(DISTINCTprice) FROMproducts;
→-- sum of unique prices
Filtered with WHERE
sql
SELECTSUM(amount) FROMordersWHEREstatus='paid';
→-- filtered sum
Returns NULL when all values are NULL
sql
SELECTSUM(NULL);
→-- NULL
⚠Anti-Pattern— Assuming SUM returns 0 on empty sets
SUM of an empty set or all-NULL column returns NULL, which can cause unexpected NULLs in calculations.
✓ Instead: Use COALESCE(SUM(col), 0) to safely default to zero.
SUM returns NULL (not 0) when the result set is empty or all values are NULL. This causes downstream calculations to also return NULL. Wrap with COALESCE(SUM(col), 0) whenever a zero default is more appropriate.
Returns the arithmetic mean of all non-NULL values in an expression.
Signature
AVG([DISTINCT] expr)
Parameters
Parameter
Type
Description
expr
NUMERIC
Numeric column or expression to average
Examples
Average of all non-NULL scores
sql
SELECTAVG(score) FROMtest_results;
→-- 78.45
With GROUP BY
sql
SELECTdept, AVG(salary) FROMemployeesGROUPBYdept;
→-- average salary per dept
DISTINCT values only
sql
SELECTAVG(DISTINCTprice) FROMproducts;
→-- average of unique prices
Filtered with WHERE
sql
SELECTAVG(amount) FROMordersWHEREyear=2024;
→-- filtered average
Returns NULL when all values are NULL
sql
SELECTAVG(NULL);
→-- NULL
⚠Anti-Pattern— Treating NULLs as zeros in averages
AVG automatically skips NULLs, so the denominator is only the count of non-NULL rows, which may not reflect your intent.
✓ Instead: If NULLs represent zero values, use AVG(COALESCE(col, 0)) to include them in the divisor.
AVG only divides the sum by the count of non-NULL rows. If NULLs in your column represent zero (e.g., no activity), the resulting average will be higher than the true mean. Use AVG(COALESCE(col, 0)) to include the zero-value rows in the denominator.
⚠Anti-Pattern— Using ORDER BY + LIMIT instead of MAX
SELECT col FROM t ORDER BY col DESC LIMIT 1 is slower and more fragile than SELECT MAX(col) FROM t.
✓ Instead: Use MAX(col) for a single maximum value; it uses index optimizations and handles NULLs correctly.
MAX(col) uses index statistics for optimization and handles NULLs correctly. It is more efficient and semantically clear than ORDER BY col DESC LIMIT 1 when you only need the maximum value.
⚠Anti-Pattern— Using ORDER BY + LIMIT instead of MIN
SELECT col FROM t ORDER BY col ASC LIMIT 1 is less efficient than SELECT MIN(col) FROM t.
✓ Instead: Use MIN(col) which benefits from index lookups and handles NULLs correctly.
MIN(created_at) per group is the canonical way to find the first event or record for each category. Use it in GROUP BY queries instead of a correlated subquery for better performance.
When concatenated output exceeds group_concat_max_len the result is silently truncated, corrupting the data without any error.
✓ Instead: Set group_concat_max_len to an appropriate size, or verify output length with CHAR_LENGTH() after the query.
The default group_concat_max_len is 1024 bytes. When building comma-separated ID lists or long tag strings, results are silently truncated at this limit. Set it at session level before the query when you expect longer output.
SELECTBIT_AND(val) FROM (SELECT6valUNIONALLSELECT5) t;
→-- 4
Identical values return that value
sql
SELECTBIT_AND(val) FROM (SELECT15valUNIONALLSELECT15) t;
→-- 15
Empty set returns all-bits-set (AND identity element)
sql
SELECTBIT_AND(NULL);
→-- 18446744073709551615
⚠Anti-Pattern— Using BIT_AND to find any set bit
BIT_AND finds bits that are set in ALL rows; if you want bits set in ANY row, you need BIT_OR.
✓ Instead: Use BIT_OR to find bits present in at least one row, and BIT_AND for bits present in every row.
BIT_AND is perfect for finding the intersection of permission bitmasks — a bit is set in the result only if every row has that bit set. Use it to find the minimum effective permissions for a user across all their roles.
⚠Anti-Pattern— Using BIT_OR to check that all rows have a bit set
BIT_OR returns a non-zero result if any row has the bit set, not all rows — it cannot confirm universal bit presence.
✓ Instead: Use BIT_AND to confirm a bit is set in every row.
BIT_OR is the union of all bitmasks in the group — a bit is set if any row has it set. Use it to build a combined permission or feature-flag bitmask from multiple roles or events.
SELECTBIT_XOR(val) FROM (SELECT5valUNIONALLSELECT3) t;
→-- 6
XOR of identical values cancels out
sql
SELECTBIT_XOR(val) FROM (SELECT7valUNIONALLSELECT7) t;
→-- 0
Empty set returns 0 (XOR identity element)
sql
SELECTBIT_XOR(NULL);
→-- 0
⚠Anti-Pattern— Relying on BIT_XOR for collision-resistant checksums
XOR is order-independent and cancels duplicate values, so two different datasets can produce the same XOR result.
✓ Instead: Use BIT_XOR only for quick change-detection hints; use cryptographic hashes (MD5, SHA2) for integrity guarantees.
XOR-ing all row checksums in a table produces a single value that changes if any row is added, removed, or modified (assuming the checksum is a good hash). Two identical scans returning the same BIT_XOR indicates no changes between runs.
⚠Anti-Pattern— Using population std dev on a sample
STD/STDDEV_POP divides by N, which underestimates variability when your data is a sample from a larger population.
✓ Instead: Use STDDEV_SAMP (divides by N-1) when working with a sample rather than the full population.
STD/STDDEV_POP divides by N (the whole population). Use it when your rows represent all possible observations (e.g., all employees in the company). For survey samples or A/B test subsets, use STDDEV_SAMP (divides by N-1).
SELECTSTDDEV(val) FROM (SELECT2valUNIONALLSELECT4UNIONALLSELECT4UNIONALLSELECT4UNIONALLSELECT5UNIONALLSELECT5UNIONALLSELECT7UNIONALLSELECT9) t;
→-- 2
Returns NULL when all values are NULL
sql
SELECTSTDDEV(NULL);
→-- NULL
⚠Anti-Pattern— Confusing STDDEV with STDDEV_SAMP
STDDEV computes the population standard deviation (divides by N), while STDDEV_SAMP divides by N-1 for sample correction.
✓ Instead: Use STDDEV_SAMP for sample data; it applies Bessel's correction (N-1) for an unbiased estimate.
Be careful when porting queries from PostgreSQL: PostgreSQL's STDDEV() maps to sample standard deviation (N-1), while MariaDB's STDDEV() maps to population standard deviation (N). Use explicit STDDEV_POP or STDDEV_SAMP to avoid cross-database surprises.
When rows are a sample, STDDEV_POP slightly underestimates the true population standard deviation.
✓ Instead: Use STDDEV_SAMP for sample data; it applies Bessel's correction (N-1) for an unbiased estimate.
If your table contains all census records, all company employees, or all product prices (not a sample), STDDEV_POP gives the exact population standard deviation. Combining it with AVG() reveals the spread around the mean.
SELECTSTDDEV_SAMP(val) FROM (SELECT10valUNIONALLSELECT20) t;
→-- 7.07
Returns NULL for single-row sample (N-1 = 0)
sql
SELECTSTDDEV_SAMP(val) FROM (SELECT42val) t;
→-- NULL
⚠Anti-Pattern— Expecting a result from a single-row sample
STDDEV_SAMP with only one row returns NULL because the variance formula is undefined for N=1.
✓ Instead: Add a HAVING COUNT(*) > 1 guard when using STDDEV_SAMP per group to avoid NULL results from singleton groups.
STDDEV_SAMP requires at least two rows because its denominator is N-1. A single-row group returns NULL, which can silently produce NULL in subsequent calculations. Add HAVING COUNT(*) >= 2 to filter out degenerate groups.
⇄ vs PostgreSQL: PostgreSQL STDDEV_SAMP is identical; PostgreSQL's plain STDDEV() is also an alias for STDDEV_SAMP (unlike MariaDB where STDDEV = STDDEV_POP).
VAR_POP divides by N, which gives a biased (too-small) variance estimate when your data is a sample.
✓ Instead: Use VAR_SAMP (divides by N-1) for sample data to get an unbiased variance estimate.
Variance and standard deviation convey the same information. If you need both, compute VAR_POP once and derive STDDEV_POP as SQRT(VAR_POP(col)) to avoid running two aggregate passes over the data.
SELECTVAR_SAMP(x) FROM (SELECT2xUNIONALLSELECT4UNIONALLSELECT6) t;
→-- 4
Returns NULL for single-row input (N-1 = 0)
sql
SELECTVAR_SAMP(val) FROM (SELECT42val) t;
→-- NULL
⚠Anti-Pattern— Confusing VAR_SAMP with VARIANCE
VARIANCE is an alias for VAR_POP in MariaDB, not VAR_SAMP — using them interchangeably leads to the wrong denominator.
✓ Instead: Explicitly use VAR_SAMP for sample variance and VAR_POP or VARIANCE for population variance.
Like STDDEV_SAMP, VAR_SAMP uses N-1 as the denominator and returns NULL when there is only one row. This is mathematically correct but can propagate unexpected NULLs into downstream calculations.
⚠Anti-Pattern— Assuming VARIANCE equals VAR_SAMP across databases
In PostgreSQL, VARIANCE() is an alias for VAR_SAMP, but in MariaDB it aliases VAR_POP — the same function name yields different results across databases.
✓ Instead: Use the explicit names VAR_POP or VAR_SAMP in cross-database code to avoid portability bugs.
In MariaDB, VARIANCE() maps to population variance (VAR_POP). In PostgreSQL, VARIANCE() maps to sample variance (VAR_SAMP). Using the explicit name VAR_POP in cross-database code removes ambiguity entirely.
⚠Anti-Pattern— Using GROUP_CONCAT for structured data
GROUP_CONCAT produces a plain string that requires manual parsing; any value containing the separator character breaks the result.
✓ Instead: Use JSON_ARRAYAGG when downstream consumers need a structured JSON array that handles special characters safely.
JSON_ARRAYAGG without ORDER BY returns elements in an unspecified order that can change between query executions. Always add ORDER BY to produce deterministic JSON arrays that are safe to compare, cache, or unit-test.
SELECTJSON_OBJECTAGG(k, v) FROM (SELECT'a'k, 1vUNIONALLSELECT'a', 2) t;
→-- '{"a":2}'
NULL values become JSON null
sql
SELECTJSON_OBJECTAGG(k, NULL) FROM (SELECT'x'k) t;
→-- '{"x":null}'
⚠Anti-Pattern— Relying on key order or deduplication
JSON_OBJECTAGG does not guarantee key order, and duplicate keys silently drop earlier values.
✓ Instead: Ensure key uniqueness before aggregation; if order matters, pre-sort in a subquery (though JSON object key order is not guaranteed by spec).
JSON_OBJECTAGG does not error on duplicate keys; it keeps the last value encountered within the group. If your key column may have duplicates, pre-aggregate or add a DISTINCT to prevent silent data loss.
⚠Anti-Pattern— Using ANY_VALUE to silence errors on non-uniform data
ANY_VALUE suppresses the ONLY_FULL_GROUP_BY error but silently returns an arbitrary value when the column is not functionally dependent on the GROUP BY key.
✓ Instead: Use MIN(), MAX(), or a subquery to get a deterministic value; only use ANY_VALUE when all group values are truly equivalent.
ANY_VALUE tells the optimizer 'any value from this group is acceptable'. Use it only when all values in the group are truly equivalent, such as a denormalized email that is the same for all rows per customer_id. Add a comment to prevent future developers from treating it as a bug.
example
SELECTcustomer_id, ANY_VALUE(email) ASemail/* same for all rows per customer */, COUNT(*) ASordersFROMordersGROUPBYcustomer_id;
⚠Anti-Pattern— Using AVG as a proxy for the median
AVG is heavily influenced by extreme outliers; in skewed distributions (income, response times) the average can be significantly higher than the typical value.
✓ Instead: Use MEDIAN for a better measure of central tendency in skewed data, and AVG only when the distribution is approximately symmetric.
In distributions with extreme outliers (response times, salaries, prices), the mean can be significantly higher than the 'typical' value. MEDIAN gives the 50th percentile — the value that half the rows are below — which better represents central tendency in skewed data.