Learn about Common Table Expressions (CTEs) in MariaDB Server. This section explains how to use CTEs for complex, readable, and reusable subqueries, simplifying data selection and manipulation.
WITH [RECURSIVE] table_reference [(columns_list)] AS (
SELECT ...
)
[CYCLE cycle_column_list RESTRICT]
SELECT ...
The WITH
keyword signifies a Common Table Expression (CTE). It allows you to refer to a subquery expression many times in a query, as if having a temporary table that only exists for the duration of a query.
There are two kinds of CTEs:
Recursive (signified by the RECURSIVE
keyword).
You can use table_reference
as any normal table in the external SELECT
part. You can also use WITH
in subqueries, as well as with EXPLAIN and SELECT.
Poorly-formed recursive CTEs can in theory cause infinite loops. The max_recursive_iterations system variable limits the number of recursions.
The CYCLE
clause enables CTE cycle detection, avoiding excessive or infinite loops,
MariaDB supports a relaxed, non-standard grammar.
The SQL Standard permits a CYCLE
clause, as follows:
WITH RECURSIVE ... (
...
)
CYCLE <cycle column list>
SET <cycle mark column> TO <cycle mark value> DEFAULT <non-cycle mark value>
USING <path column>
where all clauses are mandatory. MariaDB does not support this, but permits a non-standard relaxed grammar, as follows:
WITH RECURSIVE ... (
...
)
CYCLE <cycle column list> RESTRICT
With the use of CYCLE ... RESTRICT
it makes no difference whether the CTE uses UNION ALL
or UNION DISTINCT
anymore. UNION ALL
means "all rows, but without cycles", which is exactly what the CYCLE
clause enables. And UNION DISTINCT
means all rows should be different, which, again, is what will happen — as uniqueness is enforced over a subset of columns, complete rows will automatically all be different.
CYCLE ... RESTRICT
is not available.
Below is an example with the WITH
at the top level:
WITH t AS (SELECT a FROM t1 WHERE b >= 'c')
SELECT * FROM t2, t WHERE t2.c = t.a;
The example below uses WITH
in a subquery:
SELECT t1.a, t1.b FROM t1, t2
WHERE t1.a > t2.c
AND t2.c IN(WITH t AS (SELECT * FROM t1 WHERE t1.a < 5)
SELECT t2.c FROM t2, t WHERE t2.c = t.a);
Below is an example of a Recursive CTE:
WITH RECURSIVE ancestors AS
( SELECT * FROM folks
WHERE name="Alex"
UNION
SELECT f.*
FROM folks AS f, ancestors AS a
WHERE f.id = a.father OR f.id = a.mother )
SELECT * FROM ancestors;
Consider the following structure and data:
CREATE TABLE t1 (from_ int, to_ int);
INSERT INTO t1 VALUES (1,2), (1,100), (2,3), (3,4), (4,1);
SELECT * FROM t1;
+-------+------+
| from_ | to_ |
+-------+------+
| 1 | 2 |
| 1 | 100 |
| 2 | 3 |
| 3 | 4 |
| 4 | 1 |
+-------+------+
Given the above, the following query would theoretically result in an infinite loop due to the last record in t1 (note that max_recursive_iterations is set to 10 for the purposes of this example, to avoid the excessive number of cycles):
SET max_recursive_iterations=10;
WITH RECURSIVE cte (depth, from_, to_) AS (
SELECT 0,1,1 UNION DISTINCT SELECT depth+1, t1.from_, t1.to_
FROM t1, cte WHERE t1.from_ = cte.to_
)
SELECT * FROM cte;
+-------+-------+------+
| depth | from_ | to_ |
+-------+-------+------+
| 0 | 1 | 1 |
| 1 | 1 | 2 |
| 1 | 1 | 100 |
| 2 | 2 | 3 |
| 3 | 3 | 4 |
| 4 | 4 | 1 |
| 5 | 1 | 2 |
| 5 | 1 | 100 |
| 6 | 2 | 3 |
| 7 | 3 | 4 |
| 8 | 4 | 1 |
| 9 | 1 | 2 |
| 9 | 1 | 100 |
| 10 | 2 | 3 |
+-------+-------+------+
However, the CYCLE ... RESTRICT
clause can overcome this:
WITH RECURSIVE cte (depth, from_, to_) AS (
SELECT 0,1,1 UNION SELECT depth+1, t1.from_, t1.to_
FROM t1, cte WHERE t1.from_ = cte.to_
)
CYCLE from_, to_ RESTRICT
SELECT * FROM cte;
+-------+-------+------+
| depth | from_ | to_ |
+-------+-------+------+
| 0 | 1 | 1 |
| 1 | 1 | 2 |
| 1 | 1 | 100 |
| 2 | 2 | 3 |
| 3 | 3 | 4 |
| 4 | 4 | 1 |
+-------+-------+------+
This page is licensed: CC BY-SA / Gnu FDL
Common Table Expressions (CTEs) are a standard SQL feature, and are essentially temporary named result sets. There are two kinds of CTEs: Non-Recursive, which this article covers; and Recursive.
The WITH keyword signifies a CTE. It is given a name, followed by a body (the main query):
CTEs are similar to derived tables:
WITH engineers AS
( SELECT * FROM employees
WHERE dept = 'Engineering' )
SELECT * FROM engineers
WHERE ...
SELECT * FROM
( SELECT * FROM employees
WHERE dept = 'Engineering' ) AS engineers
WHERE
...
A non-recursive CTE is basically a query-local VIEW. There are several advantages and caveats to them. The syntax is more readable than a nested FROM (SELECT ...)
.
A CTE can refer to another and it can be referenced from multiple places.
Using this format makes for a more readable SQL than a nested FROM(SELECT ...)
clause:
WITH engineers AS (
SELECT * FROM employees
WHERE dept IN('Development','Support') ),
eu_engineers AS ( SELECT * FROM engineers WHERE country IN('NL',...) )
SELECT
...
FROM eu_engineers;
This can be an 'anti-self join', for example:
WITH engineers AS (
SELECT * FROM employees
WHERE dept IN('Development','Support') )
SELECT * FROM engineers E1
WHERE NOT EXISTS
(SELECT 1 FROM engineers E2
WHERE E2.country=E1.country
AND E2.name <> E1.name );
Or, for year-over-year comparisons, for example:
WITH sales_product_year AS (
SELECT product, YEAR(ship_date) AS year,
SUM(price) AS total_amt
FROM item_sales
GROUP BY product, year )
SELECT *
FROM sales_product_year CUR,
sales_product_year PREV,
WHERE CUR.product=PREV.product
AND CUR.year=PREV.year + 1
AND CUR.total_amt > PREV.total_amt
Another use is to compare individuals against their group. Below is an example of how this might be executed:
WITH sales_product_year AS (
SELECT product,
YEAR(ship_date) AS year,
SUM(price) AS total_amt
FROM item_sales
GROUP BY product, year
)
SELECT *
FROM sales_product_year S1
WHERE
total_amt >
(SELECT 0.1 * SUM(total_amt)
FROM sales_product_year S2
WHERE S2.year = S1.year)
This page is licensed: CC BY-SA / Gnu FDL
Common Table Expressions (CTEs) are a standard SQL feature, and are essentially temporary named result sets. CTEs first appeared in the SQL standard in 1999, and the first implementations began appearing in 2007.
There are two kinds of CTEs:
Recursive, which this article covers.
SQL is generally poor at recursive structures.
CTEs permit a query to reference itself. A recursive CTE will repeatedly execute subsets of the data until it obtains the complete result set. This makes it particularly useful for handing hierarchical or tree-structured data. max_recursive_iterations avoids infinite loops.
WITH RECURSIVE signifies a recursive CTE. It is given a name, followed by a body (the main query) as follows:
Given the following structure:
First execute the anchor part of the query:
Next, execute the recursive part of the query:
WITH recursive R AS (
SELECT anchor_data
UNION [all]
SELECT recursive_part
FROM R, ...
)
SELECT ...
Compute anchor_data.
Compute recursive_part to get the new data.
If (new data is non-empty) goto 2.
As currently implemented by MariaDB and by the SQL Standard, data may be truncated if not correctly cast. It is necessary to CAST the column to the correct width if the CTE's recursive part produces wider values for a column than the CTE's nonrecursive part. Some other DBMS give an error in this situation, and MariaDB's behavior may change in future - see MDEV-12325. See the examples below.
Sample data:
CREATE TABLE bus_routes (origin varchar(50), dst varchar(50));
INSERT INTO bus_routes VALUES
('New York', 'Boston'),
('Boston', 'New York'),
('New York', 'Washington'),
('Washington', 'Boston'),
('Washington', 'Raleigh');
Now, we want to return the bus destinations with New York as the origin:
WITH RECURSIVE bus_dst as (
SELECT origin as dst FROM bus_routes WHERE origin='New York'
UNION
SELECT bus_routes.dst FROM bus_routes JOIN bus_dst ON bus_dst.dst= bus_routes.origin
)
SELECT * FROM bus_dst;
+------------+
| dst |
+------------+
| New York |
| Boston |
| Washington |
| Raleigh |
+------------+
The above example is computed as follows:
First, the anchor data is calculated:
Starting from New York.
Boston and Washington are added.
Next, the recursive part:
Starting from Boston and then Washington.
Raleigh is added.
UNION excludes nodes that are already present.
This time, we are trying to get bus routes such as “New York -> Washington -> Raleigh”.
Using the same sample data as the previous example:
WITH RECURSIVE paths (cur_path, cur_dest) AS (
SELECT origin, origin FROM bus_routes WHERE origin='New York'
UNION
SELECT CONCAT(paths.cur_path, ',', bus_routes.dst), bus_routes.dst
FROM paths
JOIN bus_routes
ON paths.cur_dest = bus_routes.origin AND
NOT FIND_IN_SET(bus_routes.dst, paths.cur_path)
)
SELECT * FROM paths;
+-----------------------------+------------+
| cur_path | cur_dest |
+-----------------------------+------------+
| New York | New York |
| New York,Boston | Boston |
| New York,Washington | Washington |
| New York,Washington,Boston | Boston |
| New York,Washington,Raleigh | Raleigh |
+-----------------------------+------------+
In the following example, data is truncated because the results are not specifically cast to a wide enough type:
WITH RECURSIVE tbl AS (
SELECT NULL AS col
UNION
SELECT "THIS NEVER SHOWS UP" AS col FROM tbl
)
SELECT col FROM tbl
+------+
| col |
+------+
| NULL |
| |
+------+
Explicitly use CAST to overcome this:
WITH RECURSIVE tbl AS (
SELECT CAST(NULL AS CHAR(50)) AS col
UNION SELECT "THIS NEVER SHOWS UP" AS col FROM tbl
)
SELECT * FROM tbl;
+---------------------+
| col |
+---------------------+
| NULL |
| THIS NEVER SHOWS UP |
+---------------------+
This page is licensed: CC BY-SA / Gnu FDL