All pages
Powered by GitBook
1 of 4

Common Table Expressions (CTE)

Learn about Common Table Expressions (CTEs) in MariaDB Server. This section explains how to use CTEs for complex, readable, and reusable subqueries, simplifying data selection and manipulation.

WITH

Syntax

WITH [RECURSIVE] table_reference [(columns_list)] AS  (
  SELECT ...
)
[CYCLE cycle_column_list RESTRICT]
SELECT ...

Description

The WITH keyword signifies a Common Table Expression (CTE). It allows you to refer to a subquery expression many times in a query, as if having a temporary table that only exists for the duration of a query.

There are two kinds of CTEs:

  • Non-Recursive.

  • Recursive (signified by the RECURSIVE keyword).

You can use table_reference as any normal table in the external SELECT part. You can also use WITH in subqueries, as well as with EXPLAIN and SELECT.

Poorly-formed recursive CTEs can in theory cause infinite loops. The max_recursive_iterations system variable limits the number of recursions.

CYCLE ... RESTRICT

The CYCLE clause enables CTE cycle detection, avoiding excessive or infinite loops, MariaDB supports a relaxed, non-standard grammar. The SQL Standard permits a CYCLE clause, as follows:

WITH RECURSIVE ... (
  ...
)
CYCLE <cycle column list>
SET <cycle mark column> TO <cycle mark value> DEFAULT <non-cycle mark value>
USING <path column>

where all clauses are mandatory. MariaDB does not support this, but permits a non-standard relaxed grammar, as follows:

WITH RECURSIVE ... (
  ...
)
CYCLE <cycle column list> RESTRICT

With the use of CYCLE ... RESTRICT it makes no difference whether the CTE uses UNION ALL or UNION DISTINCT anymore. UNION ALL means "all rows, but without cycles", which is exactly what the CYCLE clause enables. And UNION DISTINCT means all rows should be different, which, again, is what will happen — as uniqueness is enforced over a subset of columns, complete rows will automatically all be different.

CYCLE ... RESTRICT is not available.

Examples

Below is an example with the WITH at the top level:

WITH t AS (SELECT a FROM t1 WHERE b >= 'c') 
  SELECT * FROM t2, t WHERE t2.c = t.a;

The example below uses WITH in a subquery:

SELECT t1.a, t1.b FROM t1, t2
  WHERE t1.a > t2.c 
     AND t2.c IN(WITH t AS (SELECT * FROM t1 WHERE t1.a < 5)
                SELECT t2.c FROM t2, t WHERE t2.c = t.a);

Below is an example of a Recursive CTE:

WITH RECURSIVE ancestors AS 
 ( SELECT * FROM folks
   WHERE name="Alex"
   UNION
   SELECT f.*
   FROM folks AS f, ancestors AS a
   WHERE f.id = a.father OR f.id = a.mother )
SELECT * FROM ancestors;

Consider the following structure and data:

CREATE TABLE t1 (from_ int, to_ int);
INSERT INTO t1 VALUES (1,2), (1,100), (2,3), (3,4), (4,1);
SELECT * FROM t1;
+-------+------+
| from_ | to_  |
+-------+------+
|     1 |    2 |
|     1 |  100 |
|     2 |    3 |
|     3 |    4 |
|     4 |    1 |
+-------+------+

Given the above, the following query would theoretically result in an infinite loop due to the last record in t1 (note that max_recursive_iterations is set to 10 for the purposes of this example, to avoid the excessive number of cycles):

SET max_recursive_iterations=10;

WITH RECURSIVE cte (depth, from_, to_) AS ( 
  SELECT 0,1,1 UNION DISTINCT SELECT depth+1, t1.from_, t1.to_ 
    FROM t1, cte  WHERE t1.from_ = cte.to_ 
) 
SELECT * FROM cte;
+-------+-------+------+
| depth | from_ | to_  |
+-------+-------+------+
|     0 |     1 |    1 |
|     1 |     1 |    2 |
|     1 |     1 |  100 |
|     2 |     2 |    3 |
|     3 |     3 |    4 |
|     4 |     4 |    1 |
|     5 |     1 |    2 |
|     5 |     1 |  100 |
|     6 |     2 |    3 |
|     7 |     3 |    4 |
|     8 |     4 |    1 |
|     9 |     1 |    2 |
|     9 |     1 |  100 |
|    10 |     2 |    3 |
+-------+-------+------+

However, the CYCLE ... RESTRICT clause can overcome this:

WITH RECURSIVE cte (depth, from_, to_) AS ( 
  SELECT 0,1,1 UNION SELECT depth+1, t1.from_, t1.to_ 
    FROM t1, cte WHERE t1.from_ = cte.to_ 
) 
CYCLE from_, to_ RESTRICT 
SELECT * FROM cte;
+-------+-------+------+
| depth | from_ | to_  |
+-------+-------+------+
|     0 |     1 |    1 |
|     1 |     1 |    2 |
|     1 |     1 |  100 |
|     2 |     2 |    3 |
|     3 |     3 |    4 |
|     4 |     4 |    1 |
+-------+-------+------+

See Also

  • Non-Recursive Common Table Expressions Overview

  • Recursive Common Table Expressions Overview

This page is licensed: CC BY-SA / Gnu FDL

Non-Recursive Common Table Expressions Overview

Common Table Expressions (CTEs) are a standard SQL feature, and are essentially temporary named result sets. There are two kinds of CTEs: Non-Recursive, which this article covers; and Recursive.

Non-Recursive CTEs

The WITH keyword signifies a CTE. It is given a name, followed by a body (the main query):

cte_syntax

CTEs are similar to derived tables:

WITH engineers AS 
   ( SELECT * FROM employees
     WHERE dept = 'Engineering' )

SELECT * FROM engineers
WHERE ...
SELECT * FROM
   ( SELECT * FROM employees
     WHERE dept = 'Engineering' ) AS engineers
WHERE
...

A non-recursive CTE is basically a query-local VIEW. There are several advantages and caveats to them. The syntax is more readable than a nested FROM (SELECT ...). A CTE can refer to another and it can be referenced from multiple places.

A CTE referencing Another CTE

Using this format makes for a more readable SQL than a nested FROM(SELECT ...) clause:

WITH engineers AS (
SELECT * FROM employees
WHERE dept IN('Development','Support') ),
eu_engineers AS ( SELECT * FROM engineers WHERE country IN('NL',...) )
SELECT
...
FROM eu_engineers;

Multiple Uses of a CTE

This can be an 'anti-self join', for example:

WITH engineers AS (
SELECT * FROM employees
WHERE dept IN('Development','Support') )

SELECT * FROM engineers E1
WHERE NOT EXISTS
   (SELECT 1 FROM engineers E2
    WHERE E2.country=E1.country
    AND E2.name <> E1.name );

Or, for year-over-year comparisons, for example:

WITH sales_product_year AS (
SELECT product, YEAR(ship_date) AS year,
SUM(price) AS total_amt
FROM item_sales
GROUP BY product, year )

SELECT *
FROM sales_product_year CUR,
sales_product_year PREV,
WHERE CUR.product=PREV.product 
AND  CUR.year=PREV.year + 1 
AND CUR.total_amt > PREV.total_amt

Another use is to compare individuals against their group. Below is an example of how this might be executed:

WITH sales_product_year AS (
SELECT product,
YEAR(ship_date) AS year,
SUM(price) AS total_amt
FROM item_sales
GROUP BY product, year
)

SELECT * 
FROM sales_product_year S1
WHERE
total_amt > 
    (SELECT 0.1 * SUM(total_amt)
     FROM sales_product_year S2
     WHERE S2.year = S1.year)

This page is licensed: CC BY-SA / Gnu FDL

Recursive Common Table Expressions Overview

Common Table Expressions (CTEs) are a standard SQL feature, and are essentially temporary named result sets. CTEs first appeared in the SQL standard in 1999, and the first implementations began appearing in 2007.

There are two kinds of CTEs:

  • Non-recursive;

  • Recursive, which this article covers.

SQL is generally poor at recursive structures.

trees_and_graphs

CTEs permit a query to reference itself. A recursive CTE will repeatedly execute subsets of the data until it obtains the complete result set. This makes it particularly useful for handing hierarchical or tree-structured data. max_recursive_iterations avoids infinite loops.

Syntax example

WITH RECURSIVE signifies a recursive CTE. It is given a name, followed by a body (the main query) as follows:

rcte_syntax
cte_syntax

Computation

Given the following structure:

rcte_computation

First execute the anchor part of the query:

rcte1

Next, execute the recursive part of the query:

rcte_computation_2

Summary

WITH recursive R AS (
  SELECT anchor_data
  UNION [all]
  SELECT recursive_part
  FROM R, ...
)
SELECT ...
  1. Compute anchor_data.

  2. Compute recursive_part to get the new data.

  3. If (new data is non-empty) goto 2.

CAST to avoid truncating data

As currently implemented by MariaDB and by the SQL Standard, data may be truncated if not correctly cast. It is necessary to CAST the column to the correct width if the CTE's recursive part produces wider values for a column than the CTE's nonrecursive part. Some other DBMS give an error in this situation, and MariaDB's behavior may change in future - see MDEV-12325. See the examples below.

Examples

Transitive closure - determining bus destinations

Sample data:

tc_1
CREATE TABLE bus_routes (origin varchar(50), dst varchar(50));
INSERT INTO bus_routes VALUES 
  ('New York', 'Boston'), 
  ('Boston', 'New York'), 
  ('New York', 'Washington'), 
  ('Washington', 'Boston'), 
  ('Washington', 'Raleigh');

Now, we want to return the bus destinations with New York as the origin:

WITH RECURSIVE bus_dst as ( 
    SELECT origin as dst FROM bus_routes WHERE origin='New York' 
  UNION
    SELECT bus_routes.dst FROM bus_routes JOIN bus_dst ON bus_dst.dst= bus_routes.origin 
) 
SELECT * FROM bus_dst;
+------------+
| dst        |
+------------+
| New York   |
| Boston     |
| Washington |
| Raleigh    |
+------------+

The above example is computed as follows:

First, the anchor data is calculated:

  • Starting from New York.

  • Boston and Washington are added.

Next, the recursive part:

  • Starting from Boston and then Washington.

  • Raleigh is added.

  • UNION excludes nodes that are already present.

Computing paths - determining bus routes

This time, we are trying to get bus routes such as “New York -> Washington -> Raleigh”.

Using the same sample data as the previous example:

WITH RECURSIVE paths (cur_path, cur_dest) AS (
    SELECT origin, origin FROM bus_routes WHERE origin='New York' 
  UNION
    SELECT CONCAT(paths.cur_path, ',', bus_routes.dst), bus_routes.dst 
     FROM paths
     JOIN bus_routes 
       ON paths.cur_dest = bus_routes.origin AND 
         NOT FIND_IN_SET(bus_routes.dst, paths.cur_path)
) 
SELECT * FROM paths;
+-----------------------------+------------+
| cur_path                    | cur_dest   |
+-----------------------------+------------+
| New York                    | New York   |
| New York,Boston             | Boston     |
| New York,Washington         | Washington |
| New York,Washington,Boston  | Boston     |
| New York,Washington,Raleigh | Raleigh    |
+-----------------------------+------------+

CAST to avoid data truncation

In the following example, data is truncated because the results are not specifically cast to a wide enough type:

WITH RECURSIVE tbl AS (
  SELECT NULL AS col
  UNION
  SELECT "THIS NEVER SHOWS UP" AS col FROM tbl
)
SELECT col FROM tbl
+------+
| col  |
+------+
| NULL |
|      |
+------+

Explicitly use CAST to overcome this:

WITH RECURSIVE tbl AS (
  SELECT CAST(NULL AS CHAR(50)) AS col
  UNION SELECT "THIS NEVER SHOWS UP" AS col FROM tbl
)  
SELECT * FROM tbl;
+---------------------+
| col                 |
+---------------------+
| NULL                |
| THIS NEVER SHOWS UP |
+---------------------+

This page is licensed: CC BY-SA / Gnu FDL