2 of 58

Data Manipulation (DML)

Learn data manipulation language (DML) statements in MariaDB Server. This section covers SQL commands for inserting, updating, deleting, and selecting data within your databases.

Changing & Deleting Data

Learn to change and delete data in MariaDB Server. This section covers UPDATE and DELETE SQL statements, enabling you to modify existing records and remove unwanted information efficiently.

DELETE

Syntax

Single-table syntax:

DELETE [LOW_PRIORITY] [QUICK] [IGNORE] 
  FROM tbl_name [PARTITION (partition_list)]
  [FOR PORTION OF PERIOD FROM expr1 TO expr2]
  [AS alias]                    -- from MariaDB 11.6
  [WHERE where_condition]
  [ORDER BY ...]
  [LIMIT row_count]
  [RETURNING select_expr 
    [, select_expr ...]]

Multiple-table syntax:

DELETE [LOW_PRIORITY] [QUICK] [IGNORE]
    tbl_name[.*] [, tbl_name[.*]] ...
    FROM table_references
    [WHERE where_condition]

Or:

DELETE [LOW_PRIORITY] [QUICK] [IGNORE]
    FROM tbl_name[.*] [, tbl_name[.*]] ...
    USING table_references
    [WHERE where_condition]

Trimming history:

DELETE HISTORY
  FROM tbl_name [PARTITION (partition_list)]
  [BEFORE SYSTEM_TIME [TIMESTAMP|TRANSACTION] expression]

Description

Option

Description

LOW_PRIORITY

Wait until all SELECT's are done before starting the statement. Used with storage engines that uses table locking (MyISAM, Aria etc). See HIGH_PRIORITY and LOW_PRIORITY clauses for details.

QUICK

Signal the storage engine that it should expect that a lot of rows are deleted. The storage engine can do things to speed up the DELETE like ignoring merging of data blocks until all rows are deleted from the block (instead of when a block is half full). This speeds up things at the expanse of lost space in data blocks. At least MyISAM and Aria support this feature.

IGNORE

Don't stop the query even if a not-critical error occurs (like data overflow). See How IGNORE works for a full description.

For the single-table syntax, the DELETE statement deletes rows from tbl_name and returns a count of the number of deleted rows. This count can be obtained by calling the ROW_COUNT() function. TheWHERE clause, if given, specifies the conditions that identify which rows to delete. With no WHERE clause, all rows are deleted. If the ORDER BY clause is specified, the rows are deleted in the order that is specified. The LIMIT clause places a limit on the number of rows that can be deleted.

For the multiple-table syntax, DELETE deletes from eachtbl_name the rows that satisfy the conditions. In this case,ORDER BY and LIMIT> cannot be used. A DELETE can also reference tables which are located in different databases; see Identifier Qualifiers for the syntax.

where_condition is an expression that evaluates to true for each row to be deleted. It is specified as described in SELECT.

Currently, you cannot delete from a table and select from the same table in a subquery.

You need the DELETE privilege on a table to delete rows from it. You need only the SELECT privilege for any columns that are only read, such as those named in the WHERE clause. See GRANT.

As stated, a DELETE statement with no WHERE clause deletes all rows. A faster way to do this, when you do not need to know the number of deleted rows, is to use TRUNCATE TABLE. However, within a transaction or if you have a lock on the table,TRUNCATE TABLE cannot be used whereas DELETE can. See TRUNCATE TABLE, and LOCK.

AS

Single-table DELETE statements support aliases. For example:

CREATE TABLE t1 (c1 INT);
INSERT INTO t1 VALUES (1), (2);

DELETE FROM t1 AS a1 WHERE a1.c1 = 2;

Single-table DELETE statements do not support aliases.

PARTITION

See Partition Pruning and Selection for details.

FOR PORTION OF

See Application Time Periods - Deletion by Portion.

RETURNING

It is possible to return a resultset of the deleted rows for a single table to the client by using the syntax DELETE ... RETURNING select_expr [, select_expr2 ...]]

Any of SQL expression that can be calculated from a single row fields is allowed. Subqueries are allowed. The AS keyword is allowed, so it is possible to use aliases.

The use of aggregate functions is not allowed. RETURNING cannot be used in multi-table DELETEs.

Same Source and Target Table

It is possible to delete from a table with the same source and target. For example:

DELETE FROM t1 WHERE c1 IN (SELECT b.c1 FROM t1 b WHERE b.c2=0);

DELETE HISTORY

You can use DELETE HISTORY to delete historical information from System-versioned tables.

Examples

How to use the ORDER BY and LIMIT clauses:

DELETE FROM page_hit ORDER BY TIMESTAMP LIMIT 1000000;

How to use the RETURNING clause:

DELETE FROM t RETURNING f1;
+------+
| f1   |
+------+
|    5 |
|   50 |
|  500 |
+------+

The following statement joins two tables: one is only used to satisfy a WHERE condition, but no row is deleted from it; rows from the other table are deleted, instead.

DELETE post FROM blog INNER JOIN post WHERE blog.id = post.blog_id;

Deleting from the Same Source and Target

CREATE TABLE t1 (c1 INT, c2 INT);
DELETE FROM t1 WHERE c1 IN (SELECT b.c1 FROM t1 b WHERE b.c2=0);

The statement returns:

Query OK, 0 rows affected (0.00 sec)

The statement returns:

ERROR 1093 (HY000): Table 't1' is specified twice, both as a target for 'DELETE' 
  AND AS a separate source FOR

HIGH_PRIORITY and LOW_PRIORITY

The InnoDB storage engine uses row-level locking to ensure data integrity. However some storage engines (such as MEMORY, MyISAM, Aria and MERGE) lock the whole table to prevent conflicts. These storage engines use two separate queues to remember pending statements; one is for SELECTs and the other one is for write statements (INSERT, DELETE, UPDATE). By default, the latter has a higher priority.

To give write operations a lower priority, the low_priority_updates server system variable can be set to ON. The option is available on both the global and session levels, and it can be set at startup or via the SET statement.

When too many table locks have been set by write statements, some pending SELECTs are executed. The maximum number of write locks that can be acquired before this happens is determined by the max_write_lock_count server system variable, which is dynamic.

If write statements have a higher priority (default), the priority of individual write statements (INSERT, REPLACE, UPDATE, DELETE) can be changed via the LOW_PRIORITY attribute, and the priority of a SELECT statement can be raised via the HIGH_PRIORITY attribute. Also, LOCK TABLES supports a LOW_PRIORITY attribute for WRITE locks.

If read statements have a higher priority, the priority of an INSERT can be changed via the HIGH_PRIORITY attribute. However, the priority of other write statements cannot be raised individually.

The use of LOW_PRIORITY or HIGH_PRIORITY for an INSERT prevents Concurrent Inserts from being used.

REPLACE

The RETURNING clause was introduced in MariaDB 10.5.0.

Syntax

REPLACE [LOW_PRIORITY | DELAYED]
 [INTO] tbl_name [PARTITION (partition_list)] [(col,...)]
 {VALUES | VALUE} ({expr | DEFAULT},...),(...),...
[RETURNING select_expr 
      [, select_expr ...]]

Or:

REPLACE [LOW_PRIORITY | DELAYED]
    [INTO] tbl_name [PARTITION (partition_list)]
    SET col={expr | DEFAULT}, ...
[RETURNING select_expr 
      [, select_expr ...]]

Or:

REPLACE [LOW_PRIORITY | DELAYED]
    [INTO] tbl_name [PARTITION (partition_list)] [(col,...)]
    SELECT ...
[RETURNING select_expr 
      [, select_expr ...]]

Description

REPLACE works exactly like INSERT, except that if an old row in the table has the same value as a new row for a PRIMARY KEY or aUNIQUE index, the old row is deleted before the new row is inserted. If the table has more than one UNIQUE keys, it is possible that the new row conflicts with more than one row. In this case, all conflicting rows will be deleted.

The table name can be specified in the form db_name.tbl_name or, if a default database is selected, in the form tbl_name (see Identifier Qualifiers). This allows to use REPLACE ... SELECT to copy rows between different databases.

Basically it works like this:

BEGIN;
SELECT 1 FROM t1 WHERE KEY=# FOR UPDATE;
IF FOUND-ROW
  DELETE FROM t1 WHERE KEY=# ;
ENDIF
INSERT INTO t1 VALUES (...);
END;

The above can be replaced with:

REPLACE INTO t1 VALUES (...)

REPLACE is a MariaDB/MySQL extension to the SQL standard. It either inserts, or deletes and inserts. For other MariaDB/MySQL extensions to standard SQL --- that also handle duplicate values --- see IGNORE and INSERT ON DUPLICATE KEY UPDATE.

Note that unless the table has a PRIMARY KEY orUNIQUE index, using a REPLACE statement makes no sense. It becomes equivalent to INSERT, because there is no index to be used to determine whether a new row duplicates another.

Values for all columns are taken from the values. See Partition Pruning and Selection for details. Specified in theREPLACE statement. Any missing columns are set to their default values, just as happens for INSERT. You cannot refer to values from the current row and use them in the new row. If you use an assignment such as 'SET col = col + 1', the reference to the column name on the right hand side is treated asDEFAULT(col), so the assignment is equivalent to'SET col = DEFAULT(col) + 1'.

To use REPLACE, you must have both theINSERT and DELETE privileges for the table.

There are some issues you should be aware of, before using REPLACE:

If there is an AUTO_INCREMENT field, a new value will be generated.
If there are foreign keys, ON DELETE action will be activated by REPLACE.
Triggers on DELETE and INSERT will be activated by REPLACE.

To avoid some of these behaviors, you can use INSERT ... ON DUPLICATE KEY UPDATE.

This statement activates INSERT and DELETE triggers. See Trigger Overview for details.

PARTITION

See Partition Pruning and Selection for details.

REPLACE RETURNING

REPLACE ... RETURNING returns a resultset of the replaced rows. This returns the listed columns for all the rows that are replaced, or alternatively, the specified SELECT expression. Any SQL expressions which can be calculated can be used in the select expression for the RETURNING clause, including virtual columns and aliases, expressions which use various operators such as bitwise, logical and arithmetic operators, string functions, date-time functions, numeric functions, control flow functions, secondary functions and stored functions. Along with this, statements which have subqueries and prepared statements can also be used.

Examples

Simple REPLACE statement:

REPLACE INTO t2 VALUES (1,'Leopard'),(2,'Dog') RETURNING id2, id2+id2 
as Total ,id2|id2, id2&&id2;
+-----+-------+---------+----------+
| id2 | Total | id2|id2 | id2&&id2 |
+-----+-------+---------+----------+
|   1 |     2 |       1 |        1 |
|   2 |     4 |       2 |        1 |
+-----+-------+---------+----------+

Using stored functions in RETURNING:

DELIMITER |
CREATE FUNCTION f(arg INT) RETURNS INT
    BEGIN
      RETURN (SELECT arg+arg);
    END|

DELIMITER ;
PREPARE stmt FROM "REPLACE INTO t2 SET id2=3, animal2='Fox' RETURNING f2(id2),
UPPER(animal2)";

EXECUTE stmt;
+---------+----------------+
| f2(id2) | UPPER(animal2) |
+---------+----------------+
|       6 | FOX            |
+---------+----------------+

Subqueries in the statement:

REPLACE INTO t1 SELECT * FROM t2 RETURNING (SELECT id2 FROM t2 WHERE 
id2 IN (SELECT id2 FROM t2 WHERE id2=1)) AS new_id;
+--------+
| new_id |
+--------+
|      1 |
|      1 |
|      1 |
|      1 |
+--------+

Subqueries in the RETURNING clause that return more than one row or column cannot be used..

Aggregate functions cannot be used in the RETURNING clause. Since aggregate functions work on a set of values and if the purpose is to get the row count, ROW_COUNT() with SELECT can be used, or it can be used in REPLACE...SELECT.

REPLACE ... RETURNING returns a result set of the replaced rows. This returns the listed columns for all the rows that are replaced, or alternatively, the specified SELECT expression. Any SQL expressions which can be calculated can be used in the select expression for the RETURNING clause, including virtual columns and aliases, expressions which use various operators such as bitwise, logical and arithmetic operators, string functions, date-time functions, numeric functions, control flow functions, secondary functions and stored functions. Along with this, statements which have subqueries and prepared statements can also be used.

Examples

Simple REPLACE statement:

REPLACE INTO t2 VALUES (1,'Leopard'),(2,'Dog') RETURNING id2, id2+id2 
as Total ,id2|id2, id2&&id2;
+-----+-------+---------+----------+
| id2 | Total | id2|id2 | id2&&id2 |
+-----+-------+---------+----------+
|   1 |     2 |       1 |        1 |
|   2 |     4 |       2 |        1 |
+-----+-------+---------+----------+

Using stored functions in RETURNING:

DELIMITER |
CREATE FUNCTION f(arg INT) RETURNS INT
    BEGIN
      RETURN (SELECT arg+arg);
    END|

DELIMITER ;
PREPARE stmt FROM "REPLACE INTO t2 SET id2=3, animal2='Fox' RETURNING f2(id2),
UPPER(animal2)";

EXECUTE stmt;
+---------+----------------+
| f2(id2) | UPPER(animal2) |
+---------+----------------+
|       6 | FOX            |
+---------+----------------+

Subqueries in the statement:

REPLACE INTO t1 SELECT * FROM t2 RETURNING (SELECT id2 FROM t2 WHERE 
id2 IN (SELECT id2 FROM t2 WHERE id2=1)) AS new_id;
+--------+
| new_id |
+--------+
|      1 |
|      1 |
|      1 |
|      1 |
+--------+

Subqueries in the RETURNING clause that return more than one row or column cannot be used..

Aggregate functions cannot be used in the RETURNING clause. Since aggregate functions work on a set of values and if the purpose is to get the row count, ROW_COUNT() with SELECT can be used, or it can be used in REPLACE...SELECT...RETURNING if the table in the RETURNING clause is not the same as the REPLACE table. SELECT...RETURNING if the table in the RETURNING clause is not the same as the REPLACE table.

REPLACE...RETURNING

REPLACE ... RETURNING was added in MariaDB 10.5.0, and returns a result set of the replaced rows.

Syntax

REPLACE [LOW_PRIORITY | DELAYED]
 [INTO] tbl_name [PARTITION (partition_list)] [(col,...)]
 {VALUES | VALUE} ({expr | DEFAULT},...),(...),...
[RETURNING select_expr 
      [, select_expr ...]]

Or:

REPLACE [LOW_PRIORITY | DELAYED]
    [INTO] tbl_name [PARTITION (partition_list)]
    SET col={expr | DEFAULT}, ...
[RETURNING select_expr 
      [, select_expr ...]]

Or:

REPLACE [LOW_PRIORITY | DELAYED]
    [INTO] tbl_name [PARTITION (partition_list)] [(col,...)]
    SELECT ...
[RETURNING select_expr 
      [, select_expr ...]]

Description

REPLACE ... RETURNING returns a result set of the replaced rows. The statement returns the listed columns for all the rows that are replaced, or alternatively, the specified SELECT expression. Any SQL expressions which can be calculated can be used in the select expression for the RETURNING clause, including virtual columns and aliases, expressions which use various operators such as bitwise, logical and arithmetic operators, string functions, date-time functions, numeric functions, control flow functions, secondary functions and stored functions. Along with this, statements which have subqueries and prepared statements can also be used.

Examples

Simple REPLACE statement:

REPLACE INTO t2 VALUES (1,'Leopard'),(2,'Dog') RETURNING id2, id2+id2 
AS Total ,id2|id2, id2&&id2;
+-----+-------+---------+----------+
| id2 | Total | id2|id2 | id2&&id2 |
+-----+-------+---------+----------+
|   1 |     2 |       1 |        1 |
|   2 |     4 |       2 |        1 |
+-----+-------+---------+----------+

Using stored functions in RETURNING:

DELIMITER |
CREATE FUNCTION f(arg INT) RETURNS INT
    BEGIN
      RETURN (SELECT arg+arg);
    END|

DELIMITER ;
PREPARE stmt FROM "REPLACE INTO t2 SET id2=3, animal2='Fox' RETURNING f2(id2),
UPPER(animal2)";

EXECUTE stmt;
+---------+----------------+
| f2(id2) | UPPER(animal2) |
+---------+----------------+
|       6 | FOX            |
+---------+----------------+

Subqueries in the statement:

REPLACE INTO t1 SELECT * FROM t2 RETURNING (SELECT id2 FROM t2 WHERE 
id2 IN (SELECT id2 FROM t2 WHERE id2=1)) AS new_id;
+--------+
| new_id |
+--------+
|      1 |
|      1 |
|      1 |
|      1 |
+--------+

Subqueries in the RETURNING clause that return more than one row or column cannot be used..

UPDATE

Syntax

Single-table syntax:

UPDATE [LOW_PRIORITY] [IGNORE] table_reference 
  [PARTITION (partition_list)]
  [FOR PORTION OF period FROM expr1 TO expr2]
  SET col1={expr1|DEFAULT} [,col2={expr2|DEFAULT}] ...
  [WHERE where_condition]
  [ORDER BY ...]
  [LIMIT row_count]

Multiple-table syntax:

UPDATE [LOW_PRIORITY] [IGNORE] table_references
    SET col1={expr1|DEFAULT} [, col2={expr2|DEFAULT}] ...
    [WHERE where_condition]

Description

For the single-table syntax, the UPDATE statement updates columns of existing rows in the named table with new values. TheSET clause indicates which columns to modify and the values they should be given. Each value can be given as an expression, or the keywordDEFAULT to set a column explicitly to its default value. TheWHERE clause, if given, specifies the conditions that identify which rows to update. With no WHERE clause, all rows are updated. If the ORDER BY clause is specified, the rows are updated in the order that is specified. The LIMIT clause places a limit on the number of rows that can be updated.

Both clauses can be used with multiple-table updates.

Both clauses can be used with multiple-table updates. For the multiple-table syntax, UPDATE updates rows in each table named in table_references that satisfy the conditions. In this case, ORDER BY and LIMIT could not be used.

An UPDATE can also reference tables which are located in different databases; see Identifier Qualifiers for the syntax.

where_condition is an expression that evaluates to true for each row to be updated.

table_references and where_condition are as specified as described in SELECT.

For single-table updates, assignments are evaluated in left-to-right order, while for multi-table updates, there is no guarantee of a particular order. If the SIMULTANEOUS_ASSIGNMENT sql_mode is set, UPDATE statements evaluate all assignments simultaneously.

You need the UPDATE privilege only for columns referenced in an UPDATE that are actually updated. You need only the SELECT privilege for any columns that are read but not modified. See GRANT.

The UPDATE statement supports the following modifiers:

If you use the LOW_PRIORITY keyword, execution of the UPDATE is delayed until no other clients are reading from the table. This affects only storage engines that use only table-level locking (MyISAM, MEMORY, MERGE). See HIGH_PRIORITY and LOW_PRIORITY clauses for details.
If you use the IGNORE keyword, the update statement does not abort even if errors occur during the update. Rows for which duplicate-key conflicts occur are not updated. Rows for which columns are updated to values that would cause data conversion errors are updated to the closest valid values instead.

PARTITION

See Partition Pruning and Selection for details.

FOR PORTION OF

See Application Time Periods - Updating by Portion.

UPDATE Statements With the Same Source and Target

UPDATE statements may have the same source and target. For example, given the following table:

DROP TABLE t1;
CREATE TABLE t1 (c1 INT, c2 INT);
INSERT INTO t1 VALUES (10,10), (20,20);

UPDATE t1 SET c1=c1+1 WHERE c2=(SELECT MAX(c2) FROM t1);

SELECT * FROM t1;
+------+------+
| c1   | c2   |
+------+------+
|   10 |   10 |
|   21 |   20 |
+------+------+

Example

Single-table syntax:

UPDATE table_name SET column1 = value1, column2 = value2 WHERE id=100;

Multiple-table syntax:

UPDATE tab1, tab2 SET tab1.column1 = value1, tab1.column2 = value2 WHERE tab1.id = tab2.id;

Inserting & Loading Data

Learn to insert and load data into MariaDB Server. This section covers INSERT and LOAD DATA SQL statements, enabling you to efficiently add new records to your databases.

Concurrent Inserts

The MyISAM storage engine supports concurrent inserts. This feature allows SELECT statements to be executed during INSERT operations, reducing contention.

Whether concurrent inserts can be used or not depends on the value of the concurrent_insert server system variable:

NEVER (0) disables concurrent inserts.
AUTO (1) allows concurrent inserts only when the target table has no free blocks (no data in the middle of the table has been deleted after the last OPTIMIZE TABLE). This is the default.
ALWAYS (2) always enables concurrent inserts, in which case new rows are added at the end of a table if the table is being used by another thread.

If the binary log is used, CREATE TABLE ... SELECT and INSERT ... SELECT statements cannot use concurrent inserts. These statements acquire a read lock on the table, so concurrent inserts will need to wait. This way the log can be safely used to restore data.

Concurrent inserts are not used by replicas with the row based replication (see binary log formats).

If an INSERT statement contain the HIGH_PRIORITY clause, concurrent inserts cannot be used. INSERT ... DELAYED is usually unneeded if concurrent inserts are enabled.

LOAD DATA INFILE uses concurrent inserts if the CONCURRENT keyword is specified and concurrent_insert is not NEVER. This makes the statement slower (even if no other sessions access the table) but reduces contention.

LOCK TABLES allows non-conflicting concurrent inserts if a READ LOCAL lock is used. Concurrent inserts are not allowed if the LOCAL keyword is omitted.

Notes

The decision to enable concurrent insert for a table is done when the table is opened. If you change the value of concurrent_insert it will only affect new opened tables. If you want it to work for also for tables in use or cached, you should do FLUSH TABLES after setting the variable.

IGNORE

The IGNORE option tells the server to ignore some common errors.

IGNORE can be used with the following statements:

DELETE
INSERT (see also INSERT IGNORE)
LOAD DATA INFILE
UPDATE
ALTER TABLE
CREATE TABLE ... SELECT
INSERT ... SELECT

The logic used:

Variables out of ranges are replaced with the maximum/minimum value.
SQL_MODEs STRICT_TRANS_TABLES, STRICT_ALL_TABLES, NO_ZERO_IN_DATE, NO_ZERO_DATE are ignored.
Inserting NULL in a NOT NULL field will insert 0 ( in a numerical field), 0000-00-00 ( in a date field) or an empty string ( in a character field).
Rows that cause a duplicate key error or break a foreign key constraint are not inserted, updated, or deleted.

The following errors are ignored:

Error number

Symbolic error name

Description

1022

ER_DUP_KEY

Can't write; duplicate key in table '%s'

1048

ER_BAD_NULL_ERROR

Column '%s' cannot be null

1062

ER_DUP_ENTRY

Duplicate entry '%s' for key %d

1242

ER_SUBQUERY_NO_1_ROW

Subquery returns more than 1 row

1264

ER_WARN_DATA_OUT_OF_RANGE

Out of range value for column '%s' at row %ld

1265

WARN_DATA_TRUNCATED

Data truncated for column '%s' at row %ld

1292

ER_TRUNCATED_WRONG_VALUE

Truncated incorrect %s value: '%s'

1366

ER_TRUNCATED_WRONG_VALUE_FOR_FIELD

Incorrect integer value

1369

ER_VIEW_CHECK_FAILED

CHECK OPTION failed '%s.%s'

1451

ER_ROW_IS_REFERENCED_2

Cannot delete or update a parent row

1452

ER_NO_REFERENCED_ROW_2

Cannot add or update a child row: a foreign key constraint fails (%s)

1526

ER_NO_PARTITION_FOR_GIVEN_VALUE

Table has no partition for value %s

1586

ER_DUP_ENTRY_WITH_KEY_NAME

Duplicate entry '%s' for key '%s'

1591

ER_NO_PARTITION_FOR_GIVEN_VALUE_SILENT

Table has no partition for some existing values

1748

ER_ROW_DOES_NOT_MATCH_GIVEN_PARTITION_SET

Found a row not matching the given partition set

Ignored errors normally generate a warning.

A property of the IGNORE clause consists in causing transactional engines and non-transactional engines (like InnoDB and Aria) to behave the same way. For example, normally a multi-row insert which tries to violate a UNIQUE contraint is completely rolled back on InnoDB, but might be partially executed on Aria. With the IGNORE clause, the statement will be partially executed in both engines.

Duplicate key errors also generate warnings. The OLD_MODE server variable can be used to prevent this.

_{This page is licensed: CC BY-SA / Gnu FDL}

INSERT - Default & Duplicate Values

Default Values

If the SQL_MODE contains STRICT_TRANS_TABLES and you are inserting into a transactional table (like InnoDB), or if the SQL_MODE contains STRICT_ALL_TABLES, all NOT NULL columns which do not have a DEFAULT value (and are not AUTO_INCREMENT) must be explicitly referenced in INSERT statements. If not, an error like this is produced:

ERROR 1364 (HY000): Field 'col' doesn't have a default value

In all other cases, if a NOT NULL column without a DEFAULT value is not referenced, an empty value will be inserted (for example, 0 for INTEGER columns and '' for CHAR columns). See NULL Values in MariaDB:Inserting for examples.

If a NOT NULL column having a DEFAULT value is not referenced, NULL will be inserted.

If a NULL column having a DEFAULT value is not referenced, its default value will be inserted. It is also possible to explicitly assign the default value using the DEFAULT keyword or the DEFAULT() function.

If the DEFAULT keyword is used but the column does not have a DEFAULT value, an error like this is produced:

ERROR 1364 (HY000): Field 'col' doesn't have a default value

Duplicate Values

By default, if you try to insert a duplicate row and there is a UNIQUE index, INSERT stops and an error like this is produced:

ERROR 1062 (23000): Duplicate entry 'dup_value' for key 'col'

To handle duplicates you can use the IGNORE clause, INSERT ON DUPLICATE KEY UPDATE or the REPLACE statement. Note that the IGNORE and DELAYED options are ignored when you use ON DUPLICATE KEY UPDATE.

INSERT DELAYED

Syntax

INSERT DELAYED ...

Description

The DELAYED option for the INSERT statement is a MariaDB/MySQL extension to standard SQL that is very useful if you have clients that cannot or need not wait for the INSERT to complete. This is a common situation when you use MariaDB for logging and you also periodically run SELECT and UPDATE statements that take a long time to complete.

When a client uses INSERT DELAYED, it gets an okay from the server at once, and the row is queued to be inserted when the table is not in use by any other thread.

Another major benefit of using INSERT DELAYED is that inserts from many clients are bundled together and written in one block. This is much faster than performing many separate inserts.

Note that INSERT DELAYED is slower than a normalINSERT if the table is not otherwise in use. There is also the additional overhead for the server to handle a separate thread for each table for which there are delayed rows. This means that you should useINSERT DELAYED only when you are really sure that you need it.

The queued rows are held only in memory until they are inserted into the table. This means that if you terminate mariadbd forcibly (for example, with kill -9) or if mariadbd dies unexpectedly, any queued rows that have not been written to disk are lost.

The number of concurrent INSERT DELAYED threads is limited by the max_delayed_threads server system variables. If it is set to 0, INSERT DELAYED is disabled. The session value can be equal to the global value, or 0 to disable this statement for the current session. If this limit has been reached, the DELAYED clause will be silently ignore for subsequent statements (no error will be produced).

Limitations

There are some limitations on the use of DELAYED:

INSERT DELAYED works only with MyISAM, MEMORY, ARCHIVE, and BLACKHOLE tables. If you execute INSERT DELAYED with another storage engine, you will get an error like this: ERROR 1616 (HY000): DELAYED option not supported for table 'tab_name'
For MyISAM tables, if there are no free blocks in the middle of the data file, concurrent SELECT and INSERT statements are supported. Under these circumstances, you very seldom need to use INSERT DELAYED with MyISAM.
INSERT DELAYED should be used only forINSERT statements that specify value lists. The server ignores DELAYED for INSERT ... SELECT or INSERT ... ON DUPLICATE KEY UPDATE statements.
Because the INSERT DELAYED statement returns immediately, before the rows are inserted, you cannot useLAST_INSERT_ID() to get theAUTO_INCREMENT value that the statement might generate.
DELAYED rows are not visible to SELECT statements until they actually have been inserted.
After INSERT DELAYED, ROW_COUNT() returns the number of the rows you tried to insert, not the number of the successful writes.
DELAYED is ignored on slave replication servers, so thatINSERT DELAYED is treated as a normal INSERT on slaves. This is becauseDELAYED could cause the slave to have different data than the master. INSERT DELAYED statements are not safe for replication.
Pending INSERT DELAYED statements are lost if a table is write locked and ALTER TABLE is used to modify the table structure.
INSERT DELAYED is not supported for views. If you try, you will get an error like this: ERROR 1347 (HY000): 'view_name' is not BASE TABLE
INSERT DELAYED is not supported for partitioned tables.
INSERT DELAYED is not supported within stored programs.
INSERT DELAYED does not work with triggers.
INSERT DELAYED does not work if there is a check constraint in place.
INSERT DELAYED does not work if skip-new mode is active.

INSERT IGNORE

Ignoring Errors

Normally INSERT stops and rolls back when it encounters an error.

By using the IGNORE keyword all errors are converted to warnings, which will not stop inserts of additional rows.

Invalid values are changed to the closest valid value and inserted, with a warning produced.

The IGNORE and DELAYED options are ignored when you use ON DUPLICATE KEY UPDATE.

Warnings are issued for duplicate key errors when using IGNORE. You can get the old behavior if you set OLD_MODE to NO_DUP_KEY_WARNINGS_WITH_IGNORE.

No warnings are issued for duplicate key errors when using IGNORE.

See IGNORE for a full description of effects.

Examples

CREATE TABLE t1 (x INT UNIQUE);

INSERT INTO t1 VALUES(1),(2);

INSERT INTO t1 VALUES(2),(3);
ERROR 1062 (23000): Duplicate entry '2' for key 'x'
SELECT * FROM t1;
+------+
| x    |
+------+
|    1 |
|    2 |
+------+

INSERT IGNORE INTO t1 VALUES(2),(3);
Query OK, 1 row affected, 1 warning (0.04 sec)

SHOW WARNINGS;
+---------+------+---------------------------------+
| Level   | Code | Message                         |
+---------+------+---------------------------------+
| Warning | 1062 | Duplicate entry '2' for key 'x' |
+---------+------+---------------------------------+

SELECT * FROM t1;
+------+
| x    |
+------+
|    1 |
|    2 |
|    3 |
+------+

Converting values:

CREATE OR REPLACE TABLE t2(id INT, t VARCHAR(2) NOT NULL, n INT NOT NULL);

INSERT INTO t2(id) VALUES (1),(2);
ERROR 1364 (HY000): Field 't' doesn't have a default value

INSERT IGNORE INTO t2(id) VALUES (1),(2);
Query OK, 2 rows affected, 2 warnings (0.026 sec)
Records: 2  Duplicates: 0  Warnings: 2

SHOW WARNINGS;
+---------+------+----------------------------------------+
| Level   | Code | Message                                |
+---------+------+----------------------------------------+
| Warning | 1364 | Field 't' doesn't have a default value |
| Warning | 1364 | Field 'n' doesn't have a default value |
+---------+------+----------------------------------------+

SELECT * FROM t2;
+------+---+---+
| id   | t | n |
+------+---+---+
|    1 |   | 0 |
|    2 |   | 0 |
+------+---+---+

See INSERT ON DUPLICATE KEY UPDATE for further examples using that syntax.

INSERT ON DUPLICATE KEY UPDATE

Syntax

INSERT [LOW_PRIORITY | DELAYED | HIGH_PRIORITY] [IGNORE]
  [INTO] tbl_name [PARTITION (partition_list)] [(col,...)]
  {VALUES | VALUE} ({expr | DEFAULT},...),(...),...
  [ ON DUPLICATE KEY UPDATE
    col=expr
      [, col=expr] ... ]

Or:

INSERT [LOW_PRIORITY | DELAYED | HIGH_PRIORITY] [IGNORE]
    [INTO] tbl_name [PARTITION (partition_list)]
    SET col={expr | DEFAULT}, ...
    [ ON DUPLICATE KEY UPDATE
      col=expr
        [, col=expr] ... ]

Or:

INSERT [LOW_PRIORITY | HIGH_PRIORITY] [IGNORE]
    [INTO] tbl_name [PARTITION (partition_list)] [(col,...)]
    SELECT ...
    [ ON DUPLICATE KEY UPDATE
      col=expr
        [, col=expr] ... ]

Description

INSERT ... ON DUPLICATE KEY UPDATE is a MariaDB/MySQL extension to the INSERT statement that, if it finds a duplicate unique or primary key, will instead perform an UPDATE.

The row/s affected value is reported as 1 if a row is inserted, and 2 if a row is updated, unless the API's CLIENT_FOUND_ROWS flag is set.

If more than one unique index is matched, only the first is updated. It is not recommended to use this statement on tables with more than one unique index.

If the table has an AUTO_INCREMENT primary key and the statement inserts or updates a row, the LAST_INSERT_ID() function returns its AUTO_INCREMENT value.

The VALUES() function can only be used in a ON DUPLICATE KEY UPDATE clause and has no meaning in any other context. It returns the column values from the INSERT portion of the statement. This function is particularly useful for multi-rows inserts.

The IGNORE and DELAYED options are ignored when you use ON DUPLICATE KEY UPDATE.

See Partition Pruning and Selection for details on the PARTITION clause.

This statement activates INSERT and UPDATE triggers. See Trigger Overview for details.

Examples

CREATE TABLE ins_duplicate (id INT PRIMARY KEY, animal VARCHAR(30));
INSERT INTO ins_duplicate VALUES (1,'Aardvark'), (2,'Cheetah'), (3,'Zebra');

If there is no existing key, the statement runs as a regular INSERT:

INSERT INTO ins_duplicate VALUES (4,'Gorilla') 
  ON DUPLICATE KEY UPDATE animal='Gorilla';
Query OK, 1 row affected (0.07 sec)

SELECT * FROM ins_duplicate;
+----+----------+
| id | animal   |
+----+----------+
|  1 | Aardvark |
|  2 | Cheetah  |
|  3 | Zebra    |
|  4 | Gorilla  |
+----+----------+

A regular INSERT with a primary key value of 1 will fail, due to the existing key:

INSERT INTO ins_duplicate VALUES (1,'Antelope');
ERROR 1062 (23000): Duplicate entry '1' for key 'PRIMARY'

However, we can use an INSERT ON DUPLICATE KEY UPDATE instead:

INSERT INTO ins_duplicate VALUES (1,'Antelope') 
  ON DUPLICATE KEY UPDATE animal='Antelope';
Query OK, 2 rows affected (0.09 sec)

Note that there are two rows reported as affected, but this refers only to the UPDATE.

SELECT * FROM ins_duplicate;
+----+----------+
| id | animal   |
+----+----------+
|  1 | Antelope |
|  2 | Cheetah  |
|  3 | Zebra    |
|  4 | Gorilla  |
+----+----------+

Adding a second unique column:

ALTER TABLE ins_duplicate ADD id2 INT;
UPDATE ins_duplicate SET id2=id+10;
ALTER TABLE ins_duplicate ADD UNIQUE KEY(id2);

Where two rows match the unique keys match, only the first is updated. This can be unsafe and is not recommended unless you are certain what you are doing.

INSERT INTO ins_duplicate VALUES (2,'Lion',13) 
  ON DUPLICATE KEY UPDATE animal='Lion';
Query OK, 2 rows affected (0.004 sec)

SELECT * FROM ins_duplicate;
+----+----------+------+
| id | animal   | id2  |
+----+----------+------+
|  1 | Antelope |   11 |
|  2 | Lion     |   12 |
|  3 | Zebra    |   13 |
|  4 | Gorilla  |   14 |
+----+----------+------+

Although the third row with an id of 3 has an id2 of 13, which also matched, it was not updated.

Changing id to an auto_increment field. If a new row is added, the auto_increment is moved forward. If the row is updated, it remains the same.

ALTER TABLE `ins_duplicate` CHANGE `id` `id` INT( 11 ) NOT NULL AUTO_INCREMENT;
ALTER TABLE ins_duplicate DROP id2;
SELECT Auto_increment FROM INFORMATION_SCHEMA.TABLES 
  WHERE TABLE_NAME='ins_duplicate';
+----------------+
| Auto_increment |
+----------------+
|              5 |
+----------------+

INSERT INTO ins_duplicate VALUES (2,'Leopard') 
  ON DUPLICATE KEY UPDATE animal='Leopard';
Query OK, 2 rows affected (0.00 sec)

SELECT Auto_increment FROM INFORMATION_SCHEMA.TABLES 
  WHERE TABLE_NAME='ins_duplicate';
+----------------+
| Auto_increment |
+----------------+
|              5 |
+----------------+

INSERT INTO ins_duplicate VALUES (5,'Wild Dog') 
  ON DUPLICATE KEY UPDATE animal='Wild Dog';
Query OK, 1 row affected (0.09 sec)

SELECT * FROM ins_duplicate;
+----+----------+
| id | animal   |
+----+----------+
|  1 | Antelope |
|  2 | Leopard  |
|  3 | Zebra    |
|  4 | Gorilla  |
|  5 | Wild Dog |
+----+----------+

SELECT Auto_increment FROM INFORMATION_SCHEMA.TABLES 
  WHERE TABLE_NAME='ins_duplicate';
+----------------+
| Auto_increment |
+----------------+
|              6 |
+----------------+

Refering to column values from the INSERT portion of the statement:

INSERT INTO table (a,b,c) VALUES (1,2,3),(4,5,6)
    ON DUPLICATE KEY UPDATE c=VALUES(a)+VALUES(b);

See the VALUES() function for more.

INSERT SELECT

Syntax

INSERT [LOW_PRIORITY | HIGH_PRIORITY] [IGNORE]
    [INTO] tbl_name [(col_name,...)]
    SELECT ...
    [ ON DUPLICATE KEY UPDATE col_name=expr, ... ]

Description

With INSERT ... SELECT, you can quickly insert many rows into a table from one or more other tables. For example:

INSERT INTO tbl_temp2 (fld_id)
  SELECT tbl_temp1.fld_order_id
  FROM tbl_temp1 WHERE tbl_temp1.fld_order_id > 100;

tbl_name can also be specified in the form db_name.tbl_name (see Identifier Qualifiers). This allows to copy rows between different databases.

If the new table has a primary key or UNIQUE indexes, you can use IGNORE to handle duplicate key errors during the query. The newer values will not be inserted if an identical value already exists.

REPLACE can be used instead of INSERT to prevent duplicates on UNIQUE indexes by deleting old values. In that case, ON DUPLICATE KEY UPDATE cannot be used.

INSERT ... SELECT works for tables which already exist. To create a table for a given resultset, you can use CREATE TABLE ... SELECT.

INSERT

Syntax

INSERT [LOW_PRIORITY | DELAYED | HIGH_PRIORITY] [IGNORE]
 [INTO] tbl_name [PARTITION (partition_list)] [(col,...)]
 {VALUES | VALUE} ({expr | DEFAULT},...),(...),...
 [ ON DUPLICATE KEY UPDATE
   col=expr
     [, col=expr] ... ] [RETURNING select_expr 
      [, select_expr ...]]

Or:

INSERT [LOW_PRIORITY | DELAYED | HIGH_PRIORITY] [IGNORE]
    [INTO] tbl_name [PARTITION (partition_list)]
    SET col={expr | DEFAULT}, ...
    [ ON DUPLICATE KEY UPDATE
      col=expr
        [, col=expr] ... ] [RETURNING select_expr 
      [, select_expr ...]]

Or:

INSERT [LOW_PRIORITY | HIGH_PRIORITY] [IGNORE]
    [INTO] tbl_name [PARTITION (partition_list)] [(col,...)]
    SELECT ...
    [ ON DUPLICATE KEY UPDATE
      col=expr
        [, col=expr] ... ] [RETURNING select_expr 
      [, select_expr ...]]

The INSERT statement is used to insert new rows into an existing table. The INSERT ... VALUES and INSERT ... SET forms of the statement insert rows based on explicitly specified values. The INSERT ... SELECT form inserts rows selected from another table or tables. INSERT ... SELECT is discussed further in the INSERT ... SELECT article.

The table name can be specified in the form db_name.tbl_name or, if a default database is selected, in the form tbl_name (see Identifier Qualifiers). This allows to use INSERT ... SELECT to copy rows between different databases.

The PARTITION clause can be used in both the INSERT and the SELECT part. See Partition Pruning and Selection for details.

The RETURNING clause can be used.

The RETURNING clause is not available.

The columns list is optional. It specifies which values are explicitly inserted, and in which order. If this clause is not specified, all values must be explicitly specified, in the same order they are listed in the table definition.

The list of value follow the VALUES or VALUE keyword (which are interchangeable, regardless how much values you want to insert), and is wrapped by parenthesis. The values must be listed in the same order as the columns list. It is possible to specify more than one list to insert more than one rows with a single statement. If many rows are inserted, this is a speed optimization.

For one-row statements, the SET clause may be more simple, because you don't need to remember the columns order. All values are specified in the form col = expr.

Values can also be specified in the form of a SQL expression or subquery. However, the subquery cannot access the same table that is named in the INTO clause.

If you use the LOW_PRIORITY keyword, execution of the INSERT is delayed until no other clients are reading from the table. If you use the HIGH_PRIORITY keyword, the statement has the same priority as SELECTs. This affects only storage engines that use only table-level locking (MyISAM, MEMORY, MERGE). However, if one of these keywords is specified, concurrent inserts cannot be used. See HIGH_PRIORITY and LOW_PRIORITY clauses for details.

INSERT DELAYED

For more details on the DELAYED option, see INSERT DELAYED.

HIGH PRIORITY and LOW PRIORITY

See HIGH_PRIORITY and LOW_PRIORITY.

Defaults and Duplicate Values

See INSERT - Default & Duplicate Values for details..

INSERT IGNORE

See INSERT IGNORE.

INSERT ON DUPLICATE KEY UPDATE

See INSERT ON DUPLICATE KEY UPDATE.

Examples

Specifying the column names:

INSERT INTO person (first_name, last_name) VALUES ('John', 'Doe');

Inserting more than 1 row at a time:

INSERT INTO tbl_name VALUES (1, "row 1"), (2, "row 2");

Using the SET clause:

INSERT INTO person SET first_name = 'John', last_name = 'Doe';

SELECTing from another table:

INSERT INTO contractor SELECT * FROM person WHERE status = 'c';

See INSERT ON DUPLICATE KEY UPDATE and INSERT IGNORE for further examples.

INSERT ... RETURNING

INSERT ... RETURNING returns a result set of the inserted rows.

It returns the listed columns for all the rows that are inserted, or alternatively, the specified SELECT expression. Any SQL expressions which can be calculated can be used in the select expression for the RETURNING clause, including virtual columns and aliases, expressions which use various operators such as bitwise, logical and arithmetic operators, string functions, date-time functions, numeric functions, control flow functions, secondary functions and stored functions. Along with this, statements which have subqueries and prepared statements can also be used.

Examples

Simple INSERT statement:

INSERT INTO t2 VALUES (1,'Dog'),(2,'Lion'),(3,'Tiger'),(4,'Leopard') 
RETURNING id2,id2+id2,id2&id2,id2||id2;
+-----+---------+---------+----------+
| id2 | id2+id2 | id2&id2 | id2||id2 |
+-----+---------+---------+----------+
|   1 |       2 |       1 |        1 |
|   2 |       4 |       2 |        1 |
|   3 |       6 |       3 |        1 |
|   4 |       8 |       4 |        1 |
+-----+---------+---------+----------+

Using stored functions in RETURNING

DELIMITER |
CREATE FUNCTION f(arg INT) RETURNS INT
    BEGIN
       RETURN (SELECT arg+arg);
    END|

DELIMITER ;

PREPARE stmt FROM "INSERT INTO t1 SET id1=1, animal1='Bear' RETURNING f(id1), UPPER(animal1)";

EXECUTE stmt;
+---------+----------------+
| f(id1)  | UPPER(animal1) |
+---------+----------------+
|       2 | BEAR           |
+---------+----------------+

Subqueries in the RETURNING clause that return more than one row or column cannot be used.

Aggregate functions cannot be used in the RETURNING clause. Since aggregate functions work on a set of values, and if the purpose is to get the row count, ROW_COUNT() with SELECT can be used or it can be used in INSERT...SELECT...RETURNING if the table in the RETURNING clause is not the same as the INSERT table.

INSERT...RETURNING

INSERT ... RETURNING was added in MariaDB 10.5.0, and returns a result set of the inserted rows.

Syntax

INSERT [LOW_PRIORITY | DELAYED | HIGH_PRIORITY] [IGNORE]
 [INTO] tbl_name [PARTITION (partition_list)] [(col,...)]
 {VALUES | VALUE} ({expr | DEFAULT},...),(...),...
 [ ON DUPLICATE KEY UPDATE
   col=expr
     [, col=expr] ... ] [RETURNING select_expr 
      [, select_expr ...]]

Or:

INSERT [LOW_PRIORITY | DELAYED | HIGH_PRIORITY] [IGNORE]
    [INTO] tbl_name [PARTITION (partition_list)]
    SET col={expr | DEFAULT}, ...
    [ ON DUPLICATE KEY UPDATE
      col=expr
        [, col=expr] ... ] [RETURNING select_expr 
      [, select_expr ...]]

Or:

INSERT [LOW_PRIORITY | HIGH_PRIORITY] [IGNORE]
    [INTO] tbl_name [PARTITION (partition_list)] [(col,...)]
    SELECT ...
    [ ON DUPLICATE KEY UPDATE
      col=expr
        [, col=expr] ... ] [RETURNING select_expr 
      [, select_expr ...]]

Description

INSERT ... RETURNING returns a resultset of the inserted rows.

Examples

Simple INSERT statements:

CREATE OR REPLACE TABLE t2 (id INT, animal VARCHAR(20), t TIMESTAMP);

INSERT INTO t2 (id) VALUES (2),(3) RETURNING id,t;
+------+---------------------+
| id   | t                   |
+------+---------------------+
|    2 | 2021-04-28 00:59:32 |
|    3 | 2021-04-28 00:59:32 |
+------+---------------------+

INSERT INTO t2(id,animal) VALUES (1,'Dog'),(2,'Lion'),(3,'Tiger'),(4,'Leopard')  
  RETURNING id,id+id,id&id,id||id;
+------+-------+-------+--------+
| id   | id+id | id&id | id||id |
+------+-------+-------+--------+
|    1 |     2 |     1 |      1 |
|    2 |     4 |     2 |      1 |
|    3 |     6 |     3 |      1 |
|    4 |     8 |     4 |      1 |
+------+-------+-------+--------+

Using stored functions in RETURNING:

DELIMITER |
CREATE FUNCTION f(arg INT) RETURNS INT
    BEGIN
       RETURN (SELECT arg+arg);
    END|

DELIMITER ;

PREPARE stmt FROM "INSERT INTO t1 SET id1=1, animal1='Bear' RETURNING f(id1), UPPER(animal1)";

EXECUTE stmt;
+---------+----------------+
| f(id1)  | UPPER(animal1) |
+---------+----------------+
|       2 | BEAR           |
+---------+----------------+

Subqueries in the RETURNING clause that return more than one row or column cannot be used.

Aggregate functions cannot be used in the RETURNING clause. Since aggregate functions work on a set of values, and if the purpose is to get the row count, ROW_COUNT()with SELECT can be used or it can be used in INSERT...SELECT...RETURNING if the table in the RETURNING clause is not the same as the INSERT table.

LOAD Data into Tables or Index

Learn to load data into MariaDB tables or indexes using LOAD DATA. This section details efficient methods for bulk importing data from external files, optimizing for large datasets.

LOAD DATA INFILE

Syntax

LOAD DATA [LOW_PRIORITY | CONCURRENT] [LOCAL] INFILE 'file_name'
    [REPLACE | IGNORE]
    INTO TABLE tbl_name
    [CHARACTER SET charset_name]
    [{FIELDS | COLUMNS}
        [TERMINATED BY 'string']
        [[OPTIONALLY] ENCLOSED BY 'CHAR']
        [ESCAPED BY 'CHAR']
    ]
    [LINES
        [STARTING BY 'string']
        [TERMINATED BY 'string']
    ]
    [IGNORE number {LINES|ROWS}]
    [(col_name_or_user_var,...)]
    [SET col_name = expr,...]

Description

LOAD DATA INFILE is unsafe for statement-based replication.

Reads rows from a text file into the designated table on the database at a very high speed. The file name must be given as a literal string.

Files are written to disk using the SELECT INTO OUTFILE statement. You can then read the files back into a table using the LOAD DATA INFILE statement. The FIELDS and LINES clauses are the same in both statements and by default fields are expected to be terminated with tabs () and lines with newlines (). These clauses are optional, but if both are specified then the FIELDS clause must precede LINES.

Executing this statement activates INSERT triggers.

One must have the FILE privilege to be able to execute LOAD DATA INFILE. This is to ensure normal users cannot read system files. LOAD DATA LOCAL INFILE does not have this requirement.

If the secure_file_priv system variable is set (by default it is not), the loaded file must be present in the specified directory.

Note that MariaDB's systemd unit file restricts access to /home, /root, and /run/user by default. See Configuring access to home directories.

LOAD DATA LOCAL INFILE

When you execute the LOAD DATA INFILE statement, MariaDB Server attempts to read the input file from its own file system. By contrast, when you execute the LOAD DATA LOCAL INFILE statement, the client attempts to read the input file from its file system, and it sends the contents of the input file to the MariaDB Server. This allows you to load files from the client's local file system into the database.

If you don't want to permit this operation (perhaps for security reasons), you can disable the LOAD DATA LOCAL INFILE statement on either the server or the client.

The LOAD DATA LOCAL INFILE statement can be disabled on the server by setting the local_infile system variable to 0.
The LOAD DATA LOCAL INFILE statement can be disabled on the client. If you are using MariaDB Connector/C, this can be done by unsetting the CLIENT_LOCAL_FILES capability flag with the mysql_real_connect function or by unsetting the MYSQL_OPT_LOCAL_INFILE option with mysql_optionsv function. If you are using a different client or client library, then see the documentation for your specific client or client library to determine how it handles the LOAD DATA LOCAL INFILE statement.
The LOAD DATA LOCAL INFILE strict modes like STRICT_TRANS_TABLES are disabled with keyword "local". (MDEV-11235)

If the LOAD DATA LOCAL INFILE statement is disabled by either the server or the client and if the user attempts to execute it, then the server will cause the statement to fail with the following error message:

The used command is not allowed with this MariaDB version

Note that it is not entirely accurate to say that the MariaDB version does not support the command. It would be more accurate to say that the MariaDB configuration does not support the command. See MDEV-20500 for more information.

From MariaDB 10.5.2, the error message is more accurate:

The used command is not allowed because the MariaDB server or client 
  has disabled the local infile capability

REPLACE and IGNORE

If you load data from a file into a table that already contains data and has a primary key, you may encounter issues where the statement attempts to insert a row with a primary key that already exists. When this happens, the statement fails with Error 1064, protecting the data already on the table. If you want MariaDB to overwrite duplicates, use the REPLACE keyword.

The REPLACE keyword works like the REPLACE statement. Here, the statement attempts to load the data from the file. If the row does not exist, it adds it to the table. If the row contains an existing primary key, it replaces the table data. That is, in the event of a conflict, it assumes the file contains the desired row.

This operation can cause a degradation in load speed by a factor of 20 or more if the part that has already been loaded is larger than the capacity of the InnoDB Buffer Pool. This happens because it causes a lot of turnaround in the buffer pool.

Use the IGNORE keyword when you want to skip any rows that contain a conflicting primary key. Here, the statement attempts to load the data from the file. If the row does not exist, it adds it to the table. If the row contains an existing primary key, it ignores the addition request and moves on to the next. That is, in the event of a conflict, it assumes the table contains the desired row.

`IGNORE number {LINES|ROWS}`

The IGNORE number LINES syntax can be used to ignore a number of rows from the beginning of the file. Most often this is needed when the file starts with one row that includes the column headings.

Character Sets

When the statement opens the file, it attempts to read the contents using the default character-set, as defined by the character_set_database system variable.

In the cases where the file was written using a character-set other than the default, you can specify the character-set to use with the CHARACTER SET clause in the statement. It ignores character-sets specified by the SET NAMES statement and by the character_set_client system variable. Setting the CHARACTER SET clause to a value of binary indicates "no conversion."

The statement interprets all fields in the file as having the same character-set, regardless of the column data type. To properly interpret file contents, you must ensure that it was written with the correct character-set. If you write a data file with mariadb-dump -T or with the SELECT INTO OUTFILE statement with the mariadb client, be sure to use the --default-character-set option, so that the output is written with the desired character-set.

When using mixed character sets, use the CHARACTER SET clause in both SELECT INTO OUTFILE and LOAD DATA INFILE to ensure that MariaDB correctly interprets the escape sequences.

The character_set_filesystem system variable controls the interpretation of the filename. It is currently not possible to load data files that use the ucs2 character set.

Preprocessing Inputs

col_name_or_user_var can be a column name, or a user variable. In the case of a variable, the SET statement can be used to preprocess the value before loading into the table.

Priority and Concurrency

In storage engines that perform table-level locking (MyISAM, MEMORY and MERGE), using the LOW_PRIORITY keyword, MariaDB delays insertions until no other clients are reading from the table. Alternatively, when using the MyISAM storage engine, you can use the CONCURRENT keyword to perform concurrent insertion.

The LOW_PRIORITY and CONCURRENT keywords are mutually exclusive. They cannot be used in the same statement.

Progress Reporting

The LOAD DATA INFILE statement supports progress reporting. You may find this useful when dealing with long-running operations. Using another client you can issue a SHOW PROCESSLIST query to check the progress of the data load.

Using mariadb-import

MariaDB ships with a separate utility for loading data from files: mariadb-import. It operates by sending LOAD DATA INFILE statements to the server.

MariaDB ships with a separate utility for loading data from files: mysqlimport . It operates by sending LOAD DATA INFILE statements to the server.

Using mariadb-import you can compress the file using the --compress option, to get better performance over slow networks, providing both the client and server support the compressed protocol. Use the --local option to load from the local file system.

Indexing

In cases where the storage engine supports ALTER TABLE... DISABLE KEYS statements (MyISAM and Aria), the LOAD DATA INFILE statement automatically disables indexes during the execution.

Examples

You have a file with this content (note the separator is ',', not tab, which is the default):

2,2
3,3
4,4
5,5
6,8

CREATE TABLE t1 (a INT, b INT, c INT, d INT, PRIMARY KEY (a));
LOAD DATA LOCAL INFILE 
 '/tmp/loaddata7.dat' INTO TABLE t1 FIELDS TERMINATED BY ',' (a,b) SET c=a+b;
SELECT * FROM t1;
+------+------+------+
| a    | b    | c    |
+------+------+------+
|    2 |    2 |    4 |
|    3 |    3 |    6 |
|    4 |    4 |    8 |
|    5 |    5 |   10 |
|    6 |    8 |   14 |
+------+------+------+

Another example, given the following data (the separator is a tab):

1       a
2       b

The value of the first column is doubled before loading:

LOAD DATA INFILE 'ld.txt' INTO TABLE ld (@i,v) SET i=@i*2;

SELECT * FROM ld;
+------+------+
| i    | v    |
+------+------+
|    2 | a    |
|    4 | b    |
+------+------+

LOAD INDEX

Syntax

LOAD INDEX INTO CACHE
  tbl_index_list [, tbl_index_list] ...

tbl_index_list:
  tbl_name
    [[INDEX|KEY] (index_name[, index_name] ...)]
    [IGNORE LEAVES]

Description

The LOAD INDEX INTO CACHE statement preloads a table index into the key cache to which it has been assigned by an explicit CACHE INDEX statement, or into the default key cache otherwise.LOAD INDEX INTO CACHE is used only for MyISAM or Aria tables.

The IGNORE LEAVES modifier causes only blocks for the nonleaf nodes of the index to be preloaded.

_{This page is licensed: GPLv2, originally from}_{fill_help_tables.sql}

LOAD XML

Syntax

LOAD XML [LOW_PRIORITY | CONCURRENT] [LOCAL] INFILE 'file_name'
    [REPLACE | IGNORE]
    INTO TABLE [db_name.]tbl_name
    [CHARACTER SET charset_name]
    [ROWS IDENTIFIED BY '<tagname>']
    [IGNORE number {LINES | ROWS}]
    [(column_or_user_var,...)]
    [SET col_name = expr,...]

Description

The LOAD XML statement reads data from an XML file into a table. Thefile_name must be given as a literal string. The tagname in the optional ROWS IDENTIFIED BY clause must also be given as a literal string, and must be surrounded by angle brackets (< and >).

LOAD XML acts as the complement of running the mariadb client in XML output mode (that is, starting the client with the --xml option). To write data from a table to an XML file, use a command such as the following one from the system shell:

shell> mariadb --xml -e 'SELECT * FROM mytable' > file.xml

To read the file back into a table, use LOAD XML INFILE. By default, the element is considered to be the equivalent of a database table row; this can be changed using the ROWS IDENTIFIED BY clause.

This statement supports three different XML formats:

Column names as attributes and column values as attribute values:

<row column1="value1" column2="value2" .../>

Column names as tags and column values as the content of these tags:

<row>
  <column1>value1</column1>
  <column2>value2</column2>
</row>

Column names are the name attributes of tags, and values are the contents of these tags:

<row>
  <field name='column1'>value1</field>
  <field name='column2'>value2</field>
</row>

This is the format used by other tools, such as mariadb-dump.

All 3 formats can be used in the same XML file; the import routine automatically detects the format for each row and interprets it correctly. Tags are matched based on the tag or attribute name and the column name.

The following clauses work essentially the same way for LOAD XML as they do for LOAD DATA:

LOW_PRIORITY or CONCURRENT
LOCAL
REPLACE or IGNORE
CHARACTER SET
(column_or_user_var,...)
SET

See LOAD DATA for more information about these clauses.

The IGNORE number LINES or IGNORE number ROWS clause causes the first number rows in the XML file to be skipped. It is analogous to the LOAD DATA statement's IGNORE ... LINES clause.

If the LOW_PRIORITY keyword is used, insertions are delayed until no other clients are reading from the table. The CONCURRENT keyword allows the use of concurrent inserts. These clauses cannot be specified together.

This statement activates INSERT triggers.

Selecting Data

Learn to select data in MariaDB Server using SQL. This section covers various SELECT statement clauses, including WHERE, GROUP BY, and ORDER BY, to retrieve and filter your data effectively.

DUAL

Description

You can use DUAL instead of a table name in situations where no tables are referenced, such as the following SELECT statement:

SELECT 1 + 1 FROM DUAL;
+-------+
| 1 + 1 |
+-------+
|     2 |
+-------+

DUAL is purely for the convenience of people who require that all SELECT statements should haveFROM and possibly other clauses. MariaDB ignores the clauses. MariaDB does not require FROM DUAL if no tables are referenced.

FROM DUAL could be used when you only SELECT computed values, but require a WHERE clause, perhaps to test that a script correctly handles empty resultsets:

SELECT 1 FROM DUAL WHERE FALSE;
Empty set (0.00 sec)

FOR UPDATE

InnoDB supports row-level locking. Selected rows can be locked using LOCK IN SHARE MODE or FOR UPDATE. In both cases, a lock is acquired on the rows read by the query, and it will be released when the current transaction is committed.

The FOR UPDATE clause of SELECT applies only when autocommit is set to 0 or the SELECT is enclosed in a transaction. A lock is acquired on the rows, and other transactions are prevented from writing the rows, acquire locks, and from reading them (unless their isolation level is READ UNCOMMITTED).

If autocommit is set to 1, the LOCK IN SHARE MODE and FOR UPDATE clauses have no effect in InnoDB. For non-transactional storage engines like MyISAM and ARIA, a table level lock will be taken even if autocommit is set to 1.

If the isolation level is set to SERIALIZABLE, all plain SELECT statements are converted to SELECT ... LOCK IN SHARE MODE.

Example

SELECT * FROM trans WHERE period=2001 FOR UPDATE;

GROUP BY

Use the GROUP BY clause in a SELECT statement to group rows together that have the same value in one or more column, or the same computed value using expressions with any functions and operators except grouping functions. When you use a GROUP BY clause, you will get a single result row for each group of rows that have the same value for the expression given in GROUP BY.

When grouping rows, grouping values are compared as if by the = operator. For string values, the = operator ignores trailing whitespace and may normalize characters and ignore case, depending on the collation in use.

You can use any of the grouping functions in your select expression. Their values will be calculated based on all the rows that have been grouped together for each result row. If you select a non-grouped column or a value computed from a non-grouped column, it is undefined which row the returned value is taken from. This is not permitted if the ONLY_FULL_GROUP_BY SQL_MODE is used.

You can use multiple expressions in the GROUP BY clause, separated by commas. Rows are grouped together if they match on each of the expressions.

You can also use a single integer as the grouping expression. If you use an integer n, the results will be grouped by the nth column in the select expression.

The WHERE clause is applied before the GROUP BY clause. It filters non-aggregated rows before the rows are grouped together. To filter grouped rows based on aggregate values, use the HAVING clause. The HAVING clause takes any expression and evaluates it as a boolean, just like the WHERE clause. You can use grouping functions in the HAVING clause. As with the select expression, if you reference non-grouped columns in the HAVING clause, the behavior is undefined.

By default, if a GROUP BY clause is present, the rows in the output will be sorted by the expressions used in the GROUP BY. You can also specify ASC or DESC (ascending, descending) after those expressions, like in ORDER BY. The default is ASC.

If you want the rows to be sorted by another field, you can add an explicit ORDER BY. If you don't want the result to be ordered, you can add ORDER BY NULL.

WITH ROLLUP

The WITH ROLLUP modifier adds extra rows to the result set that represent super-aggregate summaries. For a full description with examples, see SELECT WITH ROLLUP.

GROUP BY Examples

Consider the following table that records how many times each user has played and won a game:

CREATE TABLE plays (name VARCHAR(16), plays INT, wins INT);
INSERT INTO plays VALUES 
  ("John", 20, 5), 
  ("Robert", 22, 8), 
  ("Wanda", 32, 8), 
  ("Susan", 17, 3);

Get a list of win counts along with a count:

SELECT wins, COUNT(*) FROM plays GROUP BY wins;
+------+----------+
| wins | COUNT(*) |
+------+----------+
|    3 |        1 |
|    5 |        1 |
|    8 |        2 |
+------+----------+
3 rows in set (0.00 sec)

The GROUP BY expression can be a computed value, and can refer back to an identifier specified with AS. Get a list of win averages along with a count:

SELECT (wins / plays) AS winavg, COUNT(*) FROM plays GROUP BY winavg;
+--------+----------+
| winavg | COUNT(*) |
+--------+----------+
| 0.1765 |        1 |
| 0.2500 |        2 |
| 0.3636 |        1 |
+--------+----------+
3 rows in set (0.00 sec)

You can use any grouping function in the select expression. For each win average as above, get a list of the average play count taken to get that average:

SELECT (wins / plays) AS winavg, AVG(plays) FROM plays 
  GROUP BY winavg;
+--------+------------+
| winavg | AVG(plays) |
+--------+------------+
| 0.1765 |    17.0000 |
| 0.2500 |    26.0000 |
| 0.3636 |    22.0000 |
+--------+------------+
3 rows in set (0.00 sec)

You can filter on aggregate information using the HAVING clause. The HAVING clause is applied after GROUP BY and allows you to filter on aggregate data that is not available to the WHERE clause. Restrict the above example to results that involve an average number of plays over 20:

SELECT (wins / plays) AS winavg, AVG(plays) FROM plays 
  GROUP BY winavg HAVING AVG(plays) > 20;
+--------+------------+
| winavg | AVG(plays) |
+--------+------------+
| 0.2500 |    26.0000 |
| 0.3636 |    22.0000 |
+--------+------------+
2 rows in set (0.00 sec)

LIMIT

Description

Use the LIMIT clause to restrict the number of returned rows. When you use a single integer n with LIMIT, the first n rows will be returned. Use the ORDER BY clause to control which rows come first. You can also select a number of rows after an offset using either of the following:

LIMIT offset, row_count
LIMIT row_count OFFSET offset

When you provide an offset m with a limit n, the first m rows will be ignored, and the following n rows will be returned.

Executing an UPDATE with the LIMIT clause is not safe for replication. LIMIT 0 is an exception to this rule (see MDEV-6170).

There is a LIMIT ROWS EXAMINED optimization which provides the means to terminate the execution of SELECT statements which examine too many rows, and thus use too many resources. See LIMIT ROWS EXAMINED.

Multi-Table Updates

It is possible to use LIMIT (or ORDER BY) in a multi-table UPDATE statement.

It is not possible to use LIMIT (or ORDER BY) in a multi-table UPDATE statement.

GROUP_CONCAT

It is possible to use LIMIT with GROUP_CONCAT().

It is not possible to use LIMIT with GROUP_CONCAT().

Examples

CREATE TABLE members (name VARCHAR(20));
INSERT INTO members VALUES('Jagdish'),('Kenny'),('Rokurou'),('Immaculada');

SELECT * FROM members;
+------------+
| name       |
+------------+
| Jagdish    |
| Kenny      |
| Rokurou    |
| Immaculada |
+------------+

Select the first two names (no ordering specified):

SELECT * FROM members LIMIT 2;
+---------+
| name    |
+---------+
| Jagdish |
| Kenny   |
+---------+

All the names in alphabetical order:

SELECT * FROM members ORDER BY name;
+------------+
| name       |
+------------+
| Immaculada |
| Jagdish    |
| Kenny      |
| Rokurou    |
+------------+

The first two names, ordered alphabetically:

SELECT * FROM members ORDER BY name LIMIT 2;
+------------+
| name       |
+------------+
| Immaculada |
| Jagdish    |
+------------+

The third name, ordered alphabetically (the first name would be offset zero, so the third is offset two):

SELECT * FROM members ORDER BY name LIMIT 2,1;
+-------+
| name  |
+-------+
| Kenny |
+-------+

From MariaDB 10.3.2, LIMIT can be used in a multi-table update:

CREATE TABLE warehouse (product_id INT, qty INT);
INSERT INTO warehouse VALUES (1,100),(2,100),(3,100),(4,100);

CREATE TABLE store (product_id INT, qty INT);
INSERT INTO store VALUES (1,5),(2,5),(3,5),(4,5);

UPDATE warehouse,store SET warehouse.qty = warehouse.qty-2, store.qty = store.qty+2 
  WHERE (warehouse.product_id = store.product_id AND store.product_id  >= 1) 
    ORDER BY store.product_id DESC LIMIT 2;

SELECT * FROM warehouse;
+------------+------+
| product_id | qty  |
+------------+------+
|          1 |  100 |
|          2 |  100 |
|          3 |   98 |
|          4 |   98 |
+------------+------+

SELECT * FROM store;
+------------+------+
| product_id | qty  |
+------------+------+
|          1 |    5 |
|          2 |    5 |
|          3 |    7 |
|          4 |    7 |
+------------+------+

When using LIMIT with GROUP_CONCAT, you can simplify certain queries. Consider this table:

CREATE TABLE d (dd DATE, cc INT);

INSERT INTO d VALUES ('2017-01-01',1);
INSERT INTO d VALUES ('2017-01-02',2);
INSERT INTO d VALUES ('2017-01-04',3);

The following query works fine, but is rather complex:

SELECT SUBSTRING_INDEX(GROUP_CONCAT(CONCAT_WS(":",dd,cc) 
ORDER BY cc DESC),",",1) FROM d;
+----------------------------------------------------------------------------+
| SUBSTRING_INDEX(GROUP_CONCAT(CONCAT_WS(":",dd,cc) ORDER BY cc DESC),",",1) |
+----------------------------------------------------------------------------+
| 2017-01-04:3                                                               |
+----------------------------------------------------------------------------+

It can be simplified to this:

SELECT GROUP_CONCAT(CONCAT_WS(":",dd,cc) 
ORDER BY cc DESC LIMIT 1) FROM d;
+-------------------------------------------------------------+
| GROUP_CONCAT(CONCAT_WS(":",dd,cc) ORDER BY cc DESC LIMIT 1) |
+-------------------------------------------------------------+
| 2017-01-04:3                                                |
+-------------------------------------------------------------+

LOCK IN SHARE MODE

InnoDB supports row-level locking. Selected rows can be locked using LOCK IN SHARE MODE or FOR UPDATE. In both cases, a lock is acquired on the rows read by the query, and it will be released when the current transaction is committed.

When LOCK IN SHARE MODE is specified in a SELECT statement, MariaDB will wait until all transactions that have modified the rows are committed. Then, a write lock is acquired. All transactions can read the rows, but if they want to modify them, they have to wait until your transaction is committed.

If autocommit is set to 1 (the default), the LOCK IN SHARE MODE and FOR UPDATE clauses have no effect in InnoDB. For non-transactional storage engines like MyISAM and ARIA, a table level lock will be taken even if autocommit is set to 1.

Optimizer Hints

Optimizer hints are options available that affect the execution plan.

SELECT Modifiers

HIGH PRIORITY

HIGH_PRIORITY gives the statement a higher priority. If the table is locked, high priority SELECTs will be executed as soon as the lock is released, even if other statements are queued. HIGH_PRIORITY applies only if the storage engine only supports table-level locking (MyISAM, MEMORY, MERGE). See HIGH_PRIORITY and LOW_PRIORITY clauses for details.

SQL_CACHE / SQL_NO_CACHE

If the query_cache_type system variable is set to 2 or DEMAND, and the current statement is cacheable, SQL_CACHE causes the query to be cached and SQL_NO_CACHE causes the query not to be cached. For UNIONs, SQL_CACHE or SQL_NO_CACHE should be specified for the first query. See also The Query Cache for more detail and a list of the types of statements that aren't cacheable.

SQL_BUFFER_RESULT

SQL_BUFFER_RESULT forces the optimizer to use a temporary table to process the result. This is useful to free locks as soon as possible.

SQL_SMALL_RESULT / SQL_BIG_RESULT

SQL_SMALL_RESULT and SQL_BIG_RESULT tell the optimizer whether the result is very big or not. Usually, GROUP BY and DISTINCT operations are performed using a temporary table. Only if the result is very big, using a temporary table is not convenient. The optimizer automatically knows if the result is too big, but you can force the optimizer to use a temporary table with SQL_SMALL_RESULT, or avoid the temporary table using SQL_BIG_RESULT.

STRAIGHT_JOIN

STRAIGHT_JOIN applies to the JOIN queries, and tells the optimizer that the tables must be read in the order they appear in the SELECT. For const and system table this options is sometimes ignored.

SQL_CALC_FOUND_ROWS

SQL_CALC_FOUND_ROWS is only applied when using the LIMIT clause. If this option is used, MariaDB will count how many rows would match the query, without the LIMIT clause. That number can be retrieved in the next query, using FOUND_ROWS().

USE/FORCE/IGNORE INDEX

USE INDEX, FORCE INDEX and IGNORE INDEX constrain the query planning to a specific index. For further information about some of these options, see How to force query plans.

MariaDB starting with 12.0

Expanded Optimizer Hints

Syntax

Hints are placed after the main statement verb.

UPDATE /*+ hints */ table ...;
DELETE /*+ hints */ FROM table... ;
SELECT /*+ hints */  ...

They can also appear after the SELECT keyword in any subquery:

SELECT * FROM t1 WHERE a IN (SELECT /*+ hints */ ...)

There can be one or more hints separated with space:

hints:  hint hint ...

Description

Each individual hint is hint name and arguments. In case there are no arguments, the () brackets are still present:

hint:  hint_name([arguments])

Incorrect hints produce warnings (a setting to make them errors is not implemented yet). Hints that are not ignored are kept in the query text (you can see them in SHOW PROCESSLIST, Slow Query Log, EXPLAIN EXTENDED) Hints that were incorrect and were ignored are removed from there.

Hint Hierarchy

Hints can be

global - they apply to whole query;
table-level - they apply to a table;
index-level - they apply to an index in a table.

Table-Level Hints

hint_name([table_name [table_name [,...]] )

Index-Level Hints

Index-level hints apply to indexes. Possible syntax variants are:

hint_name(table_name [index_name [, index_name] ...])

hint_name(table_name@query_block [index_name [, index_name] ...])

hint_name(@query_block  table_name [index_name [, index_name] ...])

Query Block Naming

The QB_NAME hint is used to assign a name to the query block the hint is in. The Query Block is either a SELECT or a top-level construct of UPDATE or DELETE statement.

SELECT /*+ QB_NAME(foo) */ select_list FROM ...

The name can then can be used

to refer to the query block;
to refer to a table in the query block as table_name@query_block_name.

Query block scope is the whole statement. It is invalid to use the same name for multiple query blocks. You can refer to the query block "down into subquery", "down into derived table", "up to the parent" and "to a right sibling in the UNION". You cannot refer "to a left sibling in a UNION".

Hints inside views are not supported, yet. You can neither use hints in VIEW definitions, nor control query plans inside non-merged views. (This is because QB_NAME binding are done "early", before we know that some tables are views.)

SELECT#N NAMES

Besides the given name, any query block is given a name select#n. You can see it when running EXPLAIN EXTENDED:

Note 1003 SELECT /*+ NO_RANGE_OPTIMIZATION(t3@select#1 PRIMARY) */ ...

It is not possible to use it in the hint text:

SELECT /*+ BKA(tbl1@`select#1`) */ 1 FROM tbl1 ...;

QB_NAME in CTEs

Hints that control @name will control the first use of the CTE (common table expression).

Effect of Optimizer Hints

The optimizer can be controlled by

server variables - optimizer_switch, join_cache_level, and so forth;
old-style hints;
new-style hints.

Old-style hints do not overlap with server variable settings.

New-style hints are more specific than server variable settings, so they override the server variable settings.

Hints are "narrowly interpreted" and "best effort" - if a hint dictates to do something, for example:

SELECT  /*+ MRR(t1 t1_index1) */  ... FROM t1 ...

It means: When considering a query plan that involves using t1_index1 in a way that one can use MRR, use MRR. If the query planning is such that use of t1_index1 doesn't allow to use MRR, it won't be used.

The optimizer may also consider using t1_index2 and pick that over using t1_index1. In such cases, the hint is effectively ignored and no warning is given.

List of Hints

NO_RANGE_OPTIMIZATION

An index-level hint that disables range optimization for certain index(es):

SELECT /*+ NO_RANGE_OPTIMIZATION(tbl index1 index2) */  * FROM tbl ...

NO_ICP

An index-level hint that disables Index Condition Pushdown for the indexes. ICP+BKA is disabled as well.

SELECT /*+ NO_ICP(tbl index1 index2) */  * FROM tbl ...

MRR and NO_MRR

Index-level hints to force or disable use of MRR.

SELECT /*+ MRR(tbl index1 index2) */  * FROM tbl ... 

SELECT /*+ NO_MRR(tbl index1 index2) */  * FROM tbl ...

This controls:

MRR optimization for range access (mdev35483-mrr-is-narrow.sql);
BKA mdev35483-mrr-controls-bka-partially.sql.

BKA() and NO_BKA()

Query block or table-level hints.

BKA() also enables MRR to make BKA possible. (This is different from session variables, where you need to enable MRR separately). This also enables BKAH.

BNL() and NO_BNL()

Controls BNL-H.

The implementation is "BNL() hint effectively increases join_cache_level up to 4 " .. for the table(s) it applies to.

MAX_EXECUTION_TIME()

Global-level hint to limit query execution time

SELECT /*+ MAX_EXECUTION_TIME(milliseconds) */ ...  ;

A query that doesn't finish in the time specified will be aborted with an error. If @@max_statement_time is set, the hint will be ignored and a warning produced. Note that this contradicts the stated principle that "new-style hints are more specific than server variable settings, so they override the server variable settings".

Subquery Hints

SUBQUERY Hint

Query block-level hint.

SUBQUERY([@query_block_name] MATERIALIZATION)

SUBQUERY([@query_block_name] INTOEXISTS)

This controls non-semi-join subqueries. The parameter specifies which subquery to use. Use of this hint disables conversion of subquery into semi-join.

SEMIJOIN and NO_SEMIJOIN

Query block-level hints.

This controls the conversion of subqueries to semi-joins and which semi-join strategies are allowed.

[NO_]SEMIJOIN([@query_block_name] [strategy [, strategy] ...])

where the strategy is one of DUPSWEEDOUT, FIRSTMATCH, LOOSESCAN, MATERIALIZATION.

Expanded optimizer hints are not available.

ORDER BY

Description

Use the ORDER BY clause to order a resultset, such as that are returned from a SELECT statement. You can specify just a column or use any expression with functions. If you are using the GROUP BY clause, you can use grouping functions in ORDER BY. Ordering is done after grouping.

You can use multiple ordering expressions, separated by commas. Rows will be sorted by the first expression, then by the second expression if they have the same value for the first, and so on.

You can use the keywords ASC and DESC after each ordering expression to force that ordering to be ascending or descending, respectively. Ordering is ascending by default.

You can also use a single integer as the ordering expression. If you use an integer n, the results will be ordered by the nth column in the select expression.

When string values are compared, they are compared as if by the STRCMP function. STRCMP ignores trailing whitespace and may normalize characters and ignore case, depending on the collation in use.

Duplicated entries in the ORDER BY clause are removed.

ORDER BY can also be used to order the activities of a DELETE or UPDATE statement (usually with the LIMIT clause).

It is possible to use ORDER BY (or LIMIT) in a multi-table UPDATE statement.

It is not possible to use ORDER BY (or LIMIT) in a multi-table UPDATE statement.

MariaDB allows packed sort keys and values of non-sorted fields in the sort buffer. This can make filesort temporary files much smaller when VARCHAR, CHAR or BLOB columns are used, notably speeding up some ORDER BY sorts.

MariaDB does not allow packed sort keys and values of non-sorted fields in the sort buffer.

Examples

CREATE TABLE seq (i INT, x VARCHAR(1));
INSERT INTO seq VALUES (1,'a'), (2,'b'), (3,'b'), (4,'f'), (5,'e');

SELECT * FROM seq ORDER BY i;
+------+------+
| i    | x    |
+------+------+
|    1 | a    |
|    2 | b    |
|    3 | b    |
|    4 | f    |
|    5 | e    |
+------+------+

SELECT * FROM seq ORDER BY i DESC;
+------+------+
| i    | x    |
+------+------+
|    5 | e    |
|    4 | f    |
|    3 | b    |
|    2 | b    |
|    1 | a    |
+------+------+

SELECT * FROM seq ORDER BY x,i;
+------+------+
| i    | x    |
+------+------+
|    1 | a    |
|    2 | b    |
|    3 | b    |
|    5 | e    |
|    4 | f    |
+------+------+

ORDER BY in an UPDATE statement, in conjunction with LIMIT:

UPDATE seq SET x='z' WHERE x='b' ORDER BY i DESC LIMIT 1;

SELECT * FROM seq;
+------+------+
| i    | x    |
+------+------+
|    1 | a    |
|    2 | b    |
|    3 | z    |
|    4 | f    |
|    5 | e    |
+------+------+

ORDER BY can be used in a multi-table update:

CREATE TABLE warehouse (product_id INT, qty INT);
INSERT INTO warehouse VALUES (1,100),(2,100),(3,100),(4,100);

CREATE TABLE store (product_id INT, qty INT);
INSERT INTO store VALUES (1,5),(2,5),(3,5),(4,5);

UPDATE warehouse,store SET warehouse.qty = warehouse.qty-2, store.qty = store.qty+2 
  WHERE (warehouse.product_id = store.product_id AND store.product_id  >= 1) 
    ORDER BY store.product_id DESC LIMIT 2;

SELECT * FROM warehouse;
+------------+------+
| product_id | qty  |
+------------+------+
|          1 |  100 |
|          2 |  100 |
|          3 |   98 |
|          4 |   98 |
+------------+------+

SELECT * FROM store;
+------------+------+
| product_id | qty  |
+------------+------+
|          1 |    5 |
|          2 |    5 |
|          3 |    7 |
|          4 |    7 |
+------------+------+

PROCEDURE

The PROCEDURE clause of SELECT passes the whole result set to a Procedure which will process it. These Procedures are not Stored Procedures, and can only be written in the C language, so it is necessary to recompile the server.

Currently, the only available procedure is ANALYSE, which examines the resultset and suggests the optimal datatypes for each column. It is defined in the sql/sql_analyse.cc file, and can be used as an example to create more Procedures.

This clause cannot be used in a view definition.

SELECT

Syntax

SELECT
    [/*+ hints */]
    [ALL | DISTINCT | DISTINCTROW]
    [HIGH_PRIORITY]
    [STRAIGHT_JOIN]
    [SQL_SMALL_RESULT] [SQL_BIG_RESULT] [SQL_BUFFER_RESULT]
    [SQL_CACHE | SQL_NO_CACHE] [SQL_CALC_FOUND_ROWS]
    select_expr [, select_expr ...]
    [ FROM table_references
      [WHERE where_condition]
      [GROUP BY {col_name | expr | position} [ASC | DESC], ... [WITH ROLLUP]]
      [HAVING where_condition]
      [ORDER BY {col_name | expr | position} [ASC | DESC], ...]
      [LIMIT {[offset,] row_count | row_count OFFSET offset  
      [ROWS EXAMINED rows_limit] } |
        [OFFSET start { ROW | ROWS }]
        [FETCH { FIRST | NEXT } [ count ] { ROW | ROWS } { ONLY | WITH TIES }] ]
      procedure|[PROCEDURE procedure_name(argument_list)]
      [INTO OUTFILE 'file_name' [CHARACTER SET charset_name] [export_options] |
        INTO DUMPFILE 'file_name' | INTO var_name [, var_name] ]
      [FOR UPDATE lock_option | LOCK IN SHARE MODE lock_option]
export_options:
    [{FIELDS | COLUMNS}
        [TERMINATED BY 'string']
        [[OPTIONALLY] ENCLOSED BY 'char']
        [ESCAPED BY 'char']
    ]
    [LINES
        [STARTING BY 'string']
        [TERMINATED BY 'string']
    ]
lock_option:
    [WAIT n | NOWAIT | SKIP LOCKED]

[/*+ hints */] syntax is available from MariaDB 11.8.

[/*+ hints */] syntax is unavailable.

Description

SELECT is used to retrieve rows selected from one or more tables, and can include UNION statements and subqueries.

Each select_expr expression indicates a column or data that you want to retrieve. You must have at least one select expression. See Select Expressions below.
The FROM clause indicates the table or tables from which to retrieve rows. Use either a single table name or a JOIN expression. See JOIN for details. If no table is involved, FROM DUAL can be specified.
Each table can also be specified as db_name.tabl_name. Each column can also be specified as tbl_name.col_name or even db_name.tbl_name.col_name. This allows one to write queries which involve multiple databases. See Identifier Qualifiers for syntax details.
The WHERE clause, if given, indicates the condition or conditions that rows must satisfy to be selected. The where_condition is an expression that evaluates to true for each row to be selected. The statement selects all rows if there is no WHERE clause.
- In the WHERE clause, you can use any of the functions and operators that MariaDB supports, except for aggregate (summary) functions. See Functions and Operators and Functions and Modifiers for use with GROUP BY (aggregate).
Use the ORDER BY clause to order the results.
Use the LIMIT clause to restrict the results to only a certain number of rows, optionally with an offset.
Use the GROUP BY and HAVING clauses to group rows together when they have columns or computed values in common.

SELECT can also be used to retrieve rows computed without reference to any table.

Select Expressions

A SELECT statement must contain one or more select expressions, separated by commas. Each select expression can be one of the following:

The name of a column.
Any expression using functions and operators.
* to select all columns from all tables in the FROM clause.
tbl_name.* to select all columns from just the table tbl_name.

When specifying a column, you can either use just the column name or qualify the column name with the name of the table using tbl_name.col_name. The qualified form is useful if you are joining multiple tables in the FROM clause. If you do not qualify the column names when selecting from multiple tables, MariaDB will try to find the column in each table. It is an error if that column name exists in multiple tables.

You can quote column names using backticks. If you are qualifying column names with table names, quote each part separately as tbl_name`.`col_name.

If you use any grouping functions in any of the select expressions, all rows in your results will be implicitly grouped, as if you had used GROUP BY NULL. GROUP BY NULL being an expression behaves specially such that the entire result set is treated as a group.

DISTINCT

A query may produce some identical rows. By default, all rows are retrieved, even when their values are the same. To explicitly specify that you want to retrieve identical rows, use the ALL option. If you want duplicates to be removed from the result set, use the DISTINCT option. DISTINCTROW is a synonym for DISTINCT. See also COUNT DISTINCT and SELECT UNIQUE in Oracle mode.

INTO

The INTO clause is used to specify that the query results should be written to a file or variable.

SELECT INTO OUTFILE - formatting and writing the result to an external file.
SELECT INTO DUMPFILE - binary-safe writing of the unformatted results to an external file.
SELECT INTO Variable - selecting and setting variables.

The reverse of SELECT INTO OUTFILE is LOAD DATA.

LIMIT

Restricts the number of returned rows. See LIMIT and LIMIT ROWS EXAMINED for details.

See LOCK IN SHARE MODE and FOR UPDATE for details on the respective locking clauses.

OFFSET ... FETCH

See SELECT ... OFFSET ... FETCH.

The clause doesn't exist.

ORDER BY

Order a result set. See ORDER BY for details.

PARTITION

Specifies to the optimizer which partitions are relevant for the query. Other partitions will not be read. See Partition Pruning and Selection for details.

PROCEDURE

Passes the whole result set to a C Procedure. See PROCEDURE and PROCEDURE ANALYSE (the only built-in procedure not requiring the server to be recompiled).

SKIP LOCKED

This causes rows that couldn't be locked (LOCK IN SHARE MODE or FOR UPDATE) to be excluded from the result set. An explicit NOWAIT is implied here. This is only implemented on InnoDB tables and ignored otherwise.

The clause doesn't exist.

Optimizer Hints

These include HIGH_PRIORITY, STRAIGHT_JOIN, SQL_SMALL_RESULT | SQL_BIG_RESULT, SQL_BUFFER_RESULT, SQL_CACHE | SQL_NO_CACHE, and SQL_CALC_FOUND_ROWS.

See Optimizer Hints for details.

max_statement_time clause

By using max_statement_time in conjunction with SET STATEMENT, it is possible to limit the execution time of individual queries. For example:

SET STATEMENT max_statement_time=100 FOR 
  SELECT field1 FROM table_name ORDER BY field1;

WAIT/NOWAIT

Set the lock wait timeout. See WAIT and NOWAIT.

Examples

SELECT f1,f2 FROM t1 WHERE (f3<=10) AND (f4='y');

See Getting Data from MariaDB (Beginner tutorial), or the various sub-articles, for more examples.

SELECT INTO DUMPFILE

Syntax

SELECT ... INTO DUMPFILE 'file_path'

Description

SELECT ... INTO DUMPFILE is a SELECT clause which writes the resultset into a single unformatted row, without any separators, in a file. The results will not be returned to the client.

file_path can be an absolute path, or a relative path starting from the data directory. It can only be specified as a string literal, not as a variable. However, the statement can be dynamically composed and executed as a prepared statement to work around this limitation.

This statement is binary-safe and so is particularly useful for writing BLOB values to file. It can be used, for example, to copy an image or an audio document from the database to a file.

The file must not exist. It cannot be overwritten. A user needs the FILE privilege to run this statement. Also, MariaDB needs permission to write files in the specified location. If the secure_file_priv system variable is set to a non-empty directory name, the file can only be written to that directory.

The character_set_filesystem system variable has controlled interpretation of file names that are given as literal strings.

The character_set_filesystem system variable does not have controlled interpretation of file names that are given as literal strings.

Example

SELECT _utf8'Hello world!' INTO DUMPFILE '/tmp/world';

SELECT LOAD_FILE('/tmp/world') AS world;
+--------------+
| world        |
+--------------+
| Hello world! |
+--------------+

SELECT INTO OUTFILE

Syntax

SELECT ... INTO OUTFILE 'file_name'
        [CHARACTER SET charset_name]
        [export_options]

export_options:
    [{FIELDS | COLUMNS}
        [TERMINATED BY 'string']
        [[OPTIONALLY] ENCLOSED BY 'char']
        [ESCAPED BY 'char']
    ]
    [LINES
        [STARTING BY 'string']
        [TERMINATED BY 'string']
    ]

Description

SELECT INTO OUTFILE writes the resulting rows to a file, and allows the use of column and row terminators to specify a particular output format. The default is to terminate fields with tabs () and lines with newlines ().

The LOAD DATA INFILE statement complements SELECT INTO OUTFILE.

Character Sets

The CHARACTER SET clause specifies the character set in which the results are to be written. Without the clause, no conversion takes place (the binary character set). In this case, if there are multiple character sets, the output will contain these too, and may not easily be able to be reloaded.

In cases where you have two servers using different character-sets, using SELECT INTO OUTFILE to transfer data from one to the other can have unexpected results. To ensure that MariaDB correctly interprets the escape sequences, use the CHARACTER SET clause on both the SELECT INTO OUTFILE statement and the subsequent LOAD DATA INFILE statement.

Example

The following example produces a file in the CSV format:

SELECT customer_id, firstname, surname FROM customer
  INTO OUTFILE '/exportdata/customers.txt'
  FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
  LINES TERMINATED BY '\n';

The following ANSI syntax is also supported for simple SELECT without UNION :

SELECT customer_id, firstname, surname INTO OUTFILE '/exportdata/customers.txt'
  FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
  LINES TERMINATED BY '\n'
  FROM customers;

If you want to use the ANSI syntax with UNION or similar construct you have to use the syntax:

SELECT  * INTO OUTFILE "/tmp/skr3" 
FROM (SELECT * FROM t1 UNION SELECT * FROM t1);

SELECT ... OFFSET ... FETCH

SELECT ... OFFSET ... FETCH is available from MariaDB 10.6.

Syntax

OFFSET start { ROW | ROWS }
FETCH { FIRST | NEXT } [ count ] { ROW | ROWS } { ONLY | WITH TIES }

Description

The OFFSET clause allows one to return only those elements of a resultset that come after a specified offset. The FETCH clause specifies the number of rows to return, while ONLY or WITH TIES specifies whether or not to also return any further results that tie for last place according to the ordered resultset.

Either the singular ROW or the plural ROWS can be used after the OFFSET and FETCH clauses; the choice has no impact on the results.

FIRST and NEXT give the same result.

In the case of WITH TIES, an ORDER BY clause is required, otherwise an error will be returned.

SELECT i FROM t1 FETCH FIRST 2 ROWS WITH TIES;
ERROR 4180 (HY000): FETCH ... WITH TIES requires ORDER BY clause to be present

Examples

Given a table with 6 rows:

CREATE OR REPLACE TABLE t1 (i INT);
INSERT INTO t1 VALUES (1),(2),(3),(4), (4), (5);
SELECT i FROM t1 ORDER BY i ASC;
+------+
| i    |
+------+
|    1 |
|    2 |
|    3 |
|    4 |
|    4 |
|    5 |
+------+

OFFSET 2 allows one to skip the first two results:

SELECT i FROM t1 ORDER BY i ASC OFFSET 2 ROWS;
+------+
| i    |
+------+
|    3 |
|    4 |
|    4 |
|    5 |
+------+

FETCH FIRST 3 ROWS ONLY limits the results to three rows only:

SELECT i FROM t1 ORDER BY i ASC OFFSET 1 ROWS FETCH FIRST 3 ROWS ONLY;
+------+
| i    |
+------+
|    2 |
|    3 |
|    4 |
+------+

The same outcome can also be achieved with the LIMIT clause:

SELECT i FROM t1 ORDER BY i ASC LIMIT 3 OFFSET 1;
+------+
| i    |
+------+
|    2 |
|    3 |
|    4 |
+------+

WITH TIES ensures the tied result 4 is also returned:

SELECT i FROM t1 ORDER BY i ASC OFFSET 1 ROWS FETCH FIRST 3 ROWS WITH TIES;
+------+
| i    |
+------+
|    2 |
|    3 |
|    4 |
|    4 |
+------+

SELECT WITH ROLLUP

Syntax

See SELECT.

Description

The WITH ROLLUP modifier adds extra rows to the result set that represent super-aggregate summaries. The super-aggregated column is represented by a NULL value. Multiple aggregates over different columns will be added if there are multiple GROUP BY columns.

The LIMIT clause can be used at the same time, and is applied after the WITH ROLLUP rows have been added.

WITH ROLLUP cannot be used with ORDER BY. Some sorting is still possible by using ASC or DESC clauses with the GROUP BY column, although the super-aggregate rows will always be added last.

Examples

These examples use the following sample table

CREATE TABLE booksales ( 
  country VARCHAR(35), genre ENUM('fiction','non-fiction'), year YEAR, sales INT);

INSERT INTO booksales VALUES
  ('Senegal','fiction',2014,12234), ('Senegal','fiction',2015,15647),
  ('Senegal','non-fiction',2014,64980), ('Senegal','non-fiction',2015,78901),
  ('Paraguay','fiction',2014,87970), ('Paraguay','fiction',2015,76940),
  ('Paraguay','non-fiction',2014,8760), ('Paraguay','non-fiction',2015,9030);

The WITH ROLLUP modifier in this example adds an extra row that aggregates both years:

SELECT year, SUM(sales) FROM booksales GROUP BY year;
+------+------------+
| year | SUM(sales) |
+------+------------+
| 2014 |     173944 |
| 2015 |     180518 |
+------+------------+
2 rows in set (0.08 sec)

SELECT year, SUM(sales) FROM booksales GROUP BY year WITH ROLLUP;
+------+------------+
| year | SUM(sales) |
+------+------------+
| 2014 |     173944 |
| 2015 |     180518 |
| NULL |     354462 |
+------+------------+

Each time the genre, the year, or the country change, another super-aggregate row is added:

SELECT country, year, genre, SUM(sales) 
  FROM booksales GROUP BY country, year, genre;
+----------+------+-------------+------------+
| country  | year | genre       | SUM(sales) |
+----------+------+-------------+------------+
| Paraguay | 2014 | fiction     |      87970 |
| Paraguay | 2014 | non-fiction |       8760 |
| Paraguay | 2015 | fiction     |      76940 |
| Paraguay | 2015 | non-fiction |       9030 |
| Senegal  | 2014 | fiction     |      12234 |
| Senegal  | 2014 | non-fiction |      64980 |
| Senegal  | 2015 | fiction     |      15647 |
| Senegal  | 2015 | non-fiction |      78901 |
+----------+------+-------------+------------+

SELECT country, year, genre, SUM(sales) 
  FROM booksales GROUP BY country, year, genre WITH ROLLUP;
+----------+------+-------------+------------+
| country  | year | genre       | SUM(sales) |
+----------+------+-------------+------------+
| Paraguay | 2014 | fiction     |      87970 |
| Paraguay | 2014 | non-fiction |       8760 |
| Paraguay | 2014 | NULL        |      96730 |
| Paraguay | 2015 | fiction     |      76940 |
| Paraguay | 2015 | non-fiction |       9030 |
| Paraguay | 2015 | NULL        |      85970 |
| Paraguay | NULL | NULL        |     182700 |
| Senegal  | 2014 | fiction     |      12234 |
| Senegal  | 2014 | non-fiction |      64980 |
| Senegal  | 2014 | NULL        |      77214 |
| Senegal  | 2015 | fiction     |      15647 |
| Senegal  | 2015 | non-fiction |      78901 |
| Senegal  | 2015 | NULL        |      94548 |
| Senegal  | NULL | NULL        |     171762 |
| NULL     | NULL | NULL        |     354462 |
+----------+------+-------------+------------+

The LIMIT clause, applied after WITH ROLLUP:

SELECT country, year, genre, SUM(sales) 
  FROM booksales GROUP BY country, year, genre WITH ROLLUP LIMIT 4;
+----------+------+-------------+------------+
| country  | year | genre       | SUM(sales) |
+----------+------+-------------+------------+
| Paraguay | 2014 | fiction     |      87970 |
| Paraguay | 2014 | non-fiction |       8760 |
| Paraguay | 2014 | NULL        |      96730 |
| Paraguay | 2015 | fiction     |      76940 |
+----------+------+-------------+------------+

Sorting by year descending:

SELECT country, year, genre, SUM(sales) 
  FROM booksales GROUP BY country, year DESC, genre WITH ROLLUP;
+----------+------+-------------+------------+
| country  | year | genre       | SUM(sales) |
+----------+------+-------------+------------+
| Paraguay | 2015 | fiction     |      76940 |
| Paraguay | 2015 | non-fiction |       9030 |
| Paraguay | 2015 | NULL        |      85970 |
| Paraguay | 2014 | fiction     |      87970 |
| Paraguay | 2014 | non-fiction |       8760 |
| Paraguay | 2014 | NULL        |      96730 |
| Paraguay | NULL | NULL        |     182700 |
| Senegal  | 2015 | fiction     |      15647 |
| Senegal  | 2015 | non-fiction |      78901 |
| Senegal  | 2015 | NULL        |      94548 |
| Senegal  | 2014 | fiction     |      12234 |
| Senegal  | 2014 | non-fiction |      64980 |
| Senegal  | 2014 | NULL        |      77214 |
| Senegal  | NULL | NULL        |     171762 |
| NULL     | NULL | NULL        |     354462 |
+----------+------+-------------+------------+

Common Table Expressions (CTE)

Learn about Common Table Expressions (CTEs) in MariaDB Server. This section explains how to use CTEs for complex, readable, and reusable subqueries, simplifying data selection and manipulation.

WITH

Syntax

WITH [RECURSIVE] table_reference [(columns_list)] AS  (
  SELECT ...
)
[CYCLE cycle_column_list RESTRICT]
SELECT ...

Description

The WITH keyword signifies a Common Table Expression (CTE). It allows you to refer to a subquery expression many times in a query, as if having a temporary table that only exists for the duration of a query.

There are two kinds of CTEs:

Non-Recursive.
Recursive (signified by the RECURSIVE keyword).

You can use table_reference as any normal table in the external SELECT part. You can also use WITH in subqueries, as well as with EXPLAIN and SELECT.

Poorly-formed recursive CTEs can in theory cause infinite loops. The max_recursive_iterations system variable limits the number of recursions.

CYCLE ... RESTRICT

The CYCLE clause enables CTE cycle detection, avoiding excessive or infinite loops, MariaDB supports a relaxed, non-standard grammar. The SQL Standard permits a CYCLE clause, as follows:

WITH RECURSIVE ... (
  ...
)
CYCLE <cycle column list>
SET <cycle mark column> TO <cycle mark value> DEFAULT <non-cycle mark value>
USING <path column>

where all clauses are mandatory. MariaDB does not support this, but permits a non-standard relaxed grammar, as follows:

WITH RECURSIVE ... (
  ...
)
CYCLE <cycle column list> RESTRICT

With the use of CYCLE ... RESTRICT it makes no difference whether the CTE uses UNION ALL or UNION DISTINCT anymore. UNION ALL means "all rows, but without cycles", which is exactly what the CYCLE clause enables. And UNION DISTINCT means all rows should be different, which, again, is what will happen — as uniqueness is enforced over a subset of columns, complete rows will automatically all be different.

CYCLE ... RESTRICT is not available.

Examples

Below is an example with the WITH at the top level:

WITH t AS (SELECT a FROM t1 WHERE b >= 'c') 
  SELECT * FROM t2, t WHERE t2.c = t.a;

The example below uses WITH in a subquery:

SELECT t1.a, t1.b FROM t1, t2
  WHERE t1.a > t2.c 
     AND t2.c IN(WITH t AS (SELECT * FROM t1 WHERE t1.a < 5)
                SELECT t2.c FROM t2, t WHERE t2.c = t.a);

Below is an example of a Recursive CTE:

WITH RECURSIVE ancestors AS 
 ( SELECT * FROM folks
   WHERE name="Alex"
   UNION
   SELECT f.*
   FROM folks AS f, ancestors AS a
   WHERE f.id = a.father OR f.id = a.mother )
SELECT * FROM ancestors;

Consider the following structure and data:

CREATE TABLE t1 (from_ int, to_ int);
INSERT INTO t1 VALUES (1,2), (1,100), (2,3), (3,4), (4,1);
SELECT * FROM t1;
+-------+------+
| from_ | to_  |
+-------+------+
|     1 |    2 |
|     1 |  100 |
|     2 |    3 |
|     3 |    4 |
|     4 |    1 |
+-------+------+

Given the above, the following query would theoretically result in an infinite loop due to the last record in t1 (note that max_recursive_iterations is set to 10 for the purposes of this example, to avoid the excessive number of cycles):

SET max_recursive_iterations=10;

WITH RECURSIVE cte (depth, from_, to_) AS ( 
  SELECT 0,1,1 UNION DISTINCT SELECT depth+1, t1.from_, t1.to_ 
    FROM t1, cte  WHERE t1.from_ = cte.to_ 
) 
SELECT * FROM cte;
+-------+-------+------+
| depth | from_ | to_  |
+-------+-------+------+
|     0 |     1 |    1 |
|     1 |     1 |    2 |
|     1 |     1 |  100 |
|     2 |     2 |    3 |
|     3 |     3 |    4 |
|     4 |     4 |    1 |
|     5 |     1 |    2 |
|     5 |     1 |  100 |
|     6 |     2 |    3 |
|     7 |     3 |    4 |
|     8 |     4 |    1 |
|     9 |     1 |    2 |
|     9 |     1 |  100 |
|    10 |     2 |    3 |
+-------+-------+------+

However, the CYCLE ... RESTRICT clause can overcome this:

WITH RECURSIVE cte (depth, from_, to_) AS ( 
  SELECT 0,1,1 UNION SELECT depth+1, t1.from_, t1.to_ 
    FROM t1, cte WHERE t1.from_ = cte.to_ 
) 
CYCLE from_, to_ RESTRICT 
SELECT * FROM cte;
+-------+-------+------+
| depth | from_ | to_  |
+-------+-------+------+
|     0 |     1 |    1 |
|     1 |     1 |    2 |
|     1 |     1 |  100 |
|     2 |     2 |    3 |
|     3 |     3 |    4 |
|     4 |     4 |    1 |
+-------+-------+------+

Non-Recursive Common Table Expressions Overview

Common Table Expressions (CTEs) are a standard SQL feature, and are essentially temporary named result sets. There are two kinds of CTEs: Non-Recursive, which this article covers; and Recursive.

Non-Recursive CTEs

The WITH keyword signifies a CTE. It is given a name, followed by a body (the main query):

CTEs are similar to derived tables:

WITH engineers AS 
   ( SELECT * FROM employees
     WHERE dept = 'Engineering' )

SELECT * FROM engineers
WHERE ...

SELECT * FROM
   ( SELECT * FROM employees
     WHERE dept = 'Engineering' ) AS engineers
WHERE
...

A non-recursive CTE is basically a query-local VIEW. There are several advantages and caveats to them. The syntax is more readable than a nested FROM (SELECT ...). A CTE can refer to another and it can be referenced from multiple places.

A CTE referencing Another CTE

Using this format makes for a more readable SQL than a nested FROM(SELECT ...) clause:

WITH engineers AS (
SELECT * FROM employees
WHERE dept IN('Development','Support') ),
eu_engineers AS ( SELECT * FROM engineers WHERE country IN('NL',...) )
SELECT
...
FROM eu_engineers;

Multiple Uses of a CTE

This can be an 'anti-self join', for example:

WITH engineers AS (
SELECT * FROM employees
WHERE dept IN('Development','Support') )

SELECT * FROM engineers E1
WHERE NOT EXISTS
   (SELECT 1 FROM engineers E2
    WHERE E2.country=E1.country
    AND E2.name <> E1.name );

Or, for year-over-year comparisons, for example:

WITH sales_product_year AS (
SELECT product, YEAR(ship_date) AS year,
SUM(price) AS total_amt
FROM item_sales
GROUP BY product, year )

SELECT *
FROM sales_product_year CUR,
sales_product_year PREV,
WHERE CUR.product=PREV.product 
AND  CUR.year=PREV.year + 1 
AND CUR.total_amt > PREV.total_amt

Another use is to compare individuals against their group. Below is an example of how this might be executed:

WITH sales_product_year AS (
SELECT product,
YEAR(ship_date) AS year,
SUM(price) AS total_amt
FROM item_sales
GROUP BY product, year
)

SELECT * 
FROM sales_product_year S1
WHERE
total_amt > 
    (SELECT 0.1 * SUM(total_amt)
     FROM sales_product_year S2
     WHERE S2.year = S1.year)

_{This page is licensed: CC BY-SA / Gnu FDL}

Recursive Common Table Expressions Overview

Common Table Expressions (CTEs) are a standard SQL feature, and are essentially temporary named result sets. CTEs first appeared in the SQL standard in 1999, and the first implementations began appearing in 2007.

There are two kinds of CTEs:

Non-recursive;
Recursive, which this article covers.

SQL is generally poor at recursive structures.

CTEs permit a query to reference itself. A recursive CTE will repeatedly execute subsets of the data until it obtains the complete result set. This makes it particularly useful for handing hierarchical or tree-structured data. max_recursive_iterations avoids infinite loops.

Syntax example

WITH RECURSIVE signifies a recursive CTE. It is given a name, followed by a body (the main query) as follows:

Computation

Given the following structure:

First execute the anchor part of the query:

Next, execute the recursive part of the query:

Summary

WITH RECURSIVE R AS (
  SELECT anchor_data
  UNION [all]
  SELECT recursive_part
  FROM R, ...
)
SELECT ...

Compute anchor_data.
Compute recursive_part to get the new data.
If (new data is non-empty) goto 2.

CAST to avoid truncating data

As currently implemented by MariaDB and by the SQL Standard, data may be truncated if not correctly cast. It is necessary to CAST the column to the correct width if the CTE's recursive part produces wider values for a column than the CTE's nonrecursive part. Some other DBMS give an error in this situation, and MariaDB's behavior may change in future - see MDEV-12325. See the examples below.

Examples

Transitive closure - determining bus destinations

Sample data:

CREATE TABLE bus_routes (origin VARCHAR(50), dst VARCHAR(50));
INSERT INTO bus_routes VALUES 
  ('New York', 'Boston'), 
  ('Boston', 'New York'), 
  ('New York', 'Washington'), 
  ('Washington', 'Boston'), 
  ('Washington', 'Raleigh');

Now, we want to return the bus destinations with New York as the origin:

WITH RECURSIVE bus_dst as ( 
    SELECT origin as dst FROM bus_routes WHERE origin='New York' 
  UNION
    SELECT bus_routes.dst FROM bus_routes JOIN bus_dst ON bus_dst.dst= bus_routes.origin 
) 
SELECT * FROM bus_dst;
+------------+
| dst        |
+------------+
| New York   |
| Boston     |
| Washington |
| Raleigh    |
+------------+

The above example is computed as follows:

First, the anchor data is calculated:

Starting from New York.
Boston and Washington are added.

Next, the recursive part:

Starting from Boston and then Washington.
Raleigh is added.
UNION excludes nodes that are already present.

Computing paths - determining bus routes

This time, we are trying to get bus routes such as “New York -> Washington -> Raleigh”.

Using the same sample data as the previous example:

WITH RECURSIVE paths (cur_path, cur_dest) AS (
    SELECT origin, origin FROM bus_routes WHERE origin='New York' 
  UNION
    SELECT CONCAT(paths.cur_path, ',', bus_routes.dst), bus_routes.dst 
     FROM paths
     JOIN bus_routes 
       ON paths.cur_dest = bus_routes.origin AND 
         NOT FIND_IN_SET(bus_routes.dst, paths.cur_path)
) 
SELECT * FROM paths;
+-----------------------------+------------+
| cur_path                    | cur_dest   |
+-----------------------------+------------+
| New York                    | New York   |
| New York,Boston             | Boston     |
| New York,Washington         | Washington |
| New York,Washington,Boston  | Boston     |
| New York,Washington,Raleigh | Raleigh    |
+-----------------------------+------------+

CAST to avoid data truncation

In the following example, data is truncated because the results are not specifically cast to a wide enough type:

WITH RECURSIVE tbl AS (
  SELECT NULL AS col
  UNION
  SELECT "THIS NEVER SHOWS UP" AS col FROM tbl
)
SELECT col FROM tbl
+------+
| col  |
+------+
| NULL |
|      |
+------+

Explicitly use CAST to overcome this:

WITH RECURSIVE tbl AS (
  SELECT CAST(NULL AS CHAR(50)) AS col
  UNION SELECT "THIS NEVER SHOWS UP" AS col FROM tbl
)  
SELECT * FROM tbl;
+---------------------+
| col                 |
+---------------------+
| NULL                |
| THIS NEVER SHOWS UP |
+---------------------+

_{This page is licensed: CC BY-SA / Gnu FDL}

Joins & Subqueries

Learn about joins and subqueries in MariaDB Server. This section details how to combine data from multiple tables and nest queries to perform complex data retrieval operations efficiently.

EXCEPT

The result of EXCEPT contains all records of the left SELECT result set except records which are in right SELECT result set. In other words, it is the subtraction of two result sets.

MINUS is a synonym when SQL_MODE=ORACLE is set.

MINUS is a synonym is not available.

Syntax

SELECT ...
(INTERSECT [ALL | DISTINCT] | EXCEPT [ALL | DISTINCT] | UNION [ALL | DISTINCT]) 
  SELECT ...
[(INTERSECT [ALL | DISTINCT] | EXCEPT [ALL | DISTINCT] | UNION [ALL | DISTINCT]) 
  SELECT ...]
[ORDER BY [{col_name | expr | position} [ASC | DESC] 
  [, {col_name | expr | position} [ASC | DESC] ...]]]
[LIMIT {[offset,] row_count | row_count OFFSET offset}
| OFFSET start { ROW | ROWS }
| FETCH { FIRST | NEXT } [ count ] { ROW | ROWS } { ONLY | WITH TIES } ]

Brackets for explicit operation precedence are not supported; use a subquery in the FROM clause as a workaround.

Description

MariaDB supports EXCEPT and INTERSECT in addition to UNION.

The queries before and after EXCEPT must be SELECT or VALUES statements.

All behavior for naming columns, ORDER BY and LIMIT is the same as for UNION. Note that the alternative SELECT ... OFFSET ... FETCH syntax is only supported. This allows us to use the WITH TIES clause.

EXCEPT implicitly supposes a DISTINCT operation.

The result of EXCEPT is all records of the left SELECT result except records which are in right SELECT result set, i.e. it is subtraction of two result sets.

EXCEPT and UNION have the same operation precedence and INTERSECT has a higher precedence, unless running in Oracle mode, in which case all three have the same precedence.

Parentheses

Parentheses can be used to specify precedence. Before this, a syntax error would be returned.

MariaDB starting with 10.5.0

ALL/DISTINCT

EXCEPT ALL and EXCEPT DISTINCT . The ALL operator leaves duplicates intact, while the DISTINCT operator removes duplicates. DISTINCT is the default behavior if neither operator is supplied.

Only EXCEPT DISTINCT is available.

Examples

Show customers which are not employees:

(SELECT e_name AS name, email FROM customers)
EXCEPT
(SELECT c_name AS name, email FROM employees);

Difference between UNION, EXCEPT and INTERSECT:

CREATE TABLE seqs (i INT);
INSERT INTO seqs VALUES (1),(2),(2),(3),(3),(4),(5),(6);

SELECT i FROM seqs WHERE i <= 3 UNION SELECT i FROM seqs WHERE i>=3;
+------+
| i    |
+------+
|    1 |
|    2 |
|    3 |
|    4 |
|    5 |
|    6 |
+------+

SELECT i FROM seqs WHERE i <= 3 UNION ALL SELECT i FROM seqs WHERE i>=3;
+------+
| i    |
+------+
|    1 |
|    2 |
|    2 |
|    3 |
|    3 |
|    3 |
|    3 |
|    4 |
|    5 |
|    6 |
+------+

SELECT i FROM seqs WHERE i <= 3 EXCEPT SELECT i FROM seqs WHERE i>=3;
+------+
| i    |
+------+
|    1 |
|    2 |
+------+

SELECT i FROM seqs WHERE i <= 3 EXCEPT ALL SELECT i FROM seqs WHERE i>=3;
+------+
| i    |
+------+
|    1 |
|    2 |
|    2 |
+------+

SELECT i FROM seqs WHERE i <= 3 INTERSECT SELECT i FROM seqs WHERE i>=3;
+------+
| i    |
+------+
|    3 |
+------+

SELECT i FROM seqs WHERE i <= 3 INTERSECT ALL SELECT i FROM seqs WHERE i>=3;
+------+
| i    |
+------+
|    3 |
|    3 |
+------+

Parentheses for specifying precedence:

CREATE OR REPLACE TABLE t1 (a INT);
CREATE OR REPLACE TABLE t2 (b INT);
CREATE OR REPLACE TABLE t3 (c INT);

INSERT INTO t1 VALUES (1),(2),(3),(4);
INSERT INTO t2 VALUES (5),(6);
INSERT INTO t3 VALUES (1),(6);

((SELECT a FROM t1) UNION (SELECT b FROM t2)) EXCEPT (SELECT c FROM t3);
+------+
| a    |
+------+
|    2 |
|    3 |
|    4 |
|    5 |
+------+

(SELECT a FROM t1) UNION ((SELECT b FROM t2) EXCEPT (SELECT c FROM t3));
+------+
| a    |
+------+
|    1 |
|    2 |
|    3 |
|    4 |
|    5 |
+------+

Here is an example that makes use of the SEQUENCE storage engine and the VALUES statement, to generate a numeric sequence and remove some arbitrary numbers from it:

(SELECT seq FROM seq_1_to_10) EXCEPT VALUES (2), (3), (4);
+-----+
| seq |
+-----+
|   1 |
|   5 |
|   6 |
|   7 |
|   8 |
|   9 |
|  10 |
+-----+

INTERSECT

The result of an intersect is the intersection of right and left SELECT results, i.e. only records that are present in both result sets will be included in the result of the operation.

Syntax

SELECT ...
(INTERSECT [ALL | DISTINCT] | EXCEPT [ALL | DISTINCT] | UNION [ALL | DISTINCT]) SELECT ...
[(INTERSECT [ALL | DISTINCT] | EXCEPT [ALL | DISTINCT] | UNION [ALL | DISTINCT]) SELECT ...]
[ORDER BY [column [, column ...]]]
[LIMIT {[offset,] row_count | row_count OFFSET offset}]

Description

MariaDB has supported INTERSECT (as well as EXCEPT) in addition to UNION since MariaDB 10.3.

All behavior for naming columns, ORDER BY and LIMIT is the same as for UNION.

INTERSECT implicitly supposes a DISTINCT operation.

The result of an intersect is the intersection of right and left SELECT results, i.e. only records that are present in both result sets will be included in the result of the operation.

INTERSECT has higher precedence than UNION and EXCEPT (unless running running in Oracle mode, in which case all three have the same precedence). If possible it will be executed linearly, but if not, it will be translated to a subquery in the FROM clause:

(SELECT a,b FROM t1)
UNION
(SELECT c,d FROM t2)
INTERSECT
(SELECT e,f FROM t3)
UNION
(SELECT 4,4);

will be translated to:

(SELECT a,b FROM t1)
UNION
SELECT c,d FROM
  ((SELECT c,d FROM t2)
   INTERSECT
   (SELECT e,f FROM t3)) dummy_subselect
UNION
(SELECT 4,4)

Parentheses

Parentheses can be used to specify precedence.

Parentheses cannot be used to specify precedence.

MariaDB starting with 10.5.0

ALL/DISTINCT

INTERSECT ALL and INTERSECT DISTINCT . The ALL operator leaves duplicates intact, while the DISTINCT operator removes duplicates. DISTINCT is the default behavior if neither operator is supplied.

DISTINCT is the only behavior available.

Examples

Show customers which are employees:

(SELECT e_name AS name, email FROM employees)
INTERSECT
(SELECT c_name AS name, email FROM customers);

Difference between UNION, EXCEPT and INTERSECT:

CREATE TABLE seqs (i INT);
INSERT INTO seqs VALUES (1),(2),(2),(3),(3),(4),(5),(6);

SELECT i FROM seqs WHERE i <= 3 UNION SELECT i FROM seqs WHERE i>=3;
+------+
| i    |
+------+
|    1 |
|    2 |
|    3 |
|    4 |
|    5 |
|    6 |
+------+

SELECT i FROM seqs WHERE i <= 3 UNION ALL SELECT i FROM seqs WHERE i>=3;
+------+
| i    |
+------+
|    1 |
|    2 |
|    2 |
|    3 |
|    3 |
|    3 |
|    3 |
|    4 |
|    5 |
|    6 |
+------+

SELECT i FROM seqs WHERE i <= 3 EXCEPT SELECT i FROM seqs WHERE i>=3;
+------+
| i    |
+------+
|    1 |
|    2 |
+------+

SELECT i FROM seqs WHERE i <= 3 EXCEPT ALL SELECT i FROM seqs WHERE i>=3;
+------+
| i    |
+------+
|    1 |
|    2 |
|    2 |
+------+

SELECT i FROM seqs WHERE i <= 3 INTERSECT SELECT i FROM seqs WHERE i>=3;
+------+
| i    |
+------+
|    3 |
+------+

SELECT i FROM seqs WHERE i <= 3 INTERSECT ALL SELECT i FROM seqs WHERE i>=3;
+------+
| i    |
+------+
|    3 |
|    3 |
+------+

Parentheses for specifying precedence:

CREATE OR REPLACE TABLE t1 (a INT);
CREATE OR REPLACE TABLE t2 (b INT);
CREATE OR REPLACE TABLE t3 (c INT);

INSERT INTO t1 VALUES (1),(2),(3),(4);
INSERT INTO t2 VALUES (5),(6);
INSERT INTO t3 VALUES (1),(6);

((SELECT a FROM t1) UNION (SELECT b FROM t2)) INTERSECT (SELECT c FROM t3);
+------+
| a    |
+------+
|    1 |
|    6 |
+------+

(SELECT a FROM t1) UNION ((SELECT b FROM t2) INTERSECT (SELECT c FROM t3));
+------+
| a    |
+------+
|    1 |
|    2 |
|    3 |
|    4 |
|    6 |
+------+

MINUS

MINUS is available starting from 10.6.1.

MINUS is a synonym for EXCEPT when SQL_MODE=ORACLE is set.

CREATE TABLE seqs (i INT);
INSERT INTO seqs VALUES (1),(2),(2),(3),(3),(4),(5),(6);

SET SQL_MODE='ORACLE';

SELECT i FROM seqs WHERE i <= 3 MINUS SELECT i FROM seqs WHERE i>=3;
+------+
| i    |
+------+
|    1 |
|    2 |
+------+

_{This page is licensed: CC BY-SA / Gnu FDL}

Precedence Control in Table Operations

You can control the ordering of execution on table operations using parentheses.

Syntax

(  expression )
[ORDER BY [column[, column...]]]
[LIMIT {[offset,] row_count | row_count OFFSET offset}]

Description

Using parentheses in your SQL allows you to control the order of execution for SELECT statements and Table Value Constructor, including UNION, EXCEPT, and INTERSECT operations. MariaDB executes the parenthetical expression before the rest of the statement. You can then use ORDER BY and LIMIT clauses the further organize the result-set.

Note: In practice, the Optimizer may rearrange the exact order in which MariaDB executes different parts of the statement. When it calculates the result-set, however, it returns values as though the parenthetical expression were executed first.

Example

CREATE TABLE test.t1 (num INT);

INSERT INTO test.t1 VALUES (1),(2),(3);

(SELECT * FROM test.t1 
 UNION 
 VALUES (10)) 
INTERSECT 
VALUES (1),(3),(10),(11);
+------+
| num  |
+------+
|    1 |
|    3 |
|   10 |
+------+

((SELECT * FROM test.t1 
  UNION 
  VALUES (10)) 
 INTERSECT 
 VALUES (1),(3),(10),(11)) 
ORDER BY 1 DESC;
+------+
| num  |
+------+
|   10 |
|    3 |
|    1 |
+------+

_{This page is licensed: CC BY-SA / Gnu FDL}

UNION

UNION is used to combine the results from multiple SELECT statements into a single result set.

Syntax

SELECT ...
UNION [ALL | DISTINCT] SELECT ...
[UNION [ALL | DISTINCT] SELECT ...]
[ORDER BY [column [, column ...]]]
[LIMIT {[offset,] row_count | row_count OFFSET offset}]

Description

UNION is used to combine the results from multiple SELECT statements into a single result set.

The column names from the first SELECT statement are used as the column names for the results returned. Selected columns listed in corresponding positions of each SELECT statement should have the same data type. (For example, the first column selected by the first statement should have the same type as the first column selected by the other statements.)

If they don't, the type and length of the columns in the result take into account the values returned by all of the SELECTs, so there is no need for explicit casting. Note that currently this is not the case for recursive CTEs - see MDEV-12325.

Table names can be specified as db_name.tbl_name. This permits writing UNIONs which involve multiple databases. See Identifier Qualifiers for syntax details.

UNION queries cannot be used with aggregate functions.

EXCEPT and UNION have the same operation precedence and INTERSECT has a higher precedence, unless running in Oracle mode, in which case all three have the same precedence.

ALL/DISTINCT

The ALL keyword causes duplicate rows to be preserved. The DISTINCT keyword (the default if the keyword is omitted) causes duplicate rows to be removed by the results.

UNION ALL and UNION DISTINCT can both be present in a query. In this case, UNION DISTINCT will override any UNION ALL to its left.

The server can in most cases execute UNION ALL without creating a temporary table (see MDEV-334).

ORDER BY and LIMIT

Individual SELECT statements can contain their own ORDER BY and LIMIT clauses. In this case, the individual queries need to be wrapped between parentheses. However, this does not affect the order of the UNION, so they only are useful to limit the record read by one SELECT.

The UNION can have global ORDER BY and LIMIT clauses, which affect the whole result set. If the columns retrieved by individual SELECT statements have an alias (AS), the ORDER BY must use that alias, not the real column names.

HIGH_PRIORITY

Specifying a query as HIGH_PRIORITY does not work inside a UNION. If applied to the first SELECT, it is ignored. Applying to a later SELECT results in a syntax error:

ERROR 1234 (42000): Incorrect usage/placement of 'HIGH_PRIORITY'

SELECT ... INTO ...

Individual SELECT statements cannot be written INTO DUMPFILE or INTO OUTFILE. If the last SELECT statement specifies INTO DUMPFILE or INTO OUTFILE, the entire result of the UNION will be written. Placing the clause after any other SELECT will result in a syntax error.

If the result is a single row, SELECT ... INTO @var_name can also be used.

Parentheses

Parentheses can be used to specify precedence.

Parentheses cannot be used to specify precedence.

Examples

UNION between tables having different column names:

(SELECT e_name AS name, email FROM employees)
UNION
(SELECT c_name AS name, email FROM customers);

Specifying the UNION global order and limiting total rows:

(SELECT name, email FROM employees)
UNION
(SELECT name, email FROM customers)
ORDER BY name LIMIT 10;

Adding a constant row:

(SELECT 'John Doe' AS name, 'john.doe@example.net' AS email)
UNION
(SELECT name, email FROM customers);

Differing types:

SELECT CAST('x' AS CHAR(1)) UNION SELECT REPEAT('y',4);
+----------------------+
| CAST('x' AS CHAR(1)) |
+----------------------+
| x                    |
| yyyy                 |
+----------------------+

Returning the results in order of each individual SELECT by use of a sort column:

(SELECT 1 AS sort_column, e_name AS name, email FROM employees)
UNION
(SELECT 2, c_name AS name, email FROM customers) ORDER BY sort_column;

Difference between UNION, EXCEPT and INTERSECT:

CREATE TABLE seqs (i INT);
INSERT INTO seqs VALUES (1),(2),(2),(3),(3),(4),(5),(6);

SELECT i FROM seqs WHERE i <= 3 UNION SELECT i FROM seqs WHERE i>=3;
+------+
| i    |
+------+
|    1 |
|    2 |
|    3 |
|    4 |
|    5 |
|    6 |
+------+

SELECT i FROM seqs WHERE i <= 3 UNION ALL SELECT i FROM seqs WHERE i>=3;
+------+
| i    |
+------+
|    1 |
|    2 |
|    2 |
|    3 |
|    3 |
|    3 |
|    3 |
|    4 |
|    5 |
|    6 |
+------+

SELECT i FROM seqs WHERE i <= 3 EXCEPT SELECT i FROM seqs WHERE i>=3;
+------+
| i    |
+------+
|    1 |
|    2 |
+------+

SELECT i FROM seqs WHERE i <= 3 EXCEPT ALL SELECT i FROM seqs WHERE i>=3;
+------+
| i    |
+------+
|    1 |
|    2 |
|    2 |
+------+

SELECT i FROM seqs WHERE i <= 3 INTERSECT SELECT i FROM seqs WHERE i>=3;
+------+
| i    |
+------+
|    3 |
+------+

SELECT i FROM seqs WHERE i <= 3 INTERSECT ALL SELECT i FROM seqs WHERE i>=3;
+------+
| i    |
+------+
|    3 |
|    3 |
+------+

CREATE OR REPLACE TABLE t1 (a INT);
CREATE OR REPLACE TABLE t2 (b INT);
CREATE OR REPLACE TABLE t3 (c INT);

INSERT INTO t1 VALUES (1),(2),(3),(4);
INSERT INTO t2 VALUES (5),(6);
INSERT INTO t3 VALUES (1),(6);

((SELECT a FROM t1) UNION (SELECT b FROM t2)) INTERSECT (SELECT c FROM t3);
+------+
| a    |
+------+
|    1 |
|    6 |
+------+

(SELECT a FROM t1) UNION ((SELECT b FROM t2) INTERSECT (SELECT c FROM t3));
+------+
| a    |
+------+
|    1 |
|    2 |
|    3 |
|    4 |
|    6 |
+------+

Joins

Learn about SQL joins. This section details various join types (INNER, LEFT, RIGHT, FULL) to combine rows from two or more tables based on related columns, enabling complex data retrieval.

Comma vs JOIN

A query to retrieve the list of phone numbers for clients who ordered in the last two weeks might be written in a couple of ways. Here are two:

SELECT *
FROM
  clients,
  orders,
  phoneNumbers
WHERE
  clients.id = orders.clientId
  AND clients.id = phoneNumbers.clientId
  AND orderPlaced >= NOW() - INTERVAL 2 WEEK;

SELECT *
FROM
  clients
  INNER JOIN orders ON clients.id = orders.clientId
  INNER JOIN phoneNumbers ON clients.id = phoneNumbers.clientId
WHERE
  orderPlaced >= NOW() - INTERVAL 2 WEEK;

Does it make a difference? Not much as written. But you should use the second form, for these reasons:

Readability. Once the WHERE clause contains more than two conditions, it becomes tedious to pick out the difference between business logic (only dates in the last two weeks) and relational logic (which fields relate clients to orders). Using the JOIN syntax with an ON clause makes the WHERE list shorter, and makes it very easy to see how tables relate to each other.
Flexibility. Let's say we need to see all clients even if they don't have a phone number in the system. With the second version, it's easy; just change INNER JOIN phoneNumbers to LEFT JOIN phoneNumbers. Try that with the first version, and recent MySQL versions will issue a syntax error because of the change in precedence between the comma operator and the JOIN keyword. The solution is to rearrange the FROM clause or add parentheses to override the precedence, and that quickly becomes frustrating.
Portability. If your queries use standard SQL syntax, you will have an easier time switching to a different database should the need ever arise.

JOIN Syntax

Description

MariaDB supports the following JOIN syntaxes for the table_references part of SELECT statements and multiple-table DELETE and UPDATE statements:

table_references:
    table_reference [, table_reference] ...

table_reference:
    table_factor
  | join_table

table_factor (<= MariaDB 11.6):
    tbl_name [PARTITION (partition_list)]
        [query_system_time_period_specification] [[AS] alias] [index_hint_list]
  | table_subquery [query_system_time_period_specification] [AS] alias
  | ( table_references )
  | { ON table_reference LEFT OUTER JOIN table_reference
        ON conditional_expr }

table_factor (>= MariaDB 11.7):
    tbl_name [PARTITION (partition_list)]
        [query_system_time_period_specification] [[AS] alias] [index_hint_list]
  | table_subquery [query_system_time_period_specification] [AS] alias [(column_name_list)] 
  | ( table_references )
  | { ON table_reference LEFT OUTER JOIN table_reference
        ON conditional_expr }

join_table:
    table_reference [INNER | CROSS] JOIN table_factor [join_condition]
  | table_reference STRAIGHT_JOIN table_factor
  | table_reference STRAIGHT_JOIN table_factor ON conditional_expr
  | table_reference {LEFT|RIGHT} [OUTER] JOIN table_reference join_condition
  | table_reference NATURAL [{LEFT|RIGHT} [OUTER]] JOIN table_factor

join_condition:
    ON conditional_expr
  | USING (column_list)

query_system_time_period_specification:
    FOR SYSTEM_TIME AS OF point_in_time
  | FOR SYSTEM_TIME BETWEEN point_in_time AND point_in_time
  | FOR SYSTEM_TIME FROM point_in_time TO point_in_time
  | FOR SYSTEM_TIME ALL

point_in_time:
    [TIMESTAMP] expression
  | TRANSACTION expression

index_hint_list:
    index_hint [, index_hint] ...

index_hint:
    USE {INDEX|KEY}
      [{FOR {JOIN|ORDER BY|GROUP BY}] ([index_list])
  | IGNORE {INDEX|KEY}
      [{FOR {JOIN|ORDER BY|GROUP BY}] (index_list)
  | FORCE {INDEX|KEY}
      [{FOR {JOIN|ORDER BY|GROUP BY}] (index_list)

index_list:
    index_name [, index_name] ...

A table reference is also known as a join expression.

Each table can also be specified as db_name.tabl_name. This allows to write queries which involve multiple databases. See Identifier Qualifiers for syntax details.

The syntax of table_factor is an extension to the SQL Standard. The latter accepts only table_reference, not a list of them inside a pair of parentheses.

This is a conservative extension if we consider each comma in a list of table_reference items as equivalent to an inner join. For example, this query:

SELECT * FROM t1 LEFT JOIN (t2, t3, t4)
                 ON (t2.a=t1.a AND t3.b=t1.b AND t4.c=t1.c)

Is equivalent to:

SELECT * FROM t1 LEFT JOIN (t2 CROSS JOIN t3 CROSS JOIN t4)
                 ON (t2.a=t1.a AND t3.b=t1.b AND t4.c=t1.c)

In MariaDB, CROSS JOIN is a syntactic equivalent toINNER JOIN (they can replace each other). In standard SQL, they are not equivalent. INNER JOIN is used with anON clause, CROSS JOIN is used otherwise.

In general, parentheses can be ignored in join expressions containing only inner join operations. MariaDB also supports nested joins (see Nested Join Optimization).

Subqueries

A table subquery is specified as a parenthesized query and must contain a following derived table name (specified as alias in the above syntax specification).

You can optionally specify a list of column names in parenthesis.

SELECT ic1, ic2, ic3 FROM
      (
        SELECT c1, c2, c3 FROM t1 GROUP BY c1
      ) dt2 (ic1, ic2, ic3)
    JOIN t2 ON t2.c1 = dt2.ic1
    WHERE c2 > 0
    GROUP BY ic1

Here, the table subquery for t1 will be materialized and named dt2, with column names ic1, ic2, ic3. These column names are used outside the subquery.

You cannot optionally specify a list of column names in parenthesis.

System-Versioned Tabled

See System-versioned tables for more information about the FOR SYSTEM_TIME syntax.

Index Hints

Index hints can be specified to affect how the MariaDB optimizer makes use of indexes. For more information, see How to force query plans.

Examples

SELECT left_tbl.*
  FROM left_tbl LEFT JOIN right_tbl ON left_tbl.id = right_tbl.id
  WHERE right_tbl.id IS NULL;

_{This page is licensed: GPLv2, originally from} _{fill_help_tables.sql}

More Advanced Joins

This article is a continuation of the JOIN syntax page. If you're getting started with JOIN statements, review that page first.

The Employee Database

Let us begin by using an example employee database of a fairly small family business, which does not anticipate expanding in the future.

First, we create the table that will hold all of the employees and their contact information:

CREATE TABLE `Employees` (
  `ID` TINYINT(3) UNSIGNED NOT NULL AUTO_INCREMENT,
  `First_Name` VARCHAR(25) NOT NULL,
  `Last_Name` VARCHAR(25) NOT NULL,
  `Position` VARCHAR(25) NOT NULL,
  `Home_Address` VARCHAR(50) NOT NULL,
  `Home_Phone` VARCHAR(12) NOT NULL,
  PRIMARY KEY (`ID`)
) ENGINE=MyISAM;

Next, we add a few employees to the table:

INSERT INTO `Employees` (`First_Name`, `Last_Name`, `Position`, `Home_Address`, `Home_Phone`)
  VALUES
  ('Mustapha', 'Mond', 'Chief Executive Officer', '692 Promiscuous Plaza', '326-555-3492'),
  ('Henry', 'Foster', 'Store Manager', '314 Savage Circle', '326-555-3847'),
  ('Bernard', 'Marx', 'Cashier', '1240 Ambient Avenue', '326-555-8456'),
  ('Lenina', 'Crowne', 'Cashier', '281 Bumblepuppy Boulevard', '328-555-2349'),
  ('Fanny', 'Crowne', 'Restocker', '1023 Bokanovsky Lane', '326-555-6329'),
  ('Helmholtz', 'Watson', 'Janitor', '944 Soma Court', '329-555-2478');

Now, we create a second table, containing the hours which each employee clocked in and out during the week:

CREATE TABLE `Hours` (
  `ID` TINYINT(3) UNSIGNED NOT NULL,
  `Clock_In` DATETIME NOT NULL,
  `Clock_Out` DATETIME NOT NULL
) ENGINE=MyISAM;

Finally, although it is a lot of information, we add a full week of hours for each of the employees into the second table that we created:

INSERT INTO `Hours`
  VALUES
  ('1', '2005-08-08 07:00:42', '2005-08-08 17:01:36'),
  ('1', '2005-08-09 07:01:34', '2005-08-09 17:10:11'),
  ('1', '2005-08-10 06:59:56', '2005-08-10 17:09:29'),
  ('1', '2005-08-11 07:00:17', '2005-08-11 17:00:47'),
  ('1', '2005-08-12 07:02:29', '2005-08-12 16:59:12'),
  ('2', '2005-08-08 07:00:25', '2005-08-08 17:03:13'),
  ('2', '2005-08-09 07:00:57', '2005-08-09 17:05:09'),
  ('2', '2005-08-10 06:58:43', '2005-08-10 16:58:24'),
  ('2', '2005-08-11 07:01:58', '2005-08-11 17:00:45'),
  ('2', '2005-08-12 07:02:12', '2005-08-12 16:58:57'),
  ('3', '2005-08-08 07:00:12', '2005-08-08 17:01:32'),
  ('3', '2005-08-09 07:01:10', '2005-08-09 17:00:26'),
  ('3', '2005-08-10 06:59:53', '2005-08-10 17:02:53'),
  ('3', '2005-08-11 07:01:15', '2005-08-11 17:04:23'),
  ('3', '2005-08-12 07:00:51', '2005-08-12 16:57:52'),
  ('4', '2005-08-08 06:54:37', '2005-08-08 17:01:23'),
  ('4', '2005-08-09 06:58:23', '2005-08-09 17:00:54'),
  ('4', '2005-08-10 06:59:14', '2005-08-10 17:00:12'),
  ('4', '2005-08-11 07:00:49', '2005-08-11 17:00:34'),
  ('4', '2005-08-12 07:01:09', '2005-08-12 16:58:29'),
  ('5', '2005-08-08 07:00:04', '2005-08-08 17:01:43'),
  ('5', '2005-08-09 07:02:12', '2005-08-09 17:02:13'),
  ('5', '2005-08-10 06:59:39', '2005-08-10 17:03:37'),
  ('5', '2005-08-11 07:01:26', '2005-08-11 17:00:03'),
  ('5', '2005-08-12 07:02:15', '2005-08-12 16:59:02'),
  ('6', '2005-08-08 07:00:12', '2005-08-08 17:01:02'),
  ('6', '2005-08-09 07:03:44', '2005-08-09 17:00:00'),
  ('6', '2005-08-10 06:54:19', '2005-08-10 17:03:31'),
  ('6', '2005-08-11 07:00:05', '2005-08-11 17:02:57'),
  ('6', '2005-08-12 07:02:07', '2005-08-12 16:58:23');

Working with the Employee Database

Now that we have a cleanly structured database to work with, let us begin this tutorial by stepping up one notch from the last tutorial and filtering our information a little.

Filtering by Name

Earlier in the week, an anonymous employee reported that Helmholtz came into work almost four minutes late; to verify this, we will begin our investigation by filtering out employees whose first names are "Helmholtz":

SELECT
  `Employees`.`First_Name`,
  `Employees`.`Last_Name`,
  `Hours`.`Clock_In`,
  `Hours`.`Clock_Out`
FROM `Employees`
INNER JOIN `Hours` ON `Employees`.`ID` = `Hours`.`ID`
WHERE `Employees`.`First_Name` = 'Helmholtz';

The result looks like this:

+------------+-----------+---------------------+---------------------+
| First_Name | Last_Name | Clock_In            | Clock_Out           |
+------------+-----------+---------------------+---------------------+
| Helmholtz  | Watson    | 2005-08-08 07:00:12 | 2005-08-08 17:01:02 |
| Helmholtz  | Watson    | 2005-08-09 07:03:44 | 2005-08-09 17:00:00 |
| Helmholtz  | Watson    | 2005-08-10 06:54:19 | 2005-08-10 17:03:31 |
| Helmholtz  | Watson    | 2005-08-11 07:00:05 | 2005-08-11 17:02:57 |
| Helmholtz  | Watson    | 2005-08-12 07:02:07 | 2005-08-12 16:58:23 |
+------------+-----------+---------------------+---------------------+
5 rows in set (0.00 sec)

This is obviously more information than we care to trudge through, considering we only care about when he arrived past 7:00:59 on any given day within this week; thus, we need to add a couple more conditions to our WHERE clause.

Filtering by Name, Date and Time

In the following example, we will filter out all of the times which Helmholtz clocked in that were before 7:01:00 and during the work week that lasted from the 8th to the 12th of August:

SELECT
  `Employees`.`First_Name`,
  `Employees`.`Last_Name`,
  `Hours`.`Clock_In`,
  `Hours`.`Clock_Out`
FROM `Employees`
INNER JOIN `Hours` ON `Employees`.`ID` = `Hours`.`ID`
WHERE `Employees`.`First_Name` = 'Helmholtz'
AND DATE_FORMAT(`Hours`.`Clock_In`, '%Y-%m-%d') >= '2005-08-08'
AND DATE_FORMAT(`Hours`.`Clock_In`, '%Y-%m-%d') <= '2005-08-12'
AND DATE_FORMAT(`Hours`.`Clock_In`, '%H:%i:%S') > '07:00:59';

The result looks like this:

+------------+-----------+---------------------+---------------------+
| First_Name | Last_Name | Clock_In            | Clock_Out           |
+------------+-----------+---------------------+---------------------+
| Helmholtz  | Watson    | 2005-08-09 07:03:44 | 2005-08-09 17:00:00 |
| Helmholtz  | Watson    | 2005-08-12 07:02:07 | 2005-08-12 16:58:23 |
+------------+-----------+---------------------+---------------------+
2 rows in set (0.00 sec)

By merely adding a few more conditions, we eliminated all of the irrelevant information; Helmholtz was late to work on the 9th and the 12th of August.

Displaying Total Work Hours per Day

Suppose you would like to—based on the information stored in both of our tables in the employee database—develop a quick list of the total hours each employee has worked for each day recorded; a simple way to estimate the time each employee worked per day is exemplified below:

SELECT
  `Employees`.`ID`,
  `Employees`.`First_Name`,
  `Employees`.`Last_Name`,
  `Hours`.`Clock_In`,
  `Hours`.`Clock_Out`,
DATE_FORMAT(`Hours`.`Clock_Out`, '%T')-DATE_FORMAT(`Hours`.`Clock_In`, '%T') 
AS 'Total_Hours'
FROM `Employees` 
INNER JOIN `Hours` ON `Employees`.`ID` = `Hours`.`ID`;

The result (limited to 10 rows) looks like this:

+----+------------+-----------+---------------------+---------------------+-------------+
| ID | First_Name | Last_Name | Clock_In            | Clock_Out           | Total_Hours |
+----+------------+-----------+---------------------+---------------------+-------------+
|  1 | Mustapha   | Mond      | 2005-08-08 07:00:42 | 2005-08-08 17:01:36 |          10 |
|  1 | Mustapha   | Mond      | 2005-08-09 07:01:34 | 2005-08-09 17:10:11 |          10 |
|  1 | Mustapha   | Mond      | 2005-08-10 06:59:56 | 2005-08-10 17:09:29 |          11 |
|  1 | Mustapha   | Mond      | 2005-08-11 07:00:17 | 2005-08-11 17:00:47 |          10 |
|  1 | Mustapha   | Mond      | 2005-08-12 07:02:29 | 2005-08-12 16:59:12 |           9 |
|  2 | Henry      | Foster    | 2005-08-08 07:00:25 | 2005-08-08 17:03:13 |          10 |
|  2 | Henry      | Foster    | 2005-08-09 07:00:57 | 2005-08-09 17:05:09 |          10 |
|  2 | Henry      | Foster    | 2005-08-10 06:58:43 | 2005-08-10 16:58:24 |          10 |
|  2 | Henry      | Foster    | 2005-08-11 07:01:58 | 2005-08-11 17:00:45 |          10 |
|  2 | Henry      | Foster    | 2005-08-12 07:02:12 | 2005-08-12 16:58:57 |           9 |
+----+------------+-----------+---------------------+---------------------+-------------+
10 rows in set (0.00 sec)

Subqueries

Learn about subqueries in MariaDB Server. This section details how to use nested queries to perform complex data retrieval, filtering, and manipulation operations within a single SQL statement.

Subqueries and ALL

Subqueries using the ALL keyword will return true if the comparison returns true for each row returned by the subquery, or the subquery returns no rows.

Syntax

scalar_expression comparison_operator ALL <Table subquery>

scalar_expression may be any expression that evaluates to a single value.
comparison_operator may be any one of: =, >, <, >=, <=, <> or !=

ALL returns:

NULL if the comparison operator returns NULL for at least one row returned by the Table subquery or scalar_expression returns NULL.
FALSE if the comparison operator returns FALSE for at least one row returned by the Table subquery.
TRUE if the comparison operator returns TRUE for all rows returned by the Table subquery, or if Table subquery returns no rows.

NOT IN is an alias for <> ALL.

Examples

CREATE TABLE sq1 (num TINYINT);

CREATE TABLE sq2 (num2 TINYINT);

INSERT INTO sq1 VALUES(100);

INSERT INTO sq2 VALUES(40),(50),(60);

SELECT * FROM sq1 WHERE num > ALL (SELECT * FROM sq2);
+------+
| num  |
+------+
|  100 |
+------+

Since 100 > all of 40,50 and 60, the evaluation is true and the row is returned.

Adding a second row to sq1, where the evaluation for that record is false:

INSERT INTO sq1 VALUES(30);

SELECT * FROM sq1 WHERE num > ALL (SELECT * FROM sq2);
+------+
| num  |
+------+
|  100 |
+------+

Adding a new row to sq2, causing all evaluations to be false:

INSERT INTO sq2 VALUES(120);

SELECT * FROM sq1 WHERE num > ALL (SELECT * FROM sq2);
Empty set (0.00 sec)

When the subquery returns no results, the evaluation is still true:

SELECT * FROM sq1 WHERE num > ALL (SELECT * FROM sq2 WHERE num2 > 300);
+------+
| num  |
+------+
|  100 |
|   30 |
+------+

Evaluating against a NULL will cause the result to be unknown, or not true, and therefore return no rows:

INSERT INTO sq2 VALUES (NULL);

SELECT * FROM sq1 WHERE num > ALL (SELECT * FROM sq2);

_{This page is licensed: CC BY-SA / Gnu FDL}

Subqueries and ANY

Subqueries using the ANY keyword will return true if the comparison returns true for at least one row returned by the subquery.

Syntax

The required syntax for an ANY or SOME quantified comparison is:

scalar_expression comparison_operator ANY <Table subquery>

Or:

scalar_expression comparison_operator SOME <Table subquery>

scalar_expression may be any expression that evaluates to a single value.
comparison_operator may be any one of =, >, <, >=, <=, <> or !=.

ANY returns:

TRUE if the comparison operator returns TRUE for at least one row returned by the Table subquery.
FALSE if the comparison operator returns FALSE for all rows returned by the Table subquery, or Table subquery has zero rows.
NULL if the comparison operator returns NULL for at least one row returned by the Table subquery and doesn't returns TRUE for any of them, or if scalar_expression returns NULL.

SOME is a synonym for ANY, and IN is a synonym for = ANY .

Examples

CREATE TABLE sq1 (num TINYINT);

CREATE TABLE sq2 (num2 TINYINT);

INSERT INTO sq1 VALUES(100);

INSERT INTO sq2 VALUES(40),(50),(120);

SELECT * FROM sq1 WHERE num > ANY (SELECT * FROM sq2);
+------+
| num  |
+------+
|  100 |
+------+

100 is greater than two of the three values, and so the expression evaluates as true.

SOME is a synonym for ANY:

SELECT * FROM sq1 WHERE num < SOME (SELECT * FROM sq2);
+------+
| num  |
+------+
|  100 |
+------+

IN is a synonym for = ANY, and here there are no matches, so no results are returned:

SELECT * FROM sq1 WHERE num IN (SELECT * FROM sq2);
Empty set (0.00 sec)

INSERT INTO sq2 VALUES(100);
Query OK, 1 row affected (0.05 sec)

SELECT * FROM sq1 WHERE num <> ANY (SELECT * FROM sq2);
+------+
| num  |
+------+
|  100 |
+------+

Reading this query, the results may be counter-intuitive. It may seem to read as SELECT * FROM sq1 WHERE num does not match any results in sq2. Since it does match 100, it could seem that the results are incorrect. However, the query returns a result if the match does not match any of sq2. Since 100 already does not match 40, the expression evaluates to true immediately, regardless of the 100's matching. It may be more easily readable to use SOME in a case such as this:

SELECT * FROM sq1 WHERE num <> SOME (SELECT * FROM sq2);
+------+
| num  |
+------+
|  100 |
+------+

_{This page is licensed: CC BY-SA / Gnu FDL}

Subqueries and EXISTS

Syntax

SELECT ... WHERE EXISTS <Table subquery>

Description

Subqueries using the EXISTS keyword will return true if the subquery returns any rows. Conversely, subqueries using NOT EXISTS will return true only if the subquery returns no rows from the table.

EXISTS subqueries ignore the columns specified by the SELECT of the subquery, since they're not relevant. For example,

SELECT col1 FROM t1 WHERE EXISTS (SELECT * FROM t2);

and

SELECT col1 FROM t1 WHERE EXISTS (SELECT col2 FROM t2);

produce identical results.

Examples

CREATE TABLE sq1 (num TINYINT);

CREATE TABLE sq2 (num2 TINYINT);

INSERT INTO sq1 VALUES(100);

INSERT INTO sq2 VALUES(40),(50),(60);

SELECT * FROM sq1 WHERE EXISTS (SELECT * FROM sq2 WHERE num2>50);
+------+
| num  |
+------+
|  100 |
+------+

SELECT * FROM sq1 WHERE NOT EXISTS (SELECT * FROM sq2 GROUP BY num2 HAVING MIN(num2)=40);
Empty set (0.00 sec)

_{This page is licensed: CC BY-SA / Gnu FDL}

Subqueries and JOINs

A subquery can quite often, but not in all cases, be rewritten as a JOIN.

Rewriting Subqueries as JOINS

A subquery using IN can be rewritten with the DISTINCT keyword, for example:

SELECT * FROM table1 WHERE col1 IN (SELECT col1 FROM table2);

can be rewritten as:

SELECT DISTINCT table1.* FROM table1, table2 WHERE table1.col1=table2.col1;

NOT IN or NOT EXISTS queries can also be rewritten. For example, these two queries returns the same result:

SELECT * FROM table1 WHERE col1 NOT IN (SELECT col1 FROM table2);
SELECT * FROM table1 WHERE NOT EXISTS (SELECT col1 FROM table2 
         WHERE table1.col1=table2.col1);

and both can be rewritten as:

SELECT table1.* FROM table1 LEFT JOIN table2 ON table1.id=table2.id 
       WHERE table2.id IS NULL;

Subqueries that can be rewritten as a LEFT JOIN are sometimes more efficient.

Using Subqueries instead of JOINS

There are some scenarios, though, which call for subqueries rather than joins:

When you want duplicates, but not false duplicates. Suppose Table_1 has three rows — {1,1,2} — and Table_2 has two rows — {1,2,2}. If you need to list the rows in Table_1 which are also in Table_2, only this subquery-based SELECT statement will give the right answer (1,1,2):

SELECT Table_1.column_1 
FROM   Table_1 
WHERE  Table_1.column_1 IN 
  (SELECT Table_2.column_1 
    FROM   Table_2);

This SQL statement won't work:

SELECT Table_1.column_1 
FROM   Table_1,Table_2 
WHERE  Table_1.column_1 = Table_2.column_1;

because the result will be {1,1,2,2} — and the duplication of 2 is an error. This SQL statement won't work either:

SELECT DISTINCT Table_1.column_1 
FROM   Table_1,Table_2 
WHERE  Table_1.column_1 = Table_2.column_1;

because the result will be {1,2} — and the removal of the duplicated 1 is an error too.
When the outermost statement is not a query. The SQL statement:

UPDATE Table_1 SET column_1 = (SELECT column_1 FROM Table_2);

can't be expressed using a join unless some rare SQL3 features are used.
When the join is over an expression. The SQL statement:

SELECT * FROM Table_1 
WHERE column_1 + 5 =
  (SELECT MAX(column_1) FROM Table_2);

is hard to express with a join. In fact, the only way we can think of is this SQL statement:

SELECT Table_1.*
FROM   Table_1, 
      (SELECT MAX(column_1) AS max_column_1 FROM Table_2) AS Table_2
WHERE  Table_1.column_1 + 5 = Table_2.max_column_1;

which still involves a parenthesized query, so nothing is gained from the transformation.
When you want to see the exception. For example, suppose the question is: Which books are longer than Das Kapital? These two queries are effectively almost the same:

SELECT DISTINCT Bookcolumn_1.*                     
FROM   Books AS Bookcolumn_1 JOIN Books AS Bookcolumn_2 USING(page_count) 
WHERE  title = 'Das Kapital';

SELECT DISTINCT Bookcolumn_1.* 
FROM   Books AS Bookcolumn_1 
WHERE  Bookcolumn_1.page_count > 
  (SELECT DISTINCT page_count 
  FROM   Books AS Bookcolumn_2 
  WHERE  title = 'Das Kapital');

The difference is between these two SQL statements is, if there are two editions of Das Kapital (with different page counts), then the self-join example will return the books which are longer than the shortest edition of Das Kapital. That might be the wrong answer, since the original question didn't ask for "... longer than ANY book named Das Kapital" (it seems to contain a false assumption that there's only one edition).

_{This page is licensed: CC BY-SA / Gnu FDL}

Subqueries in a FROM Clause (Derived Tables)

Although subqueries are more commonly placed in a WHERE clause, they can also form part of the FROM clause. Such subqueries are commonly called derived tables.

If a subquery is used in this way, you must also use an AS clause to name the result of the subquery.

ORACLE mode

MariaDB starting with 10.6.0

Anonymous subqueries in a FROM clause (no AS clause) are permitted in ORACLE mode.

Anonymous subqueries in a FROM clause (no AS clause) are not permitted in ORACLE mode.

Correlation Column List

MariaDB starting with 11.7.0

It is possible to assign column names in the derived table name syntax element.

It is not possible to assign column names in the derived table name syntax element.

Examples

CREATE TABLE student (name CHAR(10), test CHAR(10), score TINYINT); 

INSERT INTO student VALUES 
  ('Chun', 'SQL', 75), ('Chun', 'Tuning', 73), 
  ('Esben', 'SQL', 43), ('Esben', 'Tuning', 31), 
  ('Kaolin', 'SQL', 56), ('Kaolin', 'Tuning', 88), 
  ('Tatiana', 'SQL', 87), ('Tatiana', 'Tuning', 83);

Assume that, given the data above, you want to return the average total for all students. In other words, the average of Chun's 148 (75+73), Esben's 74 (43+31), etc.

You cannot do the following:

SELECT AVG(SUM(score)) FROM student GROUP BY name;
ERROR 1111 (HY000): Invalid use of group function

A subquery in the FROM clause is however permitted:

SELECT AVG(sq_sum) FROM (SELECT SUM(score) AS sq_sum FROM student GROUP BY name) AS t;
+-------------+
| AVG(sq_sum) |
+-------------+
|    134.0000 |
+-------------+

The following is permitted:

SELECT * FROM (SELECT 1 FROM DUAL), (SELECT 2 FROM DUAL);

In this example, the second column of the derived table dt is used both within (WHERE c2 > 0), and outside, (WHERE a2 > 10), the specification. Both conditions apply to t1.c2.

CREATE OR REPLACE TABLE t1(c1 INT, c2 INT, c3 INT);

SELECT a1, a2 FROM (SELECT c1, c2, c3 FROM t1 WHERE c2 > 0) AS dt (a1, a2, a3);

_{This page is licensed: CC BY-SA / Gnu FDL}

Row Subqueries

A row subquery is a subquery returning a single row, as opposed to a scalar subquery, which returns a single column from a row, or a literal.

Examples

CREATE TABLE staff (name VARCHAR(10), age TINYINT);

CREATE TABLE customer (name VARCHAR(10), age TINYINT);

INSERT INTO staff VALUES ('Bilhah',37), ('Valerius',61), ('Maia',25);

INSERT INTO customer VALUES ('Thanasis',48), ('Valerius',61), ('Brion',51);

SELECT * FROM staff WHERE (name,age) = (SELECT name,age FROM customer WHERE name='Valerius');
+----------+------+
| name     | age  |
+----------+------+
| Valerius |   61 |
+----------+------+

Finding all rows in one table and also in another:

SELECT name,age FROM staff WHERE (name,age) IN (SELECT name,age FROM customer);
+----------+------+
| name     | age  |
+----------+------+
| Valerius |   61 |
+----------+------+

_{This page is licensed: CC BY-SA / Gnu FDL}

Scalar Subqueries

A scalar subquery is a subquery that returns a single value. This is the simplest form of a subquery, and can be used in most places a literal or single column value is valid.

The data type, length and character set and collation are all taken from the result returned by the subquery. The result of a subquery can always be NULL, that is, no result returned. Even if the original value is defined as NOT NULL, this is disregarded.

A subquery cannot be used where only a literal is expected, for example LOAD DATA INFILE expects a literal string containing the file name, and LIMIT requires a literal integer.

Examples

CREATE TABLE sq1 (num TINYINT);

CREATE TABLE sq2 (num TINYINT);

INSERT INTO sq1 VALUES (1);

INSERT INTO sq2 VALUES (10* (SELECT num FROM sq1));

SELECT * FROM sq2;
+------+
| num  |
+------+
|   10 |
+------+

Inserting a second row means the subquery is no longer a scalar, and this particular query is not valid:

INSERT INTO sq1 VALUES (2);

INSERT INTO sq2 VALUES (10* (SELECT num FROM sq1));
ERROR 1242 (21000): Subquery returns more than 1 row

No rows in the subquery, so the scalar is NULL:

INSERT INTO sq2 VALUES (10* (SELECT num FROM sq3 WHERE num='3'));

SELECT * FROM sq2;
+------+
| num  |
+------+
|   10 |
| NULL |
+------+

A more traditional scalar subquery, as part of a WHERE clause:

SELECT * FROM sq1 WHERE num = (SELECT MAX(num)/10 FROM sq2); 
+------+
| num  |
+------+
|    1 |
+------+

_{This page is licensed: CC BY-SA / Gnu FDL}

Subquery Limitations

There are a number of limitations regarding subqueries, which are discussed below.

The following tables and data will be used in the examples that follow:

CREATE TABLE staff(name VARCHAR(10),age TINYINT);

CREATE TABLE customer(name VARCHAR(10),age TINYINT);

INSERT INTO staff VALUES 
('Bilhah',37), ('Valerius',61), ('Maia',25);

INSERT INTO customer VALUES 
('Thanasis',48), ('Valerius',61), ('Brion',51);

ORDER BY and LIMIT

To use ORDER BY or limit LIMIT in subqueries both must be used.. For example:

SELECT * FROM staff WHERE name IN (SELECT name FROM customer ORDER BY name);
+----------+------+
| name     | age  |
+----------+------+
| Valerius |   61 |
+----------+------+

is valid, but

SELECT * FROM staff WHERE name IN (SELECT NAME FROM customer ORDER BY name LIMIT 1);
ERROR 1235 (42000): This version of MariaDB doesn't 
  yet support 'LIMIT & IN/ALL/ANY/SOME subquery'

is not.

Modifying and Selecting from the Same Table

It's not possible to both modify and select from the same table in a subquery. For example:

DELETE FROM staff WHERE name = (SELECT name FROM staff WHERE age=61);
ERROR 1093 (HY000): Table 'staff' is specified twice, both 
  as a target for 'DELETE' and as a separate source for data

Row Comparison Operations

There is only partial support for row comparison operations. The expression in

expr op {ALL|ANY|SOME} subquery,

must be scalar and the subquery can only return a single column.

However, because of the way IN is implemented (it is rewritten as a sequence of = comparisons and AND), the expression in

expression [NOT] IN subquery

is permitted to be an n-tuple and the subquery can return rows of n-tuples.

For example:

SELECT * FROM staff WHERE (name,age) NOT IN (
  SELECT name,age FROM customer WHERE age >=51]
);
+--------+------+
| name   | age  |
+--------+------+
| Bilhah |   37 |
| Maia   |   25 |
+--------+------+

is permitted, but

SELECT * FROM staff WHERE (name,age) = ALL (
  SELECT name,age FROM customer WHERE age >=51
);
ERROR 1241 (21000): Operand should contain 1 column(s)

is not.

Correlated Subqueries

Subqueries in the FROM clause cannot be correlated subqueries. They cannot be evaluated for each row of the outer query since they are evaluated to produce a result set during when the query is executed.

Stored Functions

A subquery can refer to a stored function which modifies data. This is an extension to the SQL standard, but can result in indeterminate outcomes. For example, take:

SELECT ... WHERE x IN (SELECT f() ...);

where f() inserts rows. The function f() could be executed a different number of times depending on how the optimizer chooses to handle the query.

This sort of construct is therefore not safe to use in replication that is not row-based, as there could be different results on the master and the slave.

_{This page is licensed: CC BY-SA / Gnu FDL}

Data Manipulation (DML)

Changing & Deleting Data

DELETE

Syntax

Description

AS

PARTITION

FOR PORTION OF

RETURNING

Same Source and Target Table

DELETE HISTORY

Examples

Deleting from the Same Source and Target

See Also

HIGH_PRIORITY and LOW_PRIORITY

See Also

REPLACE

Syntax

Description

PARTITION

REPLACE RETURNING

Examples

Examples

See Also

REPLACE...RETURNING

Syntax

Description

Examples

See Also

UPDATE

Syntax

Description

PARTITION

FOR PORTION OF

UPDATE Statements With the Same Source and Target

Example

See Also

Inserting & Loading Data

Concurrent Inserts

Notes

See Also

IGNORE

INSERT - Default & Duplicate Values

Default Values

Duplicate Values

See Also

INSERT DELAYED

Syntax

Description

Limitations

See Also

INSERT IGNORE

Ignoring Errors

Examples

See Also

INSERT ON DUPLICATE KEY UPDATE

Syntax

Description

Examples

See Also

INSERT SELECT

Syntax

Description

See Also

INSERT

Syntax

INSERT DELAYED

HIGH PRIORITY and LOW PRIORITY

Defaults and Duplicate Values

INSERT IGNORE

INSERT ON DUPLICATE KEY UPDATE

Examples

INSERT ... RETURNING

Examples

See Also

INSERT...RETURNING

Syntax

Description

Examples

See Also

`IGNORE number {LINES|ROWS}`