Learn data manipulation language (DML) statements in MariaDB Server. This section covers SQL commands for inserting, updating, deleting, and selecting data within your databases.
Learn to change and delete data in MariaDB Server. This section covers UPDATE and DELETE SQL statements, enabling you to modify existing records and remove unwanted information efficiently.
Single-table syntax:
DELETE [LOW_PRIORITY] [QUICK] [IGNORE]
FROM tbl_name [PARTITION (partition_list)]
[FOR PORTION OF PERIOD FROM expr1 TO expr2]
[AS alias] -- from MariaDB 11.6
[WHERE where_condition]
[ORDER BY ...]
[LIMIT row_count]
[RETURNING select_expr
[, select_expr ...]]
Multiple-table syntax:
DELETE [LOW_PRIORITY] [QUICK] [IGNORE]
tbl_name[.*] [, tbl_name[.*]] ...
FROM table_references
[WHERE where_condition]
Or:
DELETE [LOW_PRIORITY] [QUICK] [IGNORE]
FROM tbl_name[.*] [, tbl_name[.*]] ...
USING table_references
[WHERE where_condition]
Trimming history:
DELETE HISTORY
FROM tbl_name [PARTITION (partition_list)]
[BEFORE SYSTEM_TIME [TIMESTAMP|TRANSACTION] expression]
LOW_PRIORITY
Wait until all SELECT's are done before starting the statement. Used with storage engines that uses table locking (MyISAM, Aria etc). See HIGH_PRIORITY and LOW_PRIORITY clauses for details.
QUICK
Signal the storage engine that it should expect that a lot of rows are deleted. The storage engine can do things to speed up the DELETE like ignoring merging of data blocks until all rows are deleted from the block (instead of when a block is half full). This speeds up things at the expanse of lost space in data blocks. At least MyISAM and Aria support this feature.
IGNORE
Don't stop the query even if a not-critical error occurs (like data overflow). See How IGNORE works for a full description.
For the single-table syntax, the DELETE
statement deletes rows from tbl_name and returns a count of the number of deleted rows. This count can be obtained by calling the ROW_COUNT() function. TheWHERE
clause, if given, specifies the conditions that identify which rows to delete. With no WHERE
clause, all rows are deleted. If the ORDER BY clause is specified, the rows are deleted in the order that is specified. The LIMIT clause places a limit on the number of rows that can be deleted.
For the multiple-table syntax, DELETE
deletes from eachtbl_name
the rows that satisfy the conditions. In this case,ORDER BY and LIMIT> cannot be used. A DELETE
can also reference tables which are located in different databases; see Identifier Qualifiers for the syntax.
where_condition
is an expression that evaluates to true for each row to be deleted. It is specified as described in SELECT.
Currently, you cannot delete from a table and select from the same table in a subquery.
You need the DELETE
privilege on a table to delete rows from it. You need only the SELECT
privilege for any columns that are only read, such as those named in the WHERE
clause. SeeGRANT.
As stated, a DELETE
statement with no WHERE
clause deletes all rows. A faster way to do this, when you do not need to know the number of deleted rows, is to use TRUNCATE TABLE
. However,
within a transaction or if you have a lock on the table,TRUNCATE TABLE
cannot be used whereas DELETE
can. See TRUNCATE TABLE, andLOCK.
Single-table DELETE
statements support aliases. For example:
CREATE TABLE t1 (c1 INT);
INSERT INTO t1 VALUES (1), (2);
DELETE FROM t1 AS a1 WHERE a1.c1 = 2;
Single-table DELETE
statements do not support aliases.
See Partition Pruning and Selection for details.
See Application Time Periods - Deletion by Portion.
It is possible to return a resultset of the deleted rows for a single table to the client by using the syntax DELETE ... RETURNING select_expr [, select_expr2 ...]]
Any of SQL expression that can be calculated from a single row fields is allowed. Subqueries are allowed. The AS keyword is allowed, so it is possible to use aliases.
The use of aggregate functions is not allowed. RETURNING cannot be used in multi-table DELETEs.
It is possible to delete from a table with the same source and target. For example:
DELETE FROM t1 WHERE c1 IN (SELECT b.c1 FROM t1 b WHERE b.c2=0);
You can use DELETE HISTORY
to delete historical information from System-versioned tables.
How to use the ORDER BY and LIMIT clauses:
DELETE FROM page_hit ORDER BY TIMESTAMP LIMIT 1000000;
How to use the RETURNING clause:
DELETE FROM t RETURNING f1;
+------+
| f1 |
+------+
| 5 |
| 50 |
| 500 |
+------+
The following statement joins two tables: one is only used to satisfy a WHERE condition, but no row is deleted from it; rows from the other table are deleted, instead.
DELETE post FROM blog INNER JOIN post WHERE blog.id = post.blog_id;
CREATE TABLE t1 (c1 INT, c2 INT);
DELETE FROM t1 WHERE c1 IN (SELECT b.c1 FROM t1 b WHERE b.c2=0);
The statement returns:
Query OK, 0 rows affected (0.00 sec)
The statement returns:
ERROR 1093 (HY000): Table 't1' is specified twice, both as a target for 'DELETE'
AND AS a separate source FOR
Returning clause (video)
This page is licensed: GPLv2, originally from fill_help_tables.sql
The InnoDB storage engine uses row-level locking to ensure data integrity. However some storage engines (such as MEMORY, MyISAM, Aria and MERGE) lock the whole table to prevent conflicts. These storage engines use two separate queues to remember pending statements; one is for SELECTs and the other one is for write statements (INSERT, DELETE, UPDATE). By default, the latter has a higher priority.
To give write operations a lower priority, the low_priority_updates server system variable can be set to ON
. The option is available on both the global and session levels, and it can be set at startup or via the SET statement.
When too many table locks have been set by write statements, some pending SELECTs are executed. The maximum number of write locks that can be acquired before this happens is determined by the max_write_lock_count server system variable, which is dynamic.
If write statements have a higher priority (default), the priority of individual write statements (INSERT, REPLACE, UPDATE, DELETE) can be changed via the LOW_PRIORITY
attribute, and the priority of a SELECT
statement can be raised via the HIGH_PRIORITY
attribute. Also, LOCK TABLES supports a LOW_PRIORITY
attribute for WRITE
locks.
If read statements have a higher priority, the priority of an INSERT
can be changed via the HIGH_PRIORITY
attribute. However, the priority of other write statements cannot be raised individually.
The use of LOW_PRIORITY
or HIGH_PRIORITY
for an INSERT
prevents Concurrent Inserts from being used.
This page is licensed: CC BY-SA / Gnu FDL
REPLACE [LOW_PRIORITY | DELAYED]
[INTO] tbl_name [PARTITION (partition_list)] [(col,...)]
{VALUES | VALUE} ({expr | DEFAULT},...),(...),...
[RETURNING select_expr
[, select_expr ...]]
Or:
REPLACE [LOW_PRIORITY | DELAYED]
[INTO] tbl_name [PARTITION (partition_list)]
SET col={expr | DEFAULT}, ...
[RETURNING select_expr
[, select_expr ...]]
Or:
REPLACE [LOW_PRIORITY | DELAYED]
[INTO] tbl_name [PARTITION (partition_list)] [(col,...)]
SELECT ...
[RETURNING select_expr
[, select_expr ...]]
REPLACE
works exactly like INSERT, except that if an old row in the table has the same value as a new row for a PRIMARY KEY
or aUNIQUE
index, the old row is deleted before the new row is inserted. If the table has more than one UNIQUE
keys, it is possible that the new row conflicts with more than one row. In this case, all conflicting rows will be deleted.
The table name can be specified in the form db_name
.tbl_name
or, if a default database is selected, in the form tbl_name
(see Identifier Qualifiers). This allows to use REPLACE ... SELECT to copy rows between different databases.
Basically it works like this:
BEGIN;
SELECT 1 FROM t1 WHERE key=# FOR UPDATE;
IF found-row
DELETE FROM t1 WHERE key=# ;
ENDIF
INSERT INTO t1 VALUES (...);
END;
The above can be replaced with:
REPLACE INTO t1 VALUES (...)
REPLACE
is a MariaDB/MySQL extension to the SQL standard. It either inserts, or deletes and inserts. For other MariaDB/MySQL extensions to standard SQL --- that also handle duplicate values --- see IGNORE and INSERT ON DUPLICATE KEY UPDATE.
Note that unless the table has a PRIMARY KEY
orUNIQUE
index, using a REPLACE
statement makes no sense. It becomes equivalent to INSERT
, because there is no index to be used to determine whether a new row duplicates another.
Values for all columns are taken from the values. See Partition Pruning and Selection for details. Specified in theREPLACE
statement. Any missing columns are set to their default values, just as happens for INSERT
. You cannot refer to values from the current row and use them in the new row. If you use an assignment such as 'SET col = col + 1'
, the reference to the column name on the right hand side is treated asDEFAULT(col)
, so the assignment is equivalent to'SET col = DEFAULT(col) + 1'
.
To use REPLACE
, you must have both theINSERT
and DELETE
privileges for the table.
There are some issues you should be aware of, before using REPLACE
:
If there is an AUTO_INCREMENT field, a new value will be generated.
If there are foreign keys, ON DELETE
action will be activated by REPLACE
.
Triggers on DELETE
and INSERT
will be activated by REPLACE
.
To avoid some of these behaviors, you can use INSERT ... ON DUPLICATE KEY UPDATE
.
This statement activates INSERT
and DELETE
triggers. See Trigger Overview for details.
See Partition Pruning and Selection for details.
REPLACE ... RETURNING
returns a resultset of the replaced rows. This returns the listed columns for all the rows that are replaced, or alternatively, the specified SELECT
expression. Any SQL expressions which can be calculated can be used in the select expression for the RETURNING
clause, including virtual columns and aliases, expressions which use various operators such as bitwise, logical and arithmetic operators, string functions, date-time functions, numeric functions, control flow functions, secondary functions and stored functions. Along with this, statements which have subqueries and prepared statements can also be used.
Simple REPLACE
statement:
REPLACE INTO t2 VALUES (1,'Leopard'),(2,'Dog') RETURNING id2, id2+id2
as Total ,id2|id2, id2&&id2;
+-----+-------+---------+----------+
| id2 | Total | id2|id2 | id2&&id2 |
+-----+-------+---------+----------+
| 1 | 2 | 1 | 1 |
| 2 | 4 | 2 | 1 |
+-----+-------+---------+----------+
Using stored functions in RETURNING
:
DELIMITER |
CREATE FUNCTION f(arg INT) RETURNS INT
BEGIN
RETURN (SELECT arg+arg);
END|
DELIMITER ;
PREPARE stmt FROM "REPLACE INTO t2 SET id2=3, animal2='Fox' RETURNING f2(id2),
UPPER(animal2)";
EXECUTE stmt;
+---------+----------------+
| f2(id2) | UPPER(animal2) |
+---------+----------------+
| 6 | FOX |
+---------+----------------+
Subqueries in the statement:
REPLACE INTO t1 SELECT * FROM t2 RETURNING (SELECT id2 FROM t2 WHERE
id2 IN (SELECT id2 FROM t2 WHERE id2=1)) AS new_id;
+--------+
| new_id |
+--------+
| 1 |
| 1 |
| 1 |
| 1 |
+--------+
Subqueries in the RETURNING
clause that return more than one row or column cannot be used..
Aggregate functions cannot be used in the RETURNING clause. Since aggregate functions work on a set of values and if the purpose is to get the row count, ROW_COUNT()
with SELECT
can be used, or it can be used in REPLACE...SELECT
.
REPLACE ... RETURNING
returns a result set of the replaced rows. This returns the listed columns for all the rows that are replaced, or alternatively, the specified SELECT
expression. Any SQL expressions which can be calculated can be used in the select expression for the RETURNING
clause, including virtual columns and aliases, expressions which use various operators such as bitwise, logical and arithmetic operators, string functions, date-time functions, numeric functions, control flow functions, secondary functions and stored functions. Along with this, statements which have subqueries and prepared statements can also be used.
Simple REPLACE
statement:
REPLACE INTO t2 VALUES (1,'Leopard'),(2,'Dog') RETURNING id2, id2+id2
as Total ,id2|id2, id2&&id2;
+-----+-------+---------+----------+
| id2 | Total | id2|id2 | id2&&id2 |
+-----+-------+---------+----------+
| 1 | 2 | 1 | 1 |
| 2 | 4 | 2 | 1 |
+-----+-------+---------+----------+
Using stored functions in RETURNING
:
DELIMITER |
CREATE FUNCTION f(arg INT) RETURNS INT
BEGIN
RETURN (SELECT arg+arg);
END|
DELIMITER ;
PREPARE stmt FROM "REPLACE INTO t2 SET id2=3, animal2='Fox' RETURNING f2(id2),
UPPER(animal2)";
EXECUTE stmt;
+---------+----------------+
| f2(id2) | UPPER(animal2) |
+---------+----------------+
| 6 | FOX |
+---------+----------------+
Subqueries in the statement:
REPLACE INTO t1 SELECT * FROM t2 RETURNING (SELECT id2 FROM t2 WHERE
id2 IN (SELECT id2 FROM t2 WHERE id2=1)) AS new_id;
+--------+
| new_id |
+--------+
| 1 |
| 1 |
| 1 |
| 1 |
+--------+
Subqueries in the RETURNING
clause that return more than one row or column cannot be used..
Aggregate functions cannot be used in the RETURNING
clause. Since aggregate functions work on a set of values and if the purpose is to get the row count, ROW_COUNT()
with SELECT
can be used, or it can be used in REPLACE...SELECT...RETURNING
if the table in the RETURNING
clause is not the same as the REPLACE
table. SELECT...RETURNING
if the table in the RETURNING
clause is not the same as the REPLACE
table.
INSERT DELAYED for details on the DELAYED
clause
This page is licensed: GPLv2, originally from fill_help_tables.sql
REPLACE [LOW_PRIORITY | DELAYED]
[INTO] tbl_name [PARTITION (partition_list)] [(col,...)]
{VALUES | VALUE} ({expr | DEFAULT},...),(...),...
[RETURNING select_expr
[, select_expr ...]]
Or:
REPLACE [LOW_PRIORITY | DELAYED]
[INTO] tbl_name [PARTITION (partition_list)]
SET col={expr | DEFAULT}, ...
[RETURNING select_expr
[, select_expr ...]]
Or:
REPLACE [LOW_PRIORITY | DELAYED]
[INTO] tbl_name [PARTITION (partition_list)] [(col,...)]
SELECT ...
[RETURNING select_expr
[, select_expr ...]]
REPLACE ... RETURNING
returns a result set of the replaced rows. The statement returns the listed columns for all the rows that are replaced, or alternatively, the specified SELECT
expression. Any SQL expressions which can be calculated can be used in the select expression for the RETURNING clause, including virtual columns and aliases, expressions which use various operators such as bitwise, logical and arithmetic operators, string functions, date-time functions, numeric functions, control flow functions, secondary functions and stored functions. Along with this, statements which have subqueries and prepared statements can also be used.
Simple REPLACE
statement:
REPLACE INTO t2 VALUES (1,'Leopard'),(2,'Dog') RETURNING id2, id2+id2
AS Total ,id2|id2, id2&&id2;
+-----+-------+---------+----------+
| id2 | Total | id2|id2 | id2&&id2 |
+-----+-------+---------+----------+
| 1 | 2 | 1 | 1 |
| 2 | 4 | 2 | 1 |
+-----+-------+---------+----------+
Using stored functions in RETURNING
:
DELIMITER |
CREATE FUNCTION f(arg INT) RETURNS INT
BEGIN
RETURN (SELECT arg+arg);
END|
DELIMITER ;
PREPARE stmt FROM "REPLACE INTO t2 SET id2=3, animal2='Fox' RETURNING f2(id2),
UPPER(animal2)";
EXECUTE stmt;
+---------+----------------+
| f2(id2) | UPPER(animal2) |
+---------+----------------+
| 6 | FOX |
+---------+----------------+
Subqueries in the statement:
REPLACE INTO t1 SELECT * FROM t2 RETURNING (SELECT id2 FROM t2 WHERE
id2 IN (SELECT id2 FROM t2 WHERE id2=1)) AS new_id;
+--------+
| new_id |
+--------+
| 1 |
| 1 |
| 1 |
| 1 |
+--------+
Subqueries in the RETURNING
clause that return more than one row or column cannot be used..
Aggregate functions cannot be used in the RETURNING
clause. Since aggregate functions work on a set of values and if the purpose is to get the row count, ROW_COUNT()
with SELECT
can be used, or it can be used in REPLACE...SELECT...RETURNING
if the table in the RETURNING
clause is not the same as the REPLACE
table.
Returning clause (video)
This page is licensed: CC BY-SA / Gnu FDL
Single-table syntax:
UPDATE [LOW_PRIORITY] [IGNORE] table_reference
[PARTITION (partition_list)]
[FOR PORTION OF period FROM expr1 TO expr2]
SET col1={expr1|DEFAULT} [,col2={expr2|DEFAULT}] ...
[WHERE where_condition]
[ORDER BY ...]
[LIMIT row_count]
Multiple-table syntax:
UPDATE [LOW_PRIORITY] [IGNORE] table_references
SET col1={expr1|DEFAULT} [, col2={expr2|DEFAULT}] ...
[WHERE where_condition]
For the single-table syntax, the UPDATE
statement updates columns of existing rows in the named table with new values. TheSET
clause indicates which columns to modify and the values they should be given. Each value can be given as an expression, or the keywordDEFAULT
to set a column explicitly to its default value. TheWHERE
clause, if given, specifies the conditions that identify which rows to update. With no WHERE
clause, all rows are updated. If the ORDER BY clause is specified, the rows are
updated in the order that is specified. The LIMIT clause places a limit on the number of rows that can be updated.
Both clauses can be used with multiple-table updates.
An UPDATE
can also reference tables which are located in different databases; see Identifier Qualifiers for the syntax.
where_condition
is an expression that evaluates to true for each row to be updated.
table_references
and where_condition
are as specified as described in SELECT.
For single-table updates, assignments are evaluated in left-to-right order, while for multi-table updates, there is no guarantee of a particular order. If the SIMULTANEOUS_ASSIGNMENT
sql_mode is set, UPDATE statements evaluate all assignments simultaneously.
You need the UPDATE
privilege only for columns referenced in an UPDATE
that are actually updated. You need only the SELECT privilege for any columns that are read but not modified. See GRANT.
The UPDATE
statement supports the following modifiers:
If you use the LOW_PRIORITY
keyword, execution of the UPDATE
is delayed until no other clients are reading from the table. This affects only storage engines that use only table-level locking (MyISAM, MEMORY, MERGE). See HIGH_PRIORITY and LOW_PRIORITY clauses for details.
If you use the IGNORE
keyword, the update statement does not abort even if errors occur during the update. Rows for which duplicate-key conflicts occur are not updated. Rows for which columns are
updated to values that would cause data conversion errors are updated to the closest valid values instead.
See Partition Pruning and Selection for details.
See Application Time Periods - Updating by Portion.
UPDATE
statements may have the same source and target. For example, given the following table:
DROP TABLE t1;
CREATE TABLE t1 (c1 INT, c2 INT);
INSERT INTO t1 VALUES (10,10), (20,20);
UPDATE t1 SET c1=c1+1 WHERE c2=(SELECT MAX(c2) FROM t1);
SELECT * FROM t1;
+------+------+
| c1 | c2 |
+------+------+
| 10 | 10 |
| 21 | 20 |
+------+------+
Single-table syntax:
UPDATE table_name SET column1 = value1, column2 = value2 WHERE id=100;
Multiple-table syntax:
UPDATE tab1, tab2 SET tab1.column1 = value1, tab1.column2 = value2 WHERE tab1.id = tab2.id;
This page is licensed: GPLv2, originally from fill_help_tables.sql
Learn to insert and load data into MariaDB Server. This section covers INSERT and LOAD DATA SQL statements, enabling you to efficiently add new records to your databases.
The MyISAM storage engine supports concurrent inserts. This feature allows SELECT statements to be executed during INSERT operations, reducing contention.
Whether concurrent inserts can be used or not depends on the value of the concurrent_insert server system variable:
NEVER
(0) disables concurrent inserts.
AUTO
(1) allows concurrent inserts only when the target table has no free blocks (no data in the middle of the table has been deleted after the last OPTIMIZE TABLE). This is the default.
ALWAYS
(2) always enables concurrent inserts, in which case new rows are added at the end of a table if the table is being used by another thread.
If the binary log is used, CREATE TABLE ... SELECT and INSERT ... SELECT statements cannot use concurrent inserts. These statements acquire a read lock on the table, so concurrent inserts will need to wait. This way the log can be safely used to restore data.
Concurrent inserts are not used by replicas with the row based replication (see binary log formats).
If an INSERT statement contain the HIGH_PRIORITY clause, concurrent inserts cannot be used. INSERT ... DELAYED is usually unneeded if concurrent inserts are enabled.
LOAD DATA INFILE uses concurrent inserts if the CONCURRENT
keyword is specified and concurrent_insert is not NEVER
. This makes the statement slower (even if no other sessions access the table) but reduces contention.
LOCK TABLES allows non-conflicting concurrent inserts if a READ LOCAL
lock is used. Concurrent inserts are not allowed if the LOCAL
keyword is omitted.
The decision to enable concurrent insert for a table is done when the table is opened. If you change the value of concurrent_insert it will only affect new opened tables. If you want it to work for also for tables in use or cached, you should do FLUSH TABLES after setting the variable.
This page is licensed: CC BY-SA / Gnu FDL
The IGNORE
option tells the server to ignore some common errors.
IGNORE
can be used with the following statements:
INSERT (see also INSERT IGNORE)
The logic used:
Variables out of ranges are replaced with the maximum/minimum value.
SQL_MODEs STRICT_TRANS_TABLES
, STRICT_ALL_TABLES
, NO_ZERO_IN_DATE
, NO_ZERO_DATE
are ignored.
Inserting NULL
in a NOT NULL
field will insert 0 ( in a numerical field), 0000-00-00 ( in a date field) or an empty string ( in a character field).
Rows that cause a duplicate key error or break a foreign key constraint are not inserted, updated, or deleted.
The following errors are ignored:
1022
ER_DUP_KEY
Can't write; duplicate key in table '%s'
1048
ER_BAD_NULL_ERROR
Column '%s' cannot be null
1062
ER_DUP_ENTRY
Duplicate entry '%s' for key %d
1242
ER_SUBQUERY_NO_1_ROW
Subquery returns more than 1 row
1264
ER_WARN_DATA_OUT_OF_RANGE
Out of range value for column '%s' at row %ld
1265
WARN_DATA_TRUNCATED
Data truncated for column '%s' at row %ld
1292
ER_TRUNCATED_WRONG_VALUE
Truncated incorrect %s value: '%s'
1366
ER_TRUNCATED_WRONG_VALUE_FOR_FIELD
Incorrect integer value
1369
ER_VIEW_CHECK_FAILED
CHECK OPTION failed '%s.%s'
1451
ER_ROW_IS_REFERENCED_2
Cannot delete or update a parent row
1452
ER_NO_REFERENCED_ROW_2
Cannot add or update a child row: a foreign key constraint fails (%s)
1526
ER_NO_PARTITION_FOR_GIVEN_VALUE
Table has no partition for value %s
1586
ER_DUP_ENTRY_WITH_KEY_NAME
Duplicate entry '%s' for key '%s'
1591
ER_NO_PARTITION_FOR_GIVEN_VALUE_SILENT
Table has no partition for some existing values
1748
ER_ROW_DOES_NOT_MATCH_GIVEN_PARTITION_SET
Found a row not matching the given partition set
Ignored errors normally generate a warning.
A property of the IGNORE
clause consists in causing transactional engines and non-transactional engines (like InnoDB and Aria) to behave the same way. For example, normally a multi-row insert which tries to violate a UNIQUE
contraint is completely rolled back on InnoDB, but might be partially executed on Aria. With the IGNORE
clause, the statement will be partially executed in both engines.
Duplicate key errors also generate warnings. The OLD_MODE server variable can be used to prevent this.
This page is licensed: CC BY-SA / Gnu FDL
If the SQL_MODE contains STRICT_TRANS_TABLES
and you are inserting into a transactional table (like InnoDB), or if the SQL_MODE contains STRICT_ALL_TABLES
, all NOT NULL
columns which do not have a DEFAULT
value (and are not AUTO_INCREMENT) must be explicitly referenced in INSERT
statements. If not, an error like this is produced:
ERROR 1364 (HY000): Field 'col' doesn't have a default value
In all other cases, if a NOT NULL
column without a DEFAULT
value is not referenced, an empty value will be inserted (for example, 0 for INTEGER
columns and '' for CHAR
columns). See NULL Values in MariaDB:Inserting for examples.
If a NOT NULL
column having a DEFAULT
value is not referenced, NULL
will be inserted.
If a NULL
column having a DEFAULT
value is not referenced, its default value will be inserted. It is also possible to explicitly assign the default value using the DEFAULT
keyword or the DEFAULT() function.
If the DEFAULT
keyword is used but the column does not have a DEFAULT
value, an error like this is produced:
ERROR 1364 (HY000): Field 'col' doesn't have a default value
By default, if you try to insert a duplicate row and there is a UNIQUE
index, INSERT
stops and an error like this is produced:
ERROR 1062 (23000): Duplicate entry 'dup_value' for key 'col'
To handle duplicates you can use the IGNORE clause, INSERT ON DUPLICATE KEY UPDATE or the REPLACE statement. Note that the IGNORE and DELAYED options are ignored when you use ON DUPLICATE KEY UPDATE.
This page is licensed: CC BY-SA / Gnu FDL
INSERT DELAYED ...
The DELAYED
option for the INSERT statement is a MariaDB/MySQL extension to standard SQL that is very useful if you have clients that cannot or need not wait for the INSERT
to complete. This is a common situation when you use MariaDB for logging and you also periodically run SELECT
and UPDATE
statements that take a long time to complete.
When a client uses INSERT DELAYED
, it gets an okay from the server at once, and the row is queued to be inserted when the table is not in use by any other thread.
Another major benefit of using INSERT DELAYED
is that inserts from many clients are bundled together and written in one block. This is much faster than performing many separate inserts.
Note that INSERT DELAYED
is slower than a normalINSERT
if the table is not otherwise in use. There is also the additional overhead for the server to handle a separate thread for each table for which there are delayed rows. This means that you should useINSERT DELAYED
only when you are really sure that you need it.
The queued rows are held only in memory until they are inserted into the table. This means that if you terminate mariadbd forcibly (for example, with kill -9) or if mariadbd dies unexpectedly, any queued rows that have not been written to disk are lost.
The number of concurrent INSERT DELAYED
threads is limited by the max_delayed_threads server system variables. If it is set to 0, INSERT DELAYED
is disabled. The session value can be equal to the global value, or 0 to disable this statement for the current session. If this limit has been reached, the DELAYED
clause will be silently ignore for subsequent statements (no error will be produced).
There are some limitations on the use of DELAYED
:
For MyISAM tables, if there are no free blocks in the middle of the data file, concurrent SELECT and INSERT statements are supported. Under these circumstances, you very seldom need to use INSERT DELAYED
with MyISAM.
INSERT DELAYED
should be used only forINSERT
statements that specify value lists. The server
ignores DELAYED
for INSERT ... SELECT
or INSERT ... ON DUPLICATE KEY UPDATE
statements.
Because the INSERT DELAYED
statement returns immediately, before the rows are inserted, you cannot useLAST_INSERT_ID()
to get theAUTO_INCREMENT
value that the statement might generate.
DELAYED
rows are not visible to SELECT
statements until they actually have been inserted.
After INSERT DELAYED
, ROW_COUNT() returns the number of the rows you tried to insert, not the number of the successful writes.
DELAYED
is ignored on slave replication servers, so thatINSERT DELAYED
is treated as a normal INSERT
on slaves. This is becauseDELAYED
could cause the slave to have different data than
the master. INSERT DELAYED
statements are not safe for replication.
Pending INSERT DELAYED
statements are lost if a table is write locked and ALTER TABLE is used to modify the table structure.
INSERT DELAYED
is not supported for views. If you try, you will get an error like this: ERROR 1347 (HY000): 'view_name' is not BASE TABLE
INSERT DELAYED
is not supported for partitioned tables.
INSERT DELAYED
is not supported within stored programs.
INSERT DELAYED
does not work with triggers.
INSERT DELAYED
does not work if there is a check constraint in place.
INSERT DELAYED
does not work if skip-new mode is active.
This page is licensed: GPLv2, originally from fill_help_tables.sql
Normally INSERT stops and rolls back when it encounters an error.
By using the IGNORE keyword all errors are converted to warnings, which will not stop inserts of additional rows.
Invalid values are changed to the closest valid value and inserted, with a warning produced.
The IGNORE and DELAYED options are ignored when you use ON DUPLICATE KEY UPDATE.
Warnings are issued for duplicate key errors when using IGNORE
. You can get the old behavior if you set OLD_MODE to NO_DUP_KEY_WARNINGS_WITH_IGNORE
.
No warnings are issued for duplicate key errors when using IGNORE
.
See IGNORE for a full description of effects.
CREATE TABLE t1 (x INT UNIQUE);
INSERT INTO t1 VALUES(1),(2);
INSERT INTO t1 VALUES(2),(3);
ERROR 1062 (23000): Duplicate entry '2' for key 'x'
SELECT * FROM t1;
+------+
| x |
+------+
| 1 |
| 2 |
+------+
INSERT IGNORE INTO t1 VALUES(2),(3);
Query OK, 1 row affected, 1 warning (0.04 sec)
SHOW WARNINGS;
+---------+------+---------------------------------+
| Level | Code | Message |
+---------+------+---------------------------------+
| Warning | 1062 | Duplicate entry '2' for key 'x' |
+---------+------+---------------------------------+
SELECT * FROM t1;
+------+
| x |
+------+
| 1 |
| 2 |
| 3 |
+------+
Converting values:
CREATE OR REPLACE TABLE t2(id INT, t VARCHAR(2) NOT NULL, n INT NOT NULL);
INSERT INTO t2(id) VALUES (1),(2);
ERROR 1364 (HY000): Field 't' doesn't have a default value
INSERT IGNORE INTO t2(id) VALUES (1),(2);
Query OK, 2 rows affected, 2 warnings (0.026 sec)
Records: 2 Duplicates: 0 Warnings: 2
SHOW WARNINGS;
+---------+------+----------------------------------------+
| Level | Code | Message |
+---------+------+----------------------------------------+
| Warning | 1364 | Field 't' doesn't have a default value |
| Warning | 1364 | Field 'n' doesn't have a default value |
+---------+------+----------------------------------------+
SELECT * FROM t2;
+------+---+---+
| id | t | n |
+------+---+---+
| 1 | | 0 |
| 2 | | 0 |
+------+---+---+
See INSERT ON DUPLICATE KEY UPDATE for further examples using that syntax.
This page is licensed: CC BY-SA / Gnu FDL
INSERT [LOW_PRIORITY | DELAYED | HIGH_PRIORITY] [IGNORE]
[INTO] tbl_name [PARTITION (partition_list)] [(col,...)]
{VALUES | VALUE} ({expr | DEFAULT},...),(...),...
[ ON DUPLICATE KEY UPDATE
col=expr
[, col=expr] ... ]
Or:
INSERT [LOW_PRIORITY | DELAYED | HIGH_PRIORITY] [IGNORE]
[INTO] tbl_name [PARTITION (partition_list)]
SET col={expr | DEFAULT}, ...
[ ON DUPLICATE KEY UPDATE
col=expr
[, col=expr] ... ]
Or:
INSERT [LOW_PRIORITY | HIGH_PRIORITY] [IGNORE]
[INTO] tbl_name [PARTITION (partition_list)] [(col,...)]
SELECT ...
[ ON DUPLICATE KEY UPDATE
col=expr
[, col=expr] ... ]
INSERT ... ON DUPLICATE KEY UPDATE is a MariaDB/MySQL extension to the INSERT statement that, if it finds a duplicate unique or primary key, will instead perform an UPDATE.
The row/s affected value is reported as 1 if a row is inserted, and 2 if a row is updated, unless the API's CLIENT_FOUND_ROWS
flag is set.
If more than one unique index is matched, only the first is updated. It is not recommended to use this statement on tables with more than one unique index.
If the table has an AUTO_INCREMENT primary key and the statement inserts or updates a row, the LAST_INSERT_ID() function returns its AUTO_INCREMENT value.
The VALUES() function can only be used in a ON DUPLICATE KEY UPDATE
clause and has no meaning in any other context. It returns the column values from the INSERT
portion of the statement. This function is particularly useful for multi-rows inserts.
The IGNORE and DELAYED options are ignored when you use ON DUPLICATE KEY UPDATE
.
See Partition Pruning and Selection for details on the PARTITION clause.
This statement activates INSERT and UPDATE triggers. See Trigger Overview for details.
See also a similar statement, REPLACE.
CREATE TABLE ins_duplicate (id INT PRIMARY KEY, animal VARCHAR(30));
INSERT INTO ins_duplicate VALUES (1,'Aardvark'), (2,'Cheetah'), (3,'Zebra');
If there is no existing key, the statement runs as a regular INSERT:
INSERT INTO ins_duplicate VALUES (4,'Gorilla')
ON DUPLICATE KEY UPDATE animal='Gorilla';
Query OK, 1 row affected (0.07 sec)
SELECT * FROM ins_duplicate;
+----+----------+
| id | animal |
+----+----------+
| 1 | Aardvark |
| 2 | Cheetah |
| 3 | Zebra |
| 4 | Gorilla |
+----+----------+
A regular INSERT with a primary key value of 1 will fail, due to the existing key:
INSERT INTO ins_duplicate VALUES (1,'Antelope');
ERROR 1062 (23000): Duplicate entry '1' for key 'PRIMARY'
However, we can use an INSERT ON DUPLICATE KEY UPDATE instead:
INSERT INTO ins_duplicate VALUES (1,'Antelope')
ON DUPLICATE KEY UPDATE animal='Antelope';
Query OK, 2 rows affected (0.09 sec)
Note that there are two rows reported as affected, but this refers only to the UPDATE.
SELECT * FROM ins_duplicate;
+----+----------+
| id | animal |
+----+----------+
| 1 | Antelope |
| 2 | Cheetah |
| 3 | Zebra |
| 4 | Gorilla |
+----+----------+
Adding a second unique column:
ALTER TABLE ins_duplicate ADD id2 INT;
UPDATE ins_duplicate SET id2=id+10;
ALTER TABLE ins_duplicate ADD UNIQUE KEY(id2);
Where two rows match the unique keys match, only the first is updated. This can be unsafe and is not recommended unless you are certain what you are doing.
INSERT INTO ins_duplicate VALUES (2,'Lion',13)
ON DUPLICATE KEY UPDATE animal='Lion';
Query OK, 2 rows affected (0.004 sec)
SELECT * FROM ins_duplicate;
+----+----------+------+
| id | animal | id2 |
+----+----------+------+
| 1 | Antelope | 11 |
| 2 | Lion | 12 |
| 3 | Zebra | 13 |
| 4 | Gorilla | 14 |
+----+----------+------+
Although the third row with an id of 3 has an id2 of 13, which also matched, it was not updated.
Changing id to an auto_increment field. If a new row is added, the auto_increment is moved forward. If the row is updated, it remains the same.
ALTER TABLE `ins_duplicate` CHANGE `id` `id` INT( 11 ) NOT NULL AUTO_INCREMENT;
ALTER TABLE ins_duplicate DROP id2;
SELECT Auto_increment FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_NAME='ins_duplicate';
+----------------+
| Auto_increment |
+----------------+
| 5 |
+----------------+
INSERT INTO ins_duplicate VALUES (2,'Leopard')
ON DUPLICATE KEY UPDATE animal='Leopard';
Query OK, 2 rows affected (0.00 sec)
SELECT Auto_increment FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_NAME='ins_duplicate';
+----------------+
| Auto_increment |
+----------------+
| 5 |
+----------------+
INSERT INTO ins_duplicate VALUES (5,'Wild Dog')
ON DUPLICATE KEY UPDATE animal='Wild Dog';
Query OK, 1 row affected (0.09 sec)
SELECT * FROM ins_duplicate;
+----+----------+
| id | animal |
+----+----------+
| 1 | Antelope |
| 2 | Leopard |
| 3 | Zebra |
| 4 | Gorilla |
| 5 | Wild Dog |
+----+----------+
SELECT Auto_increment FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_NAME='ins_duplicate';
+----------------+
| Auto_increment |
+----------------+
| 6 |
+----------------+
Refering to column values from the INSERT portion of the statement:
INSERT INTO table (a,b,c) VALUES (1,2,3),(4,5,6)
ON DUPLICATE KEY UPDATE c=VALUES(a)+VALUES(b);
See the VALUES() function for more.
This page is licensed: CC BY-SA / Gnu FDL
INSERT [LOW_PRIORITY | HIGH_PRIORITY] [IGNORE]
[INTO] tbl_name [(col_name,...)]
SELECT ...
[ ON DUPLICATE KEY UPDATE col_name=expr, ... ]
With INSERT ... SELECT
, you can quickly insert many rows into a table from one or more other tables. For example:
INSERT INTO tbl_temp2 (fld_id)
SELECT tbl_temp1.fld_order_id
FROM tbl_temp1 WHERE tbl_temp1.fld_order_id > 100;
tbl_name
can also be specified in the form db_name
.tbl_name
(see Identifier Qualifiers). This allows to copy rows between different databases.
If the new table has a primary key or UNIQUE indexes, you can use IGNORE to handle duplicate key errors during the query. The newer values will not be inserted if an identical value already exists.
REPLACE can be used instead of INSERT
to prevent duplicates on UNIQUE
indexes by deleting old values. In that case, ON DUPLICATE KEY UPDATE
cannot be used.
INSERT ... SELECT
works for tables which already exist. To create a table for a given resultset, you can use CREATE TABLE ... SELECT.
This page is licensed: GPLv2, originally from fill_help_tables.sql
INSERT [LOW_PRIORITY | DELAYED | HIGH_PRIORITY] [IGNORE]
[INTO] tbl_name [PARTITION (partition_list)] [(col,...)]
{VALUES | VALUE} ({expr | DEFAULT},...),(...),...
[ ON DUPLICATE KEY UPDATE
col=expr
[, col=expr] ... ] [RETURNING select_expr
[, select_expr ...]]
Or:
INSERT [LOW_PRIORITY | DELAYED | HIGH_PRIORITY] [IGNORE]
[INTO] tbl_name [PARTITION (partition_list)]
SET col={expr | DEFAULT}, ...
[ ON DUPLICATE KEY UPDATE
col=expr
[, col=expr] ... ] [RETURNING select_expr
[, select_expr ...]]
Or:
INSERT [LOW_PRIORITY | HIGH_PRIORITY] [IGNORE]
[INTO] tbl_name [PARTITION (partition_list)] [(col,...)]
SELECT ...
[ ON DUPLICATE KEY UPDATE
col=expr
[, col=expr] ... ] [RETURNING select_expr
[, select_expr ...]]
The INSERT
statement is used to insert new rows into an existing table. The INSERT ... VALUES
and INSERT ... SET
forms of the statement insert rows based on explicitly specified values. The INSERT ... SELECT
form inserts rows selected from another table or tables. INSERT ... SELECT
is discussed further in the INSERT ... SELECT article.
The table name can be specified in the form db_name
.tbl_name
or, if a default database is selected, in the form tbl_name
(see Identifier Qualifiers). This allows to use INSERT ... SELECT to copy rows between different databases.
The PARTITION clause can be used in both the INSERT and the SELECT part. See Partition Pruning and Selection for details.
The RETURNING
clause can be used.
The RETURNING
clause is not available.
The columns list is optional. It specifies which values are explicitly inserted, and in which order. If this clause is not specified, all values must be explicitly specified, in the same order they are listed in the table definition.
The list of value follow the VALUES
or VALUE
keyword (which are interchangeable, regardless how much values you want to insert), and is wrapped by parenthesis. The values must be listed in the same order as the columns list. It is possible to specify more than one list to insert more than one rows with a single statement. If many rows are inserted, this is a speed optimization.
For one-row statements, the SET
clause may be more simple, because you don't need to remember the columns order. All values are specified in the form col
= expr
.
Values can also be specified in the form of a SQL expression or subquery. However, the subquery cannot access the same table that is named in the INTO
clause.
If you use the LOW_PRIORITY
keyword, execution of the INSERT
is delayed until no other clients are reading from the table. If you use the HIGH_PRIORITY
keyword, the statement has the same priority as SELECT
s. This affects only storage engines that use only table-level locking (MyISAM, MEMORY, MERGE). However, if one of these keywords is specified, concurrent inserts cannot be used. See HIGH_PRIORITY and LOW_PRIORITY clauses for details.
For more details on the DELAYED
option, see INSERT DELAYED.
See HIGH_PRIORITY and LOW_PRIORITY.
See INSERT - Default & Duplicate Values for details..
See INSERT IGNORE.
See INSERT ON DUPLICATE KEY UPDATE.
Specifying the column names:
INSERT INTO person (first_name, last_name) VALUES ('John', 'Doe');
Inserting more than 1 row at a time:
INSERT INTO tbl_name VALUES (1, "row 1"), (2, "row 2");
Using the SET
clause:
INSERT INTO person SET first_name = 'John', last_name = 'Doe';
SELECTing from another table:
INSERT INTO contractor SELECT * FROM person WHERE status = 'c';
See INSERT ON DUPLICATE KEY UPDATE and INSERT IGNORE for further examples.
INSERT ... RETURNING
returns a result set of the inserted rows.
It returns the listed columns for all the rows that are inserted, or alternatively, the specified SELECT
expression. Any SQL expressions which can be calculated can be used in the select expression for the RETURNING
clause, including virtual columns and aliases, expressions which use various operators such as bitwise, logical and arithmetic operators, string functions, date-time functions, numeric functions, control flow functions, secondary functions and stored functions. Along with this, statements which have subqueries and prepared statements can also be used.
Simple INSERT statement:
INSERT INTO t2 VALUES (1,'Dog'),(2,'Lion'),(3,'Tiger'),(4,'Leopard')
RETURNING id2,id2+id2,id2&id2,id2||id2;
+-----+---------+---------+----------+
| id2 | id2+id2 | id2&id2 | id2||id2 |
+-----+---------+---------+----------+
| 1 | 2 | 1 | 1 |
| 2 | 4 | 2 | 1 |
| 3 | 6 | 3 | 1 |
| 4 | 8 | 4 | 1 |
+-----+---------+---------+----------+
Using stored functions in RETURNING
DELIMITER |
CREATE FUNCTION f(arg INT) RETURNS INT
BEGIN
RETURN (SELECT arg+arg);
END|
DELIMITER ;
PREPARE stmt FROM "INSERT INTO t1 SET id1=1, animal1='Bear' RETURNING f(id1), UPPER(animal1)";
EXECUTE stmt;
+---------+----------------+
| f(id1) | UPPER(animal1) |
+---------+----------------+
| 2 | BEAR |
+---------+----------------+
Subqueries in the RETURNING
clause that return more than one row or column cannot be used.
Aggregate functions cannot be used in the RETURNING
clause. Since aggregate functions work on a set of values, and if the purpose is to get the row count, ROW_COUNT()
with SELECT
can be used or it can be used in INSERT...SELECT...RETURNING
if the table in the RETURNING clause is not the same as the INSERT
table.
REPLACE Equivalent to DELETE
+ INSERT
of conflicting row.
This page is licensed: GPLv2, originally from fill_help_tables.sql
INSERT [LOW_PRIORITY | DELAYED | HIGH_PRIORITY] [IGNORE]
[INTO] tbl_name [PARTITION (partition_list)] [(col,...)]
{VALUES | VALUE} ({expr | DEFAULT},...),(...),...
[ ON DUPLICATE KEY UPDATE
col=expr
[, col=expr] ... ] [RETURNING select_expr
[, select_expr ...]]
Or:
INSERT [LOW_PRIORITY | DELAYED | HIGH_PRIORITY] [IGNORE]
[INTO] tbl_name [PARTITION (partition_list)]
SET col={expr | DEFAULT}, ...
[ ON DUPLICATE KEY UPDATE
col=expr
[, col=expr] ... ] [RETURNING select_expr
[, select_expr ...]]
Or:
INSERT [LOW_PRIORITY | HIGH_PRIORITY] [IGNORE]
[INTO] tbl_name [PARTITION (partition_list)] [(col,...)]
SELECT ...
[ ON DUPLICATE KEY UPDATE
col=expr
[, col=expr] ... ] [RETURNING select_expr
[, select_expr ...]]
INSERT ... RETURNING
returns a resultset of the inserted rows.
It returns the listed columns for all the rows that are inserted, or alternatively, the specified SELECT
expression. Any SQL expressions which can be calculated can be used in the select expression for the RETURNING
clause, including virtual columns and aliases, expressions which use various operators such as bitwise, logical and arithmetic operators, string functions, date-time functions, numeric functions, control flow functions, secondary functions and stored functions. Along with this, statements which have subqueries and prepared statements can also be used.
Simple INSERT statements:
CREATE OR REPLACE TABLE t2 (id INT, animal VARCHAR(20), t TIMESTAMP);
INSERT INTO t2 (id) VALUES (2),(3) RETURNING id,t;
+------+---------------------+
| id | t |
+------+---------------------+
| 2 | 2021-04-28 00:59:32 |
| 3 | 2021-04-28 00:59:32 |
+------+---------------------+
INSERT INTO t2(id,animal) VALUES (1,'Dog'),(2,'Lion'),(3,'Tiger'),(4,'Leopard')
RETURNING id,id+id,id&id,id||id;
+------+-------+-------+--------+
| id | id+id | id&id | id||id |
+------+-------+-------+--------+
| 1 | 2 | 1 | 1 |
| 2 | 4 | 2 | 1 |
| 3 | 6 | 3 | 1 |
| 4 | 8 | 4 | 1 |
+------+-------+-------+--------+
Using stored functions in RETURNING
:
DELIMITER |
CREATE FUNCTION f(arg INT) RETURNS INT
BEGIN
RETURN (SELECT arg+arg);
END|
DELIMITER ;
PREPARE stmt FROM "INSERT INTO t1 SET id1=1, animal1='Bear' RETURNING f(id1), UPPER(animal1)";
EXECUTE stmt;
+---------+----------------+
| f(id1) | UPPER(animal1) |
+---------+----------------+
| 2 | BEAR |
+---------+----------------+
Subqueries in the RETURNING
clause that return more than one row or column cannot be used.
Aggregate functions cannot be used in the RETURNING
clause. Since aggregate functions work on a set of values, and if the purpose is to get the row count, ROW_COUNT()
with SELECT
can be used or it can be used in INSERT...SELECT...RETURNING
if the table in the RETURNING
clause is not the same as the INSERT
table.
Returning clause (video)
This page is licensed: CC BY-SA / Gnu FDL
Learn to load data into MariaDB tables or indexes using LOAD DATA. This section details efficient methods for bulk importing data from external files, optimizing for large datasets.
LOAD DATA [LOW_PRIORITY | CONCURRENT] [LOCAL] INFILE 'file_name'
[REPLACE | IGNORE]
INTO TABLE tbl_name
[CHARACTER SET charset_name]
[{FIELDS | COLUMNS}
[TERMINATED BY 'string']
[[OPTIONALLY] ENCLOSED BY 'CHAR']
[ESCAPED BY 'CHAR']
]
[LINES
[STARTING BY 'string']
[TERMINATED BY 'string']
]
[IGNORE number {LINES|ROWS}]
[(col_name_or_user_var,...)]
[SET col_name = expr,...]
LOAD DATA INFILE
is unsafe for statement-based replication.
Reads rows from a text file into the designated table on the database at a very high speed. The file name must be given as a literal string.
Files are written to disk using the SELECT INTO OUTFILE statement. You can then read the files back into a table using the LOAD DATA INFILE
statement. The FIELDS
and LINES
clauses are the same in both statements and by default fields are expected to be terminated with tabs () and lines with newlines (). These clauses are optional, but if both are specified then the FIELDS
clause must precede LINES
.
Executing this statement activates INSERT
triggers.
One must have the FILE privilege to be able to execute LOAD DATA INFILE. This is to ensure normal users cannot read system files. LOAD DATA LOCAL INFILE does not have this requirement.
If the secure_file_priv system variable is set (by default it is not), the loaded file must be present in the specified directory.
Note that MariaDB's systemd unit file restricts access to /home
, /root
, and /run/user
by default. See Configuring access to home directories.
When you execute the LOAD DATA INFILE
statement, MariaDB Server attempts to read the input file from its own file system. By contrast, when you execute the LOAD DATA LOCAL INFILE
statement, the client attempts to read the input file from its file system, and it sends the contents of the input file to the MariaDB Server. This allows you to load files from the client's local file system into the database.
If you don't want to permit this operation (perhaps for security reasons), you can disable the LOAD DATA LOCAL INFILE
statement on either the server or the client.
The LOAD DATA LOCAL INFILE
statement can be disabled on the server by setting the local_infile system variable to 0
.
The LOAD DATA LOCAL INFILE
statement can be disabled on the client. If you are using MariaDB Connector/C, this can be done by unsetting the CLIENT_LOCAL_FILES
capability flag with the mysql_real_connect function or by unsetting the MYSQL_OPT_LOCAL_INFILE
option with mysql_optionsv function. If you are using a different client or client library, then see the documentation for your specific client or client library to determine how it handles the LOAD DATA LOCAL INFILE
statement.
The LOAD DATA LOCAL INFILE
strict modes like STRICT_TRANS_TABLES
are disabled with keyword "local". (MDEV-11235)
If the LOAD DATA LOCAL INFILE
statement is disabled by either the server or the client and if the user attempts to execute it, then the server will cause the statement to fail with the following error message:
The used command is not allowed with this MariaDB version
Note that it is not entirely accurate to say that the MariaDB version does not support the command. It would be more accurate to say that the MariaDB configuration does not support the command. See MDEV-20500 for more information.
From MariaDB 10.5.2, the error message is more accurate:
The used command is not allowed because the MariaDB server or client
has disabled the local infile capability
If you load data from a file into a table that already contains data and has a primary key, you may encounter issues where the statement attempts to insert a row with a primary key that already exists. When this happens, the statement fails with Error 1064, protecting the data already on the table. If you want MariaDB to overwrite duplicates, use the REPLACE
keyword.
The REPLACE
keyword works like the REPLACE statement. Here, the statement attempts to load the data from the file. If the row does not exist, it adds it to the table. If the row contains an existing primary key, it replaces the table data. That is, in the event of a conflict, it assumes the file contains the desired row.
This operation can cause a degradation in load speed by a factor of 20 or more if the part that has already been loaded is larger than the capacity of the InnoDB Buffer Pool. This happens because it causes a lot of turnaround in the buffer pool.
Use the IGNORE
keyword when you want to skip any rows that contain a conflicting primary key. Here, the statement attempts to load the data from the file. If the row does not exist, it adds it to the table. If the row contains an existing primary key, it ignores the addition request and moves on to the next. That is, in the event of a conflict, it assumes the table contains the desired row.
IGNORE number {LINES|ROWS}
The IGNORE number LINES
syntax can be used to ignore a number of rows from the beginning of the file. Most often this is needed when the file starts with one row that includes the column headings.
When the statement opens the file, it attempts to read the contents using the default character-set, as defined by the character_set_database system variable.
In the cases where the file was written using a character-set other than the default, you can specify the character-set to use with the CHARACTER SET
clause in the statement. It ignores character-sets specified by the SET NAMES statement and by the character_set_client system variable. Setting the CHARACTER SET
clause to a value of binary
indicates "no conversion."
The statement interprets all fields in the file as having the same character-set, regardless of the column data type. To properly interpret file contents, you must ensure that it was written with the correct character-set. If you write a data file with mariadb-dump -T or with the SELECT INTO OUTFILE statement with the mariadb client, be sure to use the --default-character-set
option, so that the output is written with the desired character-set.
When using mixed character sets, use the CHARACTER SET
clause in both SELECT INTO OUTFILE and LOAD DATA INFILE
to ensure that MariaDB correctly interprets the escape sequences.
The character_set_filesystem system variable controls the interpretation of the filename.
It is currently not possible to load data files that use the ucs2
character set.
col_name_or_user_var can be a column name, or a user variable. In the case of a variable, the SET statement can be used to preprocess the value before loading into the table.
In storage engines that perform table-level locking (MyISAM, MEMORY and MERGE), using the LOW_PRIORITY keyword, MariaDB delays insertions until no other clients are reading from the table. Alternatively, when using the MyISAM storage engine, you can use the CONCURRENT keyword to perform concurrent insertion.
The LOW_PRIORITY
and CONCURRENT
keywords are mutually exclusive. They cannot be used in the same statement.
The LOAD DATA INFILE
statement supports progress reporting. You may find this useful when dealing with long-running operations. Using another client you can issue a SHOW PROCESSLIST query to check the progress of the data load.
MariaDB ships with a separate utility for loading data from files: mariadb-import. It operates by sending LOAD DATA INFILE
statements to the server.
MariaDB ships with a separate utility for loading data from files: mysqlimport
. It operates by sending LOAD DATA INFILE
statements to the server.
Using mariadb-import you can compress the file using the --compress
option, to get better performance over slow networks, providing both the client and server support the compressed protocol. Use the --local
option to load from the local file system.
In cases where the storage engine supports ALTER TABLE... DISABLE KEYS statements (MyISAM and Aria), the LOAD DATA INFILE
statement automatically disables indexes during the execution.
You have a file with this content (note the separator is ',', not tab, which is the default):
2,2
3,3
4,4
5,5
6,8
CREATE TABLE t1 (a INT, b INT, c INT, d INT, PRIMARY KEY (a));
LOAD DATA LOCAL INFILE
'/tmp/loaddata7.dat' INTO TABLE t1 FIELDS TERMINATED BY ',' (a,b) SET c=a+b;
SELECT * FROM t1;
+------+------+------+
| a | b | c |
+------+------+------+
| 2 | 2 | 4 |
| 3 | 3 | 6 |
| 4 | 4 | 8 |
| 5 | 5 | 10 |
| 6 | 8 | 14 |
+------+------+------+
Another example, given the following data (the separator is a tab):
1 a
2 b
The value of the first column is doubled before loading:
LOAD DATA INFILE 'ld.txt' INTO TABLE ld (@i,v) SET i=@i*2;
SELECT * FROM ld;
+------+------+
| i | v |
+------+------+
| 2 | a |
| 4 | b |
+------+------+
This page is licensed: GPLv2, originally from fill_help_tables.sql
LOAD INDEX INTO CACHE
tbl_index_list [, tbl_index_list] ...
tbl_index_list:
tbl_name
[[INDEX|KEY] (index_name[, index_name] ...)]
[IGNORE LEAVES]
The LOAD INDEX INTO CACHE
statement preloads a table index into the key cache to which it has been assigned by an explicit CACHE INDEX statement, or into the default key cache otherwise.LOAD INDEX INTO CACHE
is used only for MyISAM or Aria tables.
The IGNORE LEAVES
modifier causes only blocks for the nonleaf nodes of the index to be preloaded.
This page is licensed: GPLv2, originally from fill_help_tables.sql
LOAD XML [LOW_PRIORITY | CONCURRENT] [LOCAL] INFILE 'file_name'
[REPLACE | IGNORE]
INTO TABLE [db_name.]tbl_name
[CHARACTER SET charset_name]
[ROWS IDENTIFIED BY '<tagname>']
[IGNORE number {LINES | ROWS}]
[(column_or_user_var,...)]
[SET col_name = expr,...]
The LOAD XML statement reads data from an XML file into a table. Thefile_name
must be given as a literal string. The tagname
in the optional ROWS IDENTIFIED BY
clause must also be given as a literal
string, and must be surrounded by angle brackets (< and >).
LOAD XML acts as the complement of running the mariadb client in XML output mode (that is, starting the client with the --xml option). To write data from a table to an XML file, use a command such as the following one from the system shell:
shell> mariadb --xml -e 'SELECT * FROM mytable' > file.xml
To read the file back into a table, use LOAD XML INFILE. By default, the element is considered to be the equivalent of a database table row; this can be changed using the ROWS IDENTIFIED BY clause.
This statement supports three different XML formats:
Column names as attributes and column values as attribute values:
<row column1="value1" column2="value2" .../>
Column names as tags and column values as the content of these tags:
<row>
<column1>value1</column1>
<column2>value2</column2>
</row>
Column names are the name attributes of tags, and values are the contents of these tags:
<row>
<field name='column1'>value1</field>
<field name='column2'>value2</field>
</row>
This is the format used by other tools, such as mariadb-dump.
All 3 formats can be used in the same XML file; the import routine automatically detects the format for each row and interprets it correctly. Tags are matched based on the tag or attribute name and the column name.
The following clauses work essentially the same way for LOAD XML as they do for LOAD DATA:
LOW_PRIORITY
or CONCURRENT
LOCAL
REPLACE
or IGNORE
CHARACTER SET
(column_or_user_var,...)
SET
See LOAD DATA for more information about these clauses.
The IGNORE
number LINES
or IGNORE
number ROWS
clause causes the first number rows in the XML file to be skipped. It is analogous to the LOAD DATA
statement's IGNORE ... LINES
clause.
If the LOW_PRIORITY keyword is used, insertions are delayed until no other clients are reading from the table. The CONCURRENT
keyword allows the use of concurrent inserts. These clauses cannot be specified together.
This statement activates INSERT
triggers.
The CONNECT storage engine has an XML table type.
This page is licensed: CC BY-SA / Gnu FDL
Learn to select data in MariaDB Server using SQL. This section covers various SELECT statement clauses, including WHERE, GROUP BY, and ORDER BY, to retrieve and filter your data effectively.
You can use DUAL
instead of a table name in situations where no tables are referenced, such as the following SELECT statement:
SELECT 1 + 1 FROM DUAL;
+-------+
| 1 + 1 |
+-------+
| 2 |
+-------+
DUAL
is purely for the convenience of people who require that all SELECT
statements should haveFROM
and possibly other clauses. MariaDB ignores the clauses. MariaDB does not require FROM DUAL
if no tables are referenced.
FROM DUAL could be used when you only SELECT computed values, but require a WHERE
clause, perhaps to test that a script correctly handles empty resultsets:
SELECT 1 FROM DUAL WHERE FALSE;
Empty set (0.00 sec)
This page is licensed: GPLv2, originally from fill_help_tables.sql
InnoDB supports row-level locking. Selected rows can be locked using LOCK IN SHARE MODE or FOR UPDATE. In both cases, a lock is acquired on the rows read by the query, and it will be released when the current transaction is committed.
The FOR UPDATE
clause of SELECT applies only when autocommit is set to 0 or the SELECT
is enclosed in a transaction. A lock is acquired on the rows, and other transactions are prevented from writing the rows, acquire locks, and from reading them (unless their isolation level is READ UNCOMMITTED).
If autocommit
is set to 1, the LOCK IN SHARE MODE and FOR UPDATE
clauses have no effect in InnoDB. For non-transactional storage engines like MyISAM and ARIA, a table level lock will be taken even if autocommit is set to 1.
If the isolation level is set to SERIALIZABLE, all plain SELECT
statements are converted to SELECT ... LOCK IN SHARE MODE
.
SELECT * FROM trans WHERE period=2001 FOR UPDATE;
This page is licensed: CC BY-SA / Gnu FDL
Use the GROUP BY
clause in a SELECT statement to group rows together that have the same value in one or more column, or the same computed value using expressions with any functions and operators exceptgrouping functions. When you use a GROUP BY
clause, you will get a single result row for each group of rows that have the same value for the expression given in GROUP BY
.
When grouping rows, grouping values are compared as if by the = operator. For string values, the =
operator ignores trailing whitespace and may normalize characters and ignore case, depending on the collation in use.
You can use any of the grouping functions in your select expression. Their values will be calculated based on all the rows that have been grouped together for each result row. If you select a non-grouped column or a value computed from a non-grouped column, it is undefined which row the returned value is taken from. This is not permitted if the ONLY_FULL_GROUP_BY
SQL_MODE is used.
You can use multiple expressions in the GROUP BY
clause, separated by commas.
Rows are grouped together if they match on each of the expressions.
You can also use a single integer as the grouping expression. If you use an integer n, the results will be grouped by the nth column in the select expression.
The WHERE
clause is applied before the GROUP BY
clause. It filters non-aggregated
rows before the rows are grouped together. To filter grouped rows based on aggregate values,
use the HAVING
clause. The HAVING
clause takes any expression and evaluates it as
a boolean, just like the WHERE
clause. You can use grouping functions in the HAVING
clause. As with the select expression, if you reference non-grouped columns in the HAVING
clause, the behavior is undefined.
By default, if a GROUP BY
clause is present, the rows in the output will be sorted by the expressions used in the GROUP BY
. You can also specify ASC
or DESC
(ascending, descending) after those expressions, like in ORDER BY. The default is ASC
.
If you want the rows to be sorted by another field, you can add an explicit ORDER BY. If you don't want the result to be ordered, you can add ORDER BY NULL.
The WITH ROLLUP
modifier adds extra rows to the result set that represent super-aggregate summaries. For a full description with examples, see SELECT WITH ROLLUP.
Consider the following table that records how many times each user has played and won a game:
CREATE TABLE plays (name VARCHAR(16), plays INT, wins INT);
INSERT INTO plays VALUES
("John", 20, 5),
("Robert", 22, 8),
("Wanda", 32, 8),
("Susan", 17, 3);
Get a list of win counts along with a count:
SELECT wins, COUNT(*) FROM plays GROUP BY wins;
+------+----------+
| wins | COUNT(*) |
+------+----------+
| 3 | 1 |
| 5 | 1 |
| 8 | 2 |
+------+----------+
3 rows in set (0.00 sec)
The GROUP BY
expression can be a computed value, and can refer back to an identifier
specified with AS
. Get a list of win averages along with a count:
SELECT (wins / plays) AS winavg, COUNT(*) FROM plays GROUP BY winavg;
+--------+----------+
| winavg | COUNT(*) |
+--------+----------+
| 0.1765 | 1 |
| 0.2500 | 2 |
| 0.3636 | 1 |
+--------+----------+
3 rows in set (0.00 sec)
You can use any grouping function in the select expression. For each win average as above, get a list of the average play count taken to get that average:
SELECT (wins / plays) AS winavg, AVG(plays) FROM plays
GROUP BY winavg;
+--------+------------+
| winavg | AVG(plays) |
+--------+------------+
| 0.1765 | 17.0000 |
| 0.2500 | 26.0000 |
| 0.3636 | 22.0000 |
+--------+------------+
3 rows in set (0.00 sec)
You can filter on aggregate information using the HAVING
clause. The HAVING
clause is applied after GROUP BY
and allows you to filter on aggregate data that is not available to the WHERE
clause. Restrict the above example to results that involve an average number of plays over 20:
SELECT (wins / plays) AS winavg, AVG(plays) FROM plays
GROUP BY winavg HAVING AVG(plays) > 20;
+--------+------------+
| winavg | AVG(plays) |
+--------+------------+
| 0.2500 | 26.0000 |
| 0.3636 | 22.0000 |
+--------+------------+
2 rows in set (0.00 sec)
This page is licensed: CC BY-SA / Gnu FDL
Use the LIMIT
clause to restrict the number of returned rows. When you use a single integer n with LIMIT
, the first n rows will be returned. Use the ORDER BY clause to control which rows come first. You can also select a number of rows after an offset using either of the following:
LIMIT offset, row_count
LIMIT row_count OFFSET offset
When you provide an offset m with a limit n, the first m rows will be ignored, and the following n rows will be returned.
Executing an UPDATE with the LIMIT
clause is not safe for replication. LIMIT 0
is an exception to this rule (see MDEV-6170).
There is a LIMIT ROWS EXAMINED optimization which provides the means to terminate the execution of SELECT statements which examine too many rows, and thus use too many resources. See LIMIT ROWS EXAMINED.
It is possible to use LIMIT
with GROUP_CONCAT().
It is not possible to use LIMIT
with GROUP_CONCAT().
CREATE TABLE members (name VARCHAR(20));
INSERT INTO members VALUES('Jagdish'),('Kenny'),('Rokurou'),('Immaculada');
SELECT * FROM members;
+------------+
| name |
+------------+
| Jagdish |
| Kenny |
| Rokurou |
| Immaculada |
+------------+
Select the first two names (no ordering specified):
SELECT * FROM members LIMIT 2;
+---------+
| name |
+---------+
| Jagdish |
| Kenny |
+---------+
All the names in alphabetical order:
SELECT * FROM members ORDER BY name;
+------------+
| name |
+------------+
| Immaculada |
| Jagdish |
| Kenny |
| Rokurou |
+------------+
The first two names, ordered alphabetically:
SELECT * FROM members ORDER BY name LIMIT 2;
+------------+
| name |
+------------+
| Immaculada |
| Jagdish |
+------------+
The third name, ordered alphabetically (the first name would be offset zero, so the third is offset two):
SELECT * FROM members ORDER BY name LIMIT 2,1;
+-------+
| name |
+-------+
| Kenny |
+-------+
From MariaDB 10.3.2, LIMIT
can be used in a multi-table update:
CREATE TABLE warehouse (product_id INT, qty INT);
INSERT INTO warehouse VALUES (1,100),(2,100),(3,100),(4,100);
CREATE TABLE store (product_id INT, qty INT);
INSERT INTO store VALUES (1,5),(2,5),(3,5),(4,5);
UPDATE warehouse,store SET warehouse.qty = warehouse.qty-2, store.qty = store.qty+2
WHERE (warehouse.product_id = store.product_id AND store.product_id >= 1)
ORDER BY store.product_id DESC LIMIT 2;
SELECT * FROM warehouse;
+------------+------+
| product_id | qty |
+------------+------+
| 1 | 100 |
| 2 | 100 |
| 3 | 98 |
| 4 | 98 |
+------------+------+
SELECT * FROM store;
+------------+------+
| product_id | qty |
+------------+------+
| 1 | 5 |
| 2 | 5 |
| 3 | 7 |
| 4 | 7 |
+------------+------+
When using LIMIT
with GROUP_CONCAT, you can simplify certain queries. Consider this table:
CREATE TABLE d (dd DATE, cc INT);
INSERT INTO d VALUES ('2017-01-01',1);
INSERT INTO d VALUES ('2017-01-02',2);
INSERT INTO d VALUES ('2017-01-04',3);
The following query works fine, but is rather complex:
SELECT SUBSTRING_INDEX(GROUP_CONCAT(CONCAT_WS(":",dd,cc)
ORDER BY cc DESC),",",1) FROM d;
+----------------------------------------------------------------------------+
| SUBSTRING_INDEX(GROUP_CONCAT(CONCAT_WS(":",dd,cc) ORDER BY cc DESC),",",1) |
+----------------------------------------------------------------------------+
| 2017-01-04:3 |
+----------------------------------------------------------------------------+
It can be simplified to this:
SELECT GROUP_CONCAT(CONCAT_WS(":",dd,cc)
ORDER BY cc DESC LIMIT 1) FROM d;
+-------------------------------------------------------------+
| GROUP_CONCAT(CONCAT_WS(":",dd,cc) ORDER BY cc DESC LIMIT 1) |
+-------------------------------------------------------------+
| 2017-01-04:3 |
+-------------------------------------------------------------+
OFFSET ... FETCH Like limit, but also support WITH TIES
This page is licensed: CC BY-SA / Gnu FDL
InnoDB supports row-level locking. Selected rows can be locked using LOCK IN SHARE MODE
or FOR UPDATE. In both cases, a lock is acquired on the rows read by the query, and it will be released when the current transaction is committed.
When LOCK IN SHARE MODE
is specified in a SELECT statement, MariaDB will wait until all transactions that have modified the rows are committed. Then, a write lock is acquired. All transactions can read the rows, but if they want to modify them, they have to wait until your transaction is committed.
If autocommit is set to 1 (the default), the LOCK IN SHARE MODE and FOR UPDATE clauses have no effect in InnoDB. For non-transactional storage engines like MyISAM and ARIA, a table level lock will be taken even if autocommit is set to 1.
This page is licensed: CC BY-SA / Gnu FDL
Optimizer hints are options available that affect the execution plan.
HIGH_PRIORITY
gives the statement a higher priority. If the table is locked, high priority SELECT
s will be executed as soon as the lock is released, even if other statements are queued. HIGH_PRIORITY
applies only if the storage engine only supports table-level locking (MyISAM
, MEMORY
, MERGE
). See HIGH_PRIORITY and LOW_PRIORITY clauses for details.
If the query_cache_type system variable is set to 2 or DEMAND
, and the current statement is cacheable, SQL_CACHE
causes the query to be cached and SQL_NO_CACHE
causes the query not to be cached. For UNION
s, SQL_CACHE
or SQL_NO_CACHE
should be specified for the first query. See also The Query Cache for more detail and a list of the types of statements that aren't cacheable.
SQL_BUFFER_RESULT
forces the optimizer to use a temporary table to process the result. This is useful to free locks as soon as possible.
SQL_SMALL_RESULT
and SQL_BIG_RESULT
tell the optimizer whether the result is very big or not. Usually, GROUP BY
and DISTINCT
operations are performed using a temporary table. Only if the result is very big, using a temporary table is not convenient. The optimizer automatically knows if the result is too big, but you can force the optimizer to use a temporary table with SQL_SMALL_RESULT
, or avoid the temporary table using SQL_BIG_RESULT
.
STRAIGHT_JOIN
applies to the JOIN queries, and tells the optimizer that the tables must be read in the order they appear in the SELECT
. For const
and system
table this options is sometimes ignored.
SQL_CALC_FOUND_ROWS
is only applied when using the LIMIT
clause. If this option is used, MariaDB will count how many rows would match the query, without the LIMIT
clause. That number can be retrieved in the next query, using FOUND_ROWS().
USE INDEX
, FORCE INDEX
and IGNORE INDEX
constrain the query planning to a specific index. For further information about some of these options, see How to force query plans.
MariaDB starting with 12.0
Hints are placed after the main statement verb.
UPDATE /*+ hints */ table ...;
DELETE /*+ hints */ FROM table... ;
SELECT /*+ hints */ ...
They can also appear after the SELECT
keyword in any subquery:
SELECT * FROM t1 WHERE a IN (SELECT /*+ hints */ ...)
There can be one or more hints separated with space:
hints: hint hint ...
Each individual hint is hint name and arguments. In case there are no arguments, the () brackets are still present:
hint: hint_name([arguments])
Incorrect hints produce warnings (a setting to make them errors is not implemented yet). Hints that are not ignored are kept in the query text (you can see them in SHOW PROCESSLIST, Slow Query Log, EXPLAIN EXTENDED) Hints that were incorrect and were ignored are removed from there.
Hints can be
global - they apply to whole query;
table-level - they apply to a table;
index-level - they apply to an index in a table.
Table-Level Hints
hint_name([table_name [table_name [,...]] )
Index-Level Hints
Index-level hints apply to indexes. Possible syntax variants are:
hint_name(table_name [index_name [, index_name] ...])
hint_name(table_name@query_block [index_name [, index_name] ...])
hint_name(@query_block table_name [index_name [, index_name] ...])
The QB_NAME
hint is used to assign a name to the query block the hint is in. The Query Block is either a SELECT
or a top-level construct of UPDATE
or DELETE
statement.
SELECT /*+ QB_NAME(foo) */ select_list FROM ...
The name can then can be used
to refer to the query block;
to refer to a table in the query block as table_name@query_block_name
.
Query block scope is the whole statement. It is invalid to use the same name for multiple query blocks. You can refer to the query block "down into subquery", "down into derived table", "up to the parent" and "to a right sibling in the UNION". You cannot refer "to a left sibling in a UNION".
Hints inside views are not supported, yet. You can neither use hints in VIEW
definitions, nor control query plans inside non-merged views. (This is because QB_NAME
binding are done "early", before we know that some tables are views.)
SELECT#N NAMES
Besides the given name, any query block is given a name select#n. You can see it when running EXPLAIN EXTENDED
:
Note 1003 SELECT /*+ NO_RANGE_OPTIMIZATION(t3@select#1 PRIMARY) */ ...
It is not possible to use it in the hint text:
SELECT /*+ BKA(tbl1@`select#1`) */ 1 FROM tbl1 ...;
QB_NAME in CTEs
Hints that control @name
will control the first use of the CTE (common table expression).
The optimizer can be controlled by
server variables - optimizer_switch, join_cache_level, and so forth;
old-style hints;
new-style hints.
Old-style hints do not overlap with server variable settings.
New-style hints are more specific than server variable settings, so they override the server variable settings.
Hints are "narrowly interpreted" and "best effort" - if a hint dictates to do something, for example:
SELECT /*+ MRR(t1 t1_index1) */ ... FROM t1 ...
It means: When considering a query plan that involves using t1_index1
in a way that one can use MRR
, use MRR
. If the query planning is such that use of t1_index1
doesn't allow to use MRR
, it won't be used.
The optimizer may also consider using t1_index2
and pick that over using t1_index1
. In such cases, the hint is effectively ignored and no warning is given.
An index-level hint that disables range optimization for certain index(es):
SELECT /*+ NO_RANGE_OPTIMIZATION(tbl index1 index2) */ * FROM tbl ...
An index-level hint that disables Index Condition Pushdown for the indexes. ICP+BKA is disabled as well.
SELECT /*+ NO_ICP(tbl index1 index2) */ * FROM tbl ...
Index-level hints to force or disable use of MRR.
SELECT /*+ MRR(tbl index1 index2) */ * FROM tbl ...
SELECT /*+ NO_MRR(tbl index1 index2) */ * FROM tbl ...
This controls:
MRR optimization for range access (mdev35483-mrr-is-narrow.sql);
BKA mdev35483-mrr-controls-bka-partially.sql.
Query block or table-level hints.
BKA() also enables MRR to make BKA possible. (This is different from session variables, where you need to enable MRR separately). This also enables BKAH.
Controls BNL-H.
The implementation is "BNL() hint effectively increases join_cache_level up to 4 " .. for the table(s) it applies to.
Global-level hint to limit query execution time
SELECT /*+ MAX_EXECUTION_TIME(milliseconds) */ ... ;
A query that doesn't finish in the time specified will be aborted with an error.
If @@max_statement_time
is set, the hint will be ignored and a warning produced. Note that this contradicts the stated principle that "new-style hints are more specific than server variable settings, so they override the server variable settings".
Query block-level hint.
SUBQUERY([@query_block_name] MATERIALIZATION)
SUBQUERY([@query_block_name] INTOEXISTS)
This controls non-semi-join subqueries. The parameter specifies which subquery to use. Use of this hint disables conversion of subquery into semi-join.
Query block-level hints.
This controls the conversion of subqueries to semi-joins and which semi-join strategies are allowed.
[NO_]SEMIJOIN([@query_block_name] [strategy [, strategy] ...])
where the strategy is one of DUPSWEEDOUT, FIRSTMATCH, LOOSESCAN, MATERIALIZATION.
Expanded optimizer hints are not available.
This page is licensed: CC BY-SA / Gnu FDL
Use the ORDER BY
clause to order a resultset, such as that are returned from a SELECT
statement. You can specify just a column or use any expression with functions. If you are
using the GROUP BY
clause, you can use grouping functions in ORDER BY
. Ordering is done after grouping.
You can use multiple ordering expressions, separated by commas. Rows will be sorted by the first expression, then by the second expression if they have the same value for the first, and so on.
You can use the keywords ASC
and DESC
after each ordering expression to force that ordering to be ascending or descending, respectively. Ordering is ascending by default.
You can also use a single integer as the ordering expression. If you use an integer n, the results will be ordered by the nth column in the select expression.
When string values are compared, they are compared as if by the STRCMP function. STRCMP
ignores trailing whitespace and may normalize characters and ignore case, depending on the collation in use.
Duplicated entries in the ORDER BY
clause are removed.
ORDER BY
can also be used to order the activities of a DELETE or UPDATE statement (usually with the LIMIT clause).
MariaDB allows packed sort keys and values of non-sorted fields in the sort buffer. This can make filesort temporary files much smaller when VARCHAR
, CHAR
or BLOB
columns are used, notably speeding up some ORDER BY
sorts.
MariaDB does not allow packed sort keys and values of non-sorted fields in the sort buffer.
CREATE TABLE seq (i INT, x VARCHAR(1));
INSERT INTO seq VALUES (1,'a'), (2,'b'), (3,'b'), (4,'f'), (5,'e');
SELECT * FROM seq ORDER BY i;
+------+------+
| i | x |
+------+------+
| 1 | a |
| 2 | b |
| 3 | b |
| 4 | f |
| 5 | e |
+------+------+
SELECT * FROM seq ORDER BY i DESC;
+------+------+
| i | x |
+------+------+
| 5 | e |
| 4 | f |
| 3 | b |
| 2 | b |
| 1 | a |
+------+------+
SELECT * FROM seq ORDER BY x,i;
+------+------+
| i | x |
+------+------+
| 1 | a |
| 2 | b |
| 3 | b |
| 5 | e |
| 4 | f |
+------+------+
ORDER BY in an UPDATE statement, in conjunction with LIMIT:
UPDATE seq SET x='z' WHERE x='b' ORDER BY i DESC LIMIT 1;
SELECT * FROM seq;
+------+------+
| i | x |
+------+------+
| 1 | a |
| 2 | b |
| 3 | z |
| 4 | f |
| 5 | e |
+------+------+
ORDER BY
can be used in a multi-table update:
CREATE TABLE warehouse (product_id INT, qty INT);
INSERT INTO warehouse VALUES (1,100),(2,100),(3,100),(4,100);
CREATE TABLE store (product_id INT, qty INT);
INSERT INTO store VALUES (1,5),(2,5),(3,5),(4,5);
UPDATE warehouse,store SET warehouse.qty = warehouse.qty-2, store.qty = store.qty+2
WHERE (warehouse.product_id = store.product_id AND store.product_id >= 1)
ORDER BY store.product_id DESC LIMIT 2;
SELECT * FROM warehouse;
+------------+------+
| product_id | qty |
+------------+------+
| 1 | 100 |
| 2 | 100 |
| 3 | 98 |
| 4 | 98 |
+------------+------+
SELECT * FROM store;
+------------+------+
| product_id | qty |
+------------+------+
| 1 | 5 |
| 2 | 5 |
| 3 | 7 |
| 4 | 7 |
+------------+------+
This page is licensed: CC BY-SA / Gnu FDL
The PROCEDURE
clause of SELECT passes the whole result set to a Procedure which will process it. These Procedures are not Stored Procedures, and can only be written in the C language, so it is necessary to recompile the server.
Currently, the only available procedure is ANALYSE, which examines the resultset and suggests the optimal datatypes for each column. It is defined in the sql/sql_analyse.cc
file, and can be used as an example to create more Procedures.
This clause cannot be used in a view definition.
This page is licensed: CC BY-SA / Gnu FDL
SELECT
[/*+ hints */]
[ALL | DISTINCT | DISTINCTROW]
[HIGH_PRIORITY]
[STRAIGHT_JOIN]
[SQL_SMALL_RESULT] [SQL_BIG_RESULT] [SQL_BUFFER_RESULT]
[SQL_CACHE | SQL_NO_CACHE] [SQL_CALC_FOUND_ROWS]
select_expr [, select_expr ...]
[ FROM table_references
[WHERE where_condition]
[GROUP BY {col_name | expr | position} [ASC | DESC], ... [WITH ROLLUP]]
[HAVING where_condition]
[ORDER BY {col_name | expr | position} [ASC | DESC], ...]
[LIMIT {[offset,] row_count | row_count OFFSET offset
[ROWS EXAMINED rows_limit] } |
[OFFSET start { ROW | ROWS }]
[FETCH { FIRST | NEXT } [ count ] { ROW | ROWS } { ONLY | WITH TIES }] ]
procedure|[PROCEDURE procedure_name(argument_list)]
[INTO OUTFILE 'file_name' [CHARACTER SET charset_name] [export_options] |
INTO DUMPFILE 'file_name' | INTO var_name [, var_name] ]
[FOR UPDATE lock_option | LOCK IN SHARE MODE lock_option]
export_options:
[{FIELDS | COLUMNS}
[TERMINATED BY 'string']
[[OPTIONALLY] ENCLOSED BY 'char']
[ESCAPED BY 'char']
]
[LINES
[STARTING BY 'string']
[TERMINATED BY 'string']
]
lock_option:
[WAIT n | NOWAIT | SKIP LOCKED]
[/*+ hints */]
syntax is available from MariaDB 11.8.
[/*+ hints */]
syntax is unavailable.
SELECT
is used to retrieve rows selected from one or more tables, and can include UNION statements and subqueries.
Each select_expr
expression indicates a column or data that you want to retrieve. You must have at least one select expression. See Select Expressions below.
Each table can also be specified as db_name
.tabl_name
. Each column can also be specified as tbl_name
.col_name
or even db_name
.tbl_name
.col_name
. This allows one to write queries which involve multiple databases. See Identifier Qualifiers for syntax details.
The WHERE
clause, if given, indicates the condition or conditions that rows must satisfy to be selected. The where_condition
is an expression that evaluates to true for each row to be selected. The statement selects all rows if there is no WHERE clause.
In the WHERE
clause, you can use any of the functions and operators that MariaDB supports, except for aggregate (summary) functions. See Functions and Operators and Functions and Modifiers for use with GROUP BY (aggregate).
Use the ORDER BY clause to order the results.
Use the LIMIT clause to restrict the results to only a certain number of rows, optionally with an offset.
Use the GROUP BY and HAVING
clauses to group rows together when they have columns or computed values in common.
SELECT
can also be used to retrieve rows computed without reference to any table.
A SELECT
statement must contain one or more select expressions, separated by commas. Each select expression can be one of the following:
The name of a column.
Any expression using functions and operators.
*
to select all columns from all tables in the FROM
clause.
tbl_name.*
to select all columns from just the table tbl_name
.
When specifying a column, you can either use just the column name or qualify the column name with the name of the table using tbl_name.col_name
. The qualified form is useful if you are joining multiple tables in the FROM
clause. If you do not qualify the column names when selecting from multiple tables, MariaDB will try to find the column in each table. It is an error if that column name exists in multiple tables.
You can quote column names using backticks. If you are qualifying column names with table names, quote each part separately as tbl_name`.`col_name
.
If you use any grouping functions in any of the select expressions, all rows in your results will be implicitly grouped, as if you had used GROUP BY NULL
. GROUP BY NULL
being an expression behaves specially such that the entire result set is treated as a group.
A query may produce some identical rows. By default, all rows are retrieved, even when their values are the same. To explicitly specify that you want to retrieve identical rows, use the ALL
option. If you want duplicates to be removed from the result set, use the DISTINCT
option. DISTINCTROW
is a synonym for DISTINCT
. See also COUNT DISTINCT and SELECT UNIQUE in Oracle mode.
The INTO
clause is used to specify that the query results should be written to a file or variable.
SELECT INTO OUTFILE - formatting and writing the result to an external file.
SELECT INTO DUMPFILE - binary-safe writing of the unformatted results to an external file.
SELECT INTO Variable - selecting and setting variables.
The reverse of SELECT INTO OUTFILE
is LOAD DATA.
Restricts the number of returned rows. See LIMIT and LIMIT ROWS EXAMINED for details.
See LOCK IN SHARE MODE and FOR UPDATE for details on the respective locking clauses.
The clause doesn't exist.
Order a result set. See ORDER BY for details.
Specifies to the optimizer which partitions are relevant for the query. Other partitions will not be read. See Partition Pruning and Selection for details.
Passes the whole result set to a C Procedure. See PROCEDURE and PROCEDURE ANALYSE (the only built-in procedure not requiring the server to be recompiled).
This causes rows that couldn't be locked (LOCK IN SHARE MODE or FOR UPDATE) to be excluded from the result set. An explicit NOWAIT
is implied here. This is only implemented on InnoDB tables and ignored otherwise.
The clause doesn't exist.
These include HIGH_PRIORITY, STRAIGHT_JOIN, SQL_SMALL_RESULT | SQL_BIG_RESULT, SQL_BUFFER_RESULT, SQL_CACHE | SQL_NO_CACHE, and SQL_CALC_FOUND_ROWS.
See Optimizer Hints for details.
By using max_statement_time in conjunction with SET STATEMENT, it is possible to limit the execution time of individual queries. For example:
SET STATEMENT max_statement_time=100 FOR
SELECT field1 FROM table_name ORDER BY field1;
Set the lock wait timeout. See WAIT and NOWAIT.
SELECT f1,f2 FROM t1 WHERE (f3<=10) AND (f4='y');
See Getting Data from MariaDB (Beginner tutorial), or the various sub-articles, for more examples.
Getting Data from MariaDB (Beginner tutorial)
This page is licensed: GPLv2, originally from fill_help_tables.sql
SELECT ... INTO DUMPFILE 'file_path'
SELECT ... INTO DUMPFILE
is a SELECT clause which writes the resultset into a single unformatted row, without any separators, in a file. The results will not be returned to the client.
file_path can be an absolute path, or a relative path starting from the data directory. It can only be specified as a string literal, not as a variable. However, the statement can be dynamically composed and executed as a prepared statement to work around this limitation.
This statement is binary-safe and so is particularly useful for writing BLOB values to file. It can be used, for example, to copy an image or an audio document from the database to a file.
The file must not exist. It cannot be overwritten. A user needs the FILE privilege to run this statement. Also, MariaDB needs permission to write files in the specified location. If the secure_file_priv system variable is set to a non-empty directory name, the file can only be written to that directory.
The character_set_filesystem system variable has controlled interpretation of file names that are given as literal strings.
The character_set_filesystem system variable does not have controlled interpretation of file names that are given as literal strings.
SELECT _utf8'Hello world!' INTO DUMPFILE '/tmp/world';
SELECT LOAD_FILE('/tmp/world') AS world;
+--------------+
| world |
+--------------+
| Hello world! |
+--------------+
This page is licensed: CC BY-SA / Gnu FDL
SELECT ... INTO OUTFILE 'file_name'
[CHARACTER SET charset_name]
[export_options]
export_options:
[{FIELDS | COLUMNS}
[TERMINATED BY 'string']
[[OPTIONALLY] ENCLOSED BY 'char']
[ESCAPED BY 'char']
]
[LINES
[STARTING BY 'string']
[TERMINATED BY 'string']
]
SELECT INTO OUTFILE
writes the resulting rows to a file, and allows the use of column and row terminators to specify a particular output format. The default is to terminate fields with tabs () and lines with newlines ().
The file must not exist. It cannot be overwritten. A user needs the FILE privilege to run this statement. Also, MariaDB needs permission to write files in the specified location. If the secure_file_priv system variable is set to a non-empty directory name, the file can only be written to that directory.
The LOAD DATA INFILE statement complements SELECT INTO OUTFILE
.
The CHARACTER SET
clause specifies the character set in which the results are to be written. Without the clause, no conversion takes place (the binary character set). In this case, if there are multiple character sets, the output will contain these too, and may not easily be able to be reloaded.
In cases where you have two servers using different character-sets, using SELECT INTO OUTFILE
to transfer data from one to the other can have unexpected results. To ensure that MariaDB correctly interprets the escape sequences, use the CHARACTER SET
clause on both the SELECT INTO OUTFILE
statement and the subsequent LOAD DATA INFILE statement.
The following example produces a file in the CSV format:
SELECT customer_id, firstname, surname from customer
INTO OUTFILE '/exportdata/customers.txt'
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\n';
The following ANSI syntax is also supported for simple SELECT
without UNION
:
SELECT customer_id, firstname, surname INTO OUTFILE '/exportdata/customers.txt'
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\n'
FROM customers;
If you want to use the ANSI syntax with UNION
or similar construct you have to use the syntax:
SELECT * INTO OUTFILE "/tmp/skr3"
FROM (SELECT * FROM t1 UNION SELECT * FROM t1);
LOAD_DATA() function
This page is licensed: CC BY-SA / Gnu FDL
OFFSET start { ROW | ROWS }
FETCH { FIRST | NEXT } [ count ] { ROW | ROWS } { ONLY | WITH TIES }
The OFFSET
clause allows one to return only those elements of a resultset that come after a specified offset. The FETCH
clause specifies the number of rows to return, while ONLY
or WITH TIES
specifies whether or not to also return any further results that tie for last place according to the ordered resultset.
Either the singular ROW
or the plural ROWS
can be used after the OFFSET
and FETCH
clauses; the choice has no impact on the results.
FIRST
and NEXT
give the same result.
In the case of WITH TIES
, an ORDER BY clause is required, otherwise an error will be returned.
SELECT i FROM t1 FETCH FIRST 2 ROWS WITH TIES;
ERROR 4180 (HY000): FETCH ... WITH TIES requires ORDER BY clause to be present
Given a table with 6 rows:
CREATE OR REPLACE TABLE t1 (i INT);
INSERT INTO t1 VALUES (1),(2),(3),(4), (4), (5);
SELECT i FROM t1 ORDER BY i ASC;
+------+
| i |
+------+
| 1 |
| 2 |
| 3 |
| 4 |
| 4 |
| 5 |
+------+
OFFSET 2
allows one to skip the first two results:
SELECT i FROM t1 ORDER BY i ASC OFFSET 2 ROWS;
+------+
| i |
+------+
| 3 |
| 4 |
| 4 |
| 5 |
+------+
FETCH FIRST 3 ROWS ONLY
limits the results to three rows only:
SELECT i FROM t1 ORDER BY i ASC OFFSET 1 ROWS FETCH FIRST 3 ROWS ONLY;
+------+
| i |
+------+
| 2 |
| 3 |
| 4 |
+------+
The same outcome can also be achieved with the LIMIT clause:
SELECT i FROM t1 ORDER BY i ASC LIMIT 3 OFFSET 1;
+------+
| i |
+------+
| 2 |
| 3 |
| 4 |
+------+
WITH TIES
ensures the tied result 4
is also returned:
SELECT i FROM t1 ORDER BY i ASC OFFSET 1 ROWS FETCH FIRST 3 ROWS WITH TIES;
+------+
| i |
+------+
| 2 |
| 3 |
| 4 |
| 4 |
+------+
This page is licensed: CC BY-SA / Gnu FDL
See SELECT.
The WITH ROLLUP
modifier adds extra rows to the result set that represent super-aggregate summaries. The super-aggregated column is represented by a NULL
value. Multiple aggregates over different columns will be added if there are multiple GROUP BY
columns.
The LIMIT clause can be used at the same time, and is applied after the WITH ROLLUP
rows have been added.
WITH ROLLUP
cannot be used with ORDER BY. Some sorting is still possible by using ASC
or DESC
clauses with the GROUP BY
column, although the super-aggregate rows will always be added last.
These examples use the following sample table
CREATE TABLE booksales (
country VARCHAR(35), genre ENUM('fiction','non-fiction'), year YEAR, sales INT);
INSERT INTO booksales VALUES
('Senegal','fiction',2014,12234), ('Senegal','fiction',2015,15647),
('Senegal','non-fiction',2014,64980), ('Senegal','non-fiction',2015,78901),
('Paraguay','fiction',2014,87970), ('Paraguay','fiction',2015,76940),
('Paraguay','non-fiction',2014,8760), ('Paraguay','non-fiction',2015,9030);
The WITH ROLLUP
modifier in this example adds an extra row that aggregates both years:
SELECT year, SUM(sales) FROM booksales GROUP BY year;
+------+------------+
| year | SUM(sales) |
+------+------------+
| 2014 | 173944 |
| 2015 | 180518 |
+------+------------+
2 rows in set (0.08 sec)
SELECT year, SUM(sales) FROM booksales GROUP BY year WITH ROLLUP;
+------+------------+
| year | SUM(sales) |
+------+------------+
| 2014 | 173944 |
| 2015 | 180518 |
| NULL | 354462 |
+------+------------+
Each time the genre, the year, or the country change, another super-aggregate row is added:
SELECT country, year, genre, SUM(sales)
FROM booksales GROUP BY country, year, genre;
+----------+------+-------------+------------+
| country | year | genre | SUM(sales) |
+----------+------+-------------+------------+
| Paraguay | 2014 | fiction | 87970 |
| Paraguay | 2014 | non-fiction | 8760 |
| Paraguay | 2015 | fiction | 76940 |
| Paraguay | 2015 | non-fiction | 9030 |
| Senegal | 2014 | fiction | 12234 |
| Senegal | 2014 | non-fiction | 64980 |
| Senegal | 2015 | fiction | 15647 |
| Senegal | 2015 | non-fiction | 78901 |
+----------+------+-------------+------------+
SELECT country, year, genre, SUM(sales)
FROM booksales GROUP BY country, year, genre WITH ROLLUP;
+----------+------+-------------+------------+
| country | year | genre | SUM(sales) |
+----------+------+-------------+------------+
| Paraguay | 2014 | fiction | 87970 |
| Paraguay | 2014 | non-fiction | 8760 |
| Paraguay | 2014 | NULL | 96730 |
| Paraguay | 2015 | fiction | 76940 |
| Paraguay | 2015 | non-fiction | 9030 |
| Paraguay | 2015 | NULL | 85970 |
| Paraguay | NULL | NULL | 182700 |
| Senegal | 2014 | fiction | 12234 |
| Senegal | 2014 | non-fiction | 64980 |
| Senegal | 2014 | NULL | 77214 |
| Senegal | 2015 | fiction | 15647 |
| Senegal | 2015 | non-fiction | 78901 |
| Senegal | 2015 | NULL | 94548 |
| Senegal | NULL | NULL | 171762 |
| NULL | NULL | NULL | 354462 |
+----------+------+-------------+------------+
The LIMIT
clause, applied after WITH ROLLUP
:
SELECT country, year, genre, SUM(sales)
FROM booksales GROUP BY country, year, genre WITH ROLLUP LIMIT 4;
+----------+------+-------------+------------+
| country | year | genre | SUM(sales) |
+----------+------+-------------+------------+
| Paraguay | 2014 | fiction | 87970 |
| Paraguay | 2014 | non-fiction | 8760 |
| Paraguay | 2014 | NULL | 96730 |
| Paraguay | 2015 | fiction | 76940 |
+----------+------+-------------+------------+
Sorting by year descending:
SELECT country, year, genre, SUM(sales)
FROM booksales GROUP BY country, year DESC, genre WITH ROLLUP;
+----------+------+-------------+------------+
| country | year | genre | SUM(sales) |
+----------+------+-------------+------------+
| Paraguay | 2015 | fiction | 76940 |
| Paraguay | 2015 | non-fiction | 9030 |
| Paraguay | 2015 | NULL | 85970 |
| Paraguay | 2014 | fiction | 87970 |
| Paraguay | 2014 | non-fiction | 8760 |
| Paraguay | 2014 | NULL | 96730 |
| Paraguay | NULL | NULL | 182700 |
| Senegal | 2015 | fiction | 15647 |
| Senegal | 2015 | non-fiction | 78901 |
| Senegal | 2015 | NULL | 94548 |
| Senegal | 2014 | fiction | 12234 |
| Senegal | 2014 | non-fiction | 64980 |
| Senegal | 2014 | NULL | 77214 |
| Senegal | NULL | NULL | 171762 |
| NULL | NULL | NULL | 354462 |
+----------+------+-------------+------------+
This page is licensed: CC BY-SA / Gnu FDL
Learn about Common Table Expressions (CTEs) in MariaDB Server. This section explains how to use CTEs for complex, readable, and reusable subqueries, simplifying data selection and manipulation.
WITH [RECURSIVE] table_reference [(columns_list)] AS (
SELECT ...
)
[CYCLE cycle_column_list RESTRICT]
SELECT ...
The WITH
keyword signifies a Common Table Expression (CTE). It allows you to refer to a subquery expression many times in a query, as if having a temporary table that only exists for the duration of a query.
There are two kinds of CTEs:
Recursive (signified by the RECURSIVE
keyword).
You can use table_reference
as any normal table in the external SELECT
part. You can also use WITH
in subqueries, as well as with EXPLAIN and SELECT.
Poorly-formed recursive CTEs can in theory cause infinite loops. The max_recursive_iterations system variable limits the number of recursions.
The CYCLE
clause enables CTE cycle detection, avoiding excessive or infinite loops,
MariaDB supports a relaxed, non-standard grammar.
The SQL Standard permits a CYCLE
clause, as follows:
WITH RECURSIVE ... (
...
)
CYCLE <cycle column list>
SET <cycle mark column> TO <cycle mark value> DEFAULT <non-cycle mark value>
USING <path column>
where all clauses are mandatory. MariaDB does not support this, but permits a non-standard relaxed grammar, as follows:
WITH RECURSIVE ... (
...
)
CYCLE <cycle column list> RESTRICT
With the use of CYCLE ... RESTRICT
it makes no difference whether the CTE uses UNION ALL
or UNION DISTINCT
anymore. UNION ALL
means "all rows, but without cycles", which is exactly what the CYCLE
clause enables. And UNION DISTINCT
means all rows should be different, which, again, is what will happen — as uniqueness is enforced over a subset of columns, complete rows will automatically all be different.
CYCLE ... RESTRICT
is not available.
Below is an example with the WITH
at the top level:
WITH t AS (SELECT a FROM t1 WHERE b >= 'c')
SELECT * FROM t2, t WHERE t2.c = t.a;
The example below uses WITH
in a subquery:
SELECT t1.a, t1.b FROM t1, t2
WHERE t1.a > t2.c
AND t2.c IN(WITH t AS (SELECT * FROM t1 WHERE t1.a < 5)
SELECT t2.c FROM t2, t WHERE t2.c = t.a);
Below is an example of a Recursive CTE:
WITH RECURSIVE ancestors AS
( SELECT * FROM folks
WHERE name="Alex"
UNION
SELECT f.*
FROM folks AS f, ancestors AS a
WHERE f.id = a.father OR f.id = a.mother )
SELECT * FROM ancestors;
Consider the following structure and data:
CREATE TABLE t1 (from_ int, to_ int);
INSERT INTO t1 VALUES (1,2), (1,100), (2,3), (3,4), (4,1);
SELECT * FROM t1;
+-------+------+
| from_ | to_ |
+-------+------+
| 1 | 2 |
| 1 | 100 |
| 2 | 3 |
| 3 | 4 |
| 4 | 1 |
+-------+------+
Given the above, the following query would theoretically result in an infinite loop due to the last record in t1 (note that max_recursive_iterations is set to 10 for the purposes of this example, to avoid the excessive number of cycles):
SET max_recursive_iterations=10;
WITH RECURSIVE cte (depth, from_, to_) AS (
SELECT 0,1,1 UNION DISTINCT SELECT depth+1, t1.from_, t1.to_
FROM t1, cte WHERE t1.from_ = cte.to_
)
SELECT * FROM cte;
+-------+-------+------+
| depth | from_ | to_ |
+-------+-------+------+
| 0 | 1 | 1 |
| 1 | 1 | 2 |
| 1 | 1 | 100 |
| 2 | 2 | 3 |
| 3 | 3 | 4 |
| 4 | 4 | 1 |
| 5 | 1 | 2 |
| 5 | 1 | 100 |
| 6 | 2 | 3 |
| 7 | 3 | 4 |
| 8 | 4 | 1 |
| 9 | 1 | 2 |
| 9 | 1 | 100 |
| 10 | 2 | 3 |
+-------+-------+------+
However, the CYCLE ... RESTRICT
clause can overcome this:
WITH RECURSIVE cte (depth, from_, to_) AS (
SELECT 0,1,1 UNION SELECT depth+1, t1.from_, t1.to_
FROM t1, cte WHERE t1.from_ = cte.to_
)
CYCLE from_, to_ RESTRICT
SELECT * FROM cte;
+-------+-------+------+
| depth | from_ | to_ |
+-------+-------+------+
| 0 | 1 | 1 |
| 1 | 1 | 2 |
| 1 | 1 | 100 |
| 2 | 2 | 3 |
| 3 | 3 | 4 |
| 4 | 4 | 1 |
+-------+-------+------+
This page is licensed: CC BY-SA / Gnu FDL
Common Table Expressions (CTEs) are a standard SQL feature, and are essentially temporary named result sets. There are two kinds of CTEs: Non-Recursive, which this article covers; and Recursive.
The WITH keyword signifies a CTE. It is given a name, followed by a body (the main query):
CTEs are similar to derived tables:
WITH engineers AS
( SELECT * FROM employees
WHERE dept = 'Engineering' )
SELECT * FROM engineers
WHERE ...
SELECT * FROM
( SELECT * FROM employees
WHERE dept = 'Engineering' ) AS engineers
WHERE
...
A non-recursive CTE is basically a query-local VIEW. There are several advantages and caveats to them. The syntax is more readable than a nested FROM (SELECT ...)
.
A CTE can refer to another and it can be referenced from multiple places.
Using this format makes for a more readable SQL than a nested FROM(SELECT ...)
clause:
WITH engineers AS (
SELECT * FROM employees
WHERE dept IN('Development','Support') ),
eu_engineers AS ( SELECT * FROM engineers WHERE country IN('NL',...) )
SELECT
...
FROM eu_engineers;
This can be an 'anti-self join', for example:
WITH engineers AS (
SELECT * FROM employees
WHERE dept IN('Development','Support') )
SELECT * FROM engineers E1
WHERE NOT EXISTS
(SELECT 1 FROM engineers E2
WHERE E2.country=E1.country
AND E2.name <> E1.name );
Or, for year-over-year comparisons, for example:
WITH sales_product_year AS (
SELECT product, YEAR(ship_date) AS year,
SUM(price) AS total_amt
FROM item_sales
GROUP BY product, year )
SELECT *
FROM sales_product_year CUR,
sales_product_year PREV,
WHERE CUR.product=PREV.product
AND CUR.year=PREV.year + 1
AND CUR.total_amt > PREV.total_amt
Another use is to compare individuals against their group. Below is an example of how this might be executed:
WITH sales_product_year AS (
SELECT product,
YEAR(ship_date) AS year,
SUM(price) AS total_amt
FROM item_sales
GROUP BY product, year
)
SELECT *
FROM sales_product_year S1
WHERE
total_amt >
(SELECT 0.1 * SUM(total_amt)
FROM sales_product_year S2
WHERE S2.year = S1.year)
This page is licensed: CC BY-SA / Gnu FDL
Common Table Expressions (CTEs) are a standard SQL feature, and are essentially temporary named result sets. CTEs first appeared in the SQL standard in 1999, and the first implementations began appearing in 2007.
There are two kinds of CTEs:
Recursive, which this article covers.
SQL is generally poor at recursive structures.
CTEs permit a query to reference itself. A recursive CTE will repeatedly execute subsets of the data until it obtains the complete result set. This makes it particularly useful for handing hierarchical or tree-structured data. max_recursive_iterations avoids infinite loops.
WITH RECURSIVE signifies a recursive CTE. It is given a name, followed by a body (the main query) as follows:
Given the following structure:
First execute the anchor part of the query:
Next, execute the recursive part of the query:
WITH recursive R AS (
SELECT anchor_data
UNION [all]
SELECT recursive_part
FROM R, ...
)
SELECT ...
Compute anchor_data.
Compute recursive_part to get the new data.
If (new data is non-empty) goto 2.
As currently implemented by MariaDB and by the SQL Standard, data may be truncated if not correctly cast. It is necessary to CAST the column to the correct width if the CTE's recursive part produces wider values for a column than the CTE's nonrecursive part. Some other DBMS give an error in this situation, and MariaDB's behavior may change in future - see MDEV-12325. See the examples below.
Sample data:
CREATE TABLE bus_routes (origin varchar(50), dst varchar(50));
INSERT INTO bus_routes VALUES
('New York', 'Boston'),
('Boston', 'New York'),
('New York', 'Washington'),
('Washington', 'Boston'),
('Washington', 'Raleigh');
Now, we want to return the bus destinations with New York as the origin:
WITH RECURSIVE bus_dst as (
SELECT origin as dst FROM bus_routes WHERE origin='New York'
UNION
SELECT bus_routes.dst FROM bus_routes JOIN bus_dst ON bus_dst.dst= bus_routes.origin
)
SELECT * FROM bus_dst;
+------------+
| dst |
+------------+
| New York |
| Boston |
| Washington |
| Raleigh |
+------------+
The above example is computed as follows:
First, the anchor data is calculated:
Starting from New York.
Boston and Washington are added.
Next, the recursive part:
Starting from Boston and then Washington.
Raleigh is added.
UNION excludes nodes that are already present.
This time, we are trying to get bus routes such as “New York -> Washington -> Raleigh”.
Using the same sample data as the previous example:
WITH RECURSIVE paths (cur_path, cur_dest) AS (
SELECT origin, origin FROM bus_routes WHERE origin='New York'
UNION
SELECT CONCAT(paths.cur_path, ',', bus_routes.dst), bus_routes.dst
FROM paths
JOIN bus_routes
ON paths.cur_dest = bus_routes.origin AND
NOT FIND_IN_SET(bus_routes.dst, paths.cur_path)
)
SELECT * FROM paths;
+-----------------------------+------------+
| cur_path | cur_dest |
+-----------------------------+------------+
| New York | New York |
| New York,Boston | Boston |
| New York,Washington | Washington |
| New York,Washington,Boston | Boston |
| New York,Washington,Raleigh | Raleigh |
+-----------------------------+------------+
In the following example, data is truncated because the results are not specifically cast to a wide enough type:
WITH RECURSIVE tbl AS (
SELECT NULL AS col
UNION
SELECT "THIS NEVER SHOWS UP" AS col FROM tbl
)
SELECT col FROM tbl
+------+
| col |
+------+
| NULL |
| |
+------+
Explicitly use CAST to overcome this:
WITH RECURSIVE tbl AS (
SELECT CAST(NULL AS CHAR(50)) AS col
UNION SELECT "THIS NEVER SHOWS UP" AS col FROM tbl
)
SELECT * FROM tbl;
+---------------------+
| col |
+---------------------+
| NULL |
| THIS NEVER SHOWS UP |
+---------------------+
This page is licensed: CC BY-SA / Gnu FDL
Learn about joins and subqueries in MariaDB Server. This section details how to combine data from multiple tables and nest queries to perform complex data retrieval operations efficiently.
The result of EXCEPT
contains all records of the left SELECT
result set except records which are in right SELECT
result set. In other words, it is the subtraction of two result sets.
MINUS
is a synonym when SQL_MODE=ORACLE is set.
MINUS
is a synonym is not available.
SELECT ...
(INTERSECT [ALL | DISTINCT] | EXCEPT [ALL | DISTINCT] | UNION [ALL | DISTINCT])
SELECT ...
[(INTERSECT [ALL | DISTINCT] | EXCEPT [ALL | DISTINCT] | UNION [ALL | DISTINCT])
SELECT ...]
[ORDER BY [{col_name | expr | position} [ASC | DESC]
[, {col_name | expr | position} [ASC | DESC] ...]]]
[LIMIT {[offset,] row_count | row_count OFFSET offset}
| OFFSET start { ROW | ROWS }
| FETCH { FIRST | NEXT } [ count ] { ROW | ROWS } { ONLY | WITH TIES } ]
Brackets for explicit operation precedence are not supported; use a subquery in the FROM
clause as a workaround.
MariaDB supports EXCEPT
and INTERSECT in addition to UNION.
The queries before and after EXCEPT
must be SELECT or VALUES statements.
All behavior for naming columns, ORDER BY
and LIMIT
is the same as for UNION. Note that the alternative SELECT ... OFFSET ... FETCH syntax is only supported. This allows us to use the WITH TIES
clause.
EXCEPT
implicitly supposes a DISTINCT
operation.
The result of EXCEPT
is all records of the left SELECT
result except records which are in right SELECT
result set, i.e. it is subtraction of two result sets.
EXCEPT
and UNION
have the same operation precedence and INTERSECT
has a higher precedence, unless running in Oracle mode, in which case all three have the same precedence.
Parentheses can be used to specify precedence. Before this, a syntax error would be returned.
MariaDB starting with 10.5.0
EXCEPT ALL
and EXCEPT DISTINCT
. The ALL
operator leaves duplicates intact, while the DISTINCT
operator removes duplicates. DISTINCT
is the default behavior if neither operator is supplied.
Only EXCEPT DISTINCT
is available.
Show customers which are not employees:
(SELECT e_name AS name, email FROM customers)
EXCEPT
(SELECT c_name AS name, email FROM employees);
Difference between UNION, EXCEPT and INTERSECT:
CREATE TABLE seqs (i INT);
INSERT INTO seqs VALUES (1),(2),(2),(3),(3),(4),(5),(6);
SELECT i FROM seqs WHERE i <= 3 UNION SELECT i FROM seqs WHERE i>=3;
+------+
| i |
+------+
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
| 6 |
+------+
SELECT i FROM seqs WHERE i <= 3 UNION ALL SELECT i FROM seqs WHERE i>=3;
+------+
| i |
+------+
| 1 |
| 2 |
| 2 |
| 3 |
| 3 |
| 3 |
| 3 |
| 4 |
| 5 |
| 6 |
+------+
SELECT i FROM seqs WHERE i <= 3 EXCEPT SELECT i FROM seqs WHERE i>=3;
+------+
| i |
+------+
| 1 |
| 2 |
+------+
SELECT i FROM seqs WHERE i <= 3 EXCEPT ALL SELECT i FROM seqs WHERE i>=3;
+------+
| i |
+------+
| 1 |
| 2 |
| 2 |
+------+
SELECT i FROM seqs WHERE i <= 3 INTERSECT SELECT i FROM seqs WHERE i>=3;
+------+
| i |
+------+
| 3 |
+------+
SELECT i FROM seqs WHERE i <= 3 INTERSECT ALL SELECT i FROM seqs WHERE i>=3;
+------+
| i |
+------+
| 3 |
| 3 |
+------+
Parentheses for specifying precedence:
CREATE OR REPLACE TABLE t1 (a INT);
CREATE OR REPLACE TABLE t2 (b INT);
CREATE OR REPLACE TABLE t3 (c INT);
INSERT INTO t1 VALUES (1),(2),(3),(4);
INSERT INTO t2 VALUES (5),(6);
INSERT INTO t3 VALUES (1),(6);
((SELECT a FROM t1) UNION (SELECT b FROM t2)) EXCEPT (SELECT c FROM t3);
+------+
| a |
+------+
| 2 |
| 3 |
| 4 |
| 5 |
+------+
(SELECT a FROM t1) UNION ((SELECT b FROM t2) EXCEPT (SELECT c FROM t3));
+------+
| a |
+------+
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
+------+
Here is an example that makes use of the SEQUENCE storage engine and the VALUES statement, to generate a numeric sequence and remove some arbitrary numbers from it:
(SELECT seq FROM seq_1_to_10) EXCEPT VALUES (2), (3), (4);
+-----+
| seq |
+-----+
| 1 |
| 5 |
| 6 |
| 7 |
| 8 |
| 9 |
| 10 |
+-----+
Get Set for Set Theory: UNION, INTERSECT and EXCEPT in SQL (video tutorial)
This page is licensed: CC BY-SA / Gnu FDL
The result of an intersect is the intersection of right and left SELECT
results, i.e. only records that are present in both result sets will be included in the result of the operation.
SELECT ...
(INTERSECT [ALL | DISTINCT] | EXCEPT [ALL | DISTINCT] | UNION [ALL | DISTINCT]) SELECT ...
[(INTERSECT [ALL | DISTINCT] | EXCEPT [ALL | DISTINCT] | UNION [ALL | DISTINCT]) SELECT ...]
[ORDER BY [column [, column ...]]]
[LIMIT {[offset,] row_count | row_count OFFSET offset}]
MariaDB has supported INTERSECT
(as well as EXCEPT) in addition to UNION since MariaDB 10.3.
All behavior for naming columns, ORDER BY
and LIMIT
is the same as for UNION.
INTERSECT
implicitly supposes a DISTINCT
operation.
The result of an intersect is the intersection of right and left SELECT
results, i.e. only records that are present in both result sets will be included in the result of the operation.
INTERSECT
has higher precedence than UNION
and EXCEPT
(unless running running in Oracle mode, in which case all three have the same precedence). If possible it will be executed linearly, but if not, it will be translated to a subquery in the FROM
clause:
(SELECT a,b FROM t1)
UNION
(SELECT c,d FROM t2)
INTERSECT
(SELECT e,f FROM t3)
UNION
(SELECT 4,4);
will be translated to:
(SELECT a,b FROM t1)
UNION
SELECT c,d FROM
((SELECT c,d FROM t2)
INTERSECT
(SELECT e,f FROM t3)) dummy_subselect
UNION
(SELECT 4,4)
Parentheses can be used to specify precedence.
Parentheses cannot be used to specify precedence.
MariaDB starting with 10.5.0
INTERSECT ALL
and INTERSECT DISTINCT
. The ALL
operator leaves duplicates intact, while the DISTINCT
operator removes duplicates. DISTINCT
is the default behavior if neither operator is supplied.
DISTINCT
is the only behavior available.
Show customers which are employees:
(SELECT e_name AS name, email FROM employees)
INTERSECT
(SELECT c_name AS name, email FROM customers);
Difference between UNION, EXCEPT and INTERSECT
:
CREATE TABLE seqs (i INT);
INSERT INTO seqs VALUES (1),(2),(2),(3),(3),(4),(5),(6);
SELECT i FROM seqs WHERE i <= 3 UNION SELECT i FROM seqs WHERE i>=3;
+------+
| i |
+------+
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
| 6 |
+------+
SELECT i FROM seqs WHERE i <= 3 UNION ALL SELECT i FROM seqs WHERE i>=3;
+------+
| i |
+------+
| 1 |
| 2 |
| 2 |
| 3 |
| 3 |
| 3 |
| 3 |
| 4 |
| 5 |
| 6 |
+------+
SELECT i FROM seqs WHERE i <= 3 EXCEPT SELECT i FROM seqs WHERE i>=3;
+------+
| i |
+------+
| 1 |
| 2 |
+------+
SELECT i FROM seqs WHERE i <= 3 EXCEPT ALL SELECT i FROM seqs WHERE i>=3;
+------+
| i |
+------+
| 1 |
| 2 |
| 2 |
+------+
SELECT i FROM seqs WHERE i <= 3 INTERSECT SELECT i FROM seqs WHERE i>=3;
+------+
| i |
+------+
| 3 |
+------+
SELECT i FROM seqs WHERE i <= 3 INTERSECT ALL SELECT i FROM seqs WHERE i>=3;
+------+
| i |
+------+
| 3 |
| 3 |
+------+
Parentheses for specifying precedence:
CREATE OR REPLACE TABLE t1 (a INT);
CREATE OR REPLACE TABLE t2 (b INT);
CREATE OR REPLACE TABLE t3 (c INT);
INSERT INTO t1 VALUES (1),(2),(3),(4);
INSERT INTO t2 VALUES (5),(6);
INSERT INTO t3 VALUES (1),(6);
((SELECT a FROM t1) UNION (SELECT b FROM t2)) INTERSECT (SELECT c FROM t3);
+------+
| a |
+------+
| 1 |
| 6 |
+------+
(SELECT a FROM t1) UNION ((SELECT b FROM t2) INTERSECT (SELECT c FROM t3));
+------+
| a |
+------+
| 1 |
| 2 |
| 3 |
| 4 |
| 6 |
+------+
Get Set for Set Theory: UNION, INTERSECT and EXCEPT in SQL (video tutorial)
This page is licensed: CC BY-SA / Gnu FDL
MINUS
is a synonym for EXCEPT when SQL_MODE=ORACLE is set.
CREATE TABLE seqs (i INT);
INSERT INTO seqs VALUES (1),(2),(2),(3),(3),(4),(5),(6);
SET SQL_MODE='ORACLE';
SELECT i FROM seqs WHERE i <= 3 MINUS SELECT i FROM seqs WHERE i>=3;
+------+
| i |
+------+
| 1 |
| 2 |
+------+
This page is licensed: CC BY-SA / Gnu FDL
You can control the ordering of execution on table operations using parentheses.
( expression )
[ORDER BY [column[, column...]]]
[LIMIT {[offset,] row_count | row_count OFFSET offset}]
Using parentheses in your SQL allows you to control the order of execution for SELECT statements and Table Value Constructor, including UNION, EXCEPT, and INTERSECT operations. MariaDB executes the parenthetical expression before the rest of the statement. You can then use ORDER BY and LIMIT clauses the further organize the result-set.
Note: In practice, the Optimizer may rearrange the exact order in which MariaDB executes different parts of the statement. When it calculates the result-set, however, it returns values as though the parenthetical expression were executed first.
CREATE TABLE test.t1 (num INT);
INSERT INTO test.t1 VALUES (1),(2),(3);
(SELECT * FROM test.t1
UNION
VALUES (10))
INTERSECT
VALUES (1),(3),(10),(11);
+------+
| num |
+------+
| 1 |
| 3 |
| 10 |
+------+
((SELECT * FROM test.t1
UNION
VALUES (10))
INTERSECT
VALUES (1),(3),(10),(11))
ORDER BY 1 DESC;
+------+
| num |
+------+
| 10 |
| 3 |
| 1 |
+------+
This page is licensed: CC BY-SA / Gnu FDL
UNION
is used to combine the results from multiple SELECT statements into a single result set.
SELECT ...
UNION [ALL | DISTINCT] SELECT ...
[UNION [ALL | DISTINCT] SELECT ...]
[ORDER BY [column [, column ...]]]
[LIMIT {[offset,] row_count | row_count OFFSET offset}]
UNION
is used to combine the results from multiple SELECT statements into a single result set.
The column names from the first SELECT
statement are used as the column names for the results returned. Selected columns listed in corresponding positions of each SELECT statement should have the same data type. (For example, the first column selected by the first statement should have the same type as the first column selected by the other statements.)
If they don't, the type and length of the columns in the result take into account the values returned by all of the SELECTs, so there is no need for explicit casting. Note that currently this is not the case for recursive CTEs - see MDEV-12325.
Table names can be specified as db_name
.tbl_name
. This permits writing UNION
s which involve multiple databases. See Identifier Qualifiers for syntax details.
UNION queries cannot be used with aggregate functions.
EXCEPT
and UNION
have the same operation precedence and INTERSECT
has a higher precedence, unless running in Oracle mode, in which case all three have the same precedence.
The ALL
keyword causes duplicate rows to be preserved. The DISTINCT
keyword (the default if the keyword is omitted) causes duplicate rows to be removed by the results.
UNION ALL
and UNION DISTINCT
can both be present in a query. In this case, UNION
DISTINCT will override any UNION ALL
to its left.
The server can in most cases execute UNION ALL
without creating a temporary table (see MDEV-334).
Individual SELECT
statements can contain their own ORDER BY and LIMIT clauses. In this case, the individual queries need to be wrapped between parentheses. However, this does not affect the order of the UNION, so they only are useful to limit the record read by one SELECT
.
The UNION
can have global ORDER BY and LIMIT clauses, which affect the whole result set. If the columns retrieved by individual SELECT statements have an alias (AS), the ORDER BY
must use that alias, not the real column names.
Specifying a query as HIGH_PRIORITY does not work inside a UNION
. If applied to the first SELECT
, it is ignored. Applying to a later SELECT
results in a syntax error:
ERROR 1234 (42000): Incorrect usage/placement of 'HIGH_PRIORITY'
Individual SELECT
statements cannot be written INTO DUMPFILE or INTO OUTFILE. If the last SELECT
statement specifies INTO DUMPFILE
or INTO OUTFILE
, the entire result of the UNION
will be written. Placing the clause after any other SELECT
will result in a syntax error.
If the result is a single row, SELECT ... INTO @var_name can also be used.
Parentheses can be used to specify precedence.
Parentheses cannot be used to specify precedence.
UNION
between tables having different column names:
(SELECT e_name AS name, email FROM employees)
UNION
(SELECT c_name AS name, email FROM customers);
Specifying the UNION
global order and limiting total rows:
(SELECT name, email FROM employees)
UNION
(SELECT name, email FROM customers)
ORDER BY name LIMIT 10;
Adding a constant row:
(SELECT 'John Doe' AS name, 'john.doe@example.net' AS email)
UNION
(SELECT name, email FROM customers);
Differing types:
SELECT CAST('x' AS CHAR(1)) UNION SELECT REPEAT('y',4);
+----------------------+
| CAST('x' AS CHAR(1)) |
+----------------------+
| x |
| yyyy |
+----------------------+
Returning the results in order of each individual SELECT
by use of a sort column:
(SELECT 1 AS sort_column, e_name AS name, email FROM employees)
UNION
(SELECT 2, c_name AS name, email FROM customers) ORDER BY sort_column;
Difference between UNION, EXCEPT and INTERSECT:
CREATE TABLE seqs (i INT);
INSERT INTO seqs VALUES (1),(2),(2),(3),(3),(4),(5),(6);
SELECT i FROM seqs WHERE i <= 3 UNION SELECT i FROM seqs WHERE i>=3;
+------+
| i |
+------+
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
| 6 |
+------+
SELECT i FROM seqs WHERE i <= 3 UNION ALL SELECT i FROM seqs WHERE i>=3;
+------+
| i |
+------+
| 1 |
| 2 |
| 2 |
| 3 |
| 3 |
| 3 |
| 3 |
| 4 |
| 5 |
| 6 |
+------+
SELECT i FROM seqs WHERE i <= 3 EXCEPT SELECT i FROM seqs WHERE i>=3;
+------+
| i |
+------+
| 1 |
| 2 |
+------+
SELECT i FROM seqs WHERE i <= 3 EXCEPT ALL SELECT i FROM seqs WHERE i>=3;
+------+
| i |
+------+
| 1 |
| 2 |
| 2 |
+------+
SELECT i FROM seqs WHERE i <= 3 INTERSECT SELECT i FROM seqs WHERE i>=3;
+------+
| i |
+------+
| 3 |
+------+
SELECT i FROM seqs WHERE i <= 3 INTERSECT ALL SELECT i FROM seqs WHERE i>=3;
+------+
| i |
+------+
| 3 |
| 3 |
+------+
CREATE OR REPLACE TABLE t1 (a INT);
CREATE OR REPLACE TABLE t2 (b INT);
CREATE OR REPLACE TABLE t3 (c INT);
INSERT INTO t1 VALUES (1),(2),(3),(4);
INSERT INTO t2 VALUES (5),(6);
INSERT INTO t3 VALUES (1),(6);
((SELECT a FROM t1) UNION (SELECT b FROM t2)) INTERSECT (SELECT c FROM t3);
+------+
| a |
+------+
| 1 |
| 6 |
+------+
(SELECT a FROM t1) UNION ((SELECT b FROM t2) INTERSECT (SELECT c FROM t3));
+------+
| a |
+------+
| 1 |
| 2 |
| 3 |
| 4 |
| 6 |
+------+
Get Set for Set Theory: UNION, INTERSECT and EXCEPT in SQL (video tutorial)
This page is licensed: GPLv2, originally from fill_help_tables.sql
Learn about SQL joins. This section details various join types (INNER, LEFT, RIGHT, FULL) to combine rows from two or more tables based on related columns, enabling complex data retrieval.
A query to retrieve the list of phone numbers for clients who ordered in the last two weeks might be written in a couple of ways. Here are two:
SELECT *
FROM
clients,
orders,
phoneNumbers
WHERE
clients.id = orders.clientId
AND clients.id = phoneNumbers.clientId
AND orderPlaced >= NOW() - INTERVAL 2 WEEK;
SELECT *
FROM
clients
INNER JOIN orders ON clients.id = orders.clientId
INNER JOIN phoneNumbers ON clients.id = phoneNumbers.clientId
WHERE
orderPlaced >= NOW() - INTERVAL 2 WEEK;
Does it make a difference? Not much as written. But you should use the second form, for these reasons:
Readability. Once the WHERE
clause contains more than two conditions, it becomes tedious to pick out the difference between business logic (only dates in the last two weeks) and relational logic (which fields relate clients to orders). Using the JOIN
syntax with an ON
clause makes the WHERE
list shorter, and makes it very easy to see how tables relate to each other.
Flexibility. Let's say we need to see all clients even if they don't have a phone number in the system. With the second version, it's easy; just change INNER JOIN phoneNumbers
to LEFT JOIN phoneNumbers
. Try that with the first version, and recent MySQL versions will issue a syntax error because of the change in precedence between the comma operator and the JOIN
keyword. The solution is to rearrange the FROM
clause or add parentheses to override the precedence, and that quickly becomes frustrating.
Portability. If your queries use standard SQL syntax, you will have an easier time switching to a different database should the need ever arise.
"MySQL joins: ON vs. USING vs. Theta-style" — A blog entry about this topic.
The initial version of this article was copied, with permission, from Comma_vs_JOIN on 2012-10-05.
This page is licensed: CC BY-SA / Gnu FDL
MariaDB supports the following JOIN
syntaxes for the table_references
part of SELECT statements and multiple-table DELETE and UPDATE statements:
table_references:
table_reference [, table_reference] ...
table_reference:
table_factor
| join_table
table_factor (<= MariaDB 11.6):
tbl_name [PARTITION (partition_list)]
[query_system_time_period_specification] [[AS] alias] [index_hint_list]
| table_subquery [query_system_time_period_specification] [AS] alias
| ( table_references )
| { ON table_reference LEFT OUTER JOIN table_reference
ON conditional_expr }
table_factor (>= MariaDB 11.7):
tbl_name [PARTITION (partition_list)]
[query_system_time_period_specification] [[AS] alias] [index_hint_list]
| table_subquery [query_system_time_period_specification] [AS] alias [(column_name_list)]
| ( table_references )
| { ON table_reference LEFT OUTER JOIN table_reference
ON conditional_expr }
join_table:
table_reference [INNER | CROSS] JOIN table_factor [join_condition]
| table_reference STRAIGHT_JOIN table_factor
| table_reference STRAIGHT_JOIN table_factor ON conditional_expr
| table_reference {LEFT|RIGHT} [OUTER] JOIN table_reference join_condition
| table_reference NATURAL [{LEFT|RIGHT} [OUTER]] JOIN table_factor
join_condition:
ON conditional_expr
| USING (column_list)
query_system_time_period_specification:
FOR SYSTEM_TIME AS OF point_in_time
| FOR SYSTEM_TIME BETWEEN point_in_time AND point_in_time
| FOR SYSTEM_TIME FROM point_in_time TO point_in_time
| FOR SYSTEM_TIME ALL
point_in_time:
[TIMESTAMP] expression
| TRANSACTION expression
index_hint_list:
index_hint [, index_hint] ...
index_hint:
USE {INDEX|KEY}
[{FOR {JOIN|ORDER BY|GROUP BY}] ([index_list])
| IGNORE {INDEX|KEY}
[{FOR {JOIN|ORDER BY|GROUP BY}] (index_list)
| FORCE {INDEX|KEY}
[{FOR {JOIN|ORDER BY|GROUP BY}] (index_list)
index_list:
index_name [, index_name] ...
A table reference is also known as a join expression.
Each table can also be specified as db_name
.tabl_name
. This allows to write queries which involve multiple databases. See Identifier Qualifiers for syntax details.
The syntax of table_factor
is an extension to the SQL Standard. The latter accepts only table_reference
, not a list of them inside a pair of parentheses.
This is a conservative extension if we consider each comma in a list of table_reference items as equivalent to an inner join. For example, this query:
SELECT * FROM t1 LEFT JOIN (t2, t3, t4)
ON (t2.a=t1.a AND t3.b=t1.b AND t4.c=t1.c)
Is equivalent to:
SELECT * FROM t1 LEFT JOIN (t2 CROSS JOIN t3 CROSS JOIN t4)
ON (t2.a=t1.a AND t3.b=t1.b AND t4.c=t1.c)
In general, parentheses can be ignored in join expressions containing only inner join operations. MariaDB also supports nested joins (see Nested Join Optimization).
A table subquery is specified as a parenthesized query and must contain a following derived table name (specified as alias in the above syntax specification).
You can optionally specify a list of column names in parenthesis.
SELECT ic1, ic2, ic3 FROM
(
SELECT c1, c2, c3 FROM t1 GROUP BY c1
) dt2 (ic1, ic2, ic3)
JOIN t2 ON t2.c1 = dt2.ic1
WHERE c2 > 0
GROUP BY ic1
Here, the table subquery for t1 will be materialized and named dt2, with column names ic1, ic2, ic3. These column names are used outside the subquery.
You cannot optionally specify a list of column names in parenthesis.
See also Correlation Column List.
See System-versioned tables for more information about the FOR SYSTEM_TIME
syntax.
Index hints can be specified to affect how the MariaDB optimizer makes use of indexes. For more information, see How to force query plans.
SELECT left_tbl.*
FROM left_tbl LEFT JOIN right_tbl ON left_tbl.id = right_tbl.id
WHERE right_tbl.id IS NULL;
This page is licensed: GPLv2, originally from fill_help_tables.sql
This article is a continuation of the JOIN
syntax page. If you're getting started with JOIN
statements, review that page first.
Let us begin by using an example employee database of a fairly small family business, which does not anticipate expanding in the future.
First, we create the table that will hold all of the employees and their contact information:
CREATE TABLE `Employees` (
`ID` TINYINT(3) UNSIGNED NOT NULL AUTO_INCREMENT,
`First_Name` VARCHAR(25) NOT NULL,
`Last_Name` VARCHAR(25) NOT NULL,
`Position` VARCHAR(25) NOT NULL,
`Home_Address` VARCHAR(50) NOT NULL,
`Home_Phone` VARCHAR(12) NOT NULL,
PRIMARY KEY (`ID`)
) ENGINE=MyISAM;
Next, we add a few employees to the table:
INSERT INTO `Employees` (`First_Name`, `Last_Name`, `Position`, `Home_Address`, `Home_Phone`)
VALUES
('Mustapha', 'Mond', 'Chief Executive Officer', '692 Promiscuous Plaza', '326-555-3492'),
('Henry', 'Foster', 'Store Manager', '314 Savage Circle', '326-555-3847'),
('Bernard', 'Marx', 'Cashier', '1240 Ambient Avenue', '326-555-8456'),
('Lenina', 'Crowne', 'Cashier', '281 Bumblepuppy Boulevard', '328-555-2349'),
('Fanny', 'Crowne', 'Restocker', '1023 Bokanovsky Lane', '326-555-6329'),
('Helmholtz', 'Watson', 'Janitor', '944 Soma Court', '329-555-2478');
Now, we create a second table, containing the hours which each employee clocked in and out during the week:
CREATE TABLE `Hours` (
`ID` TINYINT(3) UNSIGNED NOT NULL,
`Clock_In` DATETIME NOT NULL,
`Clock_Out` DATETIME NOT NULL
) ENGINE=MyISAM;
Finally, although it is a lot of information, we add a full week of hours for each of the employees into the second table that we created:
INSERT INTO `Hours`
VALUES
('1', '2005-08-08 07:00:42', '2005-08-08 17:01:36'),
('1', '2005-08-09 07:01:34', '2005-08-09 17:10:11'),
('1', '2005-08-10 06:59:56', '2005-08-10 17:09:29'),
('1', '2005-08-11 07:00:17', '2005-08-11 17:00:47'),
('1', '2005-08-12 07:02:29', '2005-08-12 16:59:12'),
('2', '2005-08-08 07:00:25', '2005-08-08 17:03:13'),
('2', '2005-08-09 07:00:57', '2005-08-09 17:05:09'),
('2', '2005-08-10 06:58:43', '2005-08-10 16:58:24'),
('2', '2005-08-11 07:01:58', '2005-08-11 17:00:45'),
('2', '2005-08-12 07:02:12', '2005-08-12 16:58:57'),
('3', '2005-08-08 07:00:12', '2005-08-08 17:01:32'),
('3', '2005-08-09 07:01:10', '2005-08-09 17:00:26'),
('3', '2005-08-10 06:59:53', '2005-08-10 17:02:53'),
('3', '2005-08-11 07:01:15', '2005-08-11 17:04:23'),
('3', '2005-08-12 07:00:51', '2005-08-12 16:57:52'),
('4', '2005-08-08 06:54:37', '2005-08-08 17:01:23'),
('4', '2005-08-09 06:58:23', '2005-08-09 17:00:54'),
('4', '2005-08-10 06:59:14', '2005-08-10 17:00:12'),
('4', '2005-08-11 07:00:49', '2005-08-11 17:00:34'),
('4', '2005-08-12 07:01:09', '2005-08-12 16:58:29'),
('5', '2005-08-08 07:00:04', '2005-08-08 17:01:43'),
('5', '2005-08-09 07:02:12', '2005-08-09 17:02:13'),
('5', '2005-08-10 06:59:39', '2005-08-10 17:03:37'),
('5', '2005-08-11 07:01:26', '2005-08-11 17:00:03'),
('5', '2005-08-12 07:02:15', '2005-08-12 16:59:02'),
('6', '2005-08-08 07:00:12', '2005-08-08 17:01:02'),
('6', '2005-08-09 07:03:44', '2005-08-09 17:00:00'),
('6', '2005-08-10 06:54:19', '2005-08-10 17:03:31'),
('6', '2005-08-11 07:00:05', '2005-08-11 17:02:57'),
('6', '2005-08-12 07:02:07', '2005-08-12 16:58:23');
Now that we have a cleanly structured database to work with, let us begin this tutorial by stepping up one notch from the last tutorial and filtering our information a little.
Earlier in the week, an anonymous employee reported that Helmholtz came into work almost four minutes late; to verify this, we will begin our investigation by filtering out employees whose first names are "Helmholtz":
SELECT
`Employees`.`First_Name`,
`Employees`.`Last_Name`,
`Hours`.`Clock_In`,
`Hours`.`Clock_Out`
FROM `Employees`
INNER JOIN `Hours` ON `Employees`.`ID` = `Hours`.`ID`
WHERE `Employees`.`First_Name` = 'Helmholtz';
The result looks like this:
+------------+-----------+---------------------+---------------------+
| First_Name | Last_Name | Clock_In | Clock_Out |
+------------+-----------+---------------------+---------------------+
| Helmholtz | Watson | 2005-08-08 07:00:12 | 2005-08-08 17:01:02 |
| Helmholtz | Watson | 2005-08-09 07:03:44 | 2005-08-09 17:00:00 |
| Helmholtz | Watson | 2005-08-10 06:54:19 | 2005-08-10 17:03:31 |
| Helmholtz | Watson | 2005-08-11 07:00:05 | 2005-08-11 17:02:57 |
| Helmholtz | Watson | 2005-08-12 07:02:07 | 2005-08-12 16:58:23 |
+------------+-----------+---------------------+---------------------+
5 rows in set (0.00 sec)
This is obviously more information than we care to trudge through, considering we only care about when he arrived past 7:00:59 on any given day within this week; thus, we need to add a couple more conditions to our WHERE
clause.
In the following example, we will filter out all of the times which Helmholtz clocked in that were before 7:01:00 and during the work week that lasted from the 8th to the 12th of August:
SELECT
`Employees`.`First_Name`,
`Employees`.`Last_Name`,
`Hours`.`Clock_In`,
`Hours`.`Clock_Out`
FROM `Employees`
INNER JOIN `Hours` ON `Employees`.`ID` = `Hours`.`ID`
WHERE `Employees`.`First_Name` = 'Helmholtz'
AND DATE_FORMAT(`Hours`.`Clock_In`, '%Y-%m-%d') >= '2005-08-08'
AND DATE_FORMAT(`Hours`.`Clock_In`, '%Y-%m-%d') <= '2005-08-12'
AND DATE_FORMAT(`Hours`.`Clock_In`, '%H:%i:%S') > '07:00:59';
The result looks like this:
+------------+-----------+---------------------+---------------------+
| First_Name | Last_Name | Clock_In | Clock_Out |
+------------+-----------+---------------------+---------------------+
| Helmholtz | Watson | 2005-08-09 07:03:44 | 2005-08-09 17:00:00 |
| Helmholtz | Watson | 2005-08-12 07:02:07 | 2005-08-12 16:58:23 |
+------------+-----------+---------------------+---------------------+
2 rows in set (0.00 sec)
By merely adding a few more conditions, we eliminated all of the irrelevant information; Helmholtz was late to work on the 9th and the 12th of August.
Suppose you would like to—based on the information stored in both of our tables in the employee database—develop a quick list of the total hours each employee has worked for each day recorded; a simple way to estimate the time each employee worked per day is exemplified below:
SELECT
`Employees`.`ID`,
`Employees`.`First_Name`,
`Employees`.`Last_Name`,
`Hours`.`Clock_In`,
`Hours`.`Clock_Out`,
DATE_FORMAT(`Hours`.`Clock_Out`, '%T')-DATE_FORMAT(`Hours`.`Clock_In`, '%T')
AS 'Total_Hours'
FROM `Employees`
INNER JOIN `Hours` ON `Employees`.`ID` = `Hours`.`ID`;
The result (limited to 10 rows) looks like this:
+----+------------+-----------+---------------------+---------------------+-------------+
| ID | First_Name | Last_Name | Clock_In | Clock_Out | Total_Hours |
+----+------------+-----------+---------------------+---------------------+-------------+
| 1 | Mustapha | Mond | 2005-08-08 07:00:42 | 2005-08-08 17:01:36 | 10 |
| 1 | Mustapha | Mond | 2005-08-09 07:01:34 | 2005-08-09 17:10:11 | 10 |
| 1 | Mustapha | Mond | 2005-08-10 06:59:56 | 2005-08-10 17:09:29 | 11 |
| 1 | Mustapha | Mond | 2005-08-11 07:00:17 | 2005-08-11 17:00:47 | 10 |
| 1 | Mustapha | Mond | 2005-08-12 07:02:29 | 2005-08-12 16:59:12 | 9 |
| 2 | Henry | Foster | 2005-08-08 07:00:25 | 2005-08-08 17:03:13 | 10 |
| 2 | Henry | Foster | 2005-08-09 07:00:57 | 2005-08-09 17:05:09 | 10 |
| 2 | Henry | Foster | 2005-08-10 06:58:43 | 2005-08-10 16:58:24 | 10 |
| 2 | Henry | Foster | 2005-08-11 07:01:58 | 2005-08-11 17:00:45 | 10 |
| 2 | Henry | Foster | 2005-08-12 07:02:12 | 2005-08-12 16:58:57 | 9 |
+----+------------+-----------+---------------------+---------------------+-------------+
10 rows in set (0.00 sec)
The first version of this article was copied, with permission, from More_Advanced_Joins on 2012-10-05.
This page is licensed: CC BY-SA / Gnu FDL
Learn about subqueries in MariaDB Server. This section details how to use nested queries to perform complex data retrieval, filtering, and manipulation operations within a single SQL statement.
Subqueries using the ALL keyword will return true
if the comparison returns true
for each row returned by the subquery, or the subquery returns no rows.
scalar_expression comparison_operator ALL <Table subquery>
scalar_expression
may be any expression that evaluates to a single value.
comparison_operator
may be any one of: =
, >
, <
, >=
, <=
, <>
or !=
ALL
returns:
NULL
if the comparison operator returns NULL
for at least one row returned by the Table subquery or scalar_expression returns NULL
.
FALSE
if the comparison operator returns FALSE
for at least one row returned by the Table subquery.
TRUE
if the comparison operator returns TRUE
for all rows returned by the Table subquery, or if Table subquery returns no rows.
NOT IN
is an alias for <> ALL
.
CREATE TABLE sq1 (num TINYINT);
CREATE TABLE sq2 (num2 TINYINT);
INSERT INTO sq1 VALUES(100);
INSERT INTO sq2 VALUES(40),(50),(60);
SELECT * FROM sq1 WHERE num > ALL (SELECT * FROM sq2);
+------+
| num |
+------+
| 100 |
+------+
Since 100
> all of 40
,50
and 60
, the evaluation is true and the row is returned.
Adding a second row to sq1, where the evaluation for that record is false:
INSERT INTO sq1 VALUES(30);
SELECT * FROM sq1 WHERE num > ALL (SELECT * FROM sq2);
+------+
| num |
+------+
| 100 |
+------+
Adding a new row to sq2, causing all evaluations to be false:
INSERT INTO sq2 VALUES(120);
SELECT * FROM sq1 WHERE num > ALL (SELECT * FROM sq2);
Empty set (0.00 sec)
When the subquery returns no results, the evaluation is still true:
SELECT * FROM sq1 WHERE num > ALL (SELECT * FROM sq2 WHERE num2 > 300);
+------+
| num |
+------+
| 100 |
| 30 |
+------+
Evaluating against a NULL will cause the result to be unknown, or not true, and therefore return no rows:
INSERT INTO sq2 VALUES (NULL);
SELECT * FROM sq1 WHERE num > ALL (SELECT * FROM sq2);
This page is licensed: CC BY-SA / Gnu FDL
Subqueries using the ANY keyword will return true
if the comparison returns true
for at least one row returned by the subquery.
The required syntax for an ANY
or SOME
quantified comparison is:
scalar_expression comparison_operator ANY <Table subquery>
Or:
scalar_expression comparison_operator SOME <Table subquery>
scalar_expression
may be any expression that evaluates to a single value.
comparison_operator
may be any one of =
, >
, <
, >=
, <=
, <>
or !=
.
ANY
returns:
TRUE
if the comparison operator returns TRUE
for at least one row returned by the Table subquery.
FALSE
if the comparison operator returns FALSE
for all rows returned by the Table subquery, or Table subquery has zero rows.
NULL
if the comparison operator returns NULL
for at least one row returned by the Table subquery and doesn't returns TRUE
for any of them, or if scalar_expression returns NULL
.
SOME
is a synonym for ANY
, and IN
is a synonym for = ANY
.
CREATE TABLE sq1 (num TINYINT);
CREATE TABLE sq2 (num2 TINYINT);
INSERT INTO sq1 VALUES(100);
INSERT INTO sq2 VALUES(40),(50),(120);
SELECT * FROM sq1 WHERE num > ANY (SELECT * FROM sq2);
+------+
| num |
+------+
| 100 |
+------+
100
is greater than two of the three values, and so the expression evaluates as true.
SOME is a synonym for ANY:
SELECT * FROM sq1 WHERE num < SOME (SELECT * FROM sq2);
+------+
| num |
+------+
| 100 |
+------+
IN
is a synonym for = ANY
, and here there are no matches, so no results are returned:
SELECT * FROM sq1 WHERE num IN (SELECT * FROM sq2);
Empty set (0.00 sec)
INSERT INTO sq2 VALUES(100);
Query OK, 1 row affected (0.05 sec)
SELECT * FROM sq1 WHERE num <> ANY (SELECT * FROM sq2);
+------+
| num |
+------+
| 100 |
+------+
Reading this query, the results may be counter-intuitive. It may seem to read as SELECT * FROM
sq1 WHERE
num does not match any results in sq2. Since it does match 100, it could seem that the results are incorrect. However, the query returns a result if the match does not match any of sq2. Since 100
already does not match 40
, the expression evaluates to true immediately, regardless of the 100's matching. It may be more easily readable to use SOME
in a case such as this:
SELECT * FROM sq1 WHERE num <> SOME (SELECT * FROM sq2);
+------+
| num |
+------+
| 100 |
+------+
This page is licensed: CC BY-SA / Gnu FDL
SELECT ... WHERE EXISTS <Table subquery>
Subqueries using the EXISTS
keyword will return true
if the subquery returns any rows. Conversely, subqueries using NOT EXISTS
will return true
only if the subquery returns no rows from the table.
EXISTS subqueries ignore the columns specified by the SELECT of the subquery, since they're not relevant. For example,
SELECT col1 FROM t1 WHERE EXISTS (SELECT * FROM t2);
and
SELECT col1 FROM t1 WHERE EXISTS (SELECT col2 FROM t2);
produce identical results.
CREATE TABLE sq1 (num TINYINT);
CREATE TABLE sq2 (num2 TINYINT);
INSERT INTO sq1 VALUES(100);
INSERT INTO sq2 VALUES(40),(50),(60);
SELECT * FROM sq1 WHERE EXISTS (SELECT * FROM sq2 WHERE num2>50);
+------+
| num |
+------+
| 100 |
+------+
SELECT * FROM sq1 WHERE NOT EXISTS (SELECT * FROM sq2 GROUP BY num2 HAVING MIN(num2)=40);
Empty set (0.00 sec)
This page is licensed: CC BY-SA / Gnu FDL
A subquery can quite often, but not in all cases, be rewritten as a JOIN.
A subquery using IN
can be rewritten with the DISTINCT
keyword, for example:
SELECT * FROM table1 WHERE col1 IN (SELECT col1 FROM table2);
can be rewritten as:
SELECT DISTINCT table1.* FROM table1, table2 WHERE table1.col1=table2.col1;
NOT IN
or NOT EXISTS
queries can also be rewritten. For example, these two queries returns the same result:
SELECT * FROM table1 WHERE col1 NOT IN (SELECT col1 FROM table2);
SELECT * FROM table1 WHERE NOT EXISTS (SELECT col1 FROM table2
WHERE table1.col1=table2.col1);
and both can be rewritten as:
SELECT table1.* FROM table1 LEFT JOIN table2 ON table1.id=table2.id
WHERE table2.id IS NULL;
Subqueries that can be rewritten as a LEFT JOIN
are sometimes more efficient.
There are some scenarios, though, which call for subqueries rather than joins:
When you want duplicates, but not false duplicates. Suppose Table_1
has three rows — {1
,1
,2
}
— and Table_2
has two rows
— {1
,2
,2
}. If you need to list the rows
in Table_1
which are also in Table_2
, only this subquery-based SELECT
statement will give the right answer (1
,1
,2
):
SELECT Table_1.column_1
FROM Table_1
WHERE Table_1.column_1 IN
(SELECT Table_2.column_1
FROM Table_2);
This SQL statement won't work:
SELECT Table_1.column_1
FROM Table_1,Table_2
WHERE Table_1.column_1 = Table_2.column_1;
because the result will be {1
,1
,2
,2
}
— and the duplication of 2 is an error. This SQL statement won't work either:
SELECT DISTINCT Table_1.column_1
FROM Table_1,Table_2
WHERE Table_1.column_1 = Table_2.column_1;
because the result will be {1
,2
} — and the removal of the duplicated 1 is an error too.
When the outermost statement is not a query. The SQL statement:
UPDATE Table_1 SET column_1 = (SELECT column_1 FROM Table_2);
can't be expressed using a join unless some rare SQL3 features are used.
When the join is over an expression. The SQL statement:
SELECT * FROM Table_1
WHERE column_1 + 5 =
(SELECT MAX(column_1) FROM Table_2);
is hard to express with a join. In fact, the only way we can think of is this SQL statement:
SELECT Table_1.*
FROM Table_1,
(SELECT MAX(column_1) AS max_column_1 FROM Table_2) AS Table_2
WHERE Table_1.column_1 + 5 = Table_2.max_column_1;
which still involves a parenthesized query, so nothing is gained from the transformation.
When you want to see the exception. For example, suppose the question is: Which books are longer than Das Kapital? These two queries are effectively almost the same:
SELECT DISTINCT Bookcolumn_1.*
FROM Books AS Bookcolumn_1 JOIN Books AS Bookcolumn_2 USING(page_count)
WHERE title = 'Das Kapital';
SELECT DISTINCT Bookcolumn_1.*
FROM Books AS Bookcolumn_1
WHERE Bookcolumn_1.page_count >
(SELECT DISTINCT page_count
FROM Books AS Bookcolumn_2
WHERE title = 'Das Kapital');
The difference is between these two SQL statements is, if there are two editions of Das Kapital (with different page counts), then the self-join example will return the books which are longer than the shortest edition of Das Kapital. That might be the wrong answer, since the original question didn't ask for "... longer than ANY
book named Das Kapital" (it seems to contain a false assumption that there's only one edition).
This page is licensed: CC BY-SA / Gnu FDL
Although subqueries are more commonly placed in a WHERE clause, they can also form part of the FROM clause. Such subqueries are commonly called derived tables.
If a subquery is used in this way, you must also use an AS clause to name the result of the subquery.
MariaDB starting with 10.6.0
Anonymous subqueries in a FROM clause (no AS clause) are permitted in ORACLE mode.
Anonymous subqueries in a FROM clause (no AS
clause) are not permitted in ORACLE mode.
MariaDB starting with 11.7.0
It is possible to assign column names in the derived table name syntax element.
It is not possible to assign column names in the derived table name syntax element.
CREATE TABLE student (name CHAR(10), test CHAR(10), score TINYINT);
INSERT INTO student VALUES
('Chun', 'SQL', 75), ('Chun', 'Tuning', 73),
('Esben', 'SQL', 43), ('Esben', 'Tuning', 31),
('Kaolin', 'SQL', 56), ('Kaolin', 'Tuning', 88),
('Tatiana', 'SQL', 87), ('Tatiana', 'Tuning', 83);
Assume that, given the data above, you want to return the average total for all students. In other words, the average of Chun's 148 (75+73), Esben's 74 (43+31), etc.
You cannot do the following:
SELECT AVG(SUM(score)) FROM student GROUP BY name;
ERROR 1111 (HY000): Invalid use of group function
A subquery in the FROM clause is however permitted:
SELECT AVG(sq_sum) FROM (SELECT SUM(score) AS sq_sum FROM student GROUP BY name) AS t;
+-------------+
| AVG(sq_sum) |
+-------------+
| 134.0000 |
+-------------+
The following is permitted:
SELECT * FROM (SELECT 1 FROM DUAL), (SELECT 2 FROM DUAL);
In this example, the second column of the derived table dt
is used both within (WHERE
c2 > 0), and outside, (WHERE
a2 > 10), the specification. Both conditions apply to t1.c2.
CREATE OR REPLACE TABLE t1(c1 INT, c2 INT, c3 INT);
SELECT a1, a2 FROM (SELECT c1, c2, c3 FROM t1 WHERE c2 > 0) AS dt (a1, a2, a3);
This page is licensed: CC BY-SA / Gnu FDL
A row subquery is a subquery returning a single row, as opposed to a scalar subquery, which returns a single column from a row, or a literal.
CREATE TABLE staff (name VARCHAR(10), age TINYINT);
CREATE TABLE customer (name VARCHAR(10), age TINYINT);
INSERT INTO staff VALUES ('Bilhah',37), ('Valerius',61), ('Maia',25);
INSERT INTO customer VALUES ('Thanasis',48), ('Valerius',61), ('Brion',51);
SELECT * FROM staff WHERE (name,age) = (SELECT name,age FROM customer WHERE name='Valerius');
+----------+------+
| name | age |
+----------+------+
| Valerius | 61 |
+----------+------+
Finding all rows in one table and also in another:
SELECT name,age FROM staff WHERE (name,age) IN (SELECT name,age FROM customer);
+----------+------+
| name | age |
+----------+------+
| Valerius | 61 |
+----------+------+
This page is licensed: CC BY-SA / Gnu FDL
A scalar subquery is a subquery that returns a single value. This is the simplest form of a subquery, and can be used in most places a literal or single column value is valid.
The data type, length and character set and collation are all taken from the result returned by the subquery. The result of a subquery can always be NULL, that is, no result returned. Even if the original value is defined as NOT NULL, this is disregarded.
A subquery cannot be used where only a literal is expected, for example LOAD DATA INFILE expects a literal string containing the file name, and LIMIT requires a literal integer.
CREATE TABLE sq1 (num TINYINT);
CREATE TABLE sq2 (num TINYINT);
INSERT INTO sq1 VALUES (1);
INSERT INTO sq2 VALUES (10* (SELECT num FROM sq1));
SELECT * FROM sq2;
+------+
| num |
+------+
| 10 |
+------+
Inserting a second row means the subquery is no longer a scalar, and this particular query is not valid:
INSERT INTO sq1 VALUES (2);
INSERT INTO sq2 VALUES (10* (SELECT num FROM sq1));
ERROR 1242 (21000): Subquery returns more than 1 row
No rows in the subquery, so the scalar is NULL:
INSERT INTO sq2 VALUES (10* (SELECT num FROM sq3 WHERE num='3'));
SELECT * FROM sq2;
+------+
| num |
+------+
| 10 |
| NULL |
+------+
A more traditional scalar subquery, as part of a WHERE
clause:
SELECT * FROM sq1 WHERE num = (SELECT MAX(num)/10 FROM sq2);
+------+
| num |
+------+
| 1 |
+------+
This page is licensed: CC BY-SA / Gnu FDL
There are a number of limitations regarding subqueries, which are discussed below.
The following tables and data will be used in the examples that follow:
CREATE TABLE staff(name VARCHAR(10),age TINYINT);
CREATE TABLE customer(name VARCHAR(10),age TINYINT);
INSERT INTO staff VALUES
('Bilhah',37), ('Valerius',61), ('Maia',25);
INSERT INTO customer VALUES
('Thanasis',48), ('Valerius',61), ('Brion',51);
To use ORDER BY or limit LIMIT in subqueries both must be used.. For example:
SELECT * FROM staff WHERE name IN (SELECT name FROM customer ORDER BY name);
+----------+------+
| name | age |
+----------+------+
| Valerius | 61 |
+----------+------+
is valid, but
SELECT * FROM staff WHERE name IN (SELECT NAME FROM customer ORDER BY name LIMIT 1);
ERROR 1235 (42000): This version of MariaDB doesn't
yet support 'LIMIT & IN/ALL/ANY/SOME subquery'
is not.
It's not possible to both modify and select from the same table in a subquery. For example:
DELETE FROM staff WHERE name = (SELECT name FROM staff WHERE age=61);
ERROR 1093 (HY000): Table 'staff' is specified twice, both
as a target for 'DELETE' and as a separate source for data
There is only partial support for row comparison operations. The expression in
expr op {ALL|ANY|SOME} subquery,
must be scalar and the subquery can only return a single column.
However, because of the way IN
is implemented (it is rewritten as a sequence of =
comparisons and AND
), the expression in
expression [NOT] IN subquery
is permitted to be an n-tuple and the subquery can return rows of n-tuples.
For example:
SELECT * FROM staff WHERE (name,age) NOT IN (
SELECT name,age FROM customer WHERE age >=51]
);
+--------+------+
| name | age |
+--------+------+
| Bilhah | 37 |
| Maia | 25 |
+--------+------+
is permitted, but
SELECT * FROM staff WHERE (name,age) = ALL (
SELECT name,age FROM customer WHERE age >=51
);
ERROR 1241 (21000): Operand should contain 1 column(s)
is not.
Subqueries in the FROM
clause cannot be correlated subqueries. They cannot be evaluated for each row of the outer query since they are evaluated to produce a result set during when the query is executed.
A subquery can refer to a stored function which modifies data. This is an extension to the SQL standard, but can result in indeterminate outcomes. For example, take:
SELECT ... WHERE x IN (SELECT f() ...);
where f() inserts rows. The function f() could be executed a different number of times depending on how the optimizer chooses to handle the query.
This sort of construct is therefore not safe to use in replication that is not row-based, as there could be different results on the master and the slave.
This page is licensed: CC BY-SA / Gnu FDL