Learn about string functions in MariaDB Server. This section details SQL functions for manipulating, searching, and formatting text strings, essential for data cleansing and presentation.
ASCII(str)
Returns the numeric ASCII value of the leftmost character of the string argument. Returns 0
if the given string is empty and NULL
if it is NULL
.
ASCII()
works for 8-bit characters.
SELECT ASCII(9);
+----------+
| ASCII(9) |
+----------+
| 57 |
+----------+
SELECT ASCII('9');
+------------+
| ASCII('9') |
+------------+
| 57 |
+------------+
SELECT ASCII('abc');
+--------------+
| ASCII('abc') |
+--------------+
| 97 |
+--------------+
This page is licensed: GPLv2, originally from fill_help_tables.sql
BIN(N)
Returns a string representation of the binary value of the given longlong (that is, BIGINT) number. This is equivalent to CONV(N,10,2). The argument should be positive. If it is a FLOAT
, it will be truncated. Returns NULL
if the argument is NULL
.
SELECT BIN(12);
+---------+
| BIN(12) |
+---------+
| 1100 |
+---------+
This page is licensed: GPLv2, originally from fill_help_tables.sql
This page describes the BINARY operator. For details about the data type, see Binary Data Type.
BINARY
The BINARY
operator casts the string following it to a binary string. This is an easy way to force a column comparison to be done byte by byte rather than character by character. This causes the comparison to be case sensitive even if the column isn't defined as BINARY or BLOB.
BINARY
also causes trailing spaces to be significant.
SELECT 'a' = 'A';
+-----------+
| 'a' = 'A' |
+-----------+
| 1 |
+-----------+
SELECT BINARY 'a' = 'A';
+------------------+
| BINARY 'a' = 'A' |
+------------------+
| 0 |
+------------------+
SELECT 'a' = 'a ';
+------------+
| 'a' = 'a ' |
+------------+
| 1 |
+------------+
SELECT BINARY 'a' = 'a ';
+-------------------+
| BINARY 'a' = 'a ' |
+-------------------+
| 0 |
+-------------------+
This page is licensed: GPLv2, originally from fill_help_tables.sql
BIT_LENGTH(str)
Returns the length of the given string argument in bits. If the argument is not a string, it will be converted to string. If the argument is NULL
, it returns NULL
.
SELECT BIT_LENGTH('text');
+--------------------+
| BIT_LENGTH('text') |
+--------------------+
| 32 |
+--------------------+
SELECT BIT_LENGTH('');
+----------------+
| BIT_LENGTH('') |
+----------------+
| 0 |
+----------------+
PostgreSQL and Sybase support BIT_LENGTH()
.
This page is licensed: GPLv2, originally from fill_help_tables.sql
CAST(expr AS type)
The CAST()
function takes a value of one type and produces a value of another type, similar to the CONVERT() function.
The type can be one of the following values:
[DECIMAL(M[,D])]
Short for SIGNED INTEGER
SIGNED [INTEGER]
UNSIGNED [INTEGER]
VARCHAR (in Oracle mode)
The main difference between CAST
and CONVERT() is that CONVERT(expr,type) is ODBC syntax while CAST(expr as type)
and CONVERT(... USING ...) are SQL92 syntax.
You can use the CAST()
function with the INTERVAL
keyword.
This introduced an incompatibility with previous versions of MariaDB, and all versions of MySQL (see the example below).
Simple casts:
SELECT CAST("abc" AS BINARY);
SELECT CAST("1" AS UNSIGNED INTEGER);
SELECT CAST(123 AS CHAR CHARACTER SET utf8)
Note that when one casts to CHAR without specifying the character set, the collation_connection character set collation will be used. When used with CHAR CHARACTER SET
, the default collation for that character set will be used.
SELECT COLLATION(CAST(123 AS CHAR));
+------------------------------+
| COLLATION(CAST(123 AS CHAR)) |
+------------------------------+
| latin1_swedish_ci |
+------------------------------+
SELECT COLLATION(CAST(123 AS CHAR CHARACTER SET utf8));
+-------------------------------------------------+
| COLLATION(CAST(123 AS CHAR CHARACTER SET utf8)) |
+-------------------------------------------------+
| utf8_general_ci |
+-------------------------------------------------+
If you also want to change the collation, you have to use the COLLATE
operator:
SELECT COLLATION(CAST(123 AS CHAR CHARACTER SET utf8)
COLLATE utf8_unicode_ci);
+-------------------------------------------------------------------------+
| COLLATION(CAST(123 AS CHAR CHARACTER SET utf8) COLLATE utf8_unicode_ci) |
+-------------------------------------------------------------------------+
| utf8_unicode_ci |
+-------------------------------------------------------------------------+
Using CAST()
to order an ENUM field as a CHAR rather than the internal numerical value:
CREATE TABLE enum_list (enum_field enum('c','a','b'));
INSERT INTO enum_list (enum_field)
VALUES('c'),('a'),('c'),('b');
SELECT * FROM enum_list
ORDER BY enum_field;
+------------+
| enum_field |
+------------+
| c |
| c |
| a |
| b |
+------------+
SELECT * FROM enum_list
ORDER BY CAST(enum_field AS CHAR);
+------------+
| enum_field |
+------------+
| a |
| b |
| c |
| c |
+------------+
The following will trigger warnings, since x'aa'
and 'X'aa'
doesn't behave as a number. In all versions of MySQL, no warnings are triggered since they did erroneously behave as a number:
SELECT CAST(0xAA AS UNSIGNED), CAST(x'aa' AS UNSIGNED), CAST(X'aa' AS UNSIGNED);
+------------------------+-------------------------+-------------------------+
| CAST(0xAA AS UNSIGNED) | CAST(x'aa' AS UNSIGNED) | CAST(X'aa' AS UNSIGNED) |
+------------------------+-------------------------+-------------------------+
| 170 | 0 | 0 |
+------------------------+-------------------------+-------------------------+
1 row in set, 2 warnings (0.00 sec)
Warning (Code 1292): Truncated incorrect INTEGER value: '\xAA'
Warning (Code 1292): Truncated incorrect INTEGER value: '\xAA'
Casting to intervals:
SELECT CAST(2019-01-04 AS INTERVAL DAY_SECOND(2)) AS "Cast";
+-------------+
| Cast |
+-------------+
| 00:20:14.00 |
+-------------+
This page is licensed: GPLv2, originally from fill_help_tables.sql
CHAR(N,... [USING charset_name])
CHAR()
interprets each argument as an INT and returns a string consisting of the characters given by the code values of those integers. NULL
values are skipped. By default, CHAR()
returns a binary string. To produce a string in a given character set, use the optional USING
clause:
SELECT CHARSET(CHAR(0x65)), CHARSET(CHAR(0x65 USING utf8));
+---------------------+--------------------------------+
| CHARSET(CHAR(0x65)) | CHARSET(CHAR(0x65 USING utf8)) |
+---------------------+--------------------------------+
| binary | utf8 |
+---------------------+--------------------------------+
If USING
is given and the result string is illegal for the given character set, a warning is issued. Also, if strict SQL mode is enabled, the result from CHAR()
becomes NULL
.
SELECT CHAR(77,97,114,'105',97,'68',66);
+----------------------------------+
| CHAR(77,97,114,'105',97,'68',66) |
+----------------------------------+
| MariaDB |
+----------------------------------+
SELECT CHAR(77,77.3,'77.3');
+----------------------+
| CHAR(77,77.3,'77.3') |
+----------------------+
| MMM |
+----------------------+
1 row in set, 1 warning (0.00 sec)
Warning (Code 1292): Truncated incorrect INTEGER value: '77.3'
ASCII() - Return ASCII value of first character
ORD() - Return value for character in single or multi-byte character sets
CHR - Similar, Oracle-compatible, function
This page is licensed: GPLv2, originally from fill_help_tables.sql
CHAR_LENGTH(str)
CHARACTER_LENGTH(str)
Returns the length of the given string argument, measured in characters. A multi-byte character counts as a single character. This means that for a string containing five two-byte characters, LENGTH() (or OCTET_LENGTH() in Oracle mode) returns 10, whereas CHAR_LENGTH()
returns 5. If the argument is NULL
, it returns NULL
.
If the argument is not a string value, it is converted into a string.
It is synonymous with the CHARACTER_LENGTH()
function.
SELECT CHAR_LENGTH('MariaDB');
+------------------------+
| CHAR_LENGTH('MariaDB') |
+------------------------+
| 7 |
+------------------------+
When Oracle mode is not set:
SELECT CHAR_LENGTH('π'), LENGTH('π'), LENGTHB('π'), OCTET_LENGTH('π');
+-------------------+--------------+---------------+--------------------+
| CHAR_LENGTH('π') | LENGTH('π') | LENGTHB('π') | OCTET_LENGTH('π') |
+-------------------+--------------+---------------+--------------------+
| 1 | 2 | 2 | 2 |
+-------------------+--------------+---------------+--------------------+
In Oracle mode:
SELECT CHAR_LENGTH('π'), LENGTH('π'), LENGTHB('π'), OCTET_LENGTH('π');
+-------------------+--------------+---------------+--------------------+
| CHAR_LENGTH('π') | LENGTH('π') | LENGTHB('π') | OCTET_LENGTH('π') |
+-------------------+--------------+---------------+--------------------+
| 1 | 1 | 2 | 2 |
+-------------------+--------------+---------------+--------------------+
This page is licensed: GPLv2, originally from fill_help_tables.sql
CHARACTER_LENGTH(str)
CHARACTER_LENGTH()
is a synonym for CHAR_LENGTH().
This page is licensed: GPLv2, originally from fill_help_tables.sql
CHR(N)
CHR()
interprets each argument N as an integer and returns a VARCHAR(1) string consisting of the character given by the code values of the integer. The character set and collation of the string are set according to the values of the character_set_database and collation_database system variables.
CHR()
is similar to the CHAR() function, but only accepts a single argument.
CHR()
is available in all sql_modes.
SELECT CHR(67);
+---------+
| CHR(67) |
+---------+
| C |
+---------+
SELECT CHR('67');
+-----------+
| CHR('67') |
+-----------+
| C |
+-----------+
SELECT CHR('C');
+----------+
| CHR('C') |
+----------+
| |
+----------+
1 row in set, 1 warning (0.000 sec)
SHOW WARNINGS;
+---------+------+----------------------------------------+
| Level | Code | Message |
+---------+------+----------------------------------------+
| Warning | 1292 | Truncated incorrect INTEGER value: 'C' |
+---------+------+----------------------------------------+
ASCII() - Return ASCII value of first character
ORD() - Return value for character in single or multi-byte character sets
CHAR() - Similar function which accepts multiple integers
This page is licensed: CC BY-SA / Gnu FDL
CONCAT(str1,str2,...)
Returns the string that results from concatenating the arguments. May have one or more arguments. If all arguments are non-binary strings, the result is a non-binary string. If the arguments include any binary strings, the result is a binary string. A numeric argument is converted to its equivalent binary string form; if you want to avoid that, you can use an explicit type cast, as in this example:
SELECT CONCAT(CAST(int_col AS CHAR), char_col);
CONCAT()
returns NULL
if any argument is NULL
.
A NULL
parameter hides all information contained in other parameters from the result. Sometimes this is not desirable; to avoid this, you can:
Use the CONCAT_WS() function with an empty separator, because that function is NULL
-safe.
Use IFNULL() to turn NULLs into empty strings.
In Oracle mode, CONCAT
ignores null.
SELECT CONCAT('Ma', 'ria', 'DB');
+---------------------------+
| CONCAT('Ma', 'ria', 'DB') |
+---------------------------+
| MariaDB |
+---------------------------+
SELECT CONCAT('Ma', 'ria', NULL, 'DB');
+---------------------------------+
| CONCAT('Ma', 'ria', NULL, 'DB') |
+---------------------------------+
| NULL |
+---------------------------------+
SELECT CONCAT(42.0);
+--------------+
| CONCAT(42.0) |
+--------------+
| 42.0 |
+--------------+
Using IFNULL()
to handle NULL
values:
SELECT CONCAT('The value of @v is: ', IFNULL(@v, ''));
+------------------------------------------------+
| CONCAT('The value of @v is: ', IFNULL(@v, '')) |
+------------------------------------------------+
| The value of @v is: |
+------------------------------------------------+
In Oracle mode:
SELECT CONCAT('Ma', 'ria', NULL, 'DB');
+---------------------------------+
| CONCAT('Ma', 'ria', NULL, 'DB') |
+---------------------------------+
| MariaDB |
+---------------------------------+
This page is licensed: GPLv2, originally from fill_help_tables.sql
CONCAT_WS(separator,str1,str2,...)
CONCAT_WS()
stands for Concatenate With Separator and is a special form of CONCAT(). The first argument is the separator for the rest of the arguments. The separator is added between the strings to be concatenated. The separator can be a string, as can the rest of the arguments.
If the separator is NULL
, the result is NULL
; all other NULL
values are skipped. This makes CONCAT_WS()
suitable when you want to concatenate some values and avoid losing all information if one of them is NULL
.
SELECT CONCAT_WS(',','First name','Second name','Last Name');
+-------------------------------------------------------+
| CONCAT_WS(',','First name','Second name','Last Name') |
+-------------------------------------------------------+
| First name,Second name,Last Name |
+-------------------------------------------------------+
SELECT CONCAT_WS('-','Floor',NULL,'Room');
+------------------------------------+
| CONCAT_WS('-','Floor',NULL,'Room') |
+------------------------------------+
| Floor-Room |
+------------------------------------+
In some cases, remember to include a space in the separator string:
SET @a = 'gnu', @b = 'penguin', @c = 'sea lion';
Query OK, 0 rows affected (0.00 sec)
SELECT CONCAT_WS(', ', @a, @b, @c);
+-----------------------------+
| CONCAT_WS(', ', @a, @b, @c) |
+-----------------------------+
| gnu, penguin, sea lion |
+-----------------------------+
Using CONCAT_WS()
to handle NULL
s:
SET @a = 'a', @b = NULL, @c = 'c';
SELECT CONCAT_WS('', @a, @b, @c);
+---------------------------+
| CONCAT_WS('', @a, @b, @c) |
+---------------------------+
| ac |
+---------------------------+
This page is licensed: GPLv2, originally from fill_help_tables.sql
CONVERT(expr,type), CONVERT(expr USING transcoding_name)
The CONVERT()
and CAST() functions take a value of one type and produce a value of another type.
The type can be one of the following values:
[DECIMAL(M[,D])]
Short for SIGNED INTEGER
SIGNED [INTEGER]
UNSIGNED [INTEGER]
VARCHAR (in Oracle mode)
Note that in MariaDB, INT
and INTEGER
are the same thing.
BINARY
produces a string with the BINARY data type. If the optional length is given, BINARY(N)
causes the cast to use no more than N
bytes of the argument. Values shorter than the given number in bytes are padded with 0x00 bytes to make them equal the length value.
CHAR(N)
causes the cast to use no more than the number of characters given in the argument.
The main difference between the CAST() and CONVERT()
is that CONVERT(expr,type)
is ODBC syntax while CAST(expr as type) and CONVERT(... USING ...)
are SQL92 syntax.
CONVERT()
with USING
is used to convert data between different character sets. In MariaDB, transcoding names are the same as the corresponding character set names. For example, this statement converts the string 'abc' in the default character set to the corresponding string in the utf8 character set:
SELECT CONVERT('abc' USING utf8);
SELECT enum_col FROM tbl_name
ORDER BY CAST(enum_col AS CHAR);
Converting a BINARY to string to permit the LOWER function to work:
SET @x = 'AardVark';
SET @x = BINARY 'AardVark';
SELECT LOWER(@x), LOWER(CONVERT (@x USING latin1));
+-----------+----------------------------------+
| LOWER(@x) | LOWER(CONVERT (@x USING latin1)) |
+-----------+----------------------------------+
| AardVark | aardvark |
+-----------+----------------------------------+
This page is licensed: GPLv2, originally from fill_help_tables.sql
ELT(N, str1[, str2, str3,...])
Takes a numeric argument and a series of string arguments. Returns the string that corresponds to the given numeric position. For instance, it returns str1
if N
is 1, str2
if N
is 2, and so on. If the numeric argument is a FLOAT, MariaDB rounds it to the nearest INTEGER. If the numeric argument is less than 1, greater than the total number of arguments, or not a number, ELT()
returns NULL
. It must have at least two arguments.
It is complementary to the FIELD() function.
SELECT ELT(1, 'ej', 'Heja', 'hej', 'foo');
+------------------------------------+
| ELT(1, 'ej', 'Heja', 'hej', 'foo') |
+------------------------------------+
| ej |
+------------------------------------+
SELECT ELT(4, 'ej', 'Heja', 'hej', 'foo');
+------------------------------------+
| ELT(4, 'ej', 'Heja', 'hej', 'foo') |
+------------------------------------+
| foo |
+------------------------------------+
FIND_IN_SET() function. Returns the position of a string in a set of strings.
FIELD() function. Returns the index position of a string in a list.
This page is licensed: GPLv2, originally from fill_help_tables.sql
EXPORT_SET(bits, on, off[, separator[, number_of_bits]])
Takes a minimum of three arguments. Returns a string where each bit in the given bits
argument is returned, with the string values given for on
and off
.
Bits are examined from right to left, (from low-order to high-order bits). Strings are added to the result from left to right, separated by a separator string (defaults as ',
'). You can optionally limit the number of bits the EXPORT_SET()
function examines using the number_of_bits
option.
If any of the arguments are set as NULL
, the function returns NULL
.
SELECT EXPORT_SET(5,'Y','N',',',4);
+-----------------------------+
| EXPORT_SET(5,'Y','N',',',4) |
+-----------------------------+
| Y,N,Y,N |
+-----------------------------+
SELECT EXPORT_SET(6,'1','0',',',10);
+------------------------------+
| EXPORT_SET(6,'1','0',',',10) |
+------------------------------+
| 0,1,1,0,0,0,0,0,0,0 |
+------------------------------+
This page is licensed: GPLv2, originally from fill_help_tables.sql
EXTRACTVALUE(xml_frag, xpath_expr)
The EXTRACTVALUE()
function takes two string arguments: a fragment of XML markup and an XPath expression, (also known as a locator). It returns the text (That is, CDDATA), of the first text node which is a child of the element or elements matching the XPath expression.
In cases where a valid XPath expression does not match any text nodes in a valid XML fragment, (including the implicit /text()
expression), the EXTRACTVALUE()
function returns an empty string.
When either the XML fragment or the XPath expression is NULL
, the EXTRACTVALUE()
function returns NULL
. When the XML fragment is invalid, it raises a warning Code 1525:
Warning (Code 1525): Incorrect XML value: 'parse error at line 1 pos 11: unexpected END-OF-INPUT'
When the XPath value is invalid, it generates an Error 1105:
ERROR 1105 (HY000): XPATH syntax error: ')'
This function is the equivalent of performing a match using the XPath expression after appending /text()
. In other words:
SELECT
EXTRACTVALUE('<cases><case>example</case></cases>', '/cases/case')
AS 'Base Example',
EXTRACTVALUE('<cases><case>example</case></cases>', '/cases/case/text()')
AS 'text() Example';
+--------------+----------------+
| Base Example | text() Example |
+--------------+----------------+
| example | example |
+--------------+----------------+
When EXTRACTVALUE()
returns multiple matches, it returns the content of the first child text node of each matching element, in the matched order, as a single, space-delimited string.
By design, the EXTRACTVALUE()
function makes no distinction between a match on an empty element and no match at all. If you need to determine whether no matching element was found in the XML fragment or if an element was found that contained no child text nodes, use the XPath count()
function.
For instance, when looking for a value that exists, but contains no child text nodes, you would get a count of the number of matching instances:
SELECT
EXTRACTVALUE('<cases><case/></cases>', '/cases/case')
AS 'Empty Example',
EXTRACTVALUE('<cases><case/></cases>', 'count(/cases/case)')
AS 'count() Example';
+---------------+-----------------+
| Empty Example | count() Example |
+---------------+-----------------+
| | 1 |
+---------------+-----------------+
Alternatively, when looking for a value that doesn't exist, count()
returns 0.
SELECT
EXTRACTVALUE('<cases><case/></cases>', '/cases/person')
AS 'No Match Example',
EXTRACTVALUE('<cases><case/></cases>', 'count(/cases/person)')
AS 'count() Example';
+------------------+-----------------+
| No Match Example | count() Example |
+------------------+-----------------+
| | 0|
+------------------+-----------------+
Important: The EXTRACTVALUE()
function only returns CDDATA
. It does not return tags that the element might contain or the text that these child elements contain.
SELECT
EXTRACTVALUE('<cases><case>Person<email>x@example.com</email></case></cases>', '/cases')
AS Case;
+--------+
| Case |
+--------+
| Person |
+--------+
Note, in the above example, while the XPath expression matches to the parent <case>
instance, it does not return the contained <email>
tag or its content.
SELECT
ExtractValue('<a>ccc<b>ddd</b></a>', '/a') AS val1,
ExtractValue('<a>ccc<b>ddd</b></a>', '/a/b') AS val2,
ExtractValue('<a>ccc<b>ddd</b></a>', '//b') AS val3,
ExtractValue('<a>ccc<b>ddd</b></a>', '/b') AS val4,
ExtractValue('<a>ccc<b>ddd</b><b>eee</b></a>', '//b') AS val5;
+------+------+------+------+---------+
| val1 | val2 | val3 | val4 | val5 |
+------+------+------+------+---------+
| ccc | ddd | ddd | | ddd eee |
+------+------+------+------+---------+
This page is licensed: GPLv2, originally from fill_help_tables.sql
FIELD(pattern, str1[,str2,...])
Returns the index position of the string or number matching the given pattern. Returns 0
in the event that none of the arguments match the pattern. Raises an Error 1582 if not given at least two arguments.
When all arguments given to the FIELD()
function are strings, they are treated as case-insensitive. When all the arguments are numbers, they are treated as numbers. Otherwise, they are treated as doubles.
If the given pattern occurs more than once, the FIELD()
function only returns the index of the first instance. If the given pattern is NULL
, the function returns 0
, as a NULL
pattern always fails to match.
This function is complementary to the ELT() function.
SELECT FIELD('ej', 'Hej', 'ej', 'Heja', 'hej', 'foo')
AS 'Field Results';
+---------------+
| Field Results |
+---------------+
| 2 |
+---------------+
SELECT FIELD('fo', 'Hej', 'ej', 'Heja', 'hej', 'foo')
AS 'Field Results';
+---------------+
| Field Results |
+---------------+
| 0 |
+---------------+
SELECT FIELD(1, 2, 3, 4, 5, 1) AS 'Field Results';
+---------------+
| Field Results |
+---------------+
| 5 |
+---------------+
SELECT FIELD(NULL, 2, 3) AS 'Field Results';
+---------------+
| Field Results |
+---------------+
| 0 |
+---------------+
SELECT FIELD('fail') AS 'Field Results';
Error 1582 (42000): Incorrect parameter count in call
to native function 'field'
ELT() function. Returns the N'th element from a set of strings.
This page is licensed: GPLv2, originally from fill_help_tables.sql
FIND_IN_SET(pattern, strlist)
Returns the index position where the given pattern occurs in a string list. The first argument is the pattern you want to search for. The second argument is a string containing comma-separated variables. If the second argument is of the SET data-type, the function is optimized to use bit arithmetic.
If the pattern does not occur in the string list or if the string list is an empty string, the function returns 0
. If either argument is NULL
, the function returns NULL
. The function does not return the correct result if the pattern contains a comma (",
") character.
SELECT FIND_IN_SET('b','a,b,c,d') AS "Found Results";
+---------------+
| Found Results |
+---------------+
| 2 |
+---------------+
ELT() function. Returns the N'th element from a set of strings.
This page is licensed: GPLv2, originally from fill_help_tables.sql
FORMAT(num, decimal_position[, locale])
Formats the given number for display as a string, adding separators to appropriate position and rounding the results to the given decimal position. For instance, it would format 15233.345
to 15,233.35
.
If the given decimal position is 0
, it rounds to return no decimal point or fractional part. You can optionally specify a locale value to format numbers to the pattern appropriate for the given region.
SELECT FORMAT(1234567890.09876543210, 4) AS 'Format';
+--------------------+
| Format |
+--------------------+
| 1,234,567,890.0988 |
+--------------------+
SELECT FORMAT(1234567.89, 4) AS 'Format';
+----------------+
| Format |
+----------------+
| 1,234,567.8900 |
+----------------+
SELECT FORMAT(1234567.89, 0) AS 'Format';
+-----------+
| Format |
+-----------+
| 1,234,568 |
+-----------+
SELECT FORMAT(123456789,2,'rm_CH') AS 'Format';
+----------------+
| Format |
+----------------+
| 123'456'789,00 |
+----------------+
This page is licensed: GPLv2, originally from fill_help_tables.sql
FROM_BASE64(str)
Decodes the given base-64 encode string, returning the result as a binary string. Returns NULL
if the given string is NULL
or if it's invalid.
It is the reverse of the TO_BASE64 function.
There are numerous methods to base-64 encode a string. MariaDB uses the following:
It encodes alphabet value 64 as '+
'.
It encodes alphabet value 63 as '/
'.
It codes output in groups of four printable characters. Each three byte of data encoded uses four characters. If the final group is incomplete, it pads the difference with the '=
' character.
It divides long output, adding a new line very 76 characters.
In decoding, it recognizes and ignores newlines, carriage returns, tabs and space whitespace characters.
SELECT TO_BASE64('Maria') AS 'Input';
+-----------+
| Input |
+-----------+
| TWFyaWE= |
+-----------+
SELECT FROM_BASE64('TWFyaWE=') AS 'Output';
+--------+
| Output |
+--------+
| Maria |
+--------+
This page is licensed: CC BY-SA / Gnu FDL
HEX(N_or_S)
If N_or_S
is a number, returns a string representation of the hexadecimal value of N
, where N
is a longlong
(BIGINT) number. This is equivalent to CONV(N,10,16).
If N_or_S
is a string, returns a hexadecimal string representation ofN_or_S
where each byte of each character in N_or_S
is converted to two hexadecimal digits. If N_or_S
is NULL
, returns NULL
. The inverse of this operation is performed by the UNHEX()
function.
HEX()
with an INET6 argument returns a hexadecimal representation of the underlying 16-byte binary string.
SELECT HEX(255);
+----------+
| HEX(255) |
+----------+
| FF |
+----------+
SELECT 0x4D617269614442;
+------------------+
| 0x4D617269614442 |
+------------------+
| MariaDB |
+------------------+
SELECT HEX('MariaDB');
+----------------+
| HEX('MariaDB') |
+----------------+
| 4D617269614442 |
+----------------+
SELECT HEX(CAST('2001:db8::ff00:42:8329' AS INET6));
+----------------------------------------------+
| HEX(CAST('2001:db8::ff00:42:8329' AS INET6)) |
+----------------------------------------------+
| 20010DB8000000000000FF0000428329 |
+----------------------------------------------+
This page is licensed: GPLv2, originally from fill_help_tables.sql
INSERT(str,pos,len,newstr)
Returns the string str
, with the substring beginning at position pos
and len
characters long replaced by the string newstr
. Returns the original string if pos
is not within the length of the string.
Replaces the rest of the string from position pos
if len
is not within the length of the rest of the string. Returns NULL
if any argument is NULL
.
SELECT INSERT('Quadratic', 3, 4, 'What');
+-----------------------------------+
| INSERT('Quadratic', 3, 4, 'What') |
+-----------------------------------+
| QuWhattic |
+-----------------------------------+
SELECT INSERT('Quadratic', -1, 4, 'What');
+------------------------------------+
| INSERT('Quadratic', -1, 4, 'What') |
+------------------------------------+
| Quadratic |
+------------------------------------+
SELECT INSERT('Quadratic', 3, 100, 'What');
+-------------------------------------+
| INSERT('Quadratic', 3, 100, 'What') |
+-------------------------------------+
| QuWhat |
+-------------------------------------+
This page is licensed: GPLv2, originally from fill_help_tables.sql
INSTR(str,substr)
Returns the position of the first occurrence of substring substr in string str. This is the same as the two-argument form of LOCATE(), except that the order of the arguments is reversed.
INSTR()
performs a case-insensitive search.
If any argument is NULL
, returns NULL
.
SELECT INSTR('foobarbar', 'bar');
+---------------------------+
| INSTR('foobarbar', 'bar') |
+---------------------------+
| 4 |
+---------------------------+
SELECT INSTR('My', 'Maria');
+----------------------+
| INSTR('My', 'Maria') |
+----------------------+
| 0 |
+----------------------+
LOCATE() ; Returns the position of a string within a string
SUBSTRING_INDEX() ; Returns the substring from string before count occurrences of a delimiter
This page is licensed: GPLv2, originally from fill_help_tables.sql
LCASE(str)
LCASE()
is a synonym for LOWER().
This page is licensed: GPLv2, originally from fill_help_tables.sql
LEFT(str,len)
Returns the leftmost len
characters from the string str
, or NULL
if any argument is NULL
.
SELECT LEFT('MariaDB', 5);
+--------------------+
| LEFT('MariaDB', 5) |
+--------------------+
| Maria |
+--------------------+
This page is licensed: GPLv2, originally from fill_help_tables.sql
LENGTH(str)
Returns the length of the string str
.
In the default mode, when Oracle mode is not set, the length is measured in bytes. In this case, a multi-byte character counts as multiple bytes. This means that for a string containing five two-byte characters, LENGTH()
returns 10, whereas CHAR_LENGTH() returns 5.
When running Oracle mode, the length is measured in characters, and LENGTH
is a synonym for CHAR_LENGTH().
If str
is not a string value, it is converted into a string. If str
is NULL
, the function returns NULL
.
SELECT LENGTH('MariaDB');
+-------------------+
| LENGTH('MariaDB') |
+-------------------+
| 7 |
+-------------------+
When Oracle mode is not set:
SELECT CHAR_LENGTH('π'), LENGTH('π'), LENGTHB('π'), OCTET_LENGTH('π');
+-------------------+--------------+---------------+--------------------+
| CHAR_LENGTH('π') | LENGTH('π') | LENGTHB('π') | OCTET_LENGTH('π') |
+-------------------+--------------+---------------+--------------------+
| 1 | 2 | 2 | 2 |
+-------------------+--------------+---------------+--------------------+
In Oracle mode:
SELECT CHAR_LENGTH('π'), LENGTH('π'), LENGTHB('π'), OCTET_LENGTH('π');
+-------------------+--------------+---------------+--------------------+
| CHAR_LENGTH('π') | LENGTH('π') | LENGTHB('π') | OCTET_LENGTH('π') |
+-------------------+--------------+---------------+--------------------+
| 1 | 1 | 2 | 2 |
+-------------------+--------------+---------------+--------------------+
This page is licensed: GPLv2, originally from fill_help_tables.sql
LENGTHB(str)
LENGTHB()
returns the length of the given string, in bytes. When Oracle mode is not set, this is a synonym for LENGTH.
A multi-byte character counts as multiple bytes. This means that for a string containing five two-byte characters, LENGTHB()
returns 10, whereas CHAR_LENGTH() returns 5.
If str
is not a string value, it is converted into a string. If str
is NULL
, the function returns NULL
.
When Oracle mode is not set:
SELECT CHAR_LENGTH('π'), LENGTH('π'), LENGTHB('π'), OCTET_LENGTH('π');
+-------------------+--------------+---------------+--------------------+
| CHAR_LENGTH('π') | LENGTH('π') | LENGTHB('π') | OCTET_LENGTH('π') |
+-------------------+--------------+---------------+--------------------+
| 1 | 2 | 2 | 2 |
+-------------------+--------------+---------------+--------------------+
In Oracle mode:
SELECT CHAR_LENGTH('π'), LENGTH('π'), LENGTHB('π'), OCTET_LENGTH('π');
+-------------------+--------------+---------------+--------------------+
| CHAR_LENGTH('π') | LENGTH('π') | LENGTHB('π') | OCTET_LENGTH('π') |
+-------------------+--------------+---------------+--------------------+
| 1 | 1 | 2 | 2 |
+-------------------+--------------+---------------+--------------------+
This page is licensed: CC BY-SA / Gnu FDL
expr LIKE pat [ESCAPE 'escape_char']
expr NOT LIKE pat [ESCAPE 'escape_char']
Tests whether expr matches the pattern pat. Returns either 1 (TRUE
) or 0 (FALSE
).
Both expr and pat may be any valid expression and are evaluated to strings.
Patterns may use the following wildcard characters:
%
matches any number of characters, including zero.
_
matches any single character.
Use NOT LIKE
to test if a string does not match a pattern. This is equivalent to using
the NOT operator on the entire LIKE
expression.
If either the expression or the pattern is NULL
, the result is NULL
.
LIKE
performs case-insensitive substring matches if the collation for the expression and pattern is case-insensitive. For case-sensitive matches, declare either argument to use a binary collation using collate, or coerce either of them to a BINARY string using CAST. Use SHOW COLLATION to get a list of
available collations. Collations ending in _bin
are case-sensitive.
Numeric arguments are coerced to binary strings.
The _
wildcard matches a single character, not byte. It will only match a multi-byte character
if it is valid in the expression's character set. For example, _
will match _utf8"€"
, but it
will not match _latin1"€"
because the Euro sign is not a valid latin1 character. If necessary,
use CONVERT to use the expression in a different character set.
If you need to match the characters _
or %
, you must escape them. By default, you can prefix the wildcard characters the backslash character \
to escape them. The backslash is used both to encode special characters like newlines when a string is parsed as well as to escape wildcards in a pattern after parsing. Thus, to match an actual backslash, you sometimes need to double-escape it as "\``\``\``\"
.
To avoid difficulties with the backslash character, you can change the wildcard escape character using ESCAPE
in a LIKE
expression. The argument to ESCAPE
must be a single-character string.
Select the days that begin with "T":
CREATE TABLE t1 (d VARCHAR(16));
INSERT INTO t1 VALUES
("Monday"), ("Tuesday"), ("Wednesday"),
("Thursday"), ("Friday"), ("Saturday"), ("Sunday");
SELECT * FROM t1 WHERE d LIKE "T%";
SELECT * FROM t1 WHERE d LIKE "T%";
+----------+
| d |
+----------+
| Tuesday |
| Thursday |
+----------+
Select the days that contain the substring "es":
SELECT * FROM t1 WHERE d LIKE "%es%";
SELECT * FROM t1 WHERE d LIKE "%es%";
+-----------+
| d |
+-----------+
| Tuesday |
| Wednesday |
+-----------+
Select the six-character day names:
SELECT * FROM t1 WHERE d like "___day";
SELECT * FROM t1 WHERE d like "___day";
+---------+
| d |
+---------+
| Monday |
| Friday |
| Sunday |
+---------+
With the default collations, LIKE
is case-insensitive:
SELECT * FROM t1 WHERE d like "t%";
SELECT * FROM t1 where d like "t%";
+----------+
| d |
+----------+
| Tuesday |
| Thursday |
+----------+
Use collate to specify a binary collation, forcing case-sensitive matches:
SELECT * FROM t1 WHERE d like "t%" COLLATE latin1_bin;
SELECT * FROM t1 WHERE d like "t%" COLLATE latin1_bin;
Empty SET (0.00 sec)
You can include functions and operators in the expression to match. Select dates based on their day name:
CREATE TABLE t2 (d DATETIME);
INSERT INTO t2 VALUES
("2007-01-30 21:31:07"),
("1983-10-15 06:42:51"),
("2011-04-21 12:34:56"),
("2011-10-30 06:31:41"),
("2011-01-30 14:03:25"),
("2004-10-07 11:19:34");
SELECT * FROM t2 WHERE DAYNAME(d) LIKE "T%";
SELECT * FROM t2 WHERE DAYNAME(d) LIKE "T%";
+------------------+
| d |
+------------------+
| 2007-01-30 21:31 |
| 2011-04-21 12:34 |
| 2004-10-07 11:19 |
+------------------+
3 rows in set, 7 warnings (0.00 sec)
MariaDB can use indexes for LIKE
on string columns in the case where the LIKE doesn't start with %
or _
.
You can set the optimizer_use_condition_selectivity variable to 5. If this is done, then the optimizer will read optimizer_selectivity_sampling_limit rows to calculate the selectivity of the LIKE
expression before starting to calculate the query plan. This can help speed up some LIKE
queries by providing the optimizer with more information about your data.
For searches on text columns, with results sorted by relevance, see full-text indexes.
For more complex searches and operations on strings, you can use regular expressions, which were enhanced in MariaDB 10 (see PCRE Regular Expressions).
This page is licensed: GPLv2, originally from fill_help_tables.sql
LOAD_FILE(file_name)
Reads the file and returns the file contents as a string. To use this function, the file must be located on the server host, you must specify the full path name to the file, and you must have the FILE privilege. The file must be readable by all and it must be less than the size, in bytes, of the max_allowed_packet system variable. If the secure_file_priv system variable is set to a non-empty directory name, the file to be loaded must be located in that directory.
If the file does not exist or cannot be read because one of the preceding conditions is not satisfied, the function returns NULL
.
The character_set_filesystem system variable has controlled interpretation of file names that are given as literal strings.
Statements using the LOAD_FILE()
function are not safe for statement based replication. This is because the slave will execute the LOAD_FILE()
command itself. If the file doesn't exist on the slave, the function will return NULL
.
UPDATE t SET blob_col=LOAD_FILE('/tmp/picture') WHERE id=1;
This page is licensed: GPLv2, originally from fill_help_tables.sql
LOCATE(substr,str), LOCATE(substr,str,pos)
The first syntax returns the position of the first occurrence of substring substr
in string str
. The second syntax returns the position of the first occurrence of substring substr
in string str
, starting at position pos
. Returns 0 if substr
is not in str
.
LOCATE()
performs a case-insensitive search.
If any argument is NULL
, returns NULL.
INSTR() is the same as the two-argument form of LOCATE()
, except that the order of the arguments is reversed.
SELECT LOCATE('bar', 'foobarbar');
+----------------------------+
| LOCATE('bar', 'foobarbar') |
+----------------------------+
| 4 |
+----------------------------+
SELECT LOCATE('My', 'Maria');
+-----------------------+
| LOCATE('My', 'Maria') |
+-----------------------+
| 0 |
+-----------------------+
SELECT LOCATE('bar', 'foobarbar', 5);
+-------------------------------+
| LOCATE('bar', 'foobarbar', 5) |
+-------------------------------+
| 7 |
+-------------------------------+
INSTR() ; Returns the position of a string within a string
SUBSTRING_INDEX() ; Returns the substring from string before count occurrences of a delimiter
This page is licensed: GPLv2, originally from fill_help_tables.sql
LOWER(str)
LCASE(str)
Returns the string str
with all characters changed to lowercase according to the current character set mapping. The default is latin1 (cp1252 West European).
LCASE
is a synonym for LOWER
.
SELECT LOWER('QUADRATICALLY');
+------------------------+
| LOWER('QUADRATICALLY') |
+------------------------+
| quadratically |
+------------------------+
LOWER()
(and UPPER()) are ineffective when applied to binary strings (BINARY, VARBINARY, BLOB).
To perform letter case conversion, CONVERT the string to a non-binary string:
SET @str = BINARY 'North Carolina';
SELECT LOWER(@str), LOWER(CONVERT(@str USING latin1));
+----------------+-----------------------------------+
| LOWER(@str) | LOWER(CONVERT(@str USING latin1)) |
+----------------+-----------------------------------+
| North Carolina | north carolina |
+----------------+-----------------------------------+
This page is licensed: GPLv2, originally from fill_help_tables.sql
LPAD(str, len [,padstr])
Returns the string str
, left-padded with the string padstr
to a length of len
characters. If str
is longer than len
, the return value is shortened to len
characters. If padstr
is omitted, the LPAD function pads spaces.
Returns NULL
if given a NULL
argument. If the result is empty (zero length), returns either an empty string or with SQL_MODE=Oracle, NULL
.
The Oracle mode version of the function can be accessed outside of Oracle mode by using LPAD_ORACLE
as the function name.
SELECT LPAD('hello',10,'.');
+----------------------+
| LPAD('hello',10,'.') |
+----------------------+
| .....hello |
+----------------------+
SELECT LPAD('hello',2,'.');
+---------------------+
| LPAD('hello',2,'.') |
+---------------------+
| he |
+---------------------+
With the pad string defaulting to space:
SELECT LPAD('hello',10);
+------------------+
| LPAD('hello',10) |
+------------------+
| hello |
+------------------+
Oracle mode:
SELECT LPAD('',0),LPAD_ORACLE('',0);
+------------+-------------------+
| LPAD('',0) | LPAD_ORACLE('',0) |
+------------+-------------------+
| | NULL |
+------------+-------------------+
RPAD - Right-padding instead of left-padding.
This page is licensed: GPLv2, originally from fill_help_tables.sql
LTRIM(str)
Returns the string str
with leading space characters removed.
Returns NULL
if given a NULL
argument. If the result is empty, returns either an empty string, or with SQL_MODE=Oracle, NULL
.
The Oracle mode version of the function can be accessed outside of Oracle mode by using LTRIM_ORACLE
as the function name.
SELECT QUOTE(LTRIM(' MariaDB '));
+-------------------------------+
| QUOTE(LTRIM(' MariaDB ')) |
+-------------------------------+
| 'MariaDB ' |
+-------------------------------+
Oracle mode version:
SELECT LTRIM(''),LTRIM_ORACLE('');
+-----------+------------------+
| LTRIM('') | LTRIM_ORACLE('') |
+-----------+------------------+
| | NULL |
+-----------+------------------+
This page is licensed: GPLv2, originally from fill_help_tables.sql
MAKE_SET(bits,str1,str2,...)
Returns a set value (a string containing substrings separated by "," characters) consisting of the strings that have the corresponding bit in bits set. str1
corresponds to bit 0, str2
to bit 1, and so on. NULL
values in str1
, str2
, ... are not appended to the result.
SELECT MAKE_SET(1,'a','b','c');
+-------------------------+
| MAKE_SET(1,'a','b','c') |
+-------------------------+
| a |
+-------------------------+
SELECT MAKE_SET(1 | 4,'hello','nice','world');
+----------------------------------------+
| MAKE_SET(1 | 4,'hello','nice','world') |
+----------------------------------------+
| hello,world |
+----------------------------------------+
SELECT MAKE_SET(1 | 4,'hello','nice',NULL,'world');
+---------------------------------------------+
| MAKE_SET(1 | 4,'hello','nice',NULL,'world') |
+---------------------------------------------+
| hello |
+---------------------------------------------+
SELECT QUOTE(MAKE_SET(0,'a','b','c'));
+--------------------------------+
| QUOTE(MAKE_SET(0,'a','b','c')) |
+--------------------------------+
| '' |
+--------------------------------+
This page is licensed: GPLv2, originally from fill_help_tables.sql
MATCH (col1,col2,...) AGAINST (expr [search_modifier])
A special construct used to perform a fulltext search on a fulltext index.
See Fulltext Index Overview for a full description, and Full-text Indexes for more articles on the topic.
CREATE TABLE ft_myisam(copy TEXT,FULLTEXT(copy)) ENGINE=MyISAM;
INSERT INTO ft_myisam(copy) VALUES ('Once upon a time'), ('There was a wicked witch'),
('Who ate everybody up');
SELECT * FROM ft_myisam WHERE MATCH(copy) AGAINST('wicked');
+--------------------------+
| copy |
+--------------------------+
| There was a wicked witch |
+--------------------------+
SELECT id, body, MATCH (title,body) AGAINST
('Security implications of running MySQL as root'
IN NATURAL LANGUAGE MODE) AS score
FROM articles WHERE MATCH (title,body) AGAINST
('Security implications of running MySQL as root'
IN NATURAL LANGUAGE MODE);
+----+-------------------------------------+-----------------+
| id | body | score |
+----+-------------------------------------+-----------------+
| 4 | 1. Never run mysqld as root. 2. ... | 1.5219271183014 |
| 6 | When configured properly, MySQL ... | 1.3114095926285 |
+----+-------------------------------------+-----------------+
This page is licensed: GPLv2, originally from fill_help_tables.sql
MID(str,pos,len)
MID(str,pos,len) is a synonym for SUBSTRING(str,pos,len).
SELECT MID('abcd',4,1);
+-----------------+
| MID('abcd',4,1) |
+-----------------+
| d |
+-----------------+
SELECT MID('abcd',2,2);
+-----------------+
| MID('abcd',2,2) |
+-----------------+
| bc |
+-----------------+
A negative starting position:
SELECT MID('abcd',-2,4);
+------------------+
| MID('abcd',-2,4) |
+------------------+
| cd |
+------------------+
This page is licensed: GPLv2, originally from fill_help_tables.sql
NATURAL_SORT_KEY(str)
The NATURAL_SORT_KEY
function is used for sorting that is closer to natural sorting. Strings are sorted in alphabetical order, while numbers are treated in a way such that, for example, 10
is greater than 2
, whereas in other forms of sorting, 2
would be greater than 10
, just like z
is greater than ya
.
There are multiple natural sort implementations, differing in the way they handle leading zeroes, fractions, i18n, negatives, decimals and so on.
MariaDB's implementation ignores leading zeroes when performing the sort.
You can use also use NATURAL_SORT_KEY
with generated columns. The value is not stored permanently in the table. When using a generated column, the virtual column must be longer than the base column to cater for embedded numbers in the string and MDEV-24582.
CREATE TABLE t1 (c TEXT);
INSERT INTO t1 VALUES ('b1'),('a2'),('a11'),('a1');
SELECT c FROM t1;
+------+
| c |
+------+
| b1 |
| a2 |
| a11 |
| a1 |
+------+
SELECT c FROM t1 ORDER BY c;
+------+
| c |
+------+
| a1 |
| a11 |
| a2 |
| b1 |
+------+
Unsorted, regular sort and natural sort:
TRUNCATE t1;
INSERT INTO t1 VALUES
('5.5.31'),('10.7.0'),('10.2.1'),
('10.1.22'),('10.3.32'),('10.2.12');
SELECT c FROM t1;
+---------+
| c |
+---------+
| 5.5.31 |
| 10.7.0 |
| 10.2.1 |
| 10.1.22 |
| 10.3.32 |
| 10.2.12 |
+---------+
SELECT c FROM t1 ORDER BY c;
+---------+
| c |
+---------+
| 10.1.22 |
| 10.2.1 |
| 10.2.12 |
| 10.3.32 |
| 10.7.0 |
| 5.5.31 |
+---------+
SELECT c FROM t1 ORDER BY NATURAL_SORT_KEY(c);
+---------+
| c |
+---------+
| 5.5.31 |
| 10.1.22 |
| 10.2.1 |
| 10.2.12 |
| 10.3.32 |
| 10.7.0 |
+---------+
Sorting IPs, unsorted, regular sort and natural sort::
TRUNCATE t1;
INSERT INTO t1 VALUES
('192.167.3.1'),('192.167.1.12'),('100.200.300.400'),
('100.50.60.70'),('100.8.9.9'),('127.0.0.1'),('0.0.0.0');
SELECT c FROM t1;
+-----------------+
| c |
+-----------------+
| 192.167.3.1 |
| 192.167.1.12 |
| 100.200.300.400 |
| 100.50.60.70 |
| 100.8.9.9 |
| 127.0.0.1 |
| 0.0.0.0 |
+-----------------+
SELECT c FROM t1 ORDER BY c;
+-----------------+
| c |
+-----------------+
| 0.0.0.0 |
| 100.200.300.400 |
| 100.50.60.70 |
| 100.8.9.9 |
| 127.0.0.1 |
| 192.167.1.12 |
| 192.167.3.1 |
+-----------------+
SELECT c FROM t1 ORDER BY NATURAL_SORT_KEY(c);
+-----------------+
| c |
+-----------------+
| 0.0.0.0 |
| 100.8.9.9 |
| 100.50.60.70 |
| 100.200.300.400 |
| 127.0.0.1 |
| 192.167.1.12 |
| 192.167.3.1 |
+-----------------+
Using with a generated column:
CREATE TABLE t(c VARCHAR(3), k VARCHAR(4) AS (NATURAL_SORT_KEY(c)) INVISIBLE);
INSERT INTO t(c) VALUES ('b1'),('a2'),('a11'),('a10');
SELECT * FROM t ORDER by k;
+------+
| c |
+------+
| a2 |
| a10 |
| a11 |
| b1 |
+------+
Note that if the virtual column is not longer, results may not be as expected:
CREATE TABLE t2(c VARCHAR(3), k VARCHAR(3) AS (NATURAL_SORT_KEY(c)) INVISIBLE);
INSERT INTO t2(c) VALUES ('b1'),('a2'),('a11'),('a10');
SELECT * FROM t2 ORDER by k;
+------+
| c |
+------+
| a2 |
| a11 |
| a10 |
| b1 |
+------+
Ignoring leading zeroes can lead to undesirable results in certain contexts. For example:
CREATE TABLE t3 (a VARCHAR(4));
INSERT INTO t3 VALUES
('a1'), ('a001'), ('a10'), ('a001'), ('a10'),
('a01'), ('a01'), ('a01b'), ('a01b'), ('a1');
SELECT a FROM t3 ORDER BY a;
+------+
| a |
+------+
| a001 |
| a001 |
| a01 |
| a01 |
| a01b |
| a01b |
| a1 |
| a1 |
| a10 |
| a10 |
+------+
10 rows in set (0.000 sec)
SELECT a FROM t3 ORDER BY NATURAL_SORT_KEY(a);
+------+
| a |
+------+
| a1 |
| a01 |
| a01 |
| a001 |
| a001 |
| a1 |
| a01b |
| a01b |
| a10 |
| a10 |
+------+
This may not be what we were hoping for in a 'natural' sort. A workaround is to sort by both NATURAL_SORT_KEY
and regular sort.
SELECT a FROM t3 ORDER BY NATURAL_SORT_KEY(a), a;
+------+
| a |
+------+
| a001 |
| a001 |
| a01 |
| a01 |
| a1 |
| a1 |
| a01b |
| a01b |
| a10 |
| a10 |
+------+
This page is licensed: CC BY-SA / Gnu FDL
expr NOT LIKE pat [ESCAPE 'escape_char']
This is the same as NOT (expr LIKE pat [ESCAPE 'escape_char']).
This page is licensed: GPLv2, originally from fill_help_tables.sql
expr NOT REGEXP pat, expr NOT RLIKE pat
This is the same as NOT (expr REGEXP pat).
This page is licensed: GPLv2, originally from fill_help_tables.sql
OCTET_LENGTH(str)
OCTET_LENGTH()
returns the length of the given string, in octets (bytes). This is a synonym for LENGTHB(), and, when Oracle mode is not set, a synonym for LENGTH().
A multi-byte character counts as multiple bytes. This means that for a string containing five two-byte characters, OCTET_LENGTH()
returns 10, whereas CHAR_LENGTH() returns 5.
If str
is not a string value, it is converted into a string. If str
is NULL
, the function returns NULL
.
When Oracle mode is not set:
SELECT CHAR_LENGTH('π'), LENGTH('π'), LENGTHB('π'), OCTET_LENGTH('π');
+-------------------+--------------+---------------+--------------------+
| CHAR_LENGTH('π') | LENGTH('π') | LENGTHB('π') | OCTET_LENGTH('π') |
+-------------------+--------------+---------------+--------------------+
| 1 | 2 | 2 | 2 |
+-------------------+--------------+---------------+--------------------+
In Oracle mode:
SELECT CHAR_LENGTH('π'), LENGTH('π'), LENGTHB('π'), OCTET_LENGTH('π');
+-------------------+--------------+---------------+--------------------+
| CHAR_LENGTH('π') | LENGTH('π') | LENGTHB('π') | OCTET_LENGTH('π') |
+-------------------+--------------+---------------+--------------------+
| 1 | 1 | 2 | 2 |
+-------------------+--------------+---------------+--------------------+
This page is licensed: GPLv2, originally from fill_help_tables.sql
ORD(str)
If the leftmost character of the string str
is a multi-byte character, returns the code for that character, calculated from the numeric values of its constituent bytes using this formula:
(1st byte code)
+ (2nd byte code x 256)
+ (3rd byte code x 256 x 256) ...
If the leftmost character is not a multi-byte character, ORD()
returns the same value as the ASCII() function.
SELECT ORD('2');
+----------+
| ORD('2') |
+----------+
| 50 |
+----------+
This page is licensed: GPLv2, originally from fill_help_tables.sql
POSITION(substr IN str)
POSITION(substr IN str)
is a synonym for LOCATE(substr,str).
The function is part of ODBC 3.0.
This page is licensed: GPLv2, originally from fill_help_tables.sql
QUOTE(str)
Quotes a string to produce a result that can be used as a properly escaped data value in an SQL statement. The string is returned enclosed by single quotes and with each instance of single quote ("'
"), backslash ("\
"),ASCII NUL
, and Control-Z preceded by a backslash. If the argument is NULL
, the return value is the word "NULL
" without enclosing single quotes.
SELECT QUOTE("Don't!");
+-----------------+
| QUOTE("Don't!") |
+-----------------+
| 'Don\'t!' |
+-----------------+
SELECT QUOTE(NULL);
+-------------+
| QUOTE(NULL) |
+-------------+
| NULL |
+-------------+
This page is licensed: GPLv2, originally from fill_help_tables.sql
REPEAT(str,count)
Returns a string consisting of the string str
repeated count
times. Ifcount
is less than 1, returns an empty string. Returns NULL if str
orcount
are NULL
.
SELECT QUOTE(REPEAT('MariaDB ',4));
+------------------------------------+
| QUOTE(REPEAT('MariaDB ',4)) |
+------------------------------------+
| 'MariaDB MariaDB MariaDB MariaDB ' |
+------------------------------------+
This page is licensed: GPLv2, originally from fill_help_tables.sql
REPLACE(str,from_str,to_str)
Returns the string str
with all occurrences of the string from_str
replaced by the string to_str
. REPLACE()
performs a case-sensitive match when searching for from_str
.
SELECT REPLACE('www.mariadb.org', 'w', 'Ww');
+---------------------------------------+
| REPLACE('www.mariadb.org', 'w', 'Ww') |
+---------------------------------------+
| WwWwWw.mariadb.org |
+---------------------------------------+
This page is licensed: GPLv2, originally from fill_help_tables.sql
REVERSE(str)
Returns the string str
with the order of the characters reversed.
SELECT REVERSE('desserts');
+---------------------+
| REVERSE('desserts') |
+---------------------+
| stressed |
+---------------------+
This page is licensed: GPLv2, originally from fill_help_tables.sql
RIGHT(str,len)
Returns the rightmost len
characters from the string str
, or NULL
if any argument is NULL
.
SELECT RIGHT('MariaDB', 2);
+---------------------+
| RIGHT('MariaDB', 2) |
+---------------------+
| DB |
+---------------------+
This page is licensed: GPLv2, originally from fill_help_tables.sql
RPAD(str, len [, padstr])
Returns the string str
, right-padded with the string padstr
to a length of len
characters. If str
is longer than len
, the return value is shortened to len
characters. If padstr
is omitted, the RPAD function pads spaces.
Returns NULL
if given a NULL
argument. If the result is empty (a length of zero), returns either an empty string, or, with SQL_MODE=Oracle, NULL
.
The Oracle mode version of the function can be accessed outside of Oracle mode by using RPAD_ORACLE
as the function name.
SELECT RPAD('hello',10,'.');
+----------------------+
| RPAD('hello',10,'.') |
+----------------------+
| hello..... |
+----------------------+
SELECT RPAD('hello',2,'.');
+---------------------+
| RPAD('hello',2,'.') |
+---------------------+
| he |
+---------------------+
With the pad string defaulting to space:
SELECT RPAD('hello',30);
+--------------------------------+
| RPAD('hello',30) |
+--------------------------------+
| hello |
+--------------------------------+
Oracle mode:
SELECT RPAD('',0),RPAD_ORACLE('',0);
+------------+-------------------+
| RPAD('',0) | RPAD_ORACLE('',0) |
+------------+-------------------+
| | NULL |
+------------+-------------------+
LPAD - Left-padding instead of right-padding.
This page is licensed: GPLv2, originally from fill_help_tables.sql
RTRIM(str)
Returns the string str
with trailing space characters removed.
Returns NULL if given a NULL argument. If the result is empty, returns either an empty string, or, with SQL_MODE=Oracle, NULL
.
The Oracle mode version of the function can be accessed outside of Oracle mode by using RTRIM_ORACLE
as the function name.
SELECT QUOTE(RTRIM('MariaDB '));
+-----------------------------+
| QUOTE(RTRIM('MariaDB ')) |
+-----------------------------+
| 'MariaDB' |
+-----------------------------+
Oracle mode:
SELECT RTRIM(''),RTRIM_ORACLE('');
+-----------+------------------+
| RTRIM('') | RTRIM_ORACLE('') |
+-----------+------------------+
| | NULL |
+-----------+------------------+
This page is licensed: GPLv2, originally from fill_help_tables.sql
The SFORMAT
function takes an input string and a formatting specification and returns the string formatted using the rules the user passed in the specification.
It uses the fmtlib library for Python-like (as well as Rust, C++20, etc) string formatting.
Only fmtlib 7.0.0+ is supported.
There is no native support for temporal and decimal values:
TIME_RESULT
is handled as STRING_RESULT
.
DECIMAL_RESULT
is handled as REAL_RESULT
.
SELECT SFORMAT("The answer is {}.", 42);
+----------------------------------+
| SFORMAT("The answer is {}.", 42) |
+----------------------------------+
| The answer is 42. |
+----------------------------------+
CREATE TABLE test_sformat(mdb_release char(6), mdev int, feature char(20));
INSERT INTO test_sformat VALUES('10.7.0', 25015, 'Python style sformat'),
('10.7.0', 4958, 'UUID');
SELECT * FROM test_sformat;
+-------------+-------+----------------------+
| mdb_release | mdev | feature |
+-------------+-------+----------------------+
| 10.7.0 | 25015 | Python style sformat |
| 10.7.0 | 4958 | UUID |
+-------------+-------+----------------------+
SELECT SFORMAT('MariaDB Server {} has a preview for MDEV-{} which is about {}',
mdb_release, mdev, feature) AS 'Preview Release Examples'
FROM test_sformat;
+----------------------------------------------------------------------------------------+
| Preview Release Examples |
+----------------------------------------------------------------------------------------+
| MariaDB Server 10.7.0 has a preview for MDEV-25015 which is about Python style sformat |
| MariaDB Server 10.7.0 has a preview for MDEV-4958 which is about UUID |
+----------------------------------------------------------------------------------------+
This page is licensed: CC BY-SA / Gnu FDL
SOUNDEX(str)
Returns a soundex string from str
. Two strings that sound almost the same should have identical soundex strings. A standard soundex string is four characters long, but the SOUNDEX()
function returns an arbitrarily long string. You can use SUBSTRING()
on the result to get a standard soundex string. All non-alphabetic characters in str
are ignored. All international alphabetic characters outside the A-Z range are treated as vowels.
Important: When using SOUNDEX()
, you should be aware of the following details:
This function, as currently implemented, is intended to work well with strings that are in the English language only. Strings in other languages may not produce reasonable results.
This function implements the original Soundex algorithm, not the more popular enhanced version (also described by D. Knuth). The difference is that original version discards vowels first and duplicates second, whereas the enhanced version discards duplicates first and vowels second.
SOUNDEX('Hello');
+------------------+
| SOUNDEX('Hello') |
+------------------+
| H400 |
+------------------+
SELECT SOUNDEX('MariaDB');
+--------------------+
| SOUNDEX('MariaDB') |
+--------------------+
| M631 |
+--------------------+
SELECT SOUNDEX('Knowledgebase');
+--------------------------+
| SOUNDEX('Knowledgebase') |
+--------------------------+
| K543212 |
+--------------------------+
SELECT givenname, surname FROM users WHERE SOUNDEX(givenname) = SOUNDEX("robert");
+-----------+---------+
| givenname | surname |
+-----------+---------+
| Roberto | Castro |
+-----------+---------+
This page is licensed: GPLv2, originally from fill_help_tables.sql
expr1 SOUNDS LIKE expr2
This is the same as SOUNDEX(expr1) = SOUNDEX(expr2)
.
SELECT givenname, surname FROM users WHERE givenname SOUNDS LIKE "robert";
+-----------+---------+
| givenname | surname |
+-----------+---------+
| Roberto | Castro |
+-----------+---------+
This page is licensed: GPLv2, originally from fill_help_tables.sql
SPACE(N)
Returns a string consisting of N
space characters. If N
is NULL
, returns NULL
.
SELECT QUOTE(SPACE(6));
+-----------------+
| QUOTE(SPACE(6)) |
+-----------------+
| ' ' |
+-----------------+
This page is licensed: GPLv2, originally from fill_help_tables.sql
STRCMP(expr1,expr2)
STRCMP()
returns 0
if the strings are the same, -1
if the first argument is smaller than the second according to the current sort order, and 1
if the strings are otherwise not the same. Returns NULL
is either argument is NULL
.
SELECT STRCMP('text', 'text2');
+-------------------------+
| STRCMP('text', 'text2') |
+-------------------------+
| -1 |
+-------------------------+
SELECT STRCMP('text2', 'text');
+-------------------------+
| STRCMP('text2', 'text') |
+-------------------------+
| 1 |
+-------------------------+
SELECT STRCMP('text', 'text');
+------------------------+
| STRCMP('text', 'text') |
+------------------------+
| 0 |
+------------------------+
This page is licensed: GPLv2, originally from fill_help_tables.sql
SUBSTR()
is a synonym for SUBSTRING().
This page is licensed: GPLv2, originally from fill_help_tables.sql
SUBSTRING(str,pos),
SUBSTRING(str FROM pos),
SUBSTRING(str,pos,len),
SUBSTRING(str FROM pos FOR len)
SUBSTR(str,pos),
SUBSTR(str FROM pos),
SUBSTR(str,pos,len),
SUBSTR(str FROM pos FOR len)
The forms without a len
argument return a substring from string str
starting at position pos
.
The forms with a len
argument return a substring len
characters long from string str
, starting at position pos
.
The forms that use FROM
are standard SQL syntax.
It is also possible to use a negative value for pos
. In this case, the beginning of the substring is pos
characters from the end of the string, rather than the beginning. A negative value may be used for pos
in any of the forms of this function.
By default, the position of the first character in the string from which the substring is to be extracted is reckoned as 1. For Oracle compatibility, when sql_mode
is set to 'oracle
', position zero is treated as position 1 (although the first character is still reckoned as 1).
If any argument is NULL
, returns NULL
.
The optimizer can take advantage of queries of the format SUBSTR(col, 1, n) = const_str.
The optimizer cannot take advantage of queries of the format SUBSTR(col, 1, n) = const_str.
SELECT SUBSTRING('Knowledgebase',5);
+------------------------------+
| SUBSTRING('Knowledgebase',5) |
+------------------------------+
| ledgebase |
+------------------------------+
SELECT SUBSTRING('MariaDB' FROM 6);
+-----------------------------+
| SUBSTRING('MariaDB' FROM 6) |
+-----------------------------+
| DB |
+-----------------------------+
SELECT SUBSTRING('Knowledgebase',3,7);
+--------------------------------+
| SUBSTRING('Knowledgebase',3,7) |
+--------------------------------+
| owledge |
+--------------------------------+
SELECT SUBSTRING('Knowledgebase', -4);
+--------------------------------+
| SUBSTRING('Knowledgebase', -4) |
+--------------------------------+
| base |
+--------------------------------+
SELECT SUBSTRING('Knowledgebase', -8, 4);
+-----------------------------------+
| SUBSTRING('Knowledgebase', -8, 4) |
+-----------------------------------+
| edge |
+-----------------------------------+
SELECT SUBSTRING('Knowledgebase' FROM -8 FOR 4);
+------------------------------------------+
| SUBSTRING('Knowledgebase' FROM -8 FOR 4) |
+------------------------------------------+
| edge |
+------------------------------------------+
Oracle mode:
SELECT SUBSTR('abc',0,3);
+-------------------+
| SUBSTR('abc',0,3) |
+-------------------+
| |
+-------------------+
SELECT SUBSTR('abc',1,2);
+-------------------+
| SUBSTR('abc',1,2) |
+-------------------+
| ab |
+-------------------+
SET sql_mode='oracle';
SELECT SUBSTR('abc',0,3);
+-------------------+
| SUBSTR('abc',0,3) |
+-------------------+
| abc |
+-------------------+
SELECT SUBSTR('abc',1,2);
+-------------------+
| SUBSTR('abc',1,2) |
+-------------------+
| ab |
+-------------------+
INSTR() - Returns the position of a string within a string
LOCATE() - Returns the position of a string within a string
SUBSTRING_INDEX() - Returns a string based on substring
This page is licensed: GPLv2, originally from fill_help_tables.sql
SUBSTRING_INDEX(str,delim,count)
Returns the substring from string str
before count occurrences of the delimiter delim
. If count
is positive, everything to the left of the final delimiter (counting from the left) is returned. If count
is negative, everything to the right of the final delimiter (counting from the right) is returned. SUBSTRING_INDEX()
performs a case-sensitive match when searching for delim
.
If any argument is NULL
, returns NULL
.
For example:
SUBSTRING_INDEX('www.mariadb.org', '.', 2)
It means "Return all of the characters up to the 2nd occurrence of ."
SELECT SUBSTRING_INDEX('www.mariadb.org', '.', 2);
+--------------------------------------------+
| SUBSTRING_INDEX('www.mariadb.org', '.', 2) |
+--------------------------------------------+
| www.mariadb |
+--------------------------------------------+
SELECT SUBSTRING_INDEX('www.mariadb.org', '.', -2);
+---------------------------------------------+
| SUBSTRING_INDEX('www.mariadb.org', '.', -2) |
+---------------------------------------------+
| mariadb.org |
+---------------------------------------------+
INSTR() - Returns the position of a string within a string
LOCATE() - Returns the position of a string within a string
SUBSTRING() - Returns a string based on position
This page is licensed: GPLv2, originally from fill_help_tables.sql
TO_BASE64(str)
Converts the string argument str
to its base-64 encoded form, returning the result as a character string in the connection character set and collation.
The argument str
will be converted to string first if it is not a string. A NULL
argument will return a NULL
result.
The reverse function, FROM_BASE64(), decodes an encoded base-64 string.
There are a numerous different methods to base-64 encode a string. The following are used by MariaDB and MySQL:
Alphabet value 64 is encoded as '+'.
Alphabet value 63 is encoded as '/'.
Encoding output is made up of groups of four printable characters, with each three bytes of data encoded using four characters. If the final group is not complete, it is padded with '=' characters to make up a length of four.
To divide long output, a newline is added after every 76 characters.
Decoding will recognize and ignore newlines, carriage returns, tabs, and spaces.
SELECT TO_BASE64('Maria');
+--------------------+
| TO_BASE64('Maria') |
+--------------------+
| TWFyaWE= |
+--------------------+
This page is licensed: CC BY-SA / Gnu FDL
TO_CHAR(expr[, fmt])
The TO_CHAR
function converts an expr of type date, datetime, time or timestamp to a string. The optional fmt argument supports YYY/YYY/YY/RRRR/RR/MM/MON/MONTH/MI/DD/DY/HH/HH12/HH24/SS
and special characters. The default value is YYYY-MM-DD HH24:MI:SS
.
In Oracle, TO_CHAR
can also be used to convert numbers to strings, but this is not supported in MariaDB and will give an error.
SELECT TO_CHAR('1980-01-11 04:50:39', 'YYYY-MM-DD');
+----------------------------------------------+
| TO_CHAR('1980-01-11 04:50:39', 'YYYY-MM-DD') |
+----------------------------------------------+
| 1980-01-11 |
+----------------------------------------------+
SELECT TO_CHAR('1980-01-11 04:50:39', 'HH24-MI-SS');
+----------------------------------------------+
| TO_CHAR('1980-01-11 04:50:39', 'HH24-MI-SS') |
+----------------------------------------------+
| 04-50-39 |
+----------------------------------------------+
SELECT TO_CHAR('00-01-01 00:00:00', 'YY-MM-DD HH24:MI:SS');
+-----------------------------------------------------+
| TO_CHAR('00-01-01 00:00:00', 'YY-MM-DD HH24:MI:SS') |
+-----------------------------------------------------+
| 00-01-01 00:00:00 |
+-----------------------------------------------------+
SELECT TO_CHAR('99-12-31 23:59:59', 'YY-MM-DD HH24:MI:SS');
+-----------------------------------------------------+
| TO_CHAR('99-12-31 23:59:59', 'YY-MM-DD HH24:MI:SS') |
+-----------------------------------------------------+
| 99-12-31 23:59:59 |
+-----------------------------------------------------+
SELECT TO_CHAR('9999-12-31 23:59:59', 'YY-MM-DD HH24:MI:SS');
+-------------------------------------------------------+
| TO_CHAR('9999-12-31 23:59:59', 'YY-MM-DD HH24:MI:SS') |
+-------------------------------------------------------+
| 99-12-31 23:59:59 |
+-------------------------------------------------------+
SELECT TO_CHAR('21-01-03 08:30:00', 'Y-MONTH-DY HH:MI:SS');
+-----------------------------------------------------+
| TO_CHAR('21-01-03 08:30:00', 'Y-MONTH-DY HH:MI:SS') |
+-----------------------------------------------------+
| 1-January -Sun 08:30:00 |
+-----------------------------------------------------+
This page is licensed: CC BY-SA / Gnu FDL
TRIM_ORACLE([{BOTH | LEADING | TRAILING} [remstr] FROM] str), TRIM([remstr FROM] str)
Returns the string str
with all remstr
prefixes or suffixes removed. If none of the specifiers BOTH
, LEADING
, or TRAILING
is given, BOTH
is assumed. remstr
is optional and, if not specified, spaces are removed.
Returns NULL
if given a NULL
argument. If the result is empty, returns either an empty string, or, with SQL_MODE=Oracle, NULL
. SQL_MODE=Oracle
is not set by default.
The Oracle mode version of the function can be accessed in any mode by using TRIM_ORACLE
as the function name.
SELECT TRIM(' bar ')\G
*************************** 1. row ***************************
TRIM(' bar '): bar
SELECT TRIM(LEADING 'x' FROM 'xxxbarxxx')\G
*************************** 1. row ***************************
TRIM(LEADING 'x' FROM 'xxxbarxxx'): barxxx
SELECT TRIM(BOTH 'x' FROM 'xxxbarxxx')\G
*************************** 1. row ***************************
TRIM(BOTH 'x' FROM 'xxxbarxxx'): bar
SELECT TRIM(TRAILING 'xyz' FROM 'barxxyz')\G
*************************** 1. row ***************************
TRIM(TRAILING 'xyz' FROM 'barxxyz'): barx
With SQL_MODE=Oracle not set:
SELECT TRIM(''),TRIM_ORACLE('');
+----------+-----------------+
| TRIM('') | TRIM_ORACLE('') |
+----------+-----------------+
| | NULL |
+----------+-----------------+
With SQL_MODE=Oracle set:
SELECT TRIM(''),TRIM_ORACLE('');
+----------+-----------------+
| TRIM('') | TRIM_ORACLE('') |
+----------+-----------------+
| NULL | NULL |
+----------+-----------------+
This page is licensed: GPLv2, originally from fill_help_tables.sql
TRIM_ORACLE
is a synonym for the Oracle mode version of the TRIM function, and is available in all modes.
This page is licensed: CC BY-SA / Gnu FDL
Implicit type conversion takes place when MariaDB is using operands or different types, in order to make the operands compatible.
It is best practice not to rely upon implicit conversion; rather use CAST to explicitly convert types.
If either argument is NULL, the result of the comparison is NULL unless the NULL-safe <=> equality comparison operator is used.
If both arguments are integers, they are compared as integers.
If both arguments are strings, they are compared as strings.
If one argument is decimal and the other argument is decimal or integer, they are compared as decimals.
If one argument is decimal and the other argument is a floating point, they are compared as floating point values.
If one argument is string and the other argument is integer, they are compared as decimals.
If a hexadecimal argument is not compared to a number, it is treated as a binary string.
If a constant is compared to a TIMESTAMP or DATETIME, the constant is converted to a timestamp, unless used as an argument to the IN function.
In other cases, arguments are compared as floating point, or real, numbers.
Note that if a string column is being compared with a numeric value, MariaDB will not use the index on the column, as there are numerous alternatives that may evaluate as equal (see examples below).
Converting a string to a number:
SELECT 15+'15';
+---------+
| 15+'15' |
+---------+
| 30 |
+---------+
Converting a number to a string:
SELECT CONCAT(15,'15');
+-----------------+
| CONCAT(15,'15') |
+-----------------+
| 1515 |
+-----------------+
Floating point number errors:
SELECT '9746718491924563214' = 9746718491924563213;
+---------------------------------------------+
| '9746718491924563214' = 9746718491924563213 |
+---------------------------------------------+
| 1 |
+---------------------------------------------+
Numeric equivalence with strings:
SELECT '5' = 5;
+---------+
| '5' = 5 |
+---------+
| 1 |
+---------+
SELECT ' 5' = 5;
+------------+
| ' 5' = 5 |
+------------+
| 1 |
+------------+
SELECT ' 5 ' = 5;
+--------------+
| ' 5 ' = 5 |
+--------------+
| 1 |
+--------------+
1 row in set, 1 warning (0.000 sec)
SHOW WARNINGS;
+-------+------+--------------------------------------------+
| Level | Code | Message |
+-------+------+--------------------------------------------+
| Note | 1292 | Truncated incorrect DOUBLE value: ' 5 ' |
+-------+------+--------------------------------------------+
As a result of the above, MariaDB cannot use the index when comparing a string with a numeric value in the example below:
CREATE TABLE t (a VARCHAR(10), b VARCHAR(10), INDEX idx_a (a));
INSERT INTO t VALUES
('1', '1'), ('2', '2'), ('3', '3'),
('4', '4'), ('5', '5'), ('1', '5');
EXPLAIN SELECT * FROM t WHERE a = '3' \G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: t
type: ref
possible_keys: idx_a
key: idx_a
key_len: 13
ref: const
rows: 1
Extra: Using index condition
EXPLAIN SELECT * FROM t WHERE a = 3 \G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: t
type: ALL
possible_keys: idx_a
key: NULL
key_len: NULL
ref: NULL
rows: 6
Extra: Using where
Implicit type conversion also takes place on dyadic arithmetic operations (+,-,*,/). MariaDB chooses the minimum data type that is guaranteed to fit the result and converts both arguments to the result data type.
For addition (+), subtraction (-) and multiplication (*), the result data type is chosen as follows:
If either of the arguments is an approximate number (float, double), the result is double.
If either of the arguments is a string (char, varchar, text), the result is double.
If either of the arguments is a decimal number, the result is decimal.
If either of the arguments is of a temporal type with a non-zero fractional second precision (time(N), datetime(N), timestamp(N)), the result is decimal.
If either of the arguments is of a temporal type with a zero fractional second precision (time(0), date, datetime(0), timestamp(0)), the result may vary between int, int unsigned, bigint or bigint unsigned, depending on the exact data type combination.
If both arguments are integer numbers (tinyint, smallint, mediumint, bigint), the result may vary between int, int unsigned, bigint or bigint unsigned, depending of the exact data types and their signs.
For division (/), the result data type is chosen as follows:
If either of the arguments is an approximate number (float, double), the result is double.
If either of the arguments is a string (char, varchar, text), the result is double.
Otherwise, the result is decimal.
Note, the above rules mean that when an argument of a temporal data type appears in addition or subtraction, it's treated as a number by default.
SELECT TIME'10:20:30' + 1;
+--------------------+
| TIME'10:20:30' + 1 |
+--------------------+
| 102031 |
+--------------------+
In order to do temporal addition or subtraction instead, use the DATE_ADD() or DATE_SUB() functions, or an INTERVAL expression as the second argument:
SELECT TIME'10:20:30' + INTERVAL 1 SECOND;
+------------------------------------+
| TIME'10:20:30' + INTERVAL 1 SECOND |
+------------------------------------+
| 10:20:31 |
+------------------------------------+
SELECT "2.2" + 3;
+-----------+
| "2.2" + 3 |
+-----------+
| 5.2 |
+-----------+
SELECT 2.2 + 3;
+---------+
| 2.2 + 3 |
+---------+
| 5.2 |
+---------+
SELECT 2.2 / 3;
+---------+
| 2.2 / 3 |
+---------+
| 0.73333 |
+---------+
SELECT "2.2" / 3;
+--------------------+
| "2.2" / 3 |
+--------------------+
| 0.7333333333333334 |
+--------------------+
This page is licensed: CC BY-SA / Gnu FDL
UCASE(str)
UCASE()
is a synonym for UPPER().
This page is licensed: GPLv2, originally from fill_help_tables.sql
UNHEX(str)
Performs the inverse operation of HEX(str). That is, it interprets each pair of hexadecimal digits in the argument as a number and converts it to the character represented by the number. The resulting characters are returned as a binary string.
If str
is NULL
, UNHEX()
returns NULL
.
SELECT HEX('MariaDB');
+----------------+
| HEX('MariaDB') |
+----------------+
| 4D617269614442 |
+----------------+
SELECT UNHEX('4D617269614442');
+-------------------------+
| UNHEX('4D617269614442') |
+-------------------------+
| MariaDB |
+-------------------------+
SELECT 0x4D617269614442;
+------------------+
| 0x4D617269614442 |
+------------------+
| MariaDB |
+------------------+
SELECT UNHEX(HEX('string'));
+----------------------+
| UNHEX(HEX('string')) |
+----------------------+
| string |
+----------------------+
SELECT HEX(UNHEX('1267'));
+--------------------+
| HEX(UNHEX('1267')) |
+--------------------+
| 1267 |
+--------------------+
This page is licensed: GPLv2, originally from fill_help_tables.sql
UpdateXML(xml_target, xpath_expr, new_xml)
This function replaces a single portion of a given fragment of XML markupxml_target
with a new XML fragment new_xml
, and then returns the changed XML. The portion of xml_target
that is replaced matches an XPath expression xpath_expr
supplied by the user. If no expression matching xpath_expr
is found, or if multiple matches are found, the function returns the original xml_target
XML fragment. All three arguments should be strings.
SELECT
UpdateXML('<a><b>ccc</b><d></d></a>', '/a', '<e>fff</e>') AS val1,
UpdateXML('<a><b>ccc</b><d></d></a>', '/b', '<e>fff</e>') AS val2,
UpdateXML('<a><b>ccc</b><d></d></a>', '//b', '<e>fff</e>') AS val3,
UpdateXML('<a><b>ccc</b><d></d></a>', '/a/d', '<e>fff</e>') AS val4,
UpdateXML('<a><d></d><b>ccc</b><d></d></a>', '/a/d', '<e>fff</e>') AS val5
\G
*************************** 1. row ***************************
val1: <e>fff</e>
val2: <a><b>ccc</b><d></d></a>
val3: <a><e>fff</e><d></d></a>
val4: <a><b>ccc</b><e>fff</e></a>
val5: <a><d></d><b>ccc</b><d></d></a>
1 row in set (0.00 sec)
This page is licensed: GPLv2, originally from fill_help_tables.sql
UPPER(str)
UCASE(str)
Returns the string str
with all characters changed to uppercase according to the current character set mapping. The default is latin1 (cp1252 West European).
UCASE
is a synonym.
SELECT UPPER(surname), givenname FROM users ORDER BY surname;
+----------------+------------+
| UPPER(surname) | givenname |
+----------------+------------+
| ABEL | Jacinto |
| CASTRO | Robert |
| COSTA | Phestos |
| MOSCHELLA | Hippolytos |
+----------------+------------+
UPPER()
is ineffective when applied to binary strings (BINARY, VARBINARY, BLOB). The description of LOWER() shows how to perform lettercase conversion of binary strings.
This page is licensed: GPLv2, originally from fill_help_tables.sql
WEIGHT_STRING(str [AS {CHAR|BINARY}(N)] [LEVEL levels] [flags])
levels: N [ASC|DESC|REVERSE] [, N [ASC|DESC|REVERSE]] ...
Returns a binary string representing the string's sorting and comparison value. A string with a lower result means that for sorting purposes the string appears before a string with a higher result.
WEIGHT_STRING()
is particularly useful when adding new collations, for testing purposes.
If str
is a non-binary string (CHAR, VARCHAR or TEXT), WEIGHT_STRING returns the string's collation weight. If str
is a binary string (BINARY, VARBINARY or BLOB), the return value is simply the input value, since the weight for each byte in a binary string is the byte value.
WEIGHT_STRING()
returns NULL
if given a NULL
input.
The optional AS clause permits casting the input string to a binary or non-binary string, as well as to a particular length.
AS BINARY(N)
measures the length in bytes rather than characters, and right pads with 0x00 bytes to the desired length.
AS CHAR(N)
measures the length in characters, and right pads with spaces to the desired length.
N has a minimum value of 1, and if it is less than the length of the input string, the string is truncated without warning.
The optional LEVEL
clause specifies that the return value should contain weights for specific collation levels. The levels
specifier can either be a single integer, a comma-separated list of integers, or a range of integers separated by a dash (whitespace is ignored). Integers can range from 1 to a maximum of 6, dependent on the collation, and need to be listed in ascending order.
If the LEVEL
clause is no provided, a default of 1 to the maximum for the collation is assumed.
If the q
is specified without using a range, an optional modifier is permitted.
ASC
, the default, returns the weights without any modification.
DESC
returns bitwise-inverted weights.
REVERSE
returns the weights in reverse order.
The examples below use the HEX() function to represent non-printable results in hexadecimal format.
SELECT HEX(WEIGHT_STRING('x'));
+-------------------------+
| HEX(WEIGHT_STRING('x')) |
+-------------------------+
| 0058 |
+-------------------------+
SELECT HEX(WEIGHT_STRING('x' AS BINARY(4)));
+--------------------------------------+
| HEX(WEIGHT_STRING('x' AS BINARY(4))) |
+--------------------------------------+
| 78000000 |
+--------------------------------------+
SELECT HEX(WEIGHT_STRING('x' AS CHAR(4)));
+------------------------------------+
| HEX(WEIGHT_STRING('x' AS CHAR(4))) |
+------------------------------------+
| 0058002000200020 |
+------------------------------------+
SELECT HEX(WEIGHT_STRING(0xaa22ee LEVEL 1));
+--------------------------------------+
| HEX(WEIGHT_STRING(0xaa22ee LEVEL 1)) |
+--------------------------------------+
| AA22EE |
+--------------------------------------+
SELECT HEX(WEIGHT_STRING(0xaa22ee LEVEL 1 DESC));
+-------------------------------------------+
| HEX(WEIGHT_STRING(0xaa22ee LEVEL 1 DESC)) |
+-------------------------------------------+
| 55DD11 |
+-------------------------------------------+
SELECT HEX(WEIGHT_STRING(0xaa22ee LEVEL 1 REVERSE));
+----------------------------------------------+
| HEX(WEIGHT_STRING(0xaa22ee LEVEL 1 REVERSE)) |
+----------------------------------------------+
| EE22AA |
+----------------------------------------------+
This page is licensed: CC BY-SA / Gnu FDL
Learn about regular expression functions in MariaDB Server. This section details SQL functions for powerful pattern matching and manipulation of string data using regular expressions.
Regular Expressions allow MariaDB to perform complex pattern matching on a string. In many cases, the simple pattern matching provided by LIKE is sufficient. LIKE
performs two kinds of matches:
_
- the underscore, matching a single character
%
- the percentage sign, matching any number of characters.
In other cases you may need more control over the returned matches, and will need to use regular expressions.
Regular expression matches are performed with the REGEXP function. RLIKE
is a synonym for REGEXP
.
Comparisons are performed on the byte value, so characters that are treated as equivalent by a collation, but do not have the same byte-value, such as accented characters, could evaluate as unequal.
Without any special characters, a regular expression match is true if the characters match. The match is case-insensitive, except in the case of BINARY strings.
SELECT 'Maria' REGEXP 'Maria';
+------------------------+
| 'Maria' REGEXP 'Maria' |
+------------------------+
| 1 |
+------------------------+
SELECT 'Maria' REGEXP 'maria';
+------------------------+
| 'Maria' REGEXP 'maria' |
+------------------------+
| 1 |
+------------------------+
SELECT BINARY 'Maria' REGEXP 'maria';
+-------------------------------+
| BINARY 'Maria' REGEXP 'maria' |
+-------------------------------+
| 0 |
+-------------------------------+
Note that the word being matched must match the whole pattern:
SELECT 'Maria' REGEXP 'Mari';
+-----------------------+
| 'Maria' REGEXP 'Mari' |
+-----------------------+
| 1 |
+-----------------------+
SELECT 'Mari' REGEXP 'Maria';
+-----------------------+
| 'Mari' REGEXP 'Maria' |
+-----------------------+
| 0 |
+-----------------------+
The first returns true because the pattern "Mari" exists in the expression "Maria". When the order is reversed, the result is false, as the pattern "Maria" does not exist in the expression "Mari"
A match can be performed against more than one word with the |
character. For example:
SELECT 'Maria' REGEXP 'Monty|Maria';
+------------------------------+
| 'Maria' REGEXP 'Monty|Maria' |
+------------------------------+
| 1 |
+------------------------------+
The above examples introduce the syntax, but are not very useful on their own. It's the special characters that give regular expressions their power.
^
matches the beginning of a string (inside square brackets it can also mean NOT
- see below):
SELECT 'Maria' REGEXP '^Ma';
+----------------------+
| 'Maria' REGEXP '^Ma' |
+----------------------+
| 1 |
+----------------------+
$
matches the end of a string:
SELECT 'Maria' REGEXP 'ia$';
+----------------------+
| 'Maria' REGEXP 'ia$' |
+----------------------+
| 1 |
+----------------------+
.
matches any single character:
SELECT 'Maria' REGEXP 'Ma.ia';
+------------------------+
| 'Maria' REGEXP 'Ma.ia' |
+------------------------+
| 1 |
+------------------------+
SELECT 'Maria' REGEXP 'Ma..ia';
+-------------------------+
| 'Maria' REGEXP 'Ma..ia' |
+-------------------------+
| 0 |
+-------------------------+
x*
matches zero or more of a character x
. In the examples below, it's the r
character.
SELECT 'Maria' REGEXP 'Mar*ia';
+-------------------------+
| 'Maria' REGEXP 'Mar*ia' |
+-------------------------+
| 1 |
+-------------------------+
SELECT 'Maia' REGEXP 'Mar*ia';
+------------------------+
| 'Maia' REGEXP 'Mar*ia' |
+------------------------+
| 1 |
+------------------------+
SELECT 'Marrria' REGEXP 'Mar*ia';
+---------------------------+
| 'Marrria' REGEXP 'Mar*ia' |
+---------------------------+
| 1 |
+---------------------------+
x+
matches one or more of a character x
. In the examples below, it's the r
character.
SELECT 'Maria' REGEXP 'Mar+ia';
+-------------------------+
| 'Maria' REGEXP 'Mar+ia' |
+-------------------------+
| 1 |
+-------------------------+
SELECT 'Maia' REGEXP 'Mar+ia';
+------------------------+
| 'Maia' REGEXP 'Mar+ia' |
+------------------------+
| 0 |
+------------------------+
SELECT 'Marrria' REGEXP 'Mar+ia';
+---------------------------+
| 'Marrria' REGEXP 'Mar+ia' |
+---------------------------+
| 1 |
+---------------------------+
x?
matches zero or one of a character x
. In the examples below, it's the r
character.
SELECT 'Maria' REGEXP 'Mar?ia';
+-------------------------+
| 'Maria' REGEXP 'Mar?ia' |
+-------------------------+
| 1 |
+-------------------------+
SELECT 'Maia' REGEXP 'Mar?ia';
+------------------------+
| 'Maia' REGEXP 'Mar?ia' |
+------------------------+
| 1 |
+------------------------+
SELECT 'Marrria' REGEXP 'Mar?ia';
+---------------------------+
| 'Marrria' REGEXP 'Mar?ia' |
+---------------------------+
| 0 |
+---------------------------+
(xyz)
- combine a sequence, for example (xyz)+
or (xyz)*
SELECT 'Maria' REGEXP '(ari)+';
+-------------------------+
| 'Maria' REGEXP '(ari)+' |
+-------------------------+
| 1 |
+-------------------------+
x{n}
and x{m,n}
This notation is used to match many instances of the x
. In the case of x{n}
the match must be exactly that many times. In the case of x{m,n}
, the match can occur from m
to n
times. For example, to match zero or one instance of the string ari
(which is identical to (ari)?
), the following can be used:
SELECT 'Maria' REGEXP '(ari){0,1}';
+-----------------------------+
| 'Maria' REGEXP '(ari){0,1}' |
+-----------------------------+
| 1 |
+-----------------------------+
[xy]
groups characters for matching purposes. For example, to match either the p
or the r
character:
SELECT 'Maria' REGEXP 'Ma[pr]ia';
+---------------------------+
| 'Maria' REGEXP 'Ma[pr]ia' |
+---------------------------+
| 1 |
+---------------------------+
The square brackets also permit a range match, for example, to match any character from a-z, [a-z]
is used. Numeric ranges are also permitted.
SELECT 'Maria' REGEXP 'Ma[a-z]ia';
+----------------------------+
| 'Maria' REGEXP 'Ma[a-z]ia' |
+----------------------------+
| 1 |
+----------------------------+
The following does not match, as r
falls outside of the range a-p
.
SELECT 'Maria' REGEXP 'Ma[a-p]ia';
+----------------------------+
| 'Maria' REGEXP 'Ma[a-p]ia' |
+----------------------------+
| 0 |
+----------------------------+
^
The ^
character means does NOT
match, for example:
SELECT 'Maria' REGEXP 'Ma[^p]ia';
+---------------------------+
| 'Maria' REGEXP 'Ma[^p]ia' |
+---------------------------+
| 1 |
+---------------------------+
SELECT 'Maria' REGEXP 'Ma[^r]ia';
+---------------------------+
| 'Maria' REGEXP 'Ma[^r]ia' |
+---------------------------+
| 0 |
+---------------------------+
The [
and ]
characters on their own can be literally matched inside a []
block, without escaping, as long as they immediately match the opening bracket:
SELECT '[Maria' REGEXP '[[]';
+-----------------------+
| '[Maria' REGEXP '[[]' |
+-----------------------+
| 1 |
+-----------------------+
SELECT '[Maria' REGEXP '[]]';
+-----------------------+
| '[Maria' REGEXP '[]]' |
+-----------------------+
| 0 |
+-----------------------+
SELECT ']Maria' REGEXP '[]]';
+-----------------------+
| ']Maria' REGEXP '[]]' |
+-----------------------+
| 1 |
+-----------------------+
SELECT ']Maria' REGEXP '[]a]';
+------------------------+
| ']Maria' REGEXP '[]a]' |
+------------------------+
| 1 |
+------------------------+
Incorrect order, so no match:
SELECT ']Maria' REGEXP '[a]]';
+------------------------+
| ']Maria' REGEXP '[a]]' |
+------------------------+
| 0 |
+------------------------+
The -
character can also be matched in the same way:
SELECT '-Maria' REGEXP '[1-10]';
+--------------------------+
| '-Maria' REGEXP '[1-10]' |
+--------------------------+
| 0 |
+--------------------------+
SELECT '-Maria' REGEXP '[-1-10]';
+---------------------------+
| '-Maria' REGEXP '[-1-10]' |
+---------------------------+
| 1 |
+---------------------------+
The :<: and :>: patterns match the beginning and the end of a word respectively. For example:
SELECT 'How do I upgrade MariaDB?' REGEXP '[[:<:]]MariaDB[[:>:]]';
+------------------------------------------------------------+
| 'How do I upgrade MariaDB?' REGEXP '[[:<:]]MariaDB[[:>:]]' |
+------------------------------------------------------------+
| 1 |
+------------------------------------------------------------+
SELECT 'How do I upgrade MariaDB?' REGEXP '[[:<:]]Maria[[:>:]]';
+----------------------------------------------------------+
| 'How do I upgrade MariaDB?' REGEXP '[[:<:]]Maria[[:>:]]' |
+----------------------------------------------------------+
| 0 |
+----------------------------------------------------------+
There are a number of shortcuts to match particular preset character classes. These are matched with the [:character_class:]
pattern (inside a []
set). The following character classes exist:
alnum
Alphanumeric
alpha
Alphabetic
blank
Whitespace
cntrl
Control characters
digit
Digits
graph
Graphic characters
lower
Lowercase alphabetic
Graphic or space characters
punct
Punctuation
space
Space, tab, newline, and carriage return
upper
Uppercase alphabetic
xdigit
Hexadecimal digit
For example:
SELECT 'Maria' REGEXP 'Mar[[:alnum:]]*';
+--------------------------------+
| 'Maria' REGEXP 'Mar[:alnum:]*' |
+--------------------------------+
| 1 |
+--------------------------------+
Remember that matches are by default case-insensitive, unless a binary string is used, so the following example, specifically looking for an uppercase, counter-intuitively matches a lowercase character:
SELECT 'Mari' REGEXP 'Mar[[:upper:]]+';
+---------------------------------+
| 'Mari' REGEXP 'Mar[[:upper:]]+' |
+---------------------------------+
| 1 |
+---------------------------------+
SELECT BINARY 'Mari' REGEXP 'Mar[[:upper:]]+';
+----------------------------------------+
| BINARY 'Mari' REGEXP 'Mar[[:upper:]]+' |
+----------------------------------------+
| 0 |
+----------------------------------------+
There are also number of shortcuts to match particular preset character names. These are matched with the [.character.]
pattern (inside a []
set). The following character classes exist:
NUL
0
SOH
001
STX
002
ETX
003
EOT
004
ENQ
005
ACK
006
BEL
007
alert
007
BS
010
backspace
'\b'
HT
011
tab
'\t'
LF
012
newline
'\n'
VT
013
vertical-tab
'\v'
FF
014
form-feed
'\f'
CR
015
carriage-return
'\r'
SO
016
SI
017
DLE
020
DC1
021
DC2
022
DC3
023
DC4
024
NAK
025
SYN
026
ETB
027
CAN
030
EM
031
SUB
032
ESC
033
IS4
034
FS
034
IS3
035
GS
035
IS2
036
RS
036
IS1
037
US
037
space
' '
exclamation-mark
'!'
quotation-mark
'"'
number-sign
'#'
dollar-sign
'$'
percent-sign
'%'
ampersand
'&'
apostrophe
'''
left-parenthesis
'('
right-parenthesis
')'
asterisk
'*'
plus-sign
'+'
comma
','
hyphen
'-'
hyphen-minus
'-'
period
'.'
full-stop
'.'
slash
'/'
solidus
'/'
zero
'0'
one
'1'
two
'2'
three
'3'
four
'4'
five
'5'
six
'6'
seven
'7'
eight
'8'
nine
'9'
colon
':'
semicolon
';'
less-than-sign
'<'
equals-sign
'='
greater-than-sign
'>'
question-mark
'?'
commercial-at
'@'
left-square-bracket
'['
backslash
''
reverse-solidus
''
right-square-bracket
']'
circumflex
'^'
circumflex-accent
'^'
underscore
'_'
low-line
'_'
grave-accent
'`'
left-brace
'{'
left-curly-bracket
'{'
vertical-line
'
right-brace
'}'
right-curly-bracket
'}'
tilde
''
DEL
177
For example:
SELECT '|' REGEXP '[[.vertical-line.]]';
+----------------------------------+
| '|' REGEXP '[[.vertical-line.]]' |
+----------------------------------+
| 1 |
+----------------------------------+
The true power of regular expressions is unleashed when the above is combined, to form more complex examples. Regular expression's reputation for complexity stems from the seeming complexity of multiple combined regular expressions, when in reality, it's simply a matter of understanding the characters and how they apply:
The first example fails to match, as while the Ma
matches, either i
or r
only matches once before the ia
characters at the end.
SELECT 'Maria' REGEXP 'Ma[ir]{2}ia';
+------------------------------+
| 'Maria' REGEXP 'Ma[ir]{2}ia' |
+------------------------------+
| 0 |
+------------------------------+
This example matches, as either i
or r
match exactly twice after the Ma
, in this case one r
and one i
.
SELECT 'Maria' REGEXP 'Ma[ir]{2}';
+----------------------------+
| 'Maria' REGEXP 'Ma[ir]{2}' |
+----------------------------+
| 1 |
+----------------------------+
With the large number of special characters, care needs to be taken to properly escape characters. Two backslash characters, `` (one for the MariaDB parser, one for the regex library), are required to properly escape a character. For example:
To match the literal (Ma
:
SELECT '(Maria)' REGEXP '(Ma';
ERROR 1139 (42000): Got error 'parentheses not balanced' from regexp
SELECT '(Maria)' REGEXP '\(Ma';
ERROR 1139 (42000): Got error 'parentheses not balanced' from regexp
SELECT '(Maria)' REGEXP '\\(Ma';
+--------------------------+
| '(Maria)' REGEXP '\\(Ma' |
+--------------------------+
| 1 |
+--------------------------+
To match r+
: The first two examples are incorrect, as they match r
one or more times, not r+
:
SELECT 'Mar+ia' REGEXP 'r+';
+----------------------+
| 'Mar+ia' REGEXP 'r+' |
+----------------------+
| 1 |
+----------------------+
SELECT 'Maria' REGEXP 'r+';
+---------------------+
| 'Maria' REGEXP 'r+' |
+---------------------+
| 1 |
+---------------------+
SELECT 'Maria' REGEXP 'r\\+';
+-----------------------+
| 'Maria' REGEXP 'r\\+' |
+-----------------------+
| 0 |
+-----------------------+
SELECT 'Maria' REGEXP 'r+';
+---------------------+
| 'Maria' REGEXP 'r+' |
+---------------------+
| 1 |
+---------------------+
This page is licensed: CC BY-SA / Gnu FDL
MariaDB uses the PCRE library, which significantly improves the power of the REGEXP/RLIKE operator.
The switch to PCRE added a number of features, including recursive patterns, named capture, look-ahead and look-behind assertions, non-capturing groups, non-greedy quantifiers, Unicode character properties, extended syntax for characters and character classes, multi-line matching, and many other.
These functions work with regular expressions: REGEXP_REPLACE(), REGEXP_INSTR(), and REGEXP_SUBSTR().
Also, REGEXP/RLIKE
, and the new functions, work correctly with all multi-byte character sets supported by MariaDB, including East-Asian character sets (big5, gb2313, gbk, eucjp, eucjpms, cp932, ujis, euckr), and Unicode character sets (utf8, utf8mb4, ucs2, utf16, utf16le, utf32).
REGEXP_REPLACE(subject, pattern, replace) - Replaces all occurrences of a pattern.
REGEXP_INSTR(subject, pattern) - Position of the first appearance of a regex .
REGEXP_SUBSTR(subject,pattern) - Returns the matching part of a string.
See the individual articles for more details and examples.
In most cases PCRE is backward compatible with the old POSIX 1003.2 compliant regexp library (see Regular Expressions Overview), so you won't need to change your applications that use SQL queries with the REGEXP/RLIKE predicate.
This section briefly describes the most important extended PCRE features. For more details please refer to the documentation on the PCRE site, or to the documentation which is bundled in the /pcre/doc/html/ directory of a MariaDB sources distribution. The pages pcresyntax.html and pcrepattern.html should be a good start. Regular-Expressions.Info is another good resource to learn about PCRE and regular expressions generally.
PCRE supports the following escape sequences to match special characters:
\a
0x07 (BEL)
\cx
"control-x", where x is any ASCII character
\e
0x1B (escape)
\f
0x0C (form feed)
0x0A (newline)
0x0D (carriage return)
0x09 (TAB)
\ddd
character with octal code ddd
\xhh
character with hex code hh
\x{hhh..}
character with hex code hhh..
Note, the backslash characters (here, and in all examples in the sections below) must be escaped with another backslash, unless you're using the SQL_MODE NO_BACKSLASH_ESCAPES
.
This example tests if a character has hex code 0x61:
SELECT 'a' RLIKE '\\x{61}';
-> 1
PCRE supports the standard POSIX character classes such as alnum
, alpha
, blank
, cntrl
, digit
, graph
, lower
, print
, punct
, space
, upper
, xdigit
, with the following additional classes:
ascii
any ASCII character (0x00..0x7F)
word
any "word" character (a letter, a digit, or an underscore)
This example checks if the string consists of ASCII characters only:
SELECT 'abc' RLIKE '^[[:ascii:]]+$';
-> 1
Generic character types complement the POSIX character classes and serve to simplify writing patterns:
\d
a decimal digit (same as [:digit:])
\D
a character that is not a decimal digit
\h
a horizontal white space character
\H
a character that is not a horizontal white space character
a character that is not a new line
a newline sequence
\s
a white space character
\S
a character that is not a white space character
\v
a vertical white space character
\V
a character that is not a vertical white space character
\w
a "word" character (same as [:word:])
\W
a "non-word" character
This example checks if the string consists of "word" characters only:
SELECT 'abc' RLIKE '^\\w+$';
-> 1
\p{xx}
is a character with the xx
property, and \P{xx}
is a character without the xx
property.
The property names represented by xx
above are limited to the Unicode script names, the general category properties, and "Any", which matches any character (including newline). Those that are not part of an identified script are lumped together as "Common".
General Category Properties For \p and \P
C
Other
Cc
Control
Cf
Format
Cn
Unassigned
Co
Private use
Cs
Surrogate
L
Letter
Ll
Lower case letter
Lm
Modifier letter
Lo
Other letter
Lt
Title case letter
Lu
Upper case letter
L&
Ll, Lu, or Lt
M
Mark
Mc
Spacing mark
Me
Enclosing mark
Mn
Non-spacing mark
N
Number
Nd
Decimal number
Nl
Letter number
No
Other number
P
Punctuation
Pc
Connector punctuation
Pd
Dash punctuation
Pe
Close punctuation
Pf
Final punctuation
Pi
Initial punctuation
Po
Other punctuation
Ps
Open punctuation
S
Symbol
Sc
Currency symbol
Sk
Modifier symbol
Sm
Mathematical symbol
So
Other symbol
Z
Separator
Zl
Line separator
Zp
Paragraph separator
Zs
Space separator
This example checks if the string consists only of characters with property N (number):
SELECT '1¼①' RLIKE '^\\p{N}+$';
-> 1
Special Category Properties For \p and \P
Xan
Alphanumeric: union of properties L and N
Xps
POSIX space: property Z or tab, NL, VT, FF, CR
Xsp
Perl space: property Z or tab, NL, FF, CR
Xuc
A character than can be represented by a Universal Character Name
Xwd
Perl word: property Xan or underscore
The property Xuc
matches any character that can be represented by a Universal Character Name (in C++ and other programming languages). These include $
, @
, ```, and all characters with Unicode code points greater than U+00A0
, excluding the surrogates U+D800
..U+DFFF
.
Script Names For \p and \P
Arabic, Armenian, Avestan, Balinese, Bamum, Batak, Bengali, Bopomofo, Brahmi, Braille, Buginese, Buhid, Canadian_Aboriginal, Carian, Chakma, Cham, Cherokee, Common, Coptic, Cuneiform, Cypriot, Cyrillic, Deseret, Devanagari, Egyptian_Hieroglyphs, Ethiopic, Georgian, Glagolitic, Gothic, Greek, Gujarati, Gurmukhi, Han, Hangul, Hanunoo, Hebrew, Hiragana, Imperial_Aramaic, Inherited, Inscriptional_Pahlavi, Inscriptional_Parthian, Javanese, Kaithi, Kannada, Katakana, Kayah_Li, Kharoshthi, Khmer, Lao, Latin, Lepcha, Limbu, Linear_B, Lisu, Lycian, Lydian, Malayalam, Mandaic, Meetei_Mayek, Meroitic_Cursive, Meroitic_Hieroglyphs, Miao, Mongolian, Myanmar, New_Tai_Lue, Nko, Ogham, Old_Italic, Old_Persian, Old_South_Arabian, Old_Turkic, Ol_Chiki, Oriya, Osmanya, Phags_Pa, Phoenician, Rejang, Runic, Samaritan, Saurashtra, Sharada, Shavian, Sinhala, Sora_Sompeng, Sundanese, Syloti_Nagri, Syriac, Tagalog, Tagbanwa, Tai_Le, Tai_Tham, Tai_Viet, Takri, Tamil, Telugu, Thaana, Thai, Tibetan, Tifinagh, Ugaritic, Vai, Yi.
This example checks if the string consists only of Greek characters:
SELECT 'ΣΦΩ' RLIKE '^\\p{Greek}+$';
-> 1
The \X
escape sequence matches a character sequence that makes an "extended grapheme cluster", i.e. a composite character that consists of multiple Unicode code points.
One of the examples of a composite character can be a letter followed by non-spacing accent marks. This example demonstrates that U+0045 LATIN CAPITAL LETTER E
followed by U+0302 COMBINING CIRCUMFLEX ACCENT
followed by U+0323 COMBINING DOT BELOW
together form an extended grapheme cluster:
SELECT _ucs2 0x004503020323 RLIKE '^\\X$';
-> 1
See the PCRE documentation for the other types of extended grapheme clusters.
An assertion specifies a certain condition that must match at a particular point, but without consuming characters from the subject string. In addition to the standard POSIX simple assertions ^
(that matches at the beginning of a line) and $
(that matches at the end of a line), PCRE supports a number of other assertions:
\b
matches at a word boundary
\B
matches when not at a word boundary
\A
matches at the start of the subject
\Z
matches at the end of the subject, also matches before a newline at the end of the subject
\z
matches only at the end of the subject
\G
matches at the first matching position in the subject
This example cuts a word that consists only of 3 characters from a string:
SELECT REGEXP_SUBSTR('---abcd---xyz---', '\\b\\w{3}\\b');
-> xyz
Notice that the two \b
assertions checked the word boundaries but did not get into the matching pattern.
The \b
assertions work well in the beginning and the end of the subject string:
SELECT REGEXP_SUBSTR('xyz', '\\b\\w{3}\\b');
-> xyz
By default, the ^
and $
assertions have the same meaning with \A
, \Z
, and \z
. However, the meanings of ^
and $
can change in multiline mode (see below). By contrast, the meanings of \A
, \Z
, and \z
are always the same; they are independent of the multiline mode.
A number of options that control the default match behavior can be changed within the pattern by a sequence of option letters enclosed between (?
and )
.
(?i)
case insensitive match
(?m)
multiline mode
(?s)
dotall mode (dot matches newline characters)
(?x)
extended (ignore white space)
(?U)
ungreedy (lazy) match
(?J)
allow named subpatterns with duplicate names
(?X)
extra PCRE functionality (e.g. force error on unknown escaped character)
(?-...)
unset option(s)
For example, (?im)
sets case insensitive multiline matching.
A hyphen followed by the option letters unset the options. For example, (?-im)
means case sensitive single line match.
A combined setting and unsetting is also possible, e.g. (?im-sx)
.
If an option is set outside of subpattern parentheses, the option applies to the remainder of the pattern that follows the option. If an option is set inside a subpattern, it applies to the part of this subpattern that follows the option.
In this example the pattern (?i)m((?-i)aria)db
matches the words MariaDB
, Mariadb
, mariadb
, but not MARIADB
:
SELECT 'MariaDB' RLIKE '(?i)m((?-i)aria)db';
-> 1
SELECT 'mariaDB' RLIKE '(?i)m((?-i)aria)db';
-> 1
SELECT 'Mariadb' RLIKE '(?i)m((?-i)aria)db';
-> 1
SELECT 'MARIADB' RLIKE '(?i)m((?-i)aria)db';
-> 0
This example demonstrates that the (?x)
option makes the regexp engine ignore all white spaces in the pattern (other than in a class).
SELECT 'ab' RLIKE '(?x)a b';
-> 1
Note, putting spaces into a pattern in combination with the (?x) option can be useful to split different logical parts of a complex pattern, to make it more readable.
Multiline matching changes the meaning of ^
and $
from "the beginning of the subject string" and "the end of the subject string" to "the beginning of any line in the subject string" and "the end of any line in the subject string" respectively.
This example checks if the subject string contains two consequent lines that fully consist of digits:
SELECT 'abc\n123\n456\nxyz\n' RLIKE '(?m)^\\d+\\R\\d+$';
-> 1
Notice the (?m)
option in the beginning of the pattern, which switches to the multiline matching mode.
PCRE supports five line break conventions:
CR (\r)
- a single carriage return character
LF (\n)
- a single linefeed character
CRLF (\r\n)
- a carriage return followed by a linefeed
any of the previous three
any Unicode newline sequence
By default, the newline convention is set to any Unicode newline sequence, which includes:
LF
(U+000A, carriage return)
CR
(U+000D, carriage return)
CRLF
(a carriage return followed by a linefeed)
VT
(U+000B, vertical tab)
FF
(U+000C, form feed)
NEL
(U+0085, next line)
LS
(U+2028, line separator)
PS
(U+2029, paragraph separator)
The newline convention can be set by starting a pattern with one of the following sequences:
(*CR)
carriage return
(*LF)
linefeed
(*CRLF)
carriage return followed by linefeed
(*ANYCRLF)
any of the previous three
(*ANY)
all Unicode newline sequences
The newline conversion affects the ^
and $
assertions, the interpretation of the dot metacharacter, and the behavior of .
Note, the new line convention does not affect the meaning of .
This example demonstrates that the dot metacharacter matches , because it is not a newline sequence anymore:
SELECT 'a\nb' RLIKE '(*CR)a.b';
-> 1
By default, the escape sequence matches any Unicode newline sequences.
The meaning of can be set by starting a pattern with one of the following sequences:
(*BSR_ANYCRLF)
any of CR, LF or CRLF
(*BSR_UNICODE)
any Unicode newline sequence
It's possible to include comments inside a pattern. Comments do not participate in the pattern matching. Comments start at the (?
SELECT 'ab12' RLIKE 'ab(?#expect digits)12';
-> 1
POSIX uses the backslash to remove a special meaning from a character. PCRE introduces a syntax to remove special meaning from a sequence of characters. The characters inside \Q
... \E
are treated literally, without their special meaning.
This example checks if the string matches a dollar sign followed by a parenthesized name (a variable reference in some languages):
SELECT '$(abc)' RLIKE '^\\Q$(\\E\\w+\\Q)\\E$';
-> 1
Note that the leftmost dollar sign and the parentheses are used literally, while the rightmost dollar sign is still used to match the end of the string.
The escape sequence \K
causes any previously matched characters to be excluded from the final matched sequence. For example, the pattern: (foo)\Kbar
matches foobar
, but reports that it has matched bar
. This feature is similar to a look-behind assertion. However, in this case, the part of the subject before the real match does not have to be of fixed length:
SELECT REGEXP_SUBSTR('aaa123', '[a-z]*\\K[0-9]*');
-> 123
The question mark and the colon after the opening parenthesis create a non-capturing group: (?:...)
.
This example removes an optional article from a word, for example for better sorting of the results.
SELECT REGEXP_REPLACE('The King','(?:the|an|a)[^a-z]([a-z]+)','\\1');
-> King
Note that the articles are listed inside the left parentheses using the alternation operator |
but they do not produce a captured subpattern, so the word followed by the article is referenced by '1'
in the third argument to the function. Using non-capturing groups can be useful to save numbers on the sup-patterns that won't be used in the third argument of REGEXP_REPLACE(), as well as for performance purposes.
By default, the repetition quantifiers ?
, *
, +
and {n,m}
are "greedy", that is, they try to match as much as possible. Adding a question mark after a repetition quantifier makes it "non-greedy", so the pattern matches the minimum number of times possible.
This example cuts C comments from a line:
SELECT REGEXP_REPLACE('/* Comment1 */ i+= 1; /* Comment2 */', '/[*].*?[*]/','');
-> i+= 1;
The pattern without the non-greedy flag to the quantifier /[*].*[*]/
would match the entire string between the leftmost /*
and the rightmost */
.
A sequence inside (?>
...)
makes an atomic group. Backtracking inside an atomic group is prevented once it has matched; however, backtracking past to the previous items works normally.
Consider the pattern \d+foo
applied to the subject string 123bar
. Once the engine scans 123
and fails on the letter b
, it would normally backtrack to 2
and try to match again, then fail and backtrack to 1
and try to match and fail again, and finally fail the entire pattern. In case of an atomic group (?>\d+)foo
with the same subject string 123bar
, the engine gives up immediately after the first failure to match foo
. An atomic group with a quantifier can match all or nothing.
Atomic groups produce faster false results (i.e. in case when a long subject string does not match the pattern), because the regexp engine saves performance on backtracking. However, don't hurry to put everything into atomic groups. This example demonstrates the difference between atomic and non-atomic match:
SELECT 'abcc' RLIKE 'a(?>bc|b)c' AS atomic1;
-> 1
SELECT 'abc' RLIKE 'a(?>bc|b)c' AS atomic2;
-> 0
SELECT 'abcc' RLIKE 'a(bc|b)c' AS non_atomic1;
-> 1
SELECT 'abc' RLIKE 'a(bc|b)c' AS non_atomic2;
-> 1
The non-atomic pattern matches both abbc
and abc
, while the atomic pattern matches abbc
only.
The atomic group (?>bc|b)
in the above example can be "translated" as "if there is bc
, then don't try to match as b
". So b
can match only if bc
is not found.
Atomic groups are not capturing. To make an atomic group capturing, put it into parentheses:
SELECT REGEXP_REPLACE('abcc','a((?>bc|b))c','\\1');
-> bc
An atomic group which ends with a quantifier can be rewritten using a so called "possessive quantifier" syntax by putting an additional +
sign following the quantifier.
The pattern (?>\d+)foo
from the previous section's example can be rewritten as \d++foo
.
Backreferences match the same text as previously matched by a capturing group. Backreferences can be written using:
a backslash followed by a digit
the \g
escape sequence followed by a positive or negative number
the \g
escape sequence followed by a positive or negative number enclosed in braces
The following backreferences are identical and refer to the first capturing group:
\1
\g1
\g{1}
This example demonstrates a pattern that matches "sense and sensibility" and "response and responsibility", but not "sense and responsibility":
SELECT 'sense and sensibility' RLIKE '(sens|respons)e and \\1ibility';
-> 1
This example removes doubled words that can unintentionally creep in when you edit a text in a text editor:
SELECT REGEXP_REPLACE('using using the the regexp regexp',
'\\b(\\w+)\\s+\\1\\b','\\1');
-> USING the regexp
Note that all double words were removed, in the beginning, in the middle and in the end of the subject string.
A negative number in a \g
sequence means a relative reference. Relative references can be helpful in long patterns, and also in patterns that are created by joining fragments together that contain references within themselves. The sequence \g{-1}
is a reference to the most recently started capturing subpattern before \g
.
In this example \g{-1}
is equivalent to \2
:
SELECT 'abc123def123' RLIKE '(abc(123)def)\\g{-1}';
-> 1
SELECT 'abc123def123' RLIKE '(abc(123)def)\\2';
-> 1
Using numeric backreferences for capturing groups can be hard to track in a complicated regular expression. Also, the numbers can change if an expression is modified. To overcome these difficulties, PCRE supports named subpatterns.
A subpattern can be named in one of three ways: (?<name>
...)
or (?'name'
...)
as in Perl, or (?P<name>
...)
as in Python. References to capturing subpatterns from other parts of the pattern, can be made by name as well as by number.
Backreferences to a named subpattern can be written using the .NET syntax \k{name}
, the Perl syntax \k<name>
or \k'name'
or \g{name}
, or the Python syntax (?P=name)
.
This example tests if the string is a correct HTML tag:
SELECT '<a href="../">Up</a>' RLIKE '<(?<tag>[a-z][a-z0-9]*)[^>]*>[^<]*</(?P=tag)>';
-> 1
Look-ahead and look-behind assertions serve to specify the context for the searched regular expression pattern. Note that the assertions only check the context, they do not capture anything themselves!
This example finds the letter which is not followed by another letter (negative look-ahead):
SELECT REGEXP_SUBSTR('ab1','[a-z](?![a-z])');
-> b
This example finds the letter which is followed by a digit (positive look-ahead):
SELECT REGEXP_SUBSTR('ab1','[a-z](?=[0-9])');
-> b
This example finds the letter which does not follow a digit character (negative look-behind):
SELECT REGEXP_SUBSTR('1ab','(?<![0-9])[a-z]');
-> b
This example finds the letter which follows another letter character (positive look-behind):
SELECT REGEXP_SUBSTR('1ab','(?<=[a-z])[a-z]');
-> b
Note that look-behind assertions can only be of fixed length; you cannot have repetition operators or alternations with different lengths:
SELECT 'aaa' RLIKE '(?<=(a|bc))a';
ERROR 1139 (42000): Got error 'lookbehind assertion is not fixed length at offset 10' from regexp
PCRE supports a special syntax to recourse the entire pattern or its individual subpatterns:
(?R)
Recourse the entire pattern
(?n)
call subpattern by absolute number
(?+n)
call subpattern by relative number
(?-n)
call subpattern by relative number
(?&name)
call subpattern by name (Perl)
(?P>name)
call subpattern by name (Python)
\g
call subpattern by name (Oniguruma)
\g'name'
call subpattern by name (Oniguruma)
\g
call subpattern by absolute number (Oniguruma)
\g'n'
call subpattern by absolute number (Oniguruma)
\g<+n>
call subpattern by relative number
\g<-n>
call subpattern by relative number
\g'+n'
call subpattern by relative number
\g'-n'
call subpattern by relative number
This example checks for a correct additive arithmetic expression consisting of numbers, unary plus and minus, binary plus and minus, and parentheses:
SELECT '1+2-3+(+(4-1)+(-2)+(+1))' RLIKE '^(([+-]?(\\d+|[(](?1)[)]))(([+-](?1))*))$';
-> 1
The recursion is done using (?1)
to call for the first parenthesized subpattern, which includes everything except the leading ^
and the trailing $
.
The regular expression in the above example implements the following BNF grammar:
<expression> ::= <term> [(<sign> <term>)...]
<term> ::= [ <sign> ] <primary>
<primary> ::= <number> | <left paren> <expression> <right paren>
<sign> ::= <plus sign> | <minus sign>
Use the (?(DEFINE)
...)
syntax to define subpatterns that can be referenced from elsewhere.
This example defines a subpattern with the name letters
that matches one or more letters, which is further reused two times:
SELECT 'abc123xyz' RLIKE '^(?(DEFINE)(?<letters>[a-z]+))(?&letters)[0-9]+(?&letters)$';
-> 1
The above example can also be rewritten to define the digit part as a subpattern as well:
SELECT 'abc123xyz' RLIKE
'^(?(DEFINE)(?<letters>[a-z]+)(?<digits>[0-9]+))(?&letters)(?&digits)(?&letters)$';
-> 1
There are two forms of conditional subpatterns:
(?(condition)yes-pattern)
(?(condition)yes-pattern|no-pattern)
The yes-pattern
is used if the condition is satisfied, otherwise the no-pattern
(if any) is used.
Conditions With Subpattern References
If a condition consists of a number, it makes a condition with a subpattern reference. Such a condition is true if a capturing subpattern corresponding to the number has previously matched.
This example finds an optionally parenthesized number in a string:
SELECT REGEXP_SUBSTR('a(123)b', '([(])?[0-9]+(?(1)[)])');
-> (123)
The ([(])?
part makes a capturing subpattern that matches an optional opening parenthesis; the [0-9]+
part matches a number, and the (?(1)[)])
part matches a closing parenthesis, but only if the opening parenthesis has been previously found.
Other Kinds of Conditions
The other possible condition kinds are: recursion references and assertions. See the PCRE documentation for details.
PCRE correctly works with zero bytes in the subject strings:
SELECT 'a\0b' RLIKE '^a.b$';
-> 1
Zero bytes, however, are not supported literally in the pattern strings and should be escaped using the \xhh
or \x{hh}
syntax:
SELECT 'a\0b' RLIKE '^a\\x{00}b$';
-> 1
PCRE provides other extended features that were not covered in this document, such as duplicate subpattern numbers, backtracking control, breaking utf-8 sequences into individual bytes, setting the match limit, setting the recursion limit, optimization control, recursion conditions, assertion conditions and more types of extended grapheme clusters. Please refer to thePCRE documentation for details.
Enhanced regex was implemented as a GSoC 2013 project by Sudheera Palihakkara.
The default_regex_flags variable was introduced to address the remaining incompatibilities between PCRE and the old regex library. Here are some examples of its usage:
The default behaviour (multiline match is off)
SELECT 'a\nb\nc' RLIKE '^b$';
+---------------------------+
| '(?m)a\nb\nc' RLIKE '^b$' |
+---------------------------+
| 0 |
+---------------------------+
Enabling the multiline option using the PCRE option syntax:
SELECT 'a\nb\nc' RLIKE '(?m)^b$';
+---------------------------+
| 'a\nb\nc' RLIKE '(?m)^b$' |
+---------------------------+
| 1 |
+---------------------------+
Enabling the miltiline option using default_regex_flags
SET default_regex_flags='MULTILINE';
SELECT 'a\nb\nc' RLIKE '^b$';
+-----------------------+
| 'a\nb\nc' RLIKE '^b$' |
+-----------------------+
| 1 |
+-----------------------+
This page is licensed: CC BY-SA / Gnu FDL
expr REGEXP pat, expr RLIKE pat
Performs a pattern match of a string expression expr
against a patternpat
. The pattern can be an extended regular expression. See Regular Expressions Overview for details on the syntax for
regular expressions (see also PCRE Regular Expressions).
Returns 1
if expr
matches pat
or 0
if it doesn't match. If either expr
or pat
are NULL
, the result is NULL
.
The negative form NOT REGEXP also exists, as an alias for NOT (string REGEXP pattern)
. RLIKE
and NOT RLIKE
are synonyms for REGEXP
and NOT REGEXP
, originally provided for mSQL compatibility.
The pattern need not be a literal string. For example, it can be specified as a string expression or table column.
Note: Because MariaDB uses the C escape syntax in strings (for example, "\n" to represent the newline character), you must double any "" that you use in your REGEXP
strings.
REGEXP
is not case sensitive, except when used with binary strings.
The default_regex_flags variable addresses the remaining compatibilities between PCRE and the old regex library.
SELECT 'Monty!' REGEXP 'm%y%%';
+-------------------------+
| 'Monty!' REGEXP 'm%y%%' |
+-------------------------+
| 0 |
+-------------------------+
SELECT 'Monty!' REGEXP '.*';
+----------------------+
| 'Monty!' REGEXP '.*' |
+----------------------+
| 1 |
+----------------------+
SELECT 'new*\n*line' REGEXP 'new\\*.\\*line';
+---------------------------------------+
| 'new*\n*line' REGEXP 'new\\*.\\*line' |
+---------------------------------------+
| 1 |
+---------------------------------------+
SELECT 'a' REGEXP 'A', 'a' REGEXP BINARY 'A';
+----------------+-----------------------+
| 'a' REGEXP 'A' | 'a' REGEXP BINARY 'A' |
+----------------+-----------------------+
| 1 | 0 |
+----------------+-----------------------+
SELECT 'a' REGEXP '^[a-d]';
+---------------------+
| 'a' REGEXP '^[a-d]' |
+---------------------+
| 1 |
+---------------------+
MariaDB uses the default_regex_flags variable to address the remaining compatibilities between PCRE and the old regex library.
The default behavior (multiline match is off)
SELECT 'a\nb\nc' RLIKE '^b$';
+---------------------------+
| '(?m)a\nb\nc' RLIKE '^b$' |
+---------------------------+
| 0 |
+---------------------------+
Enabling the multiline option using the PCRE option syntax:
SELECT 'a\nb\nc' RLIKE '(?m)^b$';
+---------------------------+
| 'a\nb\nc' RLIKE '(?m)^b$' |
+---------------------------+
| 1 |
+---------------------------+
Enabling the multiline option using default_regex_flags
SET default_regex_flags='MULTILINE';
SELECT 'a\nb\nc' RLIKE '^b$';
+-----------------------+
| 'a\nb\nc' RLIKE '^b$' |
+-----------------------+
| 1 |
+-----------------------+
This page is licensed: GPLv2, originally from fill_help_tables.sql
REGEXP_INSTR(subject, pattern)
Returns the position of the first occurrence of the regular expression pattern
in the string subject
, or 0 if pattern was not found.
The positions start with 1 and are measured in characters (i.e. not in bytes), which is important for multi-byte character sets. You can cast a multi-byte character set to BINARY to get offsets in bytes.
The function follows the case sensitivity rules of the effective collation. Matching is performed case insensitively for case insensitive collations, and case sensitively for case sensitive collations and for binary data.
The collation case sensitivity can be overwritten using the (?i) and (?-i) PCRE flags.
MariaDB uses the PCRE regular expression library for enhanced regular expression performance, and REGEXP_INSTR
was introduced as part of this enhancement.
SELECT REGEXP_INSTR('abc','b');
-> 2
SELECT REGEXP_INSTR('abc','x');
-> 0
SELECT REGEXP_INSTR('BJÖRN','N');
-> 5
Casting a multi-byte character set as BINARY to get offsets in bytes:
SELECT REGEXP_INSTR(BINARY 'BJÖRN','N') AS cast_utf8_to_binary;
-> 6
Case sensitivity:
SELECT REGEXP_INSTR('ABC','b');
-> 2
SELECT REGEXP_INSTR('ABC' COLLATE utf8_bin,'b');
-> 0
SELECT REGEXP_INSTR(BINARY'ABC','b');
-> 0
SELECT REGEXP_INSTR('ABC','(?-i)b');
-> 0
SELECT REGEXP_INSTR('ABC' COLLATE utf8_bin,'(?i)b');
-> 2
This page is licensed: CC BY-SA / Gnu FDL
REGEXP_REPLACE(subject, pattern, replace)
REGEXP_REPLACE
returns the string subject
with all occurrences of the regular expression pattern
replaced by the string replace
. If no occurrences are found, then subject
is returned as is.
The replace string can have backreferences to the subexpressions in the form \N, where N is a number from 1 to 9.
The function follows the case sensitivity rules of the effective collation. Matching is performed case insensitively for case insensitive collations, and case sensitively for case sensitive collations and for binary data.
The collation case sensitivity can be overwritten using the (?i) and (?-i) PCRE flags.
MariaDB uses the PCRE regular expression library for enhanced regular expression performance, and REGEXP_REPLACE
was introduced as part of this enhancement.
The default_regex_flags variable addresses the remaining compatibilities between PCRE and the old regex library.
SELECT REGEXP_REPLACE('ab12cd','[0-9]','') AS remove_digits;
-> abcd
SELECT REGEXP_REPLACE('<html><head><title>title</title><body>body</body></htm>', '<.+?>',' ')
AS strip_html;
-> title body
Backreferences to the subexpressions in the form , where N is a number from 1 to 9:
SELECT REGEXP_REPLACE('James Bond','^(.*) (.*)$','\\2, \\1') AS reorder_name;
-> Bond, James
Case insensitive and case sensitive matches:
SELECT REGEXP_REPLACE('ABC','b','-') AS case_insensitive;
-> A-C
SELECT REGEXP_REPLACE('ABC' COLLATE utf8_bin,'b','-') AS case_sensitive;
-> ABC
SELECT REGEXP_REPLACE(BINARY 'ABC','b','-') AS binary_data;
-> ABC
Overwriting the collation case sensitivity using the (?i) and (?-i) PCRE flags.
SELECT REGEXP_REPLACE('ABC','(?-i)b','-') AS force_case_sensitive;
-> ABC
SELECT REGEXP_REPLACE(BINARY 'ABC','(?i)b','-') AS force_case_insensitive;
-> A-C
This page is licensed: CC BY-SA / Gnu FDL
REGEXP_SUBSTR(subject,pattern)
Returns the part of the string subject
that matches the regular expression pattern
, or an empty string if pattern
was not found.
The function follows the case sensitivity rules of the effective collation. Matching is performed case insensitively for case insensitive collations, and case sensitively for case sensitive collations and for binary data.
The collation case sensitivity can be overwritten using the (?i) and (?-i) PCRE flags.
MariaDB uses the PCRE regular expression library for enhanced regular expression performance, and REGEXP_SUBSTR
was introduced as part of this enhancement.
The default_regex_flags variable addresses the remaining compatibilities between PCRE and the old regex library.
SELECT REGEXP_SUBSTR('ab12cd','[0-9]+');
-> 12
SELECT REGEXP_SUBSTR(
'See https://mariadb.org/en/foundation/ for details',
'https?://[^/]*');
-> https://mariadb.org
SELECT REGEXP_SUBSTR('ABC','b');
-> B
SELECT REGEXP_SUBSTR('ABC' COLLATE utf8_bin,'b');
->
SELECT REGEXP_SUBSTR(BINARY'ABC','b');
->
SELECT REGEXP_SUBSTR('ABC','(?i)b');
-> B
SELECT REGEXP_SUBSTR('ABC' COLLATE utf8_bin,'(?+i)b');
-> B
This page is licensed: CC BY-SA / Gnu FDL
expr REGEXP pat, expr RLIKE pat
RLIKE
is a synonym for REGEXP.
This page is licensed: CC BY-SA / Gnu FDL