2 of 75

String Functions

Learn about string functions in MariaDB Server. This section details SQL functions for manipulating, searching, and formatting text strings, essential for data cleansing and presentation.

ASCII

Syntax

ASCII(str)

Description

Returns the numeric ASCII value of the leftmost character of the string argument. Returns 0 if the given string is empty and NULL if it is NULL.

ASCII() works for 8-bit characters.

Examples

SELECT ASCII(9);
+----------+
| ASCII(9) |
+----------+
|       57 |
+----------+

SELECT ASCII('9');
+------------+
| ASCII('9') |
+------------+
|         57 |
+------------+

SELECT ASCII('abc');
+--------------+
| ASCII('abc') |
+--------------+
|           97 |
+--------------+

_{This page is licensed: GPLv2, originally from}_{fill_help_tables.sql}

BIN

Syntax

BIN(N)

Description

Returns a string representation of the binary value of the given longlong (that is, BIGINT) number. This is equivalent to CONV(N,10,2). The argument should be positive. If it is a FLOAT, it will be truncated. Returns NULL if the argument is NULL.

Examples

SELECT BIN(12);
+---------+
| BIN(12) |
+---------+
| 1100    |
+---------+

BINARY Operator

This page describes the BINARY operator. For details about the data type, see Binary Data Type.

Syntax

BINARY

Description

The BINARY operator casts the string following it to a binary string. This is an easy way to force a column comparison to be done byte by byte rather than character by character. This causes the comparison to be case sensitive even if the column isn't defined as BINARY or BLOB.

BINARY also causes trailing spaces to be significant.

Examples

SELECT 'a' = 'A';
+-----------+
| 'a' = 'A' |
+-----------+
|         1 |
+-----------+

SELECT BINARY 'a' = 'A';
+------------------+
| BINARY 'a' = 'A' |
+------------------+
|                0 |
+------------------+

SELECT 'a' = 'a ';
+------------+
| 'a' = 'a ' |
+------------+
|          1 |
+------------+

SELECT BINARY 'a' = 'a ';
+-------------------+
| BINARY 'a' = 'a ' |
+-------------------+
|                 0 |
+-------------------+

BIT_LENGTH

Syntax

BIT_LENGTH(str)

Description

Returns the length of the given string argument in bits. If the argument is not a string, it will be converted to string. If the argument is NULL, it returns NULL.

Examples

SELECT BIT_LENGTH('text');
+--------------------+
| BIT_LENGTH('text') |
+--------------------+
|                 32 |
+--------------------+

SELECT BIT_LENGTH('');
+----------------+
| BIT_LENGTH('') |
+----------------+
|              0 |
+----------------+

Compatibility

PostgreSQL and Sybase support BIT_LENGTH().

_{This page is licensed: GPLv2, originally from}_{fill_help_tables.sql}

CAST

Syntax

CAST(expr AS type)

Description

The CAST() function takes a value of one type and produces a value of another type, similar to the CONVERT() function.

The type can be one of the following values:

BINARY
CHAR
DATE
DATETIME
[DECIMAL(M[,D])]
DOUBLE
FLOAT
INTEGER
- Short for SIGNED INTEGER
SIGNED [INTEGER]
UNSIGNED [INTEGER]
TIME
VARCHAR (in Oracle mode)

The main difference between CAST and CONVERT() is that CONVERT(expr,type) is ODBC syntax while CAST(expr as type) and CONVERT(... USING ...) are SQL92 syntax.

You can use the CAST() function with the INTERVAL keyword.

This introduced an incompatibility with previous versions of MariaDB, and all versions of MySQL (see the example below).

Examples

Simple casts:

SELECT CAST("abc" AS BINARY);
SELECT CAST("1" AS UNSIGNED INTEGER);
SELECT CAST(123 AS CHAR CHARACTER SET utf8)

Note that when one casts to CHAR without specifying the character set, the collation_connection character set collation will be used. When used with CHAR CHARACTER SET, the default collation for that character set will be used.

SELECT COLLATION(CAST(123 AS CHAR));
+------------------------------+
| COLLATION(CAST(123 AS CHAR)) |
+------------------------------+
| latin1_swedish_ci            |
+------------------------------+

SELECT COLLATION(CAST(123 AS CHAR CHARACTER SET utf8));
+-------------------------------------------------+
| COLLATION(CAST(123 AS CHAR CHARACTER SET utf8)) |
+-------------------------------------------------+
| utf8_general_ci                                 |
+-------------------------------------------------+

If you also want to change the collation, you have to use the COLLATE operator:

SELECT COLLATION(CAST(123 AS CHAR CHARACTER SET utf8) 
  COLLATE utf8_unicode_ci);
+-------------------------------------------------------------------------+
| COLLATION(CAST(123 AS CHAR CHARACTER SET utf8) COLLATE utf8_unicode_ci) |
+-------------------------------------------------------------------------+
| utf8_unicode_ci                                                         |
+-------------------------------------------------------------------------+

Using CAST() to order an ENUM field as a CHAR rather than the internal numerical value:

CREATE TABLE enum_list (enum_field enum('c','a','b'));

INSERT INTO enum_list (enum_field) 
VALUES('c'),('a'),('c'),('b');

SELECT * FROM enum_list 
ORDER BY enum_field;
+------------+
| enum_field |
+------------+
| c          |
| c          |
| a          |
| b          |
+------------+

SELECT * FROM enum_list 
ORDER BY CAST(enum_field AS CHAR);
+------------+
| enum_field |
+------------+
| a          |
| b          |
| c          |
| c          |
+------------+

The following will trigger warnings, since x'aa' and 'X'aa' doesn't behave as a number. In all versions of MySQL, no warnings are triggered since they did erroneously behave as a number:

SELECT CAST(0xAA AS UNSIGNED), CAST(x'aa' AS UNSIGNED), CAST(X'aa' AS UNSIGNED);
+------------------------+-------------------------+-------------------------+
| CAST(0xAA AS UNSIGNED) | CAST(x'aa' AS UNSIGNED) | CAST(X'aa' AS UNSIGNED) |
+------------------------+-------------------------+-------------------------+
|                    170 |                       0 |                       0 |
+------------------------+-------------------------+-------------------------+
1 row in set, 2 warnings (0.00 sec)

Warning (Code 1292): Truncated incorrect INTEGER value: '\xAA'
Warning (Code 1292): Truncated incorrect INTEGER value: '\xAA'

Casting to intervals:

SELECT CAST(2019-01-04 AS INTERVAL DAY_SECOND(2)) AS "Cast";

+-------------+
| Cast        |
+-------------+
| 00:20:14.00 |
+-------------+

CHAR Function

Syntax

CHAR(N,... [USING charset_name])

Description

CHAR() interprets each argument as an INT and returns a string consisting of the characters given by the code values of those integers. NULL values are skipped. By default, CHAR() returns a binary string. To produce a string in a given character set, use the optional USING clause:

SELECT CHARSET(CHAR(0x65)), CHARSET(CHAR(0x65 USING utf8));
+---------------------+--------------------------------+
| CHARSET(CHAR(0x65)) | CHARSET(CHAR(0x65 USING utf8)) |
+---------------------+--------------------------------+
| binary              | utf8                           |
+---------------------+--------------------------------+

If USING is given and the result string is illegal for the given character set, a warning is issued. Also, if strict SQL mode is enabled, the result from CHAR() becomes NULL.

Examples

SELECT CHAR(77,97,114,'105',97,'68',66);
+----------------------------------+
| CHAR(77,97,114,'105',97,'68',66) |
+----------------------------------+
| MariaDB                          |
+----------------------------------+

SELECT CHAR(77,77.3,'77.3');
+----------------------+
| CHAR(77,77.3,'77.3') |
+----------------------+
| MMM                  |
+----------------------+
1 row in set, 1 warning (0.00 sec)

Warning (Code 1292): Truncated incorrect INTEGER value: '77.3'

CHAR_LENGTH

Syntax

CHAR_LENGTH(str)
CHARACTER_LENGTH(str)

Description

Returns the length of the given string argument, measured in characters. A multi-byte character counts as a single character. This means that for a string containing five two-byte characters, LENGTH() (or OCTET_LENGTH() in Oracle mode) returns 10, whereas CHAR_LENGTH() returns 5. If the argument is NULL, it returns NULL.

If the argument is not a string value, it is converted into a string.

It is synonymous with the CHARACTER_LENGTH() function.

Examples

SELECT CHAR_LENGTH('MariaDB');
+------------------------+
| CHAR_LENGTH('MariaDB') |
+------------------------+
|                      7 |
+------------------------+

When Oracle mode is not set:

SELECT CHAR_LENGTH('π'), LENGTH('π'), LENGTHB('π'), OCTET_LENGTH('π');
+-------------------+--------------+---------------+--------------------+
| CHAR_LENGTH('π')  | LENGTH('π')  | LENGTHB('π')  | OCTET_LENGTH('π')  |
+-------------------+--------------+---------------+--------------------+
|                 1 |            2 |             2 |                  2 |
+-------------------+--------------+---------------+--------------------+

In Oracle mode:

SELECT CHAR_LENGTH('π'), LENGTH('π'), LENGTHB('π'), OCTET_LENGTH('π');
+-------------------+--------------+---------------+--------------------+
| CHAR_LENGTH('π')  | LENGTH('π')  | LENGTHB('π')  | OCTET_LENGTH('π')  |
+-------------------+--------------+---------------+--------------------+
|                 1 |            1 |             2 |                  2 |
+-------------------+--------------+---------------+--------------------+

CHARACTER_LENGTH

Syntax

CHARACTER_LENGTH(str)

Description

CHARACTER_LENGTH() is a synonym for CHAR_LENGTH().

_{This page is licensed: GPLv2, originally from}_{fill_help_tables.sql}

CHR

Syntax

CHR(N)

Description

CHR() interprets each argument N as an integer and returns a VARCHAR(1) string consisting of the character given by the code values of the integer. The character set and collation of the string are set according to the values of the character_set_database and collation_database system variables.

CHR() is similar to the CHAR() function, but only accepts a single argument.

CHR() is available in all sql_modes.

Examples

SELECT CHR(67);
+---------+
| CHR(67) |
+---------+
| C       |
+---------+

SELECT CHR('67');
+-----------+
| CHR('67') |
+-----------+
| C         |
+-----------+

SELECT CHR('C');
+----------+
| CHR('C') |
+----------+
|          |
+----------+
1 row in set, 1 warning (0.000 sec)

SHOW WARNINGS;
+---------+------+----------------------------------------+
| Level   | Code | Message                                |
+---------+------+----------------------------------------+
| Warning | 1292 | Truncated incorrect INTEGER value: 'C' |
+---------+------+----------------------------------------+

CONCAT

Syntax

CONCAT(str1,str2,...)

Description

Returns the string that results from concatenating the arguments. May have one or more arguments. If all arguments are non-binary strings, the result is a non-binary string. If the arguments include any binary strings, the result is a binary string. A numeric argument is converted to its equivalent binary string form; if you want to avoid that, you can use an explicit type cast, as in this example:

SELECT CONCAT(CAST(int_col AS CHAR), char_col);

CONCAT() returns NULL if any argument is NULL.

A NULL parameter hides all information contained in other parameters from the result. Sometimes this is not desirable; to avoid this, you can:

Use the CONCAT_WS() function with an empty separator, because that function is NULL-safe.
Use IFNULL() to turn NULLs into empty strings.

Oracle Mode

In Oracle mode, CONCAT ignores null.

Examples

SELECT CONCAT('Ma', 'ria', 'DB');
+---------------------------+
| CONCAT('Ma', 'ria', 'DB') |
+---------------------------+
| MariaDB                   |
+---------------------------+

SELECT CONCAT('Ma', 'ria', NULL, 'DB');
+---------------------------------+
| CONCAT('Ma', 'ria', NULL, 'DB') |
+---------------------------------+
| NULL                            |
+---------------------------------+

SELECT CONCAT(42.0);
+--------------+
| CONCAT(42.0) |
+--------------+
| 42.0         |
+--------------+

Using IFNULL() to handle NULL values:

SELECT CONCAT('The value of @v is: ', IFNULL(@v, ''));
+------------------------------------------------+
| CONCAT('The value of @v is: ', IFNULL(@v, '')) |
+------------------------------------------------+
| The value of @v is:                            |
+------------------------------------------------+

In Oracle mode:

SELECT CONCAT('Ma', 'ria', NULL, 'DB');
+---------------------------------+
| CONCAT('Ma', 'ria', NULL, 'DB') |
+---------------------------------+
| MariaDB                         |
+---------------------------------+

CONCAT_WS

Syntax

CONCAT_WS(separator,str1,str2,...)

Description

CONCAT_WS() stands for Concatenate With Separator and is a special form of CONCAT(). The first argument is the separator for the rest of the arguments. The separator is added between the strings to be concatenated. The separator can be a string, as can the rest of the arguments.

If the separator is NULL, the result is NULL; all other NULL values are skipped. This makes CONCAT_WS() suitable when you want to concatenate some values and avoid losing all information if one of them is NULL.

Examples

SELECT CONCAT_WS(',','First name','Second name','Last Name');
+-------------------------------------------------------+
| CONCAT_WS(',','First name','Second name','Last Name') |
+-------------------------------------------------------+
| First name,Second name,Last Name                      |
+-------------------------------------------------------+

SELECT CONCAT_WS('-','Floor',NULL,'Room');
+------------------------------------+
| CONCAT_WS('-','Floor',NULL,'Room') |
+------------------------------------+
| Floor-Room                         |
+------------------------------------+

In some cases, remember to include a space in the separator string:

SET @a = 'gnu', @b = 'penguin', @c = 'sea lion';
Query OK, 0 rows affected (0.00 sec)

SELECT CONCAT_WS(', ', @a, @b, @c);
+-----------------------------+
| CONCAT_WS(', ', @a, @b, @c) |
+-----------------------------+
| gnu, penguin, sea lion      |
+-----------------------------+

Using CONCAT_WS() to handle NULLs:

SET @a = 'a', @b = NULL, @c = 'c';

SELECT CONCAT_WS('', @a, @b, @c);
+---------------------------+
| CONCAT_WS('', @a, @b, @c) |
+---------------------------+
| ac                        |
+---------------------------+

CONVERT

Syntax

CONVERT(expr,type), CONVERT(expr USING transcoding_name)

Description

The CONVERT() and CAST() functions take a value of one type and produce a value of another type.

The type can be one of the following values:

BINARY
CHAR
DATE
DATETIME
[DECIMAL(M[,D])]
DOUBLE
FLOAT
INTEGER
- Short for SIGNED INTEGER
SIGNED [INTEGER]
UNSIGNED [INTEGER]
TIME
VARCHAR (in Oracle mode)

Note that in MariaDB, INT and INTEGER are the same thing.

BINARY produces a string with the BINARY data type. If the optional length is given, BINARY(N) causes the cast to use no more than N bytes of the argument. Values shorter than the given number in bytes are padded with 0x00 bytes to make them equal the length value.

CHAR(N) causes the cast to use no more than the number of characters given in the argument.

The main difference between the CAST() and CONVERT() is that CONVERT(expr,type) is ODBC syntax while CAST(expr as type) and CONVERT(... USING ...) are SQL92 syntax.

CONVERT() with USING is used to convert data between different character sets. In MariaDB, transcoding names are the same as the corresponding character set names. For example, this statement converts the string 'abc' in the default character set to the corresponding string in the utf8 character set:

SELECT CONVERT('abc' USING utf8);

Examples

SELECT enum_col FROM tbl_name 
ORDER BY CAST(enum_col AS CHAR);

Converting a BINARY to string to permit the LOWER function to work:

SET @x = 'AardVark';

SET @x = BINARY 'AardVark';

SELECT LOWER(@x), LOWER(CONVERT (@x USING latin1));
+-----------+----------------------------------+
| LOWER(@x) | LOWER(CONVERT (@x USING latin1)) |
+-----------+----------------------------------+
| AardVark  | aardvark                         |
+-----------+----------------------------------+

ELT

Syntax

ELT(N, str1[, str2, str3,...])

Description

Takes a numeric argument and a series of string arguments. Returns the string that corresponds to the given numeric position. For instance, it returns str1 if N is 1, str2 if N is 2, and so on. If the numeric argument is a FLOAT, MariaDB rounds it to the nearest INTEGER. If the numeric argument is less than 1, greater than the total number of arguments, or not a number, ELT() returns NULL. It must have at least two arguments.

It is complementary to the FIELD() function.

Examples

SELECT ELT(1, 'ej', 'Heja', 'hej', 'foo');
+------------------------------------+
| ELT(1, 'ej', 'Heja', 'hej', 'foo') |
+------------------------------------+
| ej                                 |
+------------------------------------+

SELECT ELT(4, 'ej', 'Heja', 'hej', 'foo');
+------------------------------------+
| ELT(4, 'ej', 'Heja', 'hej', 'foo') |
+------------------------------------+
| foo                                |
+------------------------------------+

EXPORT_SET

Syntax

EXPORT_SET(bits, on, off[, separator[, number_of_bits]])

Description

Takes a minimum of three arguments. Returns a string where each bit in the given bits argument is returned, with the string values given for on and off.

Bits are examined from right to left, (from low-order to high-order bits). Strings are added to the result from left to right, separated by a separator string (defaults as ','). You can optionally limit the number of bits the EXPORT_SET() function examines using the number_of_bits option.

If any of the arguments are set as NULL, the function returns NULL.

Examples

SELECT EXPORT_SET(5,'Y','N',',',4);
+-----------------------------+
| EXPORT_SET(5,'Y','N',',',4) |
+-----------------------------+
| Y,N,Y,N                     |
+-----------------------------+

SELECT EXPORT_SET(6,'1','0',',',10);
+------------------------------+
| EXPORT_SET(6,'1','0',',',10) |
+------------------------------+
| 0,1,1,0,0,0,0,0,0,0          |
+------------------------------+

_{This page is licensed: GPLv2, originally from}_{fill_help_tables.sql}

EXTRACTVALUE

Syntax

EXTRACTVALUE(xml_frag, xpath_expr)

Description

The EXTRACTVALUE() function takes two string arguments: a fragment of XML markup and an XPath expression, (also known as a locator). It returns the text (That is, CDDATA), of the first text node which is a child of the element or elements matching the XPath expression.

In cases where a valid XPath expression does not match any text nodes in a valid XML fragment, (including the implicit /text() expression), the EXTRACTVALUE() function returns an empty string.

Invalid Arguments

When either the XML fragment or the XPath expression is NULL, the EXTRACTVALUE() function returns NULL. When the XML fragment is invalid, it raises a warning Code 1525:

Warning (Code 1525): Incorrect XML value: 'parse error at line 1 pos 11: unexpected END-OF-INPUT'

When the XPath value is invalid, it generates an Error 1105:

ERROR 1105 (HY000): XPATH syntax error: ')'

Explicit text() Expressions

This function is the equivalent of performing a match using the XPath expression after appending /text(). In other words:

SELECT
   EXTRACTVALUE('<cases><case>example</case></cases>', '/cases/case') 
    AS 'Base Example',
   EXTRACTVALUE('<cases><case>example</case></cases>', '/cases/case/text()') 
    AS 'text() Example';
+--------------+----------------+
| Base Example | text() Example |
+--------------+----------------+
| example      | example        |
+--------------+----------------+

Count Matches

When EXTRACTVALUE() returns multiple matches, it returns the content of the first child text node of each matching element, in the matched order, as a single, space-delimited string.

By design, the EXTRACTVALUE() function makes no distinction between a match on an empty element and no match at all. If you need to determine whether no matching element was found in the XML fragment or if an element was found that contained no child text nodes, use the XPath count() function.

For instance, when looking for a value that exists, but contains no child text nodes, you would get a count of the number of matching instances:

SELECT
   EXTRACTVALUE('<cases><case/></cases>', '/cases/case') 
    AS 'Empty Example',
   EXTRACTVALUE('<cases><case/></cases>', 'count(/cases/case)') 
    AS 'count() Example';
+---------------+-----------------+
| Empty Example | count() Example |
+---------------+-----------------+
|               |               1 |
+---------------+-----------------+

Alternatively, when looking for a value that doesn't exist, count() returns 0.

SELECT
   EXTRACTVALUE('<cases><case/></cases>', '/cases/person') 
    AS 'No Match Example',
   EXTRACTVALUE('<cases><case/></cases>', 'count(/cases/person)') 
    AS 'count() Example';
+------------------+-----------------+
| No Match Example | count() Example |
+------------------+-----------------+
|                  |                0|
+------------------+-----------------+

Matches

Important: The EXTRACTVALUE() function only returns CDDATA. It does not return tags that the element might contain or the text that these child elements contain.

SELECT 
  EXTRACTVALUE('<cases><case>Person<email>x@example.com</email></case></cases>', '/cases')
   AS Case;
+--------+
| Case   |
+--------+
| Person |
+--------+

Note, in the above example, while the XPath expression matches to the parent <case> instance, it does not return the contained <email> tag or its content.

Examples

SELECT
    ExtractValue('<a>ccc<b>ddd</b></a>', '/a')            AS val1,
    ExtractValue('<a>ccc<b>ddd</b></a>', '/a/b')          AS val2,
    ExtractValue('<a>ccc<b>ddd</b></a>', '//b')           AS val3,
    ExtractValue('<a>ccc<b>ddd</b></a>', '/b')            AS val4,
    ExtractValue('<a>ccc<b>ddd</b><b>eee</b></a>', '//b') AS val5;
+------+------+------+------+---------+
| val1 | val2 | val3 | val4 | val5    |
+------+------+------+------+---------+
| ccc  | ddd  | ddd  |      | ddd eee |
+------+------+------+------+---------+

_{This page is licensed: GPLv2, originally from}_{fill_help_tables.sql}

FIELD

Syntax

FIELD(pattern, str1[,str2,...])

Description

Returns the index position of the string or number matching the given pattern. Returns 0 in the event that none of the arguments match the pattern. Raises an Error 1582 if not given at least two arguments.

When all arguments given to the FIELD() function are strings, they are treated as case-insensitive. When all the arguments are numbers, they are treated as numbers. Otherwise, they are treated as doubles.

If the given pattern occurs more than once, the FIELD() function only returns the index of the first instance. If the given pattern is NULL, the function returns 0, as a NULL pattern always fails to match.

This function is complementary to the ELT() function.

Examples

SELECT FIELD('ej', 'Hej', 'ej', 'Heja', 'hej', 'foo') 
   AS 'Field Results';
+---------------+
| Field Results | 
+---------------+
|             2 |
+---------------+

SELECT FIELD('fo', 'Hej', 'ej', 'Heja', 'hej', 'foo')
   AS 'Field Results';
+---------------+
| Field Results | 
+---------------+
|             0 |
+---------------+

SELECT FIELD(1, 2, 3, 4, 5, 1) AS 'Field Results';
+---------------+
| Field Results |
+---------------+
|             5 |
+---------------+

SELECT FIELD(NULL, 2, 3) AS 'Field Results';
+---------------+
| Field Results |
+---------------+
|             0 |
+---------------+

SELECT FIELD('fail') AS 'Field Results';
Error 1582 (42000): Incorrect parameter count in call
to native function 'field'

FIND_IN_SET

Syntax

FIND_IN_SET(pattern, strlist)

Description

Returns the index position where the given pattern occurs in a string list. The first argument is the pattern you want to search for. The second argument is a string containing comma-separated variables. If the second argument is of the SET data-type, the function is optimized to use bit arithmetic.

If the pattern does not occur in the string list or if the string list is an empty string, the function returns 0. If either argument is NULL, the function returns NULL. The function does not return the correct result if the pattern contains a comma (",") character.

Examples

SELECT FIND_IN_SET('b','a,b,c,d') AS "Found Results";
+---------------+
| Found Results |
+---------------+
|             2 |
+---------------+

FORMAT

Syntax

FORMAT(num, decimal_position[, locale])

Description

Formats the given number for display as a string, adding separators to appropriate position and rounding the results to the given decimal position. For instance, it would format 15233.345 to 15,233.35.

If the given decimal position is 0, it rounds to return no decimal point or fractional part. You can optionally specify a locale value to format numbers to the pattern appropriate for the given region.

Examples

SELECT FORMAT(1234567890.09876543210, 4) AS 'Format';
+--------------------+
| Format             |
+--------------------+
| 1,234,567,890.0988 |
+--------------------+

SELECT FORMAT(1234567.89, 4) AS 'Format';
+----------------+
| Format         |
+----------------+
| 1,234,567.8900 |
+----------------+

SELECT FORMAT(1234567.89, 0) AS 'Format';
+-----------+
| Format    |
+-----------+
| 1,234,568 |
+-----------+

SELECT FORMAT(123456789,2,'rm_CH') AS 'Format';
+----------------+
| Format         |
+----------------+
| 123'456'789,00 |
+----------------+

_{This page is licensed: GPLv2, originally from}_{fill_help_tables.sql}

FROM_BASE64

Syntax

FROM_BASE64(str)

Description

Decodes the given base-64 encode string, returning the result as a binary string. Returns NULL if the given string is NULL or if it's invalid.

It is the reverse of the TO_BASE64 function.

There are numerous methods to base-64 encode a string. MariaDB uses the following:

It encodes alphabet value 64 as '+'.
It encodes alphabet value 63 as '/'.
It codes output in groups of four printable characters. Each three byte of data encoded uses four characters. If the final group is incomplete, it pads the difference with the '=' character.
It divides long output, adding a new line very 76 characters.
In decoding, it recognizes and ignores newlines, carriage returns, tabs and space whitespace characters.

SELECT TO_BASE64('Maria') AS 'Input';
+-----------+
| Input     |
+-----------+
| TWFyaWE=  |
+-----------+

SELECT FROM_BASE64('TWFyaWE=') AS 'Output';
+--------+
| Output |
+--------+
| Maria  |
+--------+

_{This page is licensed: CC BY-SA / Gnu FDL}

HEX

Syntax

HEX(N_or_S)

Description

If N_or_S is a number, returns a string representation of the hexadecimal value of N, where N is a longlong (BIGINT) number. This is equivalent to CONV(N,10,16).

If N_or_S is a string, returns a hexadecimal string representation ofN_or_S where each byte of each character in N_or_S is converted to two hexadecimal digits. If N_or_S is NULL, returns NULL. The inverse of this operation is performed by the UNHEX() function.

HEX() with an INET6 argument returns a hexadecimal representation of the underlying 16-byte binary string.

Examples

SELECT HEX(255);
+----------+
| HEX(255) |
+----------+
| FF       |
+----------+

SELECT 0x4D617269614442;
+------------------+
| 0x4D617269614442 |
+------------------+
| MariaDB          |
+------------------+

SELECT HEX('MariaDB');
+----------------+
| HEX('MariaDB') |
+----------------+
| 4D617269614442 |
+----------------+

SELECT HEX(CAST('2001:db8::ff00:42:8329' AS INET6));
+----------------------------------------------+
| HEX(CAST('2001:db8::ff00:42:8329' AS INET6)) |
+----------------------------------------------+
| 20010DB8000000000000FF0000428329             |
+----------------------------------------------+

INSERT Function

Syntax

INSERT(str,pos,len,newstr)

Description

Returns the string str, with the substring beginning at position pos and len characters long replaced by the string newstr. Returns the original string if pos is not within the length of the string. Replaces the rest of the string from position pos if len is not within the length of the rest of the string. Returns NULL if any argument is NULL.

Examples

SELECT INSERT('Quadratic', 3, 4, 'What');
+-----------------------------------+
| INSERT('Quadratic', 3, 4, 'What') |
+-----------------------------------+
| QuWhattic                         |
+-----------------------------------+

SELECT INSERT('Quadratic', -1, 4, 'What');
+------------------------------------+
| INSERT('Quadratic', -1, 4, 'What') |
+------------------------------------+
| Quadratic                          |
+------------------------------------+

SELECT INSERT('Quadratic', 3, 100, 'What');
+-------------------------------------+
| INSERT('Quadratic', 3, 100, 'What') |
+-------------------------------------+
| QuWhat                              |
+-------------------------------------+

_{This page is licensed: GPLv2, originally from}_{fill_help_tables.sql}

INSTR

Syntax

INSTR(str,substr)

Description

Returns the position of the first occurrence of substring substr in string str. This is the same as the two-argument form of LOCATE(), except that the order of the arguments is reversed.

INSTR() performs a case-insensitive search.

If any argument is NULL, returns NULL.

Examples

SELECT INSTR('foobarbar', 'bar');
+---------------------------+
| INSTR('foobarbar', 'bar') |
+---------------------------+
|                         4 |
+---------------------------+

SELECT INSTR('My', 'Maria');
+----------------------+
| INSTR('My', 'Maria') |
+----------------------+
|                    0 |
+----------------------+

LCASE

Syntax

LCASE(str)

Description

LCASE() is a synonym for LOWER().

_{This page is licensed: GPLv2, originally from}_{fill_help_tables.sql}

LEFT

Syntax

LEFT(str,len)

Description

Returns the leftmost len characters from the string str, or NULL if any argument is NULL.

Examples

SELECT LEFT('MariaDB', 5);
+--------------------+
| LEFT('MariaDB', 5) |
+--------------------+
| Maria              |
+--------------------+

_{This page is licensed: GPLv2, originally from}_{fill_help_tables.sql}

LENGTH

Syntax

LENGTH(str)

Description

Returns the length of the string str.

In the default mode, when Oracle mode is not set, the length is measured in bytes. In this case, a multi-byte character counts as multiple bytes. This means that for a string containing five two-byte characters, LENGTH() returns 10, whereas CHAR_LENGTH() returns 5.

When running Oracle mode, the length is measured in characters, and LENGTH is a synonym for CHAR_LENGTH().

If str is not a string value, it is converted into a string. If str is NULL, the function returns NULL.

Examples

SELECT LENGTH('MariaDB');
+-------------------+
| LENGTH('MariaDB') |
+-------------------+
|                 7 |
+-------------------+

When Oracle mode is not set:

SELECT CHAR_LENGTH('π'), LENGTH('π'), LENGTHB('π'), OCTET_LENGTH('π');
+-------------------+--------------+---------------+--------------------+
| CHAR_LENGTH('π')  | LENGTH('π')  | LENGTHB('π')  | OCTET_LENGTH('π')  |
+-------------------+--------------+---------------+--------------------+
|                 1 |            2 |             2 |                  2 |
+-------------------+--------------+---------------+--------------------+

In Oracle mode:

SELECT CHAR_LENGTH('π'), LENGTH('π'), LENGTHB('π'), OCTET_LENGTH('π');
+-------------------+--------------+---------------+--------------------+
| CHAR_LENGTH('π')  | LENGTH('π')  | LENGTHB('π')  | OCTET_LENGTH('π')  |
+-------------------+--------------+---------------+--------------------+
|                 1 |            1 |             2 |                  2 |
+-------------------+--------------+---------------+--------------------+

LENGTHB

Syntax

LENGTHB(str)

Description

LENGTHB() returns the length of the given string, in bytes. When Oracle mode is not set, this is a synonym for LENGTH.

A multi-byte character counts as multiple bytes. This means that for a string containing five two-byte characters, LENGTHB() returns 10, whereas CHAR_LENGTH() returns 5.

If str is not a string value, it is converted into a string. If str is NULL, the function returns NULL.

Examples

When Oracle mode is not set:

SELECT CHAR_LENGTH('π'), LENGTH('π'), LENGTHB('π'), OCTET_LENGTH('π');
+-------------------+--------------+---------------+--------------------+
| CHAR_LENGTH('π')  | LENGTH('π')  | LENGTHB('π')  | OCTET_LENGTH('π')  |
+-------------------+--------------+---------------+--------------------+
|                 1 |            2 |             2 |                  2 |
+-------------------+--------------+---------------+--------------------+

In Oracle mode:

SELECT CHAR_LENGTH('π'), LENGTH('π'), LENGTHB('π'), OCTET_LENGTH('π');
+-------------------+--------------+---------------+--------------------+
| CHAR_LENGTH('π')  | LENGTH('π')  | LENGTHB('π')  | OCTET_LENGTH('π')  |
+-------------------+--------------+---------------+--------------------+
|                 1 |            1 |             2 |                  2 |
+-------------------+--------------+---------------+--------------------+

LIKE

Syntax

expr LIKE pat [ESCAPE 'escape_char']
expr NOT LIKE pat [ESCAPE 'escape_char']

Description

Tests whether expr matches the pattern pat. Returns either 1 (TRUE) or 0 (FALSE). Both expr and pat may be any valid expression and are evaluated to strings. Patterns may use the following wildcard characters:

% matches any number of characters, including zero.
_ matches any single character.

Use NOT LIKE to test if a string does not match a pattern. This is equivalent to using the NOT operator on the entire LIKE expression.

If either the expression or the pattern is NULL, the result is NULL.

LIKE performs case-insensitive substring matches if the collation for the expression and pattern is case-insensitive. For case-sensitive matches, declare either argument to use a binary collation using collate, or coerce either of them to a BINARY string using CAST. Use SHOW COLLATION to get a list of available collations. Collations ending in _bin are case-sensitive.

Numeric arguments are coerced to binary strings.

The _ wildcard matches a single character, not byte. It will only match a multi-byte character if it is valid in the expression's character set. For example, _ will match _utf8"€", but it will not match _latin1"€" because the Euro sign is not a valid latin1 character. If necessary, use CONVERT to use the expression in a different character set.

If you need to match the characters _ or %, you must escape them. By default, you can prefix the wildcard characters the backslash character \ to escape them. The backslash is used both to encode special characters like newlines when a string is parsed as well as to escape wildcards in a pattern after parsing. Thus, to match an actual backslash, you sometimes need to double-escape it as "\``\``\``\".

To avoid difficulties with the backslash character, you can change the wildcard escape character using ESCAPE in a LIKE expression. The argument to ESCAPE must be a single-character string.

Examples

Select the days that begin with "T":

CREATE TABLE t1 (d VARCHAR(16));
INSERT INTO t1 VALUES 
  ("Monday"), ("Tuesday"), ("Wednesday"), 
  ("Thursday"), ("Friday"), ("Saturday"), ("Sunday");
SELECT * FROM t1 WHERE d LIKE "T%";

SELECT * FROM t1 WHERE d LIKE "T%";
+----------+
| d        |
+----------+
| Tuesday  |
| Thursday |
+----------+

Select the days that contain the substring "es":

SELECT * FROM t1 WHERE d LIKE "%es%";

SELECT * FROM t1 WHERE d LIKE "%es%";
+-----------+
| d         |
+-----------+
| Tuesday   |
| Wednesday |
+-----------+

Select the six-character day names:

SELECT * FROM t1 WHERE d like "___day";

SELECT * FROM t1 WHERE d like "___day";
+---------+
| d       |
+---------+
| Monday  |
| Friday  |
| Sunday  |
+---------+

With the default collations, LIKE is case-insensitive:

SELECT * FROM t1 WHERE d like "t%";

SELECT * FROM t1 where d like "t%";
+----------+
| d        |
+----------+
| Tuesday  |
| Thursday |
+----------+

Use collate to specify a binary collation, forcing case-sensitive matches:

SELECT * FROM t1 WHERE d like "t%" COLLATE latin1_bin;

SELECT * FROM t1 WHERE d like "t%" COLLATE latin1_bin;
Empty SET (0.00 sec)

You can include functions and operators in the expression to match. Select dates based on their day name:

CREATE TABLE t2 (d DATETIME);
INSERT INTO t2 VALUES
    ("2007-01-30 21:31:07"),
    ("1983-10-15 06:42:51"),
    ("2011-04-21 12:34:56"),
    ("2011-10-30 06:31:41"),
    ("2011-01-30 14:03:25"),
    ("2004-10-07 11:19:34");
SELECT * FROM t2 WHERE DAYNAME(d) LIKE "T%";

SELECT * FROM t2 WHERE DAYNAME(d) LIKE "T%";
+------------------+
| d                |
+------------------+
| 2007-01-30 21:31 |
| 2011-04-21 12:34 |
| 2004-10-07 11:19 |
+------------------+
3 rows in set, 7 warnings (0.00 sec)

Optimizing LIKE

MariaDB can use indexes for LIKE on string columns in the case where the LIKE doesn't start with % or _.
You can set the optimizer_use_condition_selectivity variable to 5. If this is done, then the optimizer will read optimizer_selectivity_sampling_limit rows to calculate the selectivity of the LIKE expression before starting to calculate the query plan. This can help speed up some LIKE queries by providing the optimizer with more information about your data.

LOAD_FILE

Syntax

LOAD_FILE(file_name)

Description

Reads the file and returns the file contents as a string. To use this function, the file must be located on the server host, you must specify the full path name to the file, and you must have the FILE privilege. The file must be readable by all and it must be less than the size, in bytes, of the max_allowed_packet system variable. If the secure_file_priv system variable is set to a non-empty directory name, the file to be loaded must be located in that directory.

If the file does not exist or cannot be read because one of the preceding conditions is not satisfied, the function returns NULL.

The character_set_filesystem system variable has controlled interpretation of file names that are given as literal strings.

Statements using the LOAD_FILE() function are not safe for statement based replication. This is because the slave will execute the LOAD_FILE() command itself. If the file doesn't exist on the slave, the function will return NULL.

Examples

UPDATE t SET blob_col=LOAD_FILE('/tmp/picture') WHERE id=1;

LOCATE

Syntax

LOCATE(substr,str), LOCATE(substr,str,pos)

Description

The first syntax returns the position of the first occurrence of substring substr in string str. The second syntax returns the position of the first occurrence of substring substr in string str, starting at position pos. Returns 0 if substr is not in str.

LOCATE() performs a case-insensitive search.

If any argument is NULL, returns NULL.

INSTR() is the same as the two-argument form of LOCATE(), except that the order of the arguments is reversed.

Examples

SELECT LOCATE('bar', 'foobarbar');
+----------------------------+
| LOCATE('bar', 'foobarbar') |
+----------------------------+
|                          4 |
+----------------------------+

SELECT LOCATE('My', 'Maria');
+-----------------------+
| LOCATE('My', 'Maria') |
+-----------------------+
|                     0 |
+-----------------------+

SELECT LOCATE('bar', 'foobarbar', 5);
+-------------------------------+
| LOCATE('bar', 'foobarbar', 5) |
+-------------------------------+
|                             7 |
+-------------------------------+

LOWER

Syntax

LOWER(str)
LCASE(str)

Description

Returns the string str with all characters changed to lowercase according to the current character set mapping. The default is latin1 (cp1252 West European).

LCASE is a synonym for LOWER .

Examples

SELECT LOWER('QUADRATICALLY');
+------------------------+
| LOWER('QUADRATICALLY') |
+------------------------+
| quadratically          |
+------------------------+

LOWER() (and UPPER()) are ineffective when applied to binary strings (BINARY, VARBINARY, BLOB). To perform letter case conversion, CONVERT the string to a non-binary string:

SET @str = BINARY 'North Carolina';

SELECT LOWER(@str), LOWER(CONVERT(@str USING latin1));
+----------------+-----------------------------------+
| LOWER(@str)    | LOWER(CONVERT(@str USING latin1)) |
+----------------+-----------------------------------+
| North Carolina | north carolina                    |
+----------------+-----------------------------------+

_{This page is licensed: GPLv2, originally from}_{fill_help_tables.sql}

LPAD

Syntax

LPAD(str, len [,padstr])

Description

Returns the string str, left-padded with the string padstr to a length of len characters. If str is longer than len, the return value is shortened to len characters. If padstr is omitted, the LPAD function pads spaces.

Returns NULL if given a NULL argument. If the result is empty (zero length), returns either an empty string or with SQL_MODE=Oracle, NULL.

The Oracle mode version of the function can be accessed outside of Oracle mode by using LPAD_ORACLE as the function name.

Examples

SELECT LPAD('hello',10,'.');
+----------------------+
| LPAD('hello',10,'.') |
+----------------------+
| .....hello           |
+----------------------+

SELECT LPAD('hello',2,'.');
+---------------------+
| LPAD('hello',2,'.') |
+---------------------+
| he                  |
+---------------------+

With the pad string defaulting to space:

SELECT LPAD('hello',10);
+------------------+
| LPAD('hello',10) |
+------------------+
|      hello       |
+------------------+

Oracle mode:

SELECT LPAD('',0),LPAD_ORACLE('',0);
+------------+-------------------+
| LPAD('',0) | LPAD_ORACLE('',0) |
+------------+-------------------+
|            | NULL              |
+------------+-------------------+

LTRIM

Syntax

LTRIM(str)

Description

Returns the string str with leading space characters removed.

Returns NULL if given a NULL argument. If the result is empty, returns either an empty string, or with SQL_MODE=Oracle, NULL.

The Oracle mode version of the function can be accessed outside of Oracle mode by using LTRIM_ORACLE as the function name.

Examples

SELECT QUOTE(LTRIM('   MariaDB   '));
+-------------------------------+
| QUOTE(LTRIM('   MariaDB   ')) |
+-------------------------------+
| 'MariaDB   '                  |
+-------------------------------+

Oracle mode version:

SELECT LTRIM(''),LTRIM_ORACLE('');
+-----------+------------------+
| LTRIM('') | LTRIM_ORACLE('') |
+-----------+------------------+
|           | NULL             |
+-----------+------------------+

MAKE_SET

Syntax

MAKE_SET(bits,str1,str2,...)

Description

Returns a set value (a string containing substrings separated by "," characters) consisting of the strings that have the corresponding bit in bits set. str1 corresponds to bit 0, str2 to bit 1, and so on. NULL values in str1, str2, ... are not appended to the result.

Examples

SELECT MAKE_SET(1,'a','b','c');
+-------------------------+
| MAKE_SET(1,'a','b','c') |
+-------------------------+
| a                       |
+-------------------------+

SELECT MAKE_SET(1 | 4,'hello','nice','world');
+----------------------------------------+
| MAKE_SET(1 | 4,'hello','nice','world') |
+----------------------------------------+
| hello,world                            |
+----------------------------------------+

SELECT MAKE_SET(1 | 4,'hello','nice',NULL,'world');
+---------------------------------------------+
| MAKE_SET(1 | 4,'hello','nice',NULL,'world') |
+---------------------------------------------+
| hello                                       |
+---------------------------------------------+

SELECT QUOTE(MAKE_SET(0,'a','b','c'));
+--------------------------------+
| QUOTE(MAKE_SET(0,'a','b','c')) |
+--------------------------------+
| ''                             |
+--------------------------------+

_{This page is licensed: GPLv2, originally from}_{fill_help_tables.sql}

MATCH AGAINST

Syntax

MATCH (col1,col2,...) AGAINST (expr [search_modifier])

Description

A special construct used to perform a fulltext search on a fulltext index.

See Fulltext Index Overview for a full description, and Full-text Indexes for more articles on the topic.

Examples

CREATE TABLE ft_myisam(copy TEXT,FULLTEXT(copy)) ENGINE=MyISAM;

INSERT INTO ft_myisam(copy) VALUES ('Once upon a time'), ('There was a wicked witch'), 
 ('Who ate everybody up');

SELECT * FROM ft_myisam WHERE MATCH(copy) AGAINST('wicked');
+--------------------------+
| copy                     |
+--------------------------+
| There was a wicked witch |
+--------------------------+

SELECT id, body, MATCH (title,body) AGAINST
    ('Security implications of running MySQL as root'
    IN NATURAL LANGUAGE MODE) AS score
    FROM articles WHERE MATCH (title,body) AGAINST
    ('Security implications of running MySQL as root'
    IN NATURAL LANGUAGE MODE);
+----+-------------------------------------+-----------------+
| id | body                                | score           |
+----+-------------------------------------+-----------------+
|  4 | 1. Never run mysqld as root. 2. ... | 1.5219271183014 |
|  6 | When configured properly, MySQL ... | 1.3114095926285 |
+----+-------------------------------------+-----------------+

_{This page is licensed: GPLv2, originally from}_{fill_help_tables.sql}

MID

Syntax

MID(str,pos,len)

Description

MID(str,pos,len) is a synonym for SUBSTRING(str,pos,len).

Examples

SELECT MID('abcd',4,1);
+-----------------+
| MID('abcd',4,1) |
+-----------------+
| d               |
+-----------------+

SELECT MID('abcd',2,2);
+-----------------+
| MID('abcd',2,2) |
+-----------------+
| bc              |
+-----------------+

A negative starting position:

SELECT MID('abcd',-2,4);
+------------------+
| MID('abcd',-2,4) |
+------------------+
| cd               |
+------------------+

_{This page is licensed: GPLv2, originally from}_{fill_help_tables.sql}

NATURAL_SORT_KEY

NATURAL_SORT_KEY is available from MariaDB 10.7.

Syntax

NATURAL_SORT_KEY(str)

Description

The NATURAL_SORT_KEY function is used for sorting that is closer to natural sorting. Strings are sorted in alphabetical order, while numbers are treated in a way such that, for example, 10 is greater than 2, whereas in other forms of sorting, 2 would be greater than 10, just like z is greater than ya.

There are multiple natural sort implementations, differing in the way they handle leading zeroes, fractions, i18n, negatives, decimals and so on.

MariaDB's implementation ignores leading zeroes when performing the sort.

You can use also use NATURAL_SORT_KEY with generated columns. The value is not stored permanently in the table. When using a generated column, the virtual column must be longer than the base column to cater for embedded numbers in the string and MDEV-24582.

Examples

Strings and Numbers

CREATE TABLE t1 (c TEXT);

INSERT INTO t1 VALUES ('b1'),('a2'),('a11'),('a1');

SELECT c FROM t1;
+------+
| c    |
+------+
| b1   |
| a2   |
| a11  |
| a1   |
+------+

SELECT c FROM t1 ORDER BY c;
+------+
| c    |
+------+
| a1   |
| a11  |
| a2   |
| b1   |
+------+

Unsorted, regular sort and natural sort:

TRUNCATE t1;

INSERT INTO t1 VALUES 
  ('5.5.31'),('10.7.0'),('10.2.1'),
  ('10.1.22'),('10.3.32'),('10.2.12');

SELECT c FROM t1;
+---------+
| c       |
+---------+
| 5.5.31  |
| 10.7.0  |
| 10.2.1  |
| 10.1.22 |
| 10.3.32 |
| 10.2.12 |
+---------+

SELECT c FROM t1 ORDER BY c;
+---------+
| c       |
+---------+
| 10.1.22 |
| 10.2.1  |
| 10.2.12 |
| 10.3.32 |
| 10.7.0  |
| 5.5.31  |
+---------+

SELECT c FROM t1 ORDER BY NATURAL_SORT_KEY(c);
+---------+
| c       |
+---------+
| 5.5.31  |
| 10.1.22 |
| 10.2.1  |
| 10.2.12 |
| 10.3.32 |
| 10.7.0  |
+---------+

IPs

Sorting IPs, unsorted, regular sort and natural sort::

TRUNCATE t1;

INSERT INTO t1 VALUES 
  ('192.167.3.1'),('192.167.1.12'),('100.200.300.400'),
  ('100.50.60.70'),('100.8.9.9'),('127.0.0.1'),('0.0.0.0');

SELECT c FROM t1;
+-----------------+
| c               |
+-----------------+
| 192.167.3.1     |
| 192.167.1.12    |
| 100.200.300.400 |
| 100.50.60.70    |
| 100.8.9.9       |
| 127.0.0.1       |
| 0.0.0.0         |
+-----------------+

SELECT c FROM t1 ORDER BY c;
+-----------------+
| c               |
+-----------------+
| 0.0.0.0         |
| 100.200.300.400 |
| 100.50.60.70    |
| 100.8.9.9       |
| 127.0.0.1       |
| 192.167.1.12    |
| 192.167.3.1     |
+-----------------+

SELECT c FROM t1 ORDER BY NATURAL_SORT_KEY(c);
+-----------------+
| c               |
+-----------------+
| 0.0.0.0         |
| 100.8.9.9       |
| 100.50.60.70    |
| 100.200.300.400 |
| 127.0.0.1       |
| 192.167.1.12    |
| 192.167.3.1     |
+-----------------+

Generated Columns

Using with a generated column:

CREATE TABLE t(c VARCHAR(3), k VARCHAR(4) AS (NATURAL_SORT_KEY(c)) INVISIBLE);

INSERT INTO t(c) VALUES ('b1'),('a2'),('a11'),('a10');

SELECT * FROM t ORDER by k;
+------+
| c    |
+------+
| a2   |
| a10  |
| a11  |
| b1   |
+------+

Note that if the virtual column is not longer, results may not be as expected:

CREATE TABLE t2(c VARCHAR(3), k VARCHAR(3) AS (NATURAL_SORT_KEY(c)) INVISIBLE);

INSERT INTO t2(c) VALUES ('b1'),('a2'),('a11'),('a10');

SELECT * FROM t2 ORDER by k;
+------+
| c    |
+------+
| a2   |
| a11  |
| a10  |
| b1   |
+------+

Leading Zeroes

Ignoring leading zeroes can lead to undesirable results in certain contexts. For example:

CREATE TABLE t3 (a VARCHAR(4));

INSERT INTO t3 VALUES 
  ('a1'), ('a001'), ('a10'), ('a001'), ('a10'), 
  ('a01'), ('a01'), ('a01b'), ('a01b'), ('a1');

SELECT a FROM t3 ORDER BY a;
+------+
| a    |
+------+
| a001 |
| a001 |
| a01  |
| a01  |
| a01b |
| a01b |
| a1   |
| a1   |
| a10  |
| a10  |
+------+
10 rows in set (0.000 sec)

SELECT a FROM t3 ORDER BY NATURAL_SORT_KEY(a);
+------+
| a    |
+------+
| a1   |
| a01  |
| a01  |
| a001 |
| a001 |
| a1   |
| a01b |
| a01b |
| a10  |
| a10  |
+------+

This may not be what we were hoping for in a 'natural' sort. A workaround is to sort by both NATURAL_SORT_KEY and regular sort.

SELECT a FROM t3 ORDER BY NATURAL_SORT_KEY(a), a;
+------+
| a    |
+------+
| a001 |
| a001 |
| a01  |
| a01  |
| a1   |
| a1   |
| a01b |
| a01b |
| a10  |
| a10  |
+------+

_{This page is licensed: CC BY-SA / Gnu FDL}

NOT LIKE

Syntax

expr NOT LIKE pat [ESCAPE 'escape_char']

Description

This is the same as NOT (expr LIKE pat [ESCAPE 'escape_char']).

_{This page is licensed: GPLv2, originally from}_{fill_help_tables.sql}

NOT REGEXP

Syntax

expr NOT REGEXP pat, expr NOT RLIKE pat

Description

This is the same as NOT (expr REGEXP pat).

_{This page is licensed: GPLv2, originally from}_{fill_help_tables.sql}

OCTET_LENGTH

Syntax

OCTET_LENGTH(str)

Description

OCTET_LENGTH() returns the length of the given string, in octets (bytes). This is a synonym for LENGTHB(), and, when Oracle mode is not set, a synonym for LENGTH().

A multi-byte character counts as multiple bytes. This means that for a string containing five two-byte characters, OCTET_LENGTH() returns 10, whereas CHAR_LENGTH() returns 5.

If str is not a string value, it is converted into a string. If str is NULL, the function returns NULL.

Examples

When Oracle mode is not set:

SELECT CHAR_LENGTH('π'), LENGTH('π'), LENGTHB('π'), OCTET_LENGTH('π');
+-------------------+--------------+---------------+--------------------+
| CHAR_LENGTH('π')  | LENGTH('π')  | LENGTHB('π')  | OCTET_LENGTH('π')  |
+-------------------+--------------+---------------+--------------------+
|                 1 |            2 |             2 |                  2 |
+-------------------+--------------+---------------+--------------------+

In Oracle mode:

SELECT CHAR_LENGTH('π'), LENGTH('π'), LENGTHB('π'), OCTET_LENGTH('π');
+-------------------+--------------+---------------+--------------------+
| CHAR_LENGTH('π')  | LENGTH('π')  | LENGTHB('π')  | OCTET_LENGTH('π')  |
+-------------------+--------------+---------------+--------------------+
|                 1 |            1 |             2 |                  2 |
+-------------------+--------------+---------------+--------------------+

ORD

Syntax

ORD(str)

Description

If the leftmost character of the string str is a multi-byte character, returns the code for that character, calculated from the numeric values of its constituent bytes using this formula:

(1st byte code)
+ (2nd byte code x 256)
+ (3rd byte code x 256 x 256) ...

If the leftmost character is not a multi-byte character, ORD() returns the same value as the ASCII() function.

Examples

SELECT ORD('2');
+----------+
| ORD('2') |
+----------+
|       50 |
+----------+

POSITION

Syntax

POSITION(substr IN str)

Description

POSITION(substr IN str) is a synonym for LOCATE(substr,str).

The function is part of ODBC 3.0.

_{This page is licensed: GPLv2, originally from}_{fill_help_tables.sql}

QUOTE

Syntax

QUOTE(str)

Description

Quotes a string to produce a result that can be used as a properly escaped data value in an SQL statement. The string is returned enclosed by single quotes and with each instance of single quote ("'"), backslash ("\"),ASCII NUL, and Control-Z preceded by a backslash. If the argument is NULL, the return value is the word "NULL" without enclosing single quotes.

Examples

SELECT QUOTE("Don't!");
+-----------------+
| QUOTE("Don't!") |
+-----------------+
| 'Don\'t!'       |
+-----------------+

SELECT QUOTE(NULL); 
+-------------+
| QUOTE(NULL) |
+-------------+
| NULL        |
+-------------+

_{This page is licensed: GPLv2, originally from}_{fill_help_tables.sql}

REPEAT Function

Syntax

REPEAT(str,count)

Description

Returns a string consisting of the string str repeated count times. Ifcount is less than 1, returns an empty string. Returns NULL if str orcount are NULL.

Examples

SELECT QUOTE(REPEAT('MariaDB ',4));
+------------------------------------+
| QUOTE(REPEAT('MariaDB ',4))        |
+------------------------------------+
| 'MariaDB MariaDB MariaDB MariaDB ' |
+------------------------------------+

_{This page is licensed: GPLv2, originally from}_{fill_help_tables.sql}

REPLACE Function

Syntax

REPLACE(str,from_str,to_str)

Description

Returns the string str with all occurrences of the string from_str replaced by the string to_str. REPLACE() performs a case-sensitive match when searching for from_str.

Examples

SELECT REPLACE('www.mariadb.org', 'w', 'Ww');
+---------------------------------------+
| REPLACE('www.mariadb.org', 'w', 'Ww') |
+---------------------------------------+
| WwWwWw.mariadb.org                    |
+---------------------------------------+

_{This page is licensed: GPLv2, originally from}_{fill_help_tables.sql}

REVERSE

Syntax

REVERSE(str)

Description

Returns the string str with the order of the characters reversed.

Examples

SELECT REVERSE('desserts');
+---------------------+
| REVERSE('desserts') |
+---------------------+
| stressed            |
+---------------------+

_{This page is licensed: GPLv2, originally from}_{fill_help_tables.sql}

RIGHT

Syntax

RIGHT(str,len)

Description

Returns the rightmost len characters from the string str, or NULL if any argument is NULL.

Examples

SELECT RIGHT('MariaDB', 2);
+---------------------+
| RIGHT('MariaDB', 2) |
+---------------------+
| DB                  |
+---------------------+

_{This page is licensed: GPLv2, originally from}_{fill_help_tables.sql}

RPAD

Syntax

RPAD(str, len [, padstr])

Description

Returns the string str, right-padded with the string padstr to a length of len characters. If str is longer than len, the return value is shortened to len characters. If padstr is omitted, the RPAD function pads spaces.

Returns NULL if given a NULL argument. If the result is empty (a length of zero), returns either an empty string, or, with SQL_MODE=Oracle, NULL.

The Oracle mode version of the function can be accessed outside of Oracle mode by using RPAD_ORACLE as the function name.

Examples

SELECT RPAD('hello',10,'.');
+----------------------+
| RPAD('hello',10,'.') |
+----------------------+
| hello.....           |
+----------------------+

SELECT RPAD('hello',2,'.');
+---------------------+
| RPAD('hello',2,'.') |
+---------------------+
| he                  |
+---------------------+

With the pad string defaulting to space:

SELECT RPAD('hello',30);
+--------------------------------+
| RPAD('hello',30)               |
+--------------------------------+
| hello                          |
+--------------------------------+

Oracle mode:

SELECT RPAD('',0),RPAD_ORACLE('',0);
+------------+-------------------+
| RPAD('',0) | RPAD_ORACLE('',0) |
+------------+-------------------+
|            | NULL              |
+------------+-------------------+

RTRIM

Syntax

RTRIM(str)

Description

Returns the string str with trailing space characters removed.

Returns NULL if given a NULL argument. If the result is empty, returns either an empty string, or, with SQL_MODE=Oracle, NULL.

The Oracle mode version of the function can be accessed outside of Oracle mode by using RTRIM_ORACLE as the function name.

Examples

SELECT QUOTE(RTRIM('MariaDB    '));
+-----------------------------+
| QUOTE(RTRIM('MariaDB    ')) |
+-----------------------------+
| 'MariaDB'                   |
+-----------------------------+

Oracle mode:

SELECT RTRIM(''),RTRIM_ORACLE('');
+-----------+------------------+
| RTRIM('') | RTRIM_ORACLE('') |
+-----------+------------------+
|           | NULL             |
+-----------+------------------+

SFORMAT

SFORMAT is available from MariaDB 10.7.

Description

The SFORMAT function takes an input string and a formatting specification and returns the string formatted using the rules the user passed in the specification.

It uses the fmtlib library for Python-like (as well as Rust, C++20, etc) string formatting.

Only fmtlib 7.0.0+ is supported.

There is no native support for temporal and decimal values:

TIME_RESULT is handled as STRING_RESULT.
DECIMAL_RESULT is handled as REAL_RESULT.

Examples

SELECT SFORMAT("The answer is {}.", 42);
+----------------------------------+
| SFORMAT("The answer is {}.", 42) |
+----------------------------------+
| The answer is 42.                |
+----------------------------------+

CREATE TABLE test_sformat(mdb_release char(6), mdev int, feature char(20));

INSERT INTO test_sformat VALUES('10.7.0', 25015, 'Python style sformat'), 
  ('10.7.0', 4958, 'UUID');

SELECT * FROM test_sformat;
+-------------+-------+----------------------+
| mdb_release | mdev  | feature              |
+-------------+-------+----------------------+
| 10.7.0      | 25015 | Python style sformat |
| 10.7.0      |  4958 | UUID                 |
+-------------+-------+----------------------+

SELECT SFORMAT('MariaDB Server {} has a preview for MDEV-{} which is about {}', 
  mdb_release, mdev, feature) AS 'Preview Release Examples'
  FROM test_sformat;
+----------------------------------------------------------------------------------------+
| Preview Release Examples                                                               |
+----------------------------------------------------------------------------------------+
| MariaDB Server 10.7.0 has a preview for MDEV-25015 which is about Python style sformat |
| MariaDB Server 10.7.0 has a preview for MDEV-4958 which is about UUID                  |
+----------------------------------------------------------------------------------------+

SOUNDEX

Syntax

SOUNDEX(str)

Description

Returns a soundex string from str. Two strings that sound almost the same should have identical soundex strings. A standard soundex string is four characters long, but the SOUNDEX() function returns an arbitrarily long string. You can use SUBSTRING() on the result to get a standard soundex string. All non-alphabetic characters in str are ignored. All international alphabetic characters outside the A-Z range are treated as vowels.

Important: When using SOUNDEX(), you should be aware of the following details:

This function, as currently implemented, is intended to work well with strings that are in the English language only. Strings in other languages may not produce reasonable results.
This function implements the original Soundex algorithm, not the more popular enhanced version (also described by D. Knuth). The difference is that original version discards vowels first and duplicates second, whereas the enhanced version discards duplicates first and vowels second.

Examples

SOUNDEX('Hello');
+------------------+
| SOUNDEX('Hello') |
+------------------+
| H400             |
+------------------+

SELECT SOUNDEX('MariaDB');
+--------------------+
| SOUNDEX('MariaDB') |
+--------------------+
| M631               |
+--------------------+

SELECT SOUNDEX('Knowledgebase');
+--------------------------+
| SOUNDEX('Knowledgebase') |
+--------------------------+
| K543212                  |
+--------------------------+

SELECT givenname, surname FROM users WHERE SOUNDEX(givenname) = SOUNDEX("robert");
+-----------+---------+
| givenname | surname |
+-----------+---------+
| Roberto   | Castro  |
+-----------+---------+

SOUNDS LIKE

Syntax

expr1 SOUNDS LIKE expr2

Description

This is the same as SOUNDEX(expr1) = SOUNDEX(expr2).

Example

SELECT givenname, surname FROM users WHERE givenname SOUNDS LIKE "robert";
+-----------+---------+
| givenname | surname |
+-----------+---------+
| Roberto   | Castro  |
+-----------+---------+

_{This page is licensed: GPLv2, originally from}_{fill_help_tables.sql}

SPACE

Syntax

SPACE(N)

Description

Returns a string consisting of N space characters. If N is NULL, returns NULL.

Examples

SELECT QUOTE(SPACE(6));
+-----------------+
| QUOTE(SPACE(6)) |
+-----------------+
| '      '        |
+-----------------+

_{This page is licensed: GPLv2, originally from}_{fill_help_tables.sql}

STRCMP

Syntax

STRCMP(expr1,expr2)

Description

STRCMP() returns 0 if the strings are the same, -1 if the first argument is smaller than the second according to the current sort order, and 1 if the strings are otherwise not the same. Returns NULL is either argument is NULL.

Examples

SELECT STRCMP('text', 'text2');
+-------------------------+
| STRCMP('text', 'text2') |
+-------------------------+
|                      -1 |
+-------------------------+

SELECT STRCMP('text2', 'text');
+-------------------------+
| STRCMP('text2', 'text') |
+-------------------------+
|                       1 |
+-------------------------+

SELECT STRCMP('text', 'text');
+------------------------+
| STRCMP('text', 'text') |
+------------------------+
|                      0 |
+------------------------+

_{This page is licensed: GPLv2, originally from}_{fill_help_tables.sql}

SUBSTR

Description

SUBSTR() is a synonym for SUBSTRING().

_{This page is licensed: GPLv2, originally from}_{fill_help_tables.sql}

SUBSTRING

Syntax

SUBSTRING(str,pos), 
SUBSTRING(str FROM pos), 
SUBSTRING(str,pos,len),
SUBSTRING(str FROM pos FOR len)

SUBSTR(str,pos), 
SUBSTR(str FROM pos), 
SUBSTR(str,pos,len),
SUBSTR(str FROM pos FOR len)

Description

The forms without a len argument return a substring from string str starting at position pos.

The forms with a len argument return a substring len characters long from string str, starting at position pos.

The forms that use FROM are standard SQL syntax.

It is also possible to use a negative value for pos. In this case, the beginning of the substring is pos characters from the end of the string, rather than the beginning. A negative value may be used for pos in any of the forms of this function.

By default, the position of the first character in the string from which the substring is to be extracted is reckoned as 1. For Oracle compatibility, when sql_mode is set to 'oracle', position zero is treated as position 1 (although the first character is still reckoned as 1).

If any argument is NULL, returns NULL.

The optimizer can take advantage of queries of the format SUBSTR(col, 1, n) = const_str.

The optimizer cannot take advantage of queries of the format SUBSTR(col, 1, n) = const_str.

Examples

SELECT SUBSTRING('Knowledgebase',5);
+------------------------------+
| SUBSTRING('Knowledgebase',5) |
+------------------------------+
| ledgebase                    |
+------------------------------+

SELECT SUBSTRING('MariaDB' FROM 6);
+-----------------------------+
| SUBSTRING('MariaDB' FROM 6) |
+-----------------------------+
| DB                          |
+-----------------------------+

SELECT SUBSTRING('Knowledgebase',3,7);
+--------------------------------+
| SUBSTRING('Knowledgebase',3,7) |
+--------------------------------+
| owledge                        |
+--------------------------------+

SELECT SUBSTRING('Knowledgebase', -4);
+--------------------------------+
| SUBSTRING('Knowledgebase', -4) |
+--------------------------------+
| base                           |
+--------------------------------+

SELECT SUBSTRING('Knowledgebase', -8, 4);
+-----------------------------------+
| SUBSTRING('Knowledgebase', -8, 4) |
+-----------------------------------+
| edge                              |
+-----------------------------------+

SELECT SUBSTRING('Knowledgebase' FROM -8 FOR 4);
+------------------------------------------+
| SUBSTRING('Knowledgebase' FROM -8 FOR 4) |
+------------------------------------------+
| edge                                     |
+------------------------------------------+

Oracle mode:

SELECT SUBSTR('abc',0,3);
+-------------------+
| SUBSTR('abc',0,3) |
+-------------------+
|                   |
+-------------------+

SELECT SUBSTR('abc',1,2);
+-------------------+
| SUBSTR('abc',1,2) |
+-------------------+
| ab                |
+-------------------+

SET sql_mode='oracle';

SELECT SUBSTR('abc',0,3);
+-------------------+
| SUBSTR('abc',0,3) |
+-------------------+
| abc               |
+-------------------+

SELECT SUBSTR('abc',1,2);
+-------------------+
| SUBSTR('abc',1,2) |
+-------------------+
| ab                |
+-------------------+

SUBSTRING_INDEX

Syntax

SUBSTRING_INDEX(str,delim,count)

Description

Returns the substring from string str before count occurrences of the delimiter delim. If count is positive, everything to the left of the final delimiter (counting from the left) is returned. If count is negative, everything to the right of the final delimiter (counting from the right) is returned. SUBSTRING_INDEX() performs a case-sensitive match when searching for delim.

If any argument is NULL, returns NULL.

For example:

SUBSTRING_INDEX('www.mariadb.org', '.', 2)

It means "Return all of the characters up to the 2nd occurrence of ."

Examples

SELECT SUBSTRING_INDEX('www.mariadb.org', '.', 2);
+--------------------------------------------+
| SUBSTRING_INDEX('www.mariadb.org', '.', 2) |
+--------------------------------------------+
| www.mariadb                                |
+--------------------------------------------+

SELECT SUBSTRING_INDEX('www.mariadb.org', '.', -2);
+---------------------------------------------+
| SUBSTRING_INDEX('www.mariadb.org', '.', -2) |
+---------------------------------------------+
| mariadb.org                                 |
+---------------------------------------------+

TO_BASE64

Syntax

TO_BASE64(str)

Description

Converts the string argument str to its base-64 encoded form, returning the result as a character string in the connection character set and collation.

The argument str will be converted to string first if it is not a string. A NULL argument will return a NULL result.

The reverse function, FROM_BASE64(), decodes an encoded base-64 string.

There are a numerous different methods to base-64 encode a string. The following are used by MariaDB and MySQL:

Alphabet value 64 is encoded as '+'.
Alphabet value 63 is encoded as '/'.
Encoding output is made up of groups of four printable characters, with each three bytes of data encoded using four characters. If the final group is not complete, it is padded with '=' characters to make up a length of four.
To divide long output, a newline is added after every 76 characters.
Decoding will recognize and ignore newlines, carriage returns, tabs, and spaces.

Examples

SELECT TO_BASE64('Maria');
+--------------------+
| TO_BASE64('Maria') |
+--------------------+
| TWFyaWE=           |
+--------------------+

_{This page is licensed: CC BY-SA / Gnu FDL}

TO_CHAR

TO_CHAR is available from MariaDB 10.6.

Syntax

TO_CHAR(expr[, fmt])

Description

The TO_CHAR function converts an expr of type date, datetime, time or timestamp to a string. The optional fmt argument supports YYY/YYY/YY/RRRR/RR/MM/MON/MONTH/MI/DD/DY/HH/HH12/HH24/SS and special characters. The default value is YYYY-MM-DD HH24:MI:SS.

In Oracle, TO_CHAR can also be used to convert numbers to strings, but this is not supported in MariaDB and will give an error.

Examples

SELECT TO_CHAR('1980-01-11 04:50:39', 'YYYY-MM-DD');
+----------------------------------------------+
| TO_CHAR('1980-01-11 04:50:39', 'YYYY-MM-DD') |
+----------------------------------------------+
| 1980-01-11                                   |
+----------------------------------------------+

SELECT TO_CHAR('1980-01-11 04:50:39', 'HH24-MI-SS');
+----------------------------------------------+
| TO_CHAR('1980-01-11 04:50:39', 'HH24-MI-SS') |
+----------------------------------------------+
| 04-50-39                                     |
+----------------------------------------------+

SELECT TO_CHAR('00-01-01 00:00:00', 'YY-MM-DD HH24:MI:SS');
+-----------------------------------------------------+
| TO_CHAR('00-01-01 00:00:00', 'YY-MM-DD HH24:MI:SS') |
+-----------------------------------------------------+
| 00-01-01 00:00:00                                   |
+-----------------------------------------------------+

SELECT TO_CHAR('99-12-31 23:59:59', 'YY-MM-DD HH24:MI:SS');
+-----------------------------------------------------+
| TO_CHAR('99-12-31 23:59:59', 'YY-MM-DD HH24:MI:SS') |
+-----------------------------------------------------+
| 99-12-31 23:59:59                                   |
+-----------------------------------------------------+

SELECT TO_CHAR('9999-12-31 23:59:59', 'YY-MM-DD HH24:MI:SS');
+-------------------------------------------------------+
| TO_CHAR('9999-12-31 23:59:59', 'YY-MM-DD HH24:MI:SS') |
+-------------------------------------------------------+
| 99-12-31 23:59:59                                     |
+-------------------------------------------------------+

SELECT TO_CHAR('21-01-03 08:30:00', 'Y-MONTH-DY HH:MI:SS');
+-----------------------------------------------------+
| TO_CHAR('21-01-03 08:30:00', 'Y-MONTH-DY HH:MI:SS') |
+-----------------------------------------------------+
| 1-January  -Sun 08:30:00                            |
+-----------------------------------------------------+

TRIM

Syntax

TRIM_ORACLE([{BOTH | LEADING | TRAILING} [remstr] FROM] str), TRIM([remstr FROM] str)

Description

Returns the string str with all remstr prefixes or suffixes removed. If none of the specifiers BOTH, LEADING, or TRAILING is given, BOTH is assumed. remstr is optional and, if not specified, spaces are removed.

Returns NULL if given a NULL argument. If the result is empty, returns either an empty string, or, with SQL_MODE=Oracle, NULL. SQL_MODE=Oracle is not set by default.

The Oracle mode version of the function can be accessed in any mode by using TRIM_ORACLE as the function name.

Examples

SELECT TRIM('  bar   ')\G
*************************** 1. row ***************************
TRIM('  bar   '): bar

SELECT TRIM(LEADING 'x' FROM 'xxxbarxxx')\G
*************************** 1. row ***************************
TRIM(LEADING 'x' FROM 'xxxbarxxx'): barxxx

SELECT TRIM(BOTH 'x' FROM 'xxxbarxxx')\G
*************************** 1. row ***************************
TRIM(BOTH 'x' FROM 'xxxbarxxx'): bar

SELECT TRIM(TRAILING 'xyz' FROM 'barxxyz')\G
*************************** 1. row ***************************
TRIM(TRAILING 'xyz' FROM 'barxxyz'): barx

With SQL_MODE=Oracle not set:

SELECT TRIM(''),TRIM_ORACLE('');
+----------+-----------------+
| TRIM('') | TRIM_ORACLE('') |
+----------+-----------------+
|          | NULL            |
+----------+-----------------+

With SQL_MODE=Oracle set:

SELECT TRIM(''),TRIM_ORACLE('');
+----------+-----------------+
| TRIM('') | TRIM_ORACLE('') |
+----------+-----------------+
| NULL     | NULL            |
+----------+-----------------+

TRIM_ORACLE

TRIM_ORACLE is a synonym for the Oracle mode version of the TRIM function, and is available in all modes.

_{This page is licensed: CC BY-SA / Gnu FDL}

Type Conversion

Implicit type conversion takes place when MariaDB is using operands or different types, in order to make the operands compatible.

It is best practice not to rely upon implicit conversion; rather use CAST to explicitly convert types.

Rules for Conversion on Comparison

If either argument is NULL, the result of the comparison is NULL unless the NULL-safe <=> equality comparison operator is used.
If both arguments are integers, they are compared as integers.
If both arguments are strings, they are compared as strings.
If one argument is decimal and the other argument is decimal or integer, they are compared as decimals.
If one argument is decimal and the other argument is a floating point, they are compared as floating point values.
If one argument is string and the other argument is integer, they are compared as decimals.
If a hexadecimal argument is not compared to a number, it is treated as a binary string.
If a constant is compared to a TIMESTAMP or DATETIME, the constant is converted to a timestamp, unless used as an argument to the IN function.
In other cases, arguments are compared as floating point, or real, numbers.

Note that if a string column is being compared with a numeric value, MariaDB will not use the index on the column, as there are numerous alternatives that may evaluate as equal (see examples below).

Comparison Examples

Converting a string to a number:

SELECT 15+'15';
+---------+
| 15+'15' |
+---------+
|      30 |
+---------+

Converting a number to a string:

SELECT CONCAT(15,'15');
+-----------------+
| CONCAT(15,'15') |
+-----------------+
| 1515            |
+-----------------+

Floating point number errors:

SELECT '9746718491924563214' = 9746718491924563213;
+---------------------------------------------+
| '9746718491924563214' = 9746718491924563213 |
+---------------------------------------------+
|                                           1 |
+---------------------------------------------+

Numeric equivalence with strings:

SELECT '5' = 5;
+---------+
| '5' = 5 |
+---------+
|       1 |
+---------+

SELECT '   5' = 5;
+------------+
| '   5' = 5 |
+------------+
|          1 |
+------------+

SELECT '   5  ' = 5;
+--------------+
| '   5  ' = 5 |
+--------------+
|            1 |
+--------------+
1 row in set, 1 warning (0.000 sec)

SHOW WARNINGS;
+-------+------+--------------------------------------------+
| Level | Code | Message                                    |
+-------+------+--------------------------------------------+
| Note  | 1292 | Truncated incorrect DOUBLE value: '   5  ' |
+-------+------+--------------------------------------------+

As a result of the above, MariaDB cannot use the index when comparing a string with a numeric value in the example below:

CREATE TABLE t (a VARCHAR(10), b VARCHAR(10), INDEX idx_a (a));

INSERT INTO t VALUES 
  ('1', '1'), ('2', '2'), ('3', '3'), 
  ('4', '4'), ('5', '5'), ('1', '5');

EXPLAIN SELECT * FROM t WHERE a = '3' \G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t
         type: ref
possible_keys: idx_a
          key: idx_a
      key_len: 13
          ref: const
         rows: 1
        Extra: Using index condition

EXPLAIN SELECT * FROM t WHERE a = 3 \G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t
         type: ALL
possible_keys: idx_a
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 6
        Extra: Using where

Rules for Conversion on Dyadic Arithmetic Operations

Implicit type conversion also takes place on dyadic arithmetic operations (+,-,*,/). MariaDB chooses the minimum data type that is guaranteed to fit the result and converts both arguments to the result data type.

For addition (+), subtraction (-) and multiplication (*), the result data type is chosen as follows:

If either of the arguments is an approximate number (float, double), the result is double.
If either of the arguments is a string (char, varchar, text), the result is double.
If either of the arguments is a decimal number, the result is decimal.
If either of the arguments is of a temporal type with a non-zero fractional second precision (time(N), datetime(N), timestamp(N)), the result is decimal.
If either of the arguments is of a temporal type with a zero fractional second precision (time(0), date, datetime(0), timestamp(0)), the result may vary between int, int unsigned, bigint or bigint unsigned, depending on the exact data type combination.
If both arguments are integer numbers (tinyint, smallint, mediumint, bigint), the result may vary between int, int unsigned, bigint or bigint unsigned, depending of the exact data types and their signs.

For division (/), the result data type is chosen as follows:

If either of the arguments is an approximate number (float, double), the result is double.
If either of the arguments is a string (char, varchar, text), the result is double.
Otherwise, the result is decimal.

Arithmetic Examples

Note, the above rules mean that when an argument of a temporal data type appears in addition or subtraction, it's treated as a number by default.

SELECT TIME'10:20:30' + 1;
+--------------------+
| TIME'10:20:30' + 1 |
+--------------------+
|             102031 |
+--------------------+

In order to do temporal addition or subtraction instead, use the DATE_ADD() or DATE_SUB() functions, or an INTERVAL expression as the second argument:

SELECT TIME'10:20:30' + INTERVAL 1 SECOND;
+------------------------------------+
| TIME'10:20:30' + INTERVAL 1 SECOND |
+------------------------------------+
| 10:20:31                           |
+------------------------------------+

SELECT "2.2" + 3;
+-----------+
| "2.2" + 3 |
+-----------+
|       5.2 |
+-----------+

SELECT 2.2 + 3;
+---------+
| 2.2 + 3 |
+---------+
| 5.2     |
+---------+

SELECT 2.2 / 3;
+---------+
| 2.2 / 3 |
+---------+
| 0.73333 |
+---------+

SELECT "2.2" / 3;
+--------------------+
| "2.2" / 3          |
+--------------------+
| 0.7333333333333334 |
+--------------------+

_{This page is licensed: CC BY-SA / Gnu FDL}

UCASE

Syntax

UCASE(str)

Description

UCASE() is a synonym for UPPER().

_{This page is licensed: GPLv2, originally from}_{fill_help_tables.sql}

UNHEX

Syntax

UNHEX(str)

Description

Performs the inverse operation of HEX(str). That is, it interprets each pair of hexadecimal digits in the argument as a number and converts it to the character represented by the number. The resulting characters are returned as a binary string.

If str is NULL, UNHEX() returns NULL.

Examples

SELECT HEX('MariaDB');
+----------------+
| HEX('MariaDB') |
+----------------+
| 4D617269614442 |
+----------------+

SELECT UNHEX('4D617269614442');
+-------------------------+
| UNHEX('4D617269614442') |
+-------------------------+
| MariaDB                 |
+-------------------------+

SELECT 0x4D617269614442;
+------------------+
| 0x4D617269614442 |
+------------------+
| MariaDB          |
+------------------+

SELECT UNHEX(HEX('string'));
+----------------------+
| UNHEX(HEX('string')) |
+----------------------+
| string               |
+----------------------+

SELECT HEX(UNHEX('1267'));
+--------------------+
| HEX(UNHEX('1267')) |
+--------------------+
| 1267               |
+--------------------+

UPDATEXML

Syntax

UpdateXML(xml_target, xpath_expr, new_xml)

Description

This function replaces a single portion of a given fragment of XML markupxml_target with a new XML fragment new_xml, and then returns the changed XML. The portion of xml_target that is replaced matches an XPath expression xpath_expr supplied by the user. If no expression matching xpath_expr is found, or if multiple matches are found, the function returns the original xml_target XML fragment. All three arguments should be strings.

Examples

SELECT
    UpdateXML('<a><b>ccc</b><d></d></a>', '/a', '<e>fff</e>') AS val1,
    UpdateXML('<a><b>ccc</b><d></d></a>', '/b', '<e>fff</e>') AS val2,
    UpdateXML('<a><b>ccc</b><d></d></a>', '//b', '<e>fff</e>') AS val3,
    UpdateXML('<a><b>ccc</b><d></d></a>', '/a/d', '<e>fff</e>') AS val4,
    UpdateXML('<a><d></d><b>ccc</b><d></d></a>', '/a/d', '<e>fff</e>') AS val5
    \G
*************************** 1. row ***************************
val1: <e>fff</e>
val2: <a><b>ccc</b><d></d></a>
val3: <a><e>fff</e><d></d></a>
val4: <a><b>ccc</b><e>fff</e></a>
val5: <a><d></d><b>ccc</b><d></d></a>
1 row in set (0.00 sec)

_{This page is licensed: GPLv2, originally from}_{fill_help_tables.sql}

UPPER

Syntax

UPPER(str)
UCASE(str)

Description

Returns the string str with all characters changed to uppercase according to the current character set mapping. The default is latin1 (cp1252 West European).

UCASE is a synonym.

SELECT UPPER(surname), givenname FROM users ORDER BY surname;
+----------------+------------+
| UPPER(surname) | givenname  |
+----------------+------------+
| ABEL           | Jacinto    |
| CASTRO         | Robert     |
| COSTA          | Phestos    |
| MOSCHELLA      | Hippolytos |
+----------------+------------+

UPPER() is ineffective when applied to binary strings (BINARY, VARBINARY, BLOB). The description of LOWER() shows how to perform lettercase conversion of binary strings.

_{This page is licensed: GPLv2, originally from}_{fill_help_tables.sql}

WEIGHT_STRING

Syntax

WEIGHT_STRING(str [AS {CHAR|BINARY}(N)] [LEVEL levels] [flags])
  levels: N [ASC|DESC|REVERSE] [, N [ASC|DESC|REVERSE]] ...

Description

Returns a binary string representing the string's sorting and comparison value. A string with a lower result means that for sorting purposes the string appears before a string with a higher result.

WEIGHT_STRING() is particularly useful when adding new collations, for testing purposes.

If str is a non-binary string (CHAR, VARCHAR or TEXT), WEIGHT_STRING returns the string's collation weight. If str is a binary string (BINARY, VARBINARY or BLOB), the return value is simply the input value, since the weight for each byte in a binary string is the byte value.

WEIGHT_STRING() returns NULL if given a NULL input.

The optional AS clause permits casting the input string to a binary or non-binary string, as well as to a particular length.

AS BINARY(N) measures the length in bytes rather than characters, and right pads with 0x00 bytes to the desired length.

AS CHAR(N) measures the length in characters, and right pads with spaces to the desired length.

N has a minimum value of 1, and if it is less than the length of the input string, the string is truncated without warning.

The optional LEVEL clause specifies that the return value should contain weights for specific collation levels. The levels specifier can either be a single integer, a comma-separated list of integers, or a range of integers separated by a dash (whitespace is ignored). Integers can range from 1 to a maximum of 6, dependent on the collation, and need to be listed in ascending order.

If the LEVEL clause is no provided, a default of 1 to the maximum for the collation is assumed.

If the q is specified without using a range, an optional modifier is permitted.

ASC, the default, returns the weights without any modification.

DESC returns bitwise-inverted weights.

REVERSE returns the weights in reverse order.

Examples

The examples below use the HEX() function to represent non-printable results in hexadecimal format.

SELECT HEX(WEIGHT_STRING('x'));
+-------------------------+
| HEX(WEIGHT_STRING('x')) |
+-------------------------+
| 0058                    |
+-------------------------+

SELECT HEX(WEIGHT_STRING('x' AS BINARY(4)));
+--------------------------------------+
| HEX(WEIGHT_STRING('x' AS BINARY(4))) |
+--------------------------------------+
| 78000000                             |
+--------------------------------------+

SELECT HEX(WEIGHT_STRING('x' AS CHAR(4)));
+------------------------------------+
| HEX(WEIGHT_STRING('x' AS CHAR(4))) |
+------------------------------------+
| 0058002000200020                   |
+------------------------------------+

SELECT HEX(WEIGHT_STRING(0xaa22ee LEVEL 1));
+--------------------------------------+
| HEX(WEIGHT_STRING(0xaa22ee LEVEL 1)) |
+--------------------------------------+
| AA22EE                               |
+--------------------------------------+

SELECT HEX(WEIGHT_STRING(0xaa22ee LEVEL 1 DESC));
+-------------------------------------------+
| HEX(WEIGHT_STRING(0xaa22ee LEVEL 1 DESC)) |
+-------------------------------------------+
| 55DD11                                    |
+-------------------------------------------+

SELECT HEX(WEIGHT_STRING(0xaa22ee LEVEL 1 REVERSE));
+----------------------------------------------+
| HEX(WEIGHT_STRING(0xaa22ee LEVEL 1 REVERSE)) |
+----------------------------------------------+
| EE22AA                                       |
+----------------------------------------------+

_{This page is licensed: CC BY-SA / Gnu FDL}

Regular Expressions Functions

Learn about regular expression functions in MariaDB Server. This section details SQL functions for powerful pattern matching and manipulation of string data using regular expressions.

Regular Expressions Overview

Regular Expressions allow MariaDB to perform complex pattern matching on a string. In many cases, the simple pattern matching provided by LIKE is sufficient. LIKE performs two kinds of matches:

_ - the underscore, matching a single character
% - the percentage sign, matching any number of characters.

In other cases you may need more control over the returned matches, and will need to use regular expressions.

Regular expression matches are performed with the REGEXP function. RLIKE is a synonym for REGEXP.

Comparisons are performed on the byte value, so characters that are treated as equivalent by a collation, but do not have the same byte-value, such as accented characters, could evaluate as unequal.

Without any special characters, a regular expression match is true if the characters match. The match is case-insensitive, except in the case of BINARY strings.

SELECT 'Maria' REGEXP 'Maria';
+------------------------+
| 'Maria' REGEXP 'Maria' |
+------------------------+
|                      1 |
+------------------------+

SELECT 'Maria' REGEXP 'maria';
+------------------------+
| 'Maria' REGEXP 'maria' |
+------------------------+
|                      1 |
+------------------------+

SELECT BINARY 'Maria' REGEXP 'maria';
+-------------------------------+
| BINARY 'Maria' REGEXP 'maria' |
+-------------------------------+
|                             0 |
+-------------------------------+

Note that the word being matched must match the whole pattern:

SELECT 'Maria' REGEXP 'Mari';
+-----------------------+
| 'Maria' REGEXP 'Mari' |
+-----------------------+
|                     1 |
+-----------------------+

SELECT 'Mari' REGEXP 'Maria';
+-----------------------+
| 'Mari' REGEXP 'Maria' |
+-----------------------+
|                     0 |
+-----------------------+

The first returns true because the pattern "Mari" exists in the expression "Maria". When the order is reversed, the result is false, as the pattern "Maria" does not exist in the expression "Mari"

A match can be performed against more than one word with the | character. For example:

SELECT 'Maria' REGEXP 'Monty|Maria';
+------------------------------+
| 'Maria' REGEXP 'Monty|Maria' |
+------------------------------+
|                            1 |
+------------------------------+

Special Characters

The above examples introduce the syntax, but are not very useful on their own. It's the special characters that give regular expressions their power.

^

^ matches the beginning of a string (inside square brackets it can also mean NOT - see below):

SELECT 'Maria' REGEXP '^Ma';
+----------------------+
| 'Maria' REGEXP '^Ma' |
+----------------------+
|                    1 |
+----------------------+

$

$ matches the end of a string:

SELECT 'Maria' REGEXP 'ia$';
+----------------------+
| 'Maria' REGEXP 'ia$' |
+----------------------+
|                    1 |
+----------------------+

.

. matches any single character:

SELECT 'Maria' REGEXP 'Ma.ia';
+------------------------+
| 'Maria' REGEXP 'Ma.ia' |
+------------------------+
|                      1 |
+------------------------+

SELECT 'Maria' REGEXP 'Ma..ia';
+-------------------------+
| 'Maria' REGEXP 'Ma..ia' |
+-------------------------+
|                       0 |
+-------------------------+

*

x* matches zero or more of a character x. In the examples below, it's the r character.

SELECT 'Maria' REGEXP 'Mar*ia';
+-------------------------+
| 'Maria' REGEXP 'Mar*ia' |
+-------------------------+
|                       1 |
+-------------------------+

SELECT 'Maia' REGEXP 'Mar*ia';
+------------------------+
| 'Maia' REGEXP 'Mar*ia' |
+------------------------+
|                      1 |
+------------------------+

SELECT 'Marrria' REGEXP 'Mar*ia';
+---------------------------+
| 'Marrria' REGEXP 'Mar*ia' |
+---------------------------+
|                         1 |
+---------------------------+

+

x+ matches one or more of a character x. In the examples below, it's the r character.

SELECT 'Maria' REGEXP 'Mar+ia';
+-------------------------+
| 'Maria' REGEXP 'Mar+ia' |
+-------------------------+
|                       1 |
+-------------------------+

SELECT 'Maia' REGEXP 'Mar+ia';
+------------------------+
| 'Maia' REGEXP 'Mar+ia' |
+------------------------+
|                      0 |
+------------------------+

SELECT 'Marrria' REGEXP 'Mar+ia';
+---------------------------+
| 'Marrria' REGEXP 'Mar+ia' |
+---------------------------+
|                         1 |
+---------------------------+

?

x? matches zero or one of a character x. In the examples below, it's the r character.

SELECT 'Maria' REGEXP 'Mar?ia';
+-------------------------+
| 'Maria' REGEXP 'Mar?ia' |
+-------------------------+
|                       1 |
+-------------------------+

SELECT 'Maia' REGEXP 'Mar?ia';
+------------------------+
| 'Maia' REGEXP 'Mar?ia' |
+------------------------+
|                      1 |
+------------------------+

SELECT 'Marrria' REGEXP 'Mar?ia';
+---------------------------+
| 'Marrria' REGEXP 'Mar?ia' |
+---------------------------+
|                         0 |
+---------------------------+

()

(xyz) - combine a sequence, for example (xyz)+ or (xyz)*

SELECT 'Maria' REGEXP '(ari)+';
+-------------------------+
| 'Maria' REGEXP '(ari)+' |
+-------------------------+
|                       1 |
+-------------------------+

{}

x{n} and x{m,n} This notation is used to match many instances of the x. In the case of x{n} the match must be exactly that many times. In the case of x{m,n}, the match can occur from m to n times. For example, to match zero or one instance of the string ari (which is identical to (ari)?), the following can be used:

SELECT 'Maria' REGEXP '(ari){0,1}';
+-----------------------------+
| 'Maria' REGEXP '(ari){0,1}' |
+-----------------------------+
|                           1 |
+-----------------------------+

[]

[xy] groups characters for matching purposes. For example, to match either the p or the r character:

SELECT 'Maria' REGEXP 'Ma[pr]ia';
+---------------------------+
| 'Maria' REGEXP 'Ma[pr]ia' |
+---------------------------+
|                         1 |
+---------------------------+

The square brackets also permit a range match, for example, to match any character from a-z, [a-z] is used. Numeric ranges are also permitted.

SELECT 'Maria' REGEXP 'Ma[a-z]ia';
+----------------------------+
| 'Maria' REGEXP 'Ma[a-z]ia' |
+----------------------------+
|                          1 |
+----------------------------+

The following does not match, as r falls outside of the range a-p.

SELECT 'Maria' REGEXP 'Ma[a-p]ia';
+----------------------------+
| 'Maria' REGEXP 'Ma[a-p]ia' |
+----------------------------+
|                          0 |
+----------------------------+

The ^ character means does NOT match, for example:

SELECT 'Maria' REGEXP 'Ma[^p]ia';
+---------------------------+
| 'Maria' REGEXP 'Ma[^p]ia' |
+---------------------------+
|                         1 |
+---------------------------+

SELECT 'Maria' REGEXP 'Ma[^r]ia';
+---------------------------+
| 'Maria' REGEXP 'Ma[^r]ia' |
+---------------------------+
|                         0 |
+---------------------------+

The [ and ] characters on their own can be literally matched inside a [] block, without escaping, as long as they immediately match the opening bracket:

SELECT '[Maria' REGEXP '[[]';
+-----------------------+
| '[Maria' REGEXP '[[]' |
+-----------------------+
|                     1 |
+-----------------------+

SELECT '[Maria' REGEXP '[]]';
+-----------------------+
| '[Maria' REGEXP '[]]' |
+-----------------------+
|                     0 |
+-----------------------+

SELECT ']Maria' REGEXP '[]]';
+-----------------------+
| ']Maria' REGEXP '[]]' |
+-----------------------+
|                     1 |
+-----------------------+

SELECT ']Maria' REGEXP '[]a]';
+------------------------+
| ']Maria' REGEXP '[]a]' |
+------------------------+
|                      1 |
+------------------------+

Incorrect order, so no match:

SELECT ']Maria' REGEXP '[a]]';
+------------------------+
| ']Maria' REGEXP '[a]]' |
+------------------------+
|                      0 |
+------------------------+

The - character can also be matched in the same way:

SELECT '-Maria' REGEXP '[1-10]';
+--------------------------+
| '-Maria' REGEXP '[1-10]' |
+--------------------------+
|                        0 |
+--------------------------+

SELECT '-Maria' REGEXP '[-1-10]';
+---------------------------+
| '-Maria' REGEXP '[-1-10]' |
+---------------------------+
|                         1 |
+---------------------------+

Word boundaries

The :<: and :>: patterns match the beginning and the end of a word respectively. For example:

SELECT 'How do I upgrade MariaDB?' REGEXP '[[:<:]]MariaDB[[:>:]]';
+------------------------------------------------------------+
| 'How do I upgrade MariaDB?' REGEXP '[[:<:]]MariaDB[[:>:]]' |
+------------------------------------------------------------+
|                                                          1 |
+------------------------------------------------------------+

SELECT 'How do I upgrade MariaDB?' REGEXP '[[:<:]]Maria[[:>:]]';
+----------------------------------------------------------+
| 'How do I upgrade MariaDB?' REGEXP '[[:<:]]Maria[[:>:]]' |
+----------------------------------------------------------+
|                                                        0 |
+----------------------------------------------------------+

Character Classes

There are a number of shortcuts to match particular preset character classes. These are matched with the [:character_class:] pattern (inside a [] set). The following character classes exist:

Character Class

Description

alnum

Alphanumeric

alpha

Alphabetic

blank

Whitespace

cntrl

Control characters

digit

Digits

graph

Graphic characters

lower

Lowercase alphabetic

Graphic or space characters

punct

Punctuation

space

Space, tab, newline, and carriage return

upper

Uppercase alphabetic

xdigit

Hexadecimal digit

For example:

SELECT 'Maria' REGEXP 'Mar[[:alnum:]]*';
+--------------------------------+
| 'Maria' REGEXP 'Mar[:alnum:]*' |
+--------------------------------+
|                              1 |
+--------------------------------+

Remember that matches are by default case-insensitive, unless a binary string is used, so the following example, specifically looking for an uppercase, counter-intuitively matches a lowercase character:

SELECT 'Mari' REGEXP 'Mar[[:upper:]]+';
+---------------------------------+
| 'Mari' REGEXP 'Mar[[:upper:]]+' |
+---------------------------------+
|                               1 |
+---------------------------------+

SELECT BINARY 'Mari' REGEXP 'Mar[[:upper:]]+';
+----------------------------------------+
| BINARY 'Mari' REGEXP 'Mar[[:upper:]]+' |
+----------------------------------------+
|                                      0 |
+----------------------------------------+

Character Names

There are also number of shortcuts to match particular preset character names. These are matched with the [.character.] pattern (inside a [] set). The following character classes exist:

Name

Character

NUL

SOH

001

STX

002

ETX

003

EOT

004

ENQ

005

ACK

006

BEL

007

alert

007

010

backspace

'\b'

011

tab

'\t'

012

newline

'\n'

013

vertical-tab

'\v'

014

form-feed

'\f'

015

carriage-return

'\r'

016

017

DLE

020

DC1

021

DC2

022

DC3

023

DC4

024

NAK

025

SYN

026

ETB

027

CAN

030

031

SUB

032

ESC

033

IS4

034

IS3

035

IS2

036

IS1

037

space

' '

exclamation-mark

'!'

quotation-mark

'"'

number-sign

'#'

dollar-sign

'$'

percent-sign

'%'

ampersand

'&'

apostrophe

'''

left-parenthesis

'('

right-parenthesis

')'

asterisk

'*'

plus-sign

'+'

comma

','

hyphen

'-'

hyphen-minus

'-'

period

'.'

full-stop

'.'

slash

'/'

solidus

'/'

zero

'0'

one

'1'

two

'2'

three

'3'

four

'4'

five

'5'

six

'6'

seven

'7'

eight

'8'

nine

'9'

colon

':'

semicolon

';'

less-than-sign

'<'

equals-sign

'='

greater-than-sign

'>'

question-mark

'?'

commercial-at

'@'

left-square-bracket

'['

backslash

reverse-solidus

right-square-bracket

']'

circumflex

'^'

circumflex-accent

'^'

underscore

'_'

low-line

'_'

grave-accent

'`'

left-brace

'{'

left-curly-bracket

'{'

vertical-line

right-brace

'}'

right-curly-bracket

'}'

tilde

DEL

177

For example:

SELECT '|' REGEXP '[[.vertical-line.]]';
+----------------------------------+
| '|' REGEXP '[[.vertical-line.]]' |
+----------------------------------+
|                                1 |
+----------------------------------+

Combining

The true power of regular expressions is unleashed when the above is combined, to form more complex examples. Regular expression's reputation for complexity stems from the seeming complexity of multiple combined regular expressions, when in reality, it's simply a matter of understanding the characters and how they apply:

The first example fails to match, as while the Ma matches, either i or r only matches once before the ia characters at the end.

SELECT 'Maria' REGEXP 'Ma[ir]{2}ia';
+------------------------------+
| 'Maria' REGEXP 'Ma[ir]{2}ia' |
+------------------------------+
|                            0 |
+------------------------------+

This example matches, as either i or r match exactly twice after the Ma, in this case one r and one i.

SELECT 'Maria' REGEXP 'Ma[ir]{2}';
+----------------------------+
| 'Maria' REGEXP 'Ma[ir]{2}' |
+----------------------------+
|                          1 |
+----------------------------+

Escaping

With the large number of special characters, care needs to be taken to properly escape characters. Two backslash characters, `` (one for the MariaDB parser, one for the regex library), are required to properly escape a character. For example:

To match the literal (Ma:

SELECT '(Maria)' REGEXP '(Ma';
ERROR 1139 (42000): Got error 'parentheses not balanced' from regexp

SELECT '(Maria)' REGEXP '\(Ma';
ERROR 1139 (42000): Got error 'parentheses not balanced' from regexp

SELECT '(Maria)' REGEXP '\\(Ma';
+--------------------------+
| '(Maria)' REGEXP '\\(Ma' |
+--------------------------+
|                        1 |
+--------------------------+

To match r+: The first two examples are incorrect, as they match r one or more times, not r+:

SELECT 'Mar+ia' REGEXP 'r+';
+----------------------+
| 'Mar+ia' REGEXP 'r+' |
+----------------------+
|                    1 |
+----------------------+

SELECT 'Maria' REGEXP 'r+';
+---------------------+
| 'Maria' REGEXP 'r+' |
+---------------------+
|                   1 |
+---------------------+

SELECT 'Maria' REGEXP 'r\\+';
+-----------------------+
| 'Maria' REGEXP 'r\\+' |
+-----------------------+
|                     0 |
+-----------------------+

SELECT 'Maria' REGEXP 'r+';
+---------------------+
| 'Maria' REGEXP 'r+' |
+---------------------+
|                   1 |
+---------------------+

_{This page is licensed: CC BY-SA / Gnu FDL}

PCRE - Perl Compatible Regular Expressions

PCRE Versions

PCRE Version

Introduced

Maturity

PCRE2 10.34

MariaDB 10.5.1

Stable

PCRE 8.43

MariaDB 10.1.39

Stable

PCRE 8.42

MariaDB 10.2.15, MariaDB 10.1.33, MariaDB 10.0.35

Stable

PCRE 8.41

MariaDB 10.2.8, MariaDB 10.1.26, MariaDB 10.0.32

Stable

PCRE 8.40

MariaDB 10.2.5, MariaDB 10.1.22, MariaDB 10.0.30

Stable

PCRE 8.39

MariaDB 10.1.15, MariaDB 10.0.26

Stable

PCRE 8.38

MariaDB 10.1.10, MariaDB 10.0.23

Stable

PCRE 8.37

MariaDB 10.1.5, MariaDB 10.0.18

Stable

PCRE 8.36

MariaDB 10.1.2, MariaDB 10.0.15

Stable

PCRE 8.35

MariaDB 10.1.0, MariaDB 10.0.12

Stable

PCRE 8.34

MariaDB 10.0.8

Stable

PCRE Enhancements

MariaDB uses the PCRE library, which significantly improves the power of the REGEXP/RLIKE operator.

The switch to PCRE added a number of features, including recursive patterns, named capture, look-ahead and look-behind assertions, non-capturing groups, non-greedy quantifiers, Unicode character properties, extended syntax for characters and character classes, multi-line matching, and many other.

These functions work with regular expressions: REGEXP_REPLACE(), REGEXP_INSTR(), and REGEXP_SUBSTR().

Also, REGEXP/RLIKE, and the new functions, work correctly with all multi-byte character sets supported by MariaDB, including East-Asian character sets (big5, gb2313, gbk, eucjp, eucjpms, cp932, ujis, euckr), and Unicode character sets (utf8, utf8mb4, ucs2, utf16, utf16le, utf32).

New Regular Expression Functions

REGEXP_REPLACE(subject, pattern, replace) - Replaces all occurrences of a pattern.
REGEXP_INSTR(subject, pattern) - Position of the first appearance of a regex .
REGEXP_SUBSTR(subject,pattern) - Returns the matching part of a string.

See the individual articles for more details and examples.

PCRE Syntax

In most cases PCRE is backward compatible with the old POSIX 1003.2 compliant regexp library (see Regular Expressions Overview), so you won't need to change your applications that use SQL queries with the REGEXP/RLIKE predicate.

This section briefly describes the most important extended PCRE features. For more details please refer to the documentation on the PCRE site, or to the documentation which is bundled in the /pcre/doc/html/ directory of a MariaDB sources distribution. The pages pcresyntax.html and pcrepattern.html should be a good start. Regular-Expressions.Info is another good resource to learn about PCRE and regular expressions generally.

Special Characters

PCRE supports the following escape sequences to match special characters:

Sequence

Description

0x07 (BEL)

\cx

"control-x", where x is any ASCII character

0x1B (escape)

0x0C (form feed)

0x0A (newline)

0x0D (carriage return)

0x09 (TAB)

\ddd

character with octal code ddd

\xhh

character with hex code hh

\x{hhh..}

character with hex code hhh..

Note, the backslash characters (here, and in all examples in the sections below) must be escaped with another backslash, unless you're using the SQL_MODE NO_BACKSLASH_ESCAPES.

This example tests if a character has hex code 0x61:

SELECT 'a' RLIKE '\\x{61}';
-> 1

Character Classes

PCRE supports the standard POSIX character classes such as alnum, alpha, blank, cntrl, digit, graph, lower, print, punct, space, upper, xdigit, with the following additional classes:

Class

Description

ascii

any ASCII character (0x00..0x7F)

word

any "word" character (a letter, a digit, or an underscore)

This example checks if the string consists of ASCII characters only:

SELECT 'abc' RLIKE '^[[:ascii:]]+$';
-> 1

Generic Character Types

Generic character types complement the POSIX character classes and serve to simplify writing patterns:

Class

Description

a decimal digit (same as [:digit:])

a character that is not a decimal digit

a horizontal white space character

a character that is not a horizontal white space character

a character that is not a new line

a newline sequence

a white space character

a character that is not a white space character

a vertical white space character

a character that is not a vertical white space character

a "word" character (same as [:word:])

a "non-word" character

This example checks if the string consists of "word" characters only:

SELECT 'abc' RLIKE '^\\w+$';
-> 1

Unicode Character Properties

\p{xx} is a character with the xx property, and \P{xx} is a character without the xx property.

The property names represented by xx above are limited to the Unicode script names, the general category properties, and "Any", which matches any character (including newline). Those that are not part of an identified script are lumped together as "Common".

General Category Properties For \p and \P

Property

Description

Other

Control

Format

Unassigned

Private use

Surrogate

Letter

Lower case letter

Modifier letter

Other letter

Title case letter

Upper case letter

Ll, Lu, or Lt

Mark

Spacing mark

Enclosing mark

Non-spacing mark

Number

Decimal number

Letter number

Other number

Punctuation

Connector punctuation

Dash punctuation

Close punctuation

Final punctuation

Initial punctuation

Other punctuation

Open punctuation

Symbol

Currency symbol

Modifier symbol

Mathematical symbol

Other symbol

Separator

Line separator

Paragraph separator

Space separator

This example checks if the string consists only of characters with property N (number):

SELECT '1¼①' RLIKE '^\\p{N}+$';
-> 1

Special Category Properties For \p and \P

Property

Description

Xan

Alphanumeric: union of properties L and N

Xps

POSIX space: property Z or tab, NL, VT, FF, CR

Xsp

Perl space: property Z or tab, NL, FF, CR

Xuc

A character than can be represented by a Universal Character Name

Xwd

Perl word: property Xan or underscore

The property Xuc matches any character that can be represented by a Universal Character Name (in C++ and other programming languages). These include $, @, ```, and all characters with Unicode code points greater than U+00A0, excluding the surrogates U+D800..U+DFFF.

Script Names For \p and \P

Arabic, Armenian, Avestan, Balinese, Bamum, Batak, Bengali, Bopomofo, Brahmi, Braille, Buginese, Buhid, Canadian_Aboriginal, Carian, Chakma, Cham, Cherokee, Common, Coptic, Cuneiform, Cypriot, Cyrillic, Deseret, Devanagari, Egyptian_Hieroglyphs, Ethiopic, Georgian, Glagolitic, Gothic, Greek, Gujarati, Gurmukhi, Han, Hangul, Hanunoo, Hebrew, Hiragana, Imperial_Aramaic, Inherited, Inscriptional_Pahlavi, Inscriptional_Parthian, Javanese, Kaithi, Kannada, Katakana, Kayah_Li, Kharoshthi, Khmer, Lao, Latin, Lepcha, Limbu, Linear_B, Lisu, Lycian, Lydian, Malayalam, Mandaic, Meetei_Mayek, Meroitic_Cursive, Meroitic_Hieroglyphs, Miao, Mongolian, Myanmar, New_Tai_Lue, Nko, Ogham, Old_Italic, Old_Persian, Old_South_Arabian, Old_Turkic, Ol_Chiki, Oriya, Osmanya, Phags_Pa, Phoenician, Rejang, Runic, Samaritan, Saurashtra, Sharada, Shavian, Sinhala, Sora_Sompeng, Sundanese, Syloti_Nagri, Syriac, Tagalog, Tagbanwa, Tai_Le, Tai_Tham, Tai_Viet, Takri, Tamil, Telugu, Thaana, Thai, Tibetan, Tifinagh, Ugaritic, Vai, Yi.

This example checks if the string consists only of Greek characters:

SELECT 'ΣΦΩ' RLIKE '^\\p{Greek}+$';
-> 1

Extended Unicode Grapheme Sequence

The \X escape sequence matches a character sequence that makes an "extended grapheme cluster", i.e. a composite character that consists of multiple Unicode code points.

One of the examples of a composite character can be a letter followed by non-spacing accent marks. This example demonstrates that U+0045 LATIN CAPITAL LETTER E followed by U+0302 COMBINING CIRCUMFLEX ACCENT followed by U+0323 COMBINING DOT BELOW together form an extended grapheme cluster:

SELECT _ucs2 0x004503020323 RLIKE '^\\X$';
-> 1

See the PCRE documentation for the other types of extended grapheme clusters.

Simple Assertions

An assertion specifies a certain condition that must match at a particular point, but without consuming characters from the subject string. In addition to the standard POSIX simple assertions ^ (that matches at the beginning of a line) and $ (that matches at the end of a line), PCRE supports a number of other assertions:

Assertion

Description

matches at a word boundary

matches when not at a word boundary

matches at the start of the subject

matches at the end of the subject, also matches before a newline at the end of the subject

matches only at the end of the subject

matches at the first matching position in the subject

This example cuts a word that consists only of 3 characters from a string:

SELECT REGEXP_SUBSTR('---abcd---xyz---', '\\b\\w{3}\\b');
-> xyz

Notice that the two \b assertions checked the word boundaries but did not get into the matching pattern.

The \b assertions work well in the beginning and the end of the subject string:

SELECT REGEXP_SUBSTR('xyz', '\\b\\w{3}\\b');
-> xyz

By default, the ^ and $ assertions have the same meaning with \A, \Z, and \z. However, the meanings of ^ and $ can change in multiline mode (see below). By contrast, the meanings of \A, \Z, and \z are always the same; they are independent of the multiline mode.

Option Setting

A number of options that control the default match behavior can be changed within the pattern by a sequence of option letters enclosed between (? and ).

Option

Description

(?i)

case insensitive match

(?m)

multiline mode

(?s)

dotall mode (dot matches newline characters)

(?x)

extended (ignore white space)

(?U)

ungreedy (lazy) match

(?J)

allow named subpatterns with duplicate names

(?X)

extra PCRE functionality (e.g. force error on unknown escaped character)

(?-...)

unset option(s)

For example, (?im) sets case insensitive multiline matching.

A hyphen followed by the option letters unset the options. For example, (?-im) means case sensitive single line match.

A combined setting and unsetting is also possible, e.g. (?im-sx).

If an option is set outside of subpattern parentheses, the option applies to the remainder of the pattern that follows the option. If an option is set inside a subpattern, it applies to the part of this subpattern that follows the option.

In this example the pattern (?i)m((?-i)aria)db matches the words MariaDB, Mariadb, mariadb, but not MARIADB:

SELECT 'MariaDB' RLIKE '(?i)m((?-i)aria)db';
-> 1

SELECT 'mariaDB' RLIKE '(?i)m((?-i)aria)db';
-> 1

SELECT 'Mariadb' RLIKE '(?i)m((?-i)aria)db';
-> 1

SELECT 'MARIADB' RLIKE '(?i)m((?-i)aria)db';
-> 0

This example demonstrates that the (?x) option makes the regexp engine ignore all white spaces in the pattern (other than in a class).

SELECT 'ab' RLIKE '(?x)a b';
-> 1

Note, putting spaces into a pattern in combination with the (?x) option can be useful to split different logical parts of a complex pattern, to make it more readable.

Multiline Matching

Multiline matching changes the meaning of ^ and $ from "the beginning of the subject string" and "the end of the subject string" to "the beginning of any line in the subject string" and "the end of any line in the subject string" respectively.

This example checks if the subject string contains two consequent lines that fully consist of digits:

SELECT 'abc\n123\n456\nxyz\n' RLIKE '(?m)^\\d+\\R\\d+$';
-> 1

Notice the (?m) option in the beginning of the pattern, which switches to the multiline matching mode.

Newline Conventions

PCRE supports five line break conventions:

CR (\r) - a single carriage return character
LF (\n) - a single linefeed character
CRLF (\r\n) - a carriage return followed by a linefeed
any of the previous three
any Unicode newline sequence

By default, the newline convention is set to any Unicode newline sequence, which includes:

Sequence

Description

(U+000A, carriage return)

(U+000D, carriage return)

CRLF

(a carriage return followed by a linefeed)

(U+000B, vertical tab)

(U+000C, form feed)

NEL

(U+0085, next line)

(U+2028, line separator)

(U+2029, paragraph separator)

The newline convention can be set by starting a pattern with one of the following sequences:

Sequence

Description

(*CR)

carriage return

(*LF)

linefeed

(*CRLF)

carriage return followed by linefeed

(*ANYCRLF)

any of the previous three

(*ANY)

all Unicode newline sequences

The newline conversion affects the ^ and $ assertions, the interpretation of the dot metacharacter, and the behavior of .

Note, the new line convention does not affect the meaning of .

This example demonstrates that the dot metacharacter matches , because it is not a newline sequence anymore:

SELECT 'a\nb' RLIKE '(*CR)a.b';
-> 1

Newline Sequences

By default, the escape sequence matches any Unicode newline sequences.

The meaning of can be set by starting a pattern with one of the following sequences:

Sequence

Description

(*BSR_ANYCRLF)

any of CR, LF or CRLF

(*BSR_UNICODE)

any Unicode newline sequence

Comments

It's possible to include comments inside a pattern. Comments do not participate in the pattern matching. Comments start at the (?

sequence and continue up to the next closing parenthesis:

SELECT 'ab12' RLIKE 'ab(?#expect digits)12';
-> 1

Quoting

POSIX uses the backslash to remove a special meaning from a character. PCRE introduces a syntax to remove special meaning from a sequence of characters. The characters inside \Q ... \E are treated literally, without their special meaning.

This example checks if the string matches a dollar sign followed by a parenthesized name (a variable reference in some languages):

SELECT '$(abc)' RLIKE '^\\Q$(\\E\\w+\\Q)\\E$';
-> 1

Note that the leftmost dollar sign and the parentheses are used literally, while the rightmost dollar sign is still used to match the end of the string.

Resetting the Match Start

The escape sequence \K causes any previously matched characters to be excluded from the final matched sequence. For example, the pattern: (foo)\Kbar matches foobar, but reports that it has matched bar. This feature is similar to a look-behind assertion. However, in this case, the part of the subject before the real match does not have to be of fixed length:

SELECT REGEXP_SUBSTR('aaa123', '[a-z]*\\K[0-9]*');
-> 123

Non-Capturing Groups

The question mark and the colon after the opening parenthesis create a non-capturing group: (?:...).

This example removes an optional article from a word, for example for better sorting of the results.

SELECT REGEXP_REPLACE('The King','(?:the|an|a)[^a-z]([a-z]+)','\\1');
-> King

Note that the articles are listed inside the left parentheses using the alternation operator | but they do not produce a captured subpattern, so the word followed by the article is referenced by '1' in the third argument to the function. Using non-capturing groups can be useful to save numbers on the sup-patterns that won't be used in the third argument of REGEXP_REPLACE(), as well as for performance purposes.

Non-Greedy Quantifiers

By default, the repetition quantifiers ?, *, + and {n,m} are "greedy", that is, they try to match as much as possible. Adding a question mark after a repetition quantifier makes it "non-greedy", so the pattern matches the minimum number of times possible.

This example cuts C comments from a line:

SELECT REGEXP_REPLACE('/* Comment1 */ i+= 1; /* Comment2 */', '/[*].*?[*]/','');
->  i+= 1;

The pattern without the non-greedy flag to the quantifier /[*].*[*]/ would match the entire string between the leftmost /* and the rightmost */.

Atomic Groups

A sequence inside (?>...) makes an atomic group. Backtracking inside an atomic group is prevented once it has matched; however, backtracking past to the previous items works normally.

Consider the pattern \d+foo applied to the subject string 123bar. Once the engine scans 123 and fails on the letter b, it would normally backtrack to 2 and try to match again, then fail and backtrack to 1 and try to match and fail again, and finally fail the entire pattern. In case of an atomic group (?>\d+)foo with the same subject string 123bar, the engine gives up immediately after the first failure to match foo. An atomic group with a quantifier can match all or nothing.

Atomic groups produce faster false results (i.e. in case when a long subject string does not match the pattern), because the regexp engine saves performance on backtracking. However, don't hurry to put everything into atomic groups. This example demonstrates the difference between atomic and non-atomic match:

SELECT 'abcc' RLIKE 'a(?>bc|b)c' AS atomic1;
-> 1

SELECT 'abc' RLIKE 'a(?>bc|b)c' AS atomic2;
-> 0
 
SELECT 'abcc' RLIKE 'a(bc|b)c' AS non_atomic1;
-> 1

SELECT 'abc' RLIKE 'a(bc|b)c' AS non_atomic2;
-> 1

The non-atomic pattern matches both abbc and abc, while the atomic pattern matches abbc only.

The atomic group (?>bc|b) in the above example can be "translated" as "if there is bc, then don't try to match as b". So b can match only if bc is not found.

Atomic groups are not capturing. To make an atomic group capturing, put it into parentheses:

SELECT REGEXP_REPLACE('abcc','a((?>bc|b))c','\\1');
-> bc

Possessive quantifiers

An atomic group which ends with a quantifier can be rewritten using a so called "possessive quantifier" syntax by putting an additional + sign following the quantifier.

The pattern (?>\d+)foo from the previous section's example can be rewritten as \d++foo.

Absolute and Relative Numeric Backreferences

Backreferences match the same text as previously matched by a capturing group. Backreferences can be written using:

a backslash followed by a digit
the \g escape sequence followed by a positive or negative number
the \g escape sequence followed by a positive or negative number enclosed in braces

The following backreferences are identical and refer to the first capturing group:

\1
\g1
\g{1}

This example demonstrates a pattern that matches "sense and sensibility" and "response and responsibility", but not "sense and responsibility":

SELECT 'sense and sensibility' RLIKE '(sens|respons)e and \\1ibility';
-> 1

This example removes doubled words that can unintentionally creep in when you edit a text in a text editor:

SELECT REGEXP_REPLACE('using using the the regexp regexp',
 '\\b(\\w+)\\s+\\1\\b','\\1');
-> USING the regexp

Note that all double words were removed, in the beginning, in the middle and in the end of the subject string.

A negative number in a \g sequence means a relative reference. Relative references can be helpful in long patterns, and also in patterns that are created by joining fragments together that contain references within themselves. The sequence \g{-1} is a reference to the most recently started capturing subpattern before \g.

In this example \g{-1} is equivalent to \2:

SELECT 'abc123def123' RLIKE '(abc(123)def)\\g{-1}';     
-> 1

SELECT 'abc123def123' RLIKE '(abc(123)def)\\2';
-> 1

Named Subpatterns and Backreferences

Using numeric backreferences for capturing groups can be hard to track in a complicated regular expression. Also, the numbers can change if an expression is modified. To overcome these difficulties, PCRE supports named subpatterns.

A subpattern can be named in one of three ways: (?<name>...) or (?'name'...) as in Perl, or (?P<name>...) as in Python. References to capturing subpatterns from other parts of the pattern, can be made by name as well as by number.

Backreferences to a named subpattern can be written using the .NET syntax \k{name}, the Perl syntax \k<name> or \k'name' or \g{name}, or the Python syntax (?P=name).

This example tests if the string is a correct HTML tag:

SELECT '<a href="../">Up</a>' RLIKE '<(?<tag>[a-z][a-z0-9]*)[^>]*>[^<]*</(?P=tag)>';
-> 1

Positive and Negative Look-Ahead and Look-Behind Assertions

Look-ahead and look-behind assertions serve to specify the context for the searched regular expression pattern. Note that the assertions only check the context, they do not capture anything themselves!

This example finds the letter which is not followed by another letter (negative look-ahead):

SELECT REGEXP_SUBSTR('ab1','[a-z](?![a-z])');
-> b

This example finds the letter which is followed by a digit (positive look-ahead):

SELECT REGEXP_SUBSTR('ab1','[a-z](?=[0-9])');
-> b

This example finds the letter which does not follow a digit character (negative look-behind):

SELECT REGEXP_SUBSTR('1ab','(?<![0-9])[a-z]');
-> b

This example finds the letter which follows another letter character (positive look-behind):

SELECT REGEXP_SUBSTR('1ab','(?<=[a-z])[a-z]');
-> b

Note that look-behind assertions can only be of fixed length; you cannot have repetition operators or alternations with different lengths:

SELECT 'aaa' RLIKE '(?<=(a|bc))a';
ERROR 1139 (42000): Got error 'lookbehind assertion is not fixed length at offset 10' from regexp

Subroutine Reference and Recursive Patterns

PCRE supports a special syntax to recourse the entire pattern or its individual subpatterns:

Syntax

Description

(?R)

Recourse the entire pattern

(?n)

call subpattern by absolute number

(?+n)

call subpattern by relative number

(?-n)

call subpattern by relative number

(?&name)

call subpattern by name (Perl)

(?P>name)

call subpattern by name (Python)

call subpattern by name (Oniguruma)

\g'name'

call subpattern by name (Oniguruma)

call subpattern by absolute number (Oniguruma)

\g'n'

call subpattern by absolute number (Oniguruma)

\g<+n>

call subpattern by relative number

\g<-n>

call subpattern by relative number

\g'+n'

call subpattern by relative number

\g'-n'

call subpattern by relative number

This example checks for a correct additive arithmetic expression consisting of numbers, unary plus and minus, binary plus and minus, and parentheses:

SELECT '1+2-3+(+(4-1)+(-2)+(+1))' RLIKE  '^(([+-]?(\\d+|[(](?1)[)]))(([+-](?1))*))$';
-> 1

The recursion is done using (?1) to call for the first parenthesized subpattern, which includes everything except the leading ^ and the trailing $.

The regular expression in the above example implements the following BNF grammar:

<expression> ::= <term> [(<sign> <term>)...]
<term> ::= [ <sign> ] <primary>
<primary> ::= <number> | <left paren> <expression> <right paren>
<sign> ::= <plus sign> | <minus sign>

Defining Subpatterns For Use By Reference

Use the (?(DEFINE)...) syntax to define subpatterns that can be referenced from elsewhere.

This example defines a subpattern with the name letters that matches one or more letters, which is further reused two times:

SELECT 'abc123xyz' RLIKE '^(?(DEFINE)(?<letters>[a-z]+))(?&letters)[0-9]+(?&letters)$';
-> 1

The above example can also be rewritten to define the digit part as a subpattern as well:

SELECT 'abc123xyz' RLIKE
 '^(?(DEFINE)(?<letters>[a-z]+)(?<digits>[0-9]+))(?&letters)(?&digits)(?&letters)$';
-> 1

Conditional Subpatterns

There are two forms of conditional subpatterns:

(?(condition)yes-pattern)
(?(condition)yes-pattern|no-pattern)

The yes-pattern is used if the condition is satisfied, otherwise the no-pattern (if any) is used.

Conditions With Subpattern References

If a condition consists of a number, it makes a condition with a subpattern reference. Such a condition is true if a capturing subpattern corresponding to the number has previously matched.

This example finds an optionally parenthesized number in a string:

SELECT REGEXP_SUBSTR('a(123)b', '([(])?[0-9]+(?(1)[)])');
-> (123)

The ([(])? part makes a capturing subpattern that matches an optional opening parenthesis; the [0-9]+ part matches a number, and the (?(1)[)]) part matches a closing parenthesis, but only if the opening parenthesis has been previously found.

Other Kinds of Conditions

The other possible condition kinds are: recursion references and assertions. See the PCRE documentation for details.

Matching Zero Bytes (0x00)

PCRE correctly works with zero bytes in the subject strings:

SELECT 'a\0b' RLIKE '^a.b$';
-> 1

Zero bytes, however, are not supported literally in the pattern strings and should be escaped using the \xhh or \x{hh} syntax:

SELECT 'a\0b' RLIKE '^a\\x{00}b$';
-> 1

Other PCRE Features

PCRE provides other extended features that were not covered in this document, such as duplicate subpattern numbers, backtracking control, breaking utf-8 sequences into individual bytes, setting the match limit, setting the recursion limit, optimization control, recursion conditions, assertion conditions and more types of extended grapheme clusters. Please refer to thePCRE documentation for details.

Enhanced regex was implemented as a GSoC 2013 project by Sudheera Palihakkara.

default_regex_flags Examples

The default_regex_flags variable was introduced to address the remaining incompatibilities between PCRE and the old regex library. Here are some examples of its usage:

The default behaviour (multiline match is off)

SELECT 'a\nb\nc' RLIKE '^b$';
+---------------------------+
| '(?m)a\nb\nc' RLIKE '^b$' |
+---------------------------+
|                         0 |
+---------------------------+

Enabling the multiline option using the PCRE option syntax:

SELECT 'a\nb\nc' RLIKE '(?m)^b$';
+---------------------------+
| 'a\nb\nc' RLIKE '(?m)^b$' |
+---------------------------+
|                         1 |
+---------------------------+

Enabling the miltiline option using default_regex_flags

SET default_regex_flags='MULTILINE';
SELECT 'a\nb\nc' RLIKE '^b$';
+-----------------------+
| 'a\nb\nc' RLIKE '^b$' |
+-----------------------+
|                     1 |
+-----------------------+

REGEXP

Syntax

expr REGEXP pat, expr RLIKE pat

Description

Performs a pattern match of a string expression expr against a patternpat. The pattern can be an extended regular expression. See Regular Expressions Overview for details on the syntax for regular expressions (see also PCRE Regular Expressions).

Returns 1 if expr matches pat or 0 if it doesn't match. If either expr or pat are NULL, the result is NULL.

The negative form NOT REGEXP also exists, as an alias for NOT (string REGEXP pattern). RLIKE and NOT RLIKE are synonyms for REGEXP and NOT REGEXP, originally provided for mSQL compatibility.

The pattern need not be a literal string. For example, it can be specified as a string expression or table column.

Note: Because MariaDB uses the C escape syntax in strings (for example, "\n" to represent the newline character), you must double any "" that you use in your REGEXP strings.

REGEXP is not case sensitive, except when used with binary strings.

The default_regex_flags variable addresses the remaining compatibilities between PCRE and the old regex library.

Examples

SELECT 'Monty!' REGEXP 'm%y%%';
+-------------------------+
| 'Monty!' REGEXP 'm%y%%' |
+-------------------------+
|                       0 |
+-------------------------+

SELECT 'Monty!' REGEXP '.*';
+----------------------+
| 'Monty!' REGEXP '.*' |
+----------------------+
|                    1 |
+----------------------+

SELECT 'new*\n*line' REGEXP 'new\\*.\\*line';
+---------------------------------------+
| 'new*\n*line' REGEXP 'new\\*.\\*line' |
+---------------------------------------+
|                                     1 |
+---------------------------------------+

SELECT 'a' REGEXP 'A', 'a' REGEXP BINARY 'A';
+----------------+-----------------------+
| 'a' REGEXP 'A' | 'a' REGEXP BINARY 'A' |
+----------------+-----------------------+
|              1 |                     0 |
+----------------+-----------------------+

SELECT 'a' REGEXP '^[a-d]';
+---------------------+
| 'a' REGEXP '^[a-d]' |
+---------------------+
|                   1 |
+---------------------+

default_regex_flags examples

MariaDB uses the default_regex_flags variable to address the remaining compatibilities between PCRE and the old regex library.

The default behavior (multiline match is off)

SELECT 'a\nb\nc' RLIKE '^b$';
+---------------------------+
| '(?m)a\nb\nc' RLIKE '^b$' |
+---------------------------+
|                         0 |
+---------------------------+

Enabling the multiline option using the PCRE option syntax:

SELECT 'a\nb\nc' RLIKE '(?m)^b$';
+---------------------------+
| 'a\nb\nc' RLIKE '(?m)^b$' |
+---------------------------+
|                         1 |
+---------------------------+

Enabling the multiline option using default_regex_flags

SET default_regex_flags='MULTILINE';
SELECT 'a\nb\nc' RLIKE '^b$';
+-----------------------+
| 'a\nb\nc' RLIKE '^b$' |
+-----------------------+
|                     1 |
+-----------------------+

REGEXP_INSTR

Syntax

REGEXP_INSTR(subject, pattern)

Returns the position of the first occurrence of the regular expression pattern in the string subject, or 0 if pattern was not found.

The positions start with 1 and are measured in characters (i.e. not in bytes), which is important for multi-byte character sets. You can cast a multi-byte character set to BINARY to get offsets in bytes.

The function follows the case sensitivity rules of the effective collation. Matching is performed case insensitively for case insensitive collations, and case sensitively for case sensitive collations and for binary data.

The collation case sensitivity can be overwritten using the (?i) and (?-i) PCRE flags.

MariaDB uses the PCRE regular expression library for enhanced regular expression performance, and REGEXP_INSTR was introduced as part of this enhancement.

Examples

SELECT REGEXP_INSTR('abc','b');
-> 2

SELECT REGEXP_INSTR('abc','x');
-> 0

SELECT REGEXP_INSTR('BJÖRN','N');
-> 5

Casting a multi-byte character set as BINARY to get offsets in bytes:

SELECT REGEXP_INSTR(BINARY 'BJÖRN','N') AS cast_utf8_to_binary;
-> 6

Case sensitivity:

SELECT REGEXP_INSTR('ABC','b');
-> 2

SELECT REGEXP_INSTR('ABC' COLLATE utf8_bin,'b');
-> 0

SELECT REGEXP_INSTR(BINARY'ABC','b');
-> 0

SELECT REGEXP_INSTR('ABC','(?-i)b');
-> 0

SELECT REGEXP_INSTR('ABC' COLLATE utf8_bin,'(?i)b');
-> 2

_{This page is licensed: CC BY-SA / Gnu FDL}

REGEXP_REPLACE

Syntax

REGEXP_REPLACE(subject, pattern, replace)

Description

REGEXP_REPLACE returns the string subject with all occurrences of the regular expression pattern replaced by the string replace. If no occurrences are found, then subject is returned as is.

The replace string can have backreferences to the subexpressions in the form \N, where N is a number from 1 to 9.

The collation case sensitivity can be overwritten using the (?i) and (?-i) PCRE flags.

MariaDB uses the PCRE regular expression library for enhanced regular expression performance, and REGEXP_REPLACE was introduced as part of this enhancement.

The default_regex_flags variable addresses the remaining compatibilities between PCRE and the old regex library.

Examples

SELECT REGEXP_REPLACE('ab12cd','[0-9]','') AS remove_digits;
-> abcd

SELECT REGEXP_REPLACE('<html><head><title>title</title><body>body</body></htm>', '<.+?>',' ')
AS strip_html;
-> title  body

Backreferences to the subexpressions in the form , where N is a number from 1 to 9:

SELECT REGEXP_REPLACE('James Bond','^(.*) (.*)$','\\2, \\1') AS reorder_name;
-> Bond, James

Case insensitive and case sensitive matches:

SELECT REGEXP_REPLACE('ABC','b','-') AS case_insensitive;
-> A-C

SELECT REGEXP_REPLACE('ABC' COLLATE utf8_bin,'b','-') AS case_sensitive;
-> ABC

SELECT REGEXP_REPLACE(BINARY 'ABC','b','-') AS binary_data;
-> ABC

Overwriting the collation case sensitivity using the (?i) and (?-i) PCRE flags.

SELECT REGEXP_REPLACE('ABC','(?-i)b','-') AS force_case_sensitive;
-> ABC

SELECT REGEXP_REPLACE(BINARY 'ABC','(?i)b','-') AS force_case_insensitive;
-> A-C

_{This page is licensed: CC BY-SA / Gnu FDL}

REGEXP_SUBSTR

Syntax

REGEXP_SUBSTR(subject,pattern)

Description

Returns the part of the string subject that matches the regular expression pattern, or an empty string if pattern was not found.

The collation case sensitivity can be overwritten using the (?i) and (?-i) PCRE flags.

MariaDB uses the PCRE regular expression library for enhanced regular expression performance, and REGEXP_SUBSTR was introduced as part of this enhancement.

The default_regex_flags variable addresses the remaining compatibilities between PCRE and the old regex library.

Examples

SELECT REGEXP_SUBSTR('ab12cd','[0-9]+');
-> 12

SELECT REGEXP_SUBSTR(
  'See https://mariadb.org/en/foundation/ for details',
  'https?://[^/]*');
-> https://mariadb.org

SELECT REGEXP_SUBSTR('ABC','b');
-> B

SELECT REGEXP_SUBSTR('ABC' COLLATE utf8_bin,'b');
->

SELECT REGEXP_SUBSTR(BINARY'ABC','b');
->

SELECT REGEXP_SUBSTR('ABC','(?i)b');
-> B

SELECT REGEXP_SUBSTR('ABC' COLLATE utf8_bin,'(?+i)b');
-> B

_{This page is licensed: CC BY-SA / Gnu FDL}

RLIKE

Syntax

expr REGEXP pat, expr RLIKE pat

Description

RLIKE is a synonym for REGEXP.

_{This page is licensed: CC BY-SA / Gnu FDL}

String Functions

ASCII

Syntax

Description

Examples

BIN

Syntax

Description

Examples

See Also

BINARY Operator

Syntax

Description

Examples

See Also

BIT_LENGTH

Syntax

Description

Examples

Compatibility

CAST

Syntax

Description

Examples

See Also

CHAR Function

Syntax

Description

Examples

See Also

CHAR_LENGTH

Syntax

Description

Examples

See Also

CHARACTER_LENGTH

Syntax

Description

CHR

Syntax

Description

Examples

See Also

CONCAT

Syntax

Description

Oracle Mode

Examples

See Also

CONCAT_WS

Syntax

Description

Examples

See Also

CONVERT

Syntax

Description

Examples

See Also

ELT

Syntax

Description

Examples

See also

EXPORT_SET

Syntax

Description

Examples

EXTRACTVALUE

Syntax

Description

Invalid Arguments

Explicit text() Expressions

Count Matches

Matches

Examples

FIELD

Syntax

Description

Examples