Routers are the core of a MaxScale service, intelligently directing database traffic. This version adds the KafkaCDC router and enhances others with features like causal reads.
The avrorouter is a MariaDB 10.0 binary log to Avro file converter. It consumes binary logs from a local directory and transforms them into a set of Avro files. These files can then be queried by clients for various purposes.
This router is intended to be used in tandem with theBinlog Server. The Binlog Server can connect to a master server and request binlog records. These records can then consumed by the avrorouter directly from the binlog cache of the Binlog Server. This allows MariaDB MaxScale to automatically transform binlog events on the master to local Avro format files.
The avrorouter can also consume binary logs straight from the master. This will remove the need to configure the Binlog Server but it will increase the disk space requirement on the master server by at least a factor of two.
The converted Avro files can be requested with the CDC protocol. This protocol should be used to communicate with the avrorouter and currently it is the only supported protocol. The clients can request either Avro or JSON format data streams from a database table.
MaxScale 2.4.0 added a direct replication mode that connects the avrorouter directly to a MariaDB server. This mode is an improvement over the binlogrouter based replication as it provides a more space-efficient and faster conversion process. This is the recommended method of using the avrorouter as it is faster, more efficient and less prone to errors caused by missing DDL events.
To enable the direct replication mode, add either the servers
or the cluster
parameter to the avrorouter service. The avrorouter will then use one of the
servers as the replication source.
Here is a minimal avrorouter direct replication configuration:
[maxscale]
threads=auto
[server1]
type=server
address=127.0.0.1
port=3306
[cdc-service]
type=service
router=avrorouter
servers=server1
user=maxuser
password=maxpwd
[cdc-listener]
type=listener
service=cdc-service
protocol=CDC
port=4001
In direct replication mode, the avrorouter stores the latest replicated GTID in
the last_gtid.txt
file located in the avrodir
(defaults to/var/lib/maxscale
). To reset the replication process, stop MaxScale and remove
the file.
Additionally, the avrorouter will attempt to automatically create any missing schema files for tables that have data events for them but the DDL for those tables is not contained in the binlogs.
For information about common service parameters, refer to theConfiguration Guide.
gtid_start_pos
Type: string
Mandatory: No
Dynamic: No
Default: ""
The GTID where avrorouter starts the replication from in direct replication mode. The parameter value must be in the MariaDB GTID format e.g. 0-1-123 where the first number is the replication domain, the second the server_id value of the server and the last is the GTID sequence number.
This parameter has no effect in the traditional mode. If this parameter is defined, the replication will start from the implicit GTID that the master first serves.
server_id
Type: number
Mandatory: No
Dynamic: No
Default: 1234
Theserver_id used when replicating from the master in direct replication mode.
codec
Type: enum
Mandatory: No
Dynamic: No
Values: null
, deflate
Default: null
The compression codec to use. By default, the avrorouter does not use compression.
This parameter takes one of the following two values; null ordeflate. These are the mandatory compression algorithms required by the Avro specification. For more information about the compression types, refer to the Avro specification.
match
and exclude
Type: regex
Mandatory: No
Dynamic: No
Default: ""
These regular expression settings filter events for processing depending on table names. Avrorouter does not support theoptions-parameter for regular expressions.
To prevent excessive matching of similarly named tables, surround each table
name with the ^
and $
tokens. For example, to match the test.clients
table
but not test.clients_old
table use match=^test[.]clients$
. For multiple
tables, surround each table in parentheses and add a pipe character between
them: match=(^test[.]t1$)|(^test[.]t2$)
.
binlogdir
Type: path
Mandatory: No
Dynamic: No
Default: /var/lib/maxscale/
The location of the binary log files. This is the first mandatory parameter and it defines where the module will read binlog files from. Read access to this directory is required.
avrodir
Type: path
Mandatory: No
Dynamic: No
Default: /var/lib/maxscale/
The location where the Avro files are stored. This is the second mandatory parameter and it governs where the converted files are stored. This directory will be used to store the Avro files, plain-text Avro schemas and other files needed by the avrorouter. The user running MariaDB MaxScale will need both read and write access to this directory.
The avrorouter will also use the avrodir to store various internal files. These files are named avro.index and avro-conversion.ini. By default, the default data directory, /var/lib/maxscale/, is used. Before version 2.1 of MaxScale, the value of binlogdir was used as the default value for avrodir.
filestem
Type: string
Mandatory: No
Dynamic: No
Default: mysql-bin
The base name of the binlog files. The binlog files are assumed to follow the naming schema . where is the binlog number and is the value of this router option.
For example, with the following parameters:
filestem=mybin
binlogdir=/var/lib/mysql/binlogs/
The first binlog file the avrorouter would look for is /var/lib/mysql/binlogs/mybin.000001
.
start_index
Type: number
Mandatory: No
Dynamic: No
Default: 1
The starting index number of the binlog file. The default value is 1. For the binlog mysql-bin.000001 the index would be 1, for mysql-bin.000005 the index would be 5.
If you need to start from a binlog file other than 1, you need to set the value of this option to the correct index. The avrorouter will always start from the beginning of the binary log file.
cooperative_replication
Type: boolean
Mandatory: No
Dynamic: No
Default: false
Controls whether multiple instances cooperatively replicate from the same cluster. This is a boolean parameter and is disabled by default. It was added in MaxScale 6.0.
When this parameter is enabled and the monitor pointed to by the cluster
parameter supports cooperative monitoring (currently only mariadbmon
),
the replication is only active if the monitor owns the cluster it is
monitoring.
With this feature, multiple MaxScale instances can replicate from the same set of servers and only one of them actively processes the replication stream. This allows the avrorouter instances to be made highly-available without having to have them all process the events at the same time.
Whenever an instance that does not own the cluster gains ownership of the cluster, the replication will continue from the latest GTID processed by that instance. This means that if the instance hasn't replicated events that have been purged from the binary logs, the replication cannot continue.
Avro File Related Parameters
These options control how large the Avro file data blocks can get. Increasing or lowering the block size could have a positive effect depending on your use case. For more information about the Avro file format and how it organizes data, refer to theAvro documentation.
The avrorouter will flush a block and start a new one when either group_trx
transactions or group_rows
row events have been processed. Changing these
options will also allow more frequent updates to stored data but this
will cause a small increase in file size and search times.
It is highly recommended to keep the block sizes relatively large to allow larger chunks of memory to be flushed to disk at one time. This will make the conversion process noticeably faster.
group_trx
Type: number
Mandatory: No
Dynamic: No
Default: 1
Controls the number of transactions that are grouped into a single Avro data block.
group_rows
Type: number
Mandatory: No
Dynamic: No
Default: 1000
Controls the number of row events that are grouped into a single Avro data block.
block_size
Type: size
Mandatory: No
Dynamic: Yes
Default: 16KiB
The Avro data block size in bytes. The default is 16 kilobytes. Increase this value if individual events in the binary logs are very large. The value is a size type parameter which means that it can also be defined with an SI suffix. Refer to the Configuration Guide for more details about size type parameters and how to use them.
Example configuration
[replication-router]
type=service
router=binlogrouter
router_options=server-id=4000,binlogdir=/var/lib/mysql,filestem=binlog
user=maxuser
password=maxpwd
[avro-router]
type=service
router=avrorouter
binlogdir=/var/lib/mysql
filestem=binlog
avrodir=/var/lib/maxscale
Read Module Commands documentation for details about module commands.
The avrorouter supports the following module commands.
avrorouter::convert SERVICE {start | stop}
Start or stop the binary log to Avro conversion. The first parameter is the name of the service to stop and the second parameter tells whether to start the conversion process or to stop it.
avrorouter::purge SERVICE
This command will delete all files created by the avrorouter. This includes all .avsc schema files and .avro data files as well as the internal state tracking files. Use this to completely reset the conversion process.
Note: Once the command has completed, MaxScale must be restarted to restart
the conversion process. Issuing a convert start
command will not work.
WARNING: You will lose any and all converted data when this command is executed.
The avrorouter creates two files in the location pointed by avrodir:avro.index and avro-conversion.ini. The avro.index file is used to store the locations of the GTIDs in the .avro files. The avro-conversion.ini contains the last converted position and GTID in the binlogs. If you need to reset the conversion process, delete these two files and restart MaxScale.
To reset the binlog conversion process, issue the purge
module command by
executing it via MaxCtrl and stop MaxScale. If manually created schema files
were used, they need to be recreated once MaxScale is stopped. After stopping
MaxScale and optionally creating the schema files, the conversion process can be
started by starting MaxScale.
The safest way to stop the avrorouter when used with the binlogrouter is to follow the following steps:
Issue STOP SLAVE
on the binlogrouter
Wait for the avrorouter to process all files
Stop MaxScale with systemctl stop maxscale
This guarantees that the conversion process halts at a known good position in the latest binlog file.
The avrorouter comes with an example client program, cdc.py, written in Python 3. This client can connect to a MaxScale configured with the CDC protocol and the avrorouter.
Before using this client, you will need to install the Python 3 interpreter and add users to the service with the cdc_users.py script. Fore more details about the user creation, please refer to the CDC Protocol and CDC Users documentation.
Read the output of cdc.py --help
for a full list of supported options
and a short usage description of the client program.
The avrorouter needs to have access to the CREATE TABLE statement for all tables for which there are data events in the binary logs. If the CREATE TABLE statements for the tables aren't present in the current binary logs, the schema files must be created.
In the direct replication mode, avrorouter will automatically create the missing
schema files by connecting to the database and executing a SHOW CREATE TABLE
statement. If a connection cannot be made or the service user lacks the
permission, an error will be logged and the data events for that table will not
be processed.
For the legacy binlog mode, the files must be generated with a schema file generator. There are currently two methods to generate the .avsc schema files.
The cdc_one_schema.py
generates a schema file for a single table by reading a
tab separated list of field and type names from the standard input. This is the
recommended schema generation tool as it does not directly communicate with the
database thus making it more flexible.
The only requirement to run the script is that a Python interpreter is installed.
To use this script, pipe the output of the mysql
command line into thecdc_one_schema.py
script:
mysql -ss -u <user> -p -h <host> -P <port> -e 'DESCRIBE `<database>`.`<table>`'|./cdc_one_schema.py <database> <table>
Replace the <user>
, <host>
, <port>
, <database>
and <table>
with
appropriate values and run the command. Note that the -ss
parameter is
mandatory as that will generate the tab separated output instead of the default
pretty-printed output.
An .avsc file named after the database and table name will be generated in the
current working directory. Copy this file to the location pointed by theavrodir
parameter of the avrorouter.
Alternatively, you can also copy the output of the mysql
command to a file and
feed it into the script if you cannot execute the SQL command directly:
# On the database server
mysql -ss -u <user> -p -h <host> -P <port> -e 'DESCRIBE `<database>`.`<table>`' > schema.tsv
# On the MaxScale server
./cdc_one_schema.py <database> <table> < schema.tsv
If you want to use a specific Python interpreter instead of the one found in the
search path, you can modify the first line of the script from #!/usr/bin/env python
to #!/path/to/python
where /path/to/python
is the absolute path to
the Python interpreter (both Python 2 and Python 3 can be used).
usage: cdc_schema.py [--help] [-h HOST] [-P PORT] [-u USER] [-p PASSWORD] DATABASE
The cdc_schema.py executable is installed as a part of MaxScale. This is a Python 3 script that generates Avro schema files from an existing database.
The script will generate the .avsc schema files into the current directory. Run
the script for all required databases copy the generated .avsc files to the
directory where the avrorouter stores the .avro files (the value of avrodir
).
The cdc_schema.go example Go program is provided with MaxScale. This file
can be used to create Avro schemas for the avrorouter by connecting to a
database and reading the table definitions. You can find the file in MaxScale's
share directory in /usr/share/maxscale/
.
You'll need to install the Go compiler and run go get
to resolve Go
dependencies before you can use the cdc_schema program. After resolving the
dependencies you can run the program with go run cdc_schema.go
. The program
will create .avsc files in the current directory. These files should be moved
to the location pointed by the avrodir option of the avrorouter if they are
to be used by the router itself.
Read the output of go run cdc_schema.go -help
for more information on how
to run the program.
The Avrorouter Tutorial shows you how the Avrorouter works with the Binlog Server to convert binlogs from a master server into easy to process Avro data.
Here is a simple configuration example which reads binary logs locally from/var/lib/mysql/
and stores them as Avro files in /var/lib/maxscale/avro/
.
The service has one listener listening on port 4001 for CDC protocol clients.
[avro-converter]
type=service
router=avrorouter
user=myuser
password=mypasswd
router_options=binlogdir=/var/lib/mysql/,
filestem=binlog,
avrodir=/var/lib/maxscale/avro/
[avro-listener]
type=listener
service=avro-converter
protocol=CDC
port=4001
Here is an example how you can query for data in JSON format using the cdc.py Python script. It queries the table test.mytable for all change records.
cdc.py --user=myuser --password=mypasswd --host=127.0.0.1 --port=4001 test.mytable
You can then combine it with the cdc_kafka_producer.py to publish these change records to a Kafka broker.
cdc.py --user=myuser --password=mypasswd --host=127.0.0.1 --port=4001 test.mytable |
cdc_kafka_producer.py --kafka-broker 127.0.0.1:9092 --kafka-topic test.mytable
For more information on how to use these scripts, see the output of cdc.py -h
and cdc_kafka_producer.py -h
.
To build the avrorouter from source, you will need theAvro C library, liblzma,the Jansson library and sqlite3 development
headers. When configuring MaxScale with CMake, you will need to add-DBUILD_CDC=Y
to build the CDC module set.
The Avro C library needs to be build with position independent code enabled. You can do this by adding the following flags to the CMake invocation when configuring the Avro C library.
-DCMAKE_C_FLAGS=-fPIC -DCMAKE_CXX_FLAGS=-fPIC
For more details about building MaxScale from source, please refer to theBuilding MaxScale from Source Code document.
The router_diagnostics
output for an avrorouter service contains the following
fields.
infofile
: File where the avrorouter stores the conversion process state.
avrodir
: Directory where avro files are stored
binlogdir
: Directory where binlog files are read from
binlog_name
: Current binlog name
binlog_pos
: Current binlog position
gtid
: Current GTID
gtid_timestamp
: Current GTID timestamp
gtid_event_number
: Current GTID event number
The avrorouter does not support the following data types, conversions or SQL statements:
BIT
Fields CAST from integer types to string types
The avrorouter does not do any crash recovery. This means that the avro files need to be removed or truncated to valid block lengths before starting the avrorouter.
This page is licensed: CC BY-SA / Gnu FDL
NOTE This is the documentation for the binlogrouter in MaxScale 2.4 and is only provided for reference. The documentation for the binlogrouter in MaxScale 2.5 is provided here.
The binlogrouter is a replication protocol proxy module for MariaDB MaxScale. This module allows MariaDB MaxScale to connect to a master server and retrieve binary logs while slave servers can connect to MariaDB MaxScale like they would connect to a normal master server. If the master server goes down, the slave servers can still connect to MariaDB MaxScale and read binary logs. You can switch to a new master server without the slaves noticing that the actual master server has changed. This allows for a more highly available replication setup where replication is high-priority.
The binlogrouter requires the user
and password
parameters. These should be
configured according to theConfiguration Guide.
In addition to these two parameters, the server_id
and binlogdir
parameters
needs to be defined.
The binlogrouter accepts the following parameters.
Note: Earlier versions of MaxScale supported the configuration of the
binlogrouter only via router_options
(a the comma-separated list of key-value
pairs). As of MaxScale 2.1, all of the router options should be defined as
parameters. The values defined in router_options
will have priority over the
parameters to support legacy configurations. The use of router_options
is
deprecated.
binlogdir
This parameter controls the location where MariaDB MaxScale stores the binary log files. This is a mandatory parameter.
The binlogdir also contains the cache subdirectory which stores data
retrieved from the master during the slave registration phase. The
master.ini file also resides in the binlogdir. This file keeps track of
the current master configuration and it is updated when a CHANGE MASTER TO
query is executed.
From 2.1 onwards, the 'cache' directory is stored in the same location as other
user credential caches. This means that with the default options, the user
credential cache is stored in/var/cache/maxscale/<Service Name>/<Listener Name>/cache/
.
Read the MySQL Authenticator documentation for instructions on how to define a custom location for the user cache.
server_id
MariaDB MaxScale must have a unique server_id. This parameter configures the value of the server_id that MariaDB MaxScale will use when connecting to the master. This is a mandatory parameter.
Older versions of MaxScale allowed the ID to be specified using server-id
.
This has been deprecated and will be removed in a future release of MariaDB MaxScale.
master_id
The server_id value that MariaDB MaxScale should use to report to the slaves that connect to MariaDB MaxScale.
This may either be the same as the server id of the real master or can be chosen to be different if the slaves need to be aware of the proxy layer. The real master server ID will be used if the option is not set.
Older versions of MaxScale allowed the ID to be specified using master-id
.
This has been deprecated and will be removed in a future release of MariaDB MaxScale.
uuid
This is used to set the unique UUID that the binlog router uses when it connects to the master server. By default the UUID will be generated.
master_uuid
It is a requirement of replication that each server has a unique UUID value. If this option is not set, binlogrouter will identify itself to the slaves using the UUID of the real master.
master_version
By default, the router will identify itself to the slaves using the server version of the real master. This option allows the router to use a custom version string.
master_hostname
By default, the router will identify itself to the slaves using the hostname of the real master. This option allows the router to use a custom hostname.
slave_hostname
Since MaxScale 2.1.6 the router can optionally identify itself to the master
using a custom hostname. The specified hostname can be seen in the master viaSHOW SLAVE HOSTS
command. The default is not to send any hostname string
during registration.
user
Note: This is option can only be given to the router_options
parameter. Use
the user
parameter of the service instead.
This is the user name that MariaDB MaxScale uses when it connects to the master. This user name must have the rights required for replication as with any other user that a slave uses for replication purposes. If the user parameter is not given in the router options then the same user as is used to retrieve the credential information will be used for the replication connection, i.e. the user in the service entry.
This user is the only one available for MySQL connection to MaxScale Binlog Server for administration when master connection is not done yet.
In MaxScale 2.1, the service user injection is done by the MySQLAuth authenticator module. Read theMySQL Authenticator documentation for more details.
The user that is used for replication must be granted replication privileges on the database server.
CREATE USER 'repl'@'maxscalehost' IDENTIFIED by 'password';
GRANT REPLICATION SLAVE ON *.* TO 'repl'@'maxscalehost';
password
Note: This is option can only be given to the router_options
parameter. Use
the password
parameter of the service instead.
The password for the user. If the password is not explicitly given then the
password in the service entry will be used. For compatibility with other
username and password definitions within the MariaDB MaxScale configuration file
it is also possible to use the parameter passwd
.
heartbeat
This defines the value of the heartbeat interval for the connection to the master. The duration can be specified as documentedhere. If no explicit unit is provided, the value is interpreted as seconds in MaxScale 2.4. In subsequent versions a value without a unit may be rejected. Note that since the granularity of the parameter is seconds, a value specified in milliseconds will be rejected, even if the duration is longer than a second. The default value for the heartbeat period is every 5 minutes.
MariaDB MaxScale requests the master to ensure that a binlog event is sent at least every heartbeat period. If there are no real binlog events to send the master will sent a special heartbeat event. The current interval value is reported in the diagnostic output.
burstsize
This parameter is used to define the maximum amount of data that will be sent to
a slave by MariaDB MaxScale when that slave is lagging behind the master. The
default value is 1M
.
The burst size can be provided as specifiedhere, except that IEC binary
prefixes can be used as suffixes only from MaxScale 2.1 onwards. MaxScale 2.0
and earlier only support burstsize
defined in bytes.
In this situation the slave is said to be in "catchup mode", this parameter is designed to both prevent flooding of that slave and also to prevent threads within MariaDB MaxScale spending disproportionate amounts of time with slaves that are lagging behind the master.
mariadb10-compatibility
This parameter allows binlogrouter to replicate from a MariaDB 10.0 master server: this parameter is enabled by default since MaxScale 2.2.0. In earlier versions the parameter was disabled by default.
# Example
mariadb10-compatibility=1
Additionally, since MaxScale 2.2.1, MariaDB 10.x slave servers can connect to binlog server using GTID value instead of binlog name and position.
Example of a MariaDB 10.x slave connection to MaxScale
MariaDB> SET @@global.gtid_slave_pos='0-10122-230';
MariaDB> CHANGE MASTER TO
MASTER_HOST='192.168.10.8',
MASTER_PORT=5306,
MASTER_USE_GTID=Slave_pos;
MariaDB> START SLAVE;
Note:
Slave servers can connect either with file and pos or GTID.
MaxScale saves all the incoming MariaDB GTIDs (DDLs and DMLs)
in a sqlite3 database located in binlogdir (gtid_maps.db
).
When a slave server connects with a GTID request a lookup is made for
the value match and following binlog events will be sent.
transaction_safety
This parameter is used to enable/disable incomplete transactions detection in binlog router. The default value is off.
When MariaDB MaxScale starts an error message may appear if current binlog file is corrupted or an incomplete transaction is found. During normal operations binlog events are not distributed to the slaves until a COMMIT is seen. Set transaction_safety=on to enable detection of incomplete transactions.
send_slave_heartbeat
This defines whether MariaDB MaxScale sends the heartbeat packet to the slave when there are no real binlog events to send. This parameter takes a boolean value and the default value is false. This means that no heartbeat events are sent to slave servers.
If value is set to true the interval value (requested by the slave during registration) is reported in the diagnostic output and the packet is send after the time interval without any event to send.
semisync
This parameter controls whether binlog server could ask Master server to start the Semi-Synchronous replication. This parameter takes a boolean value and the default value is false.
In order to get semi-sync working, the Master server must have therpl_semi_sync_master plugin installed. The availability of the plugin and the value of the GLOBAL VARIABLE rpl_semi_sync_master_enabled are checked in the Master registration phase: if the plugin is installed in the Master database, the binlog server subsequently requests the semi-sync option.
Note:
the network replication stream from Master has two additional bytes before each binlog event.
the Semi-Sync protocol requires an acknowledge packet to be sent back to Master only when requested: the semi-sync flag will have value of 1. This flag is set only if rpl_semi_sync_master_enabled=1 is set in the Master, otherwise it will always have value of 0 and no ack packet is sent back.
Please note that semi-sync replication is only related to binlog server to Master communication.
ssl_cert_verification_depth
This parameter sets the maximum length of the certificate authority chain that will be accepted. Legal values are positive integers. This applies to SSL connection to master server that could be acivated either by writing options in master.ini or later via a CHANGE MASTER TO command. This parameter cannot be modified at runtime. The default verification depth is 9.
encrypt_binlog
Whether to encrypt binlog files: the default is off.
When set to on the binlog files will be encrypted using specified AES algorithm and the KEY in the specified key file.
Note: binlog encryption must be used while replicating from a MariaDB 10.1 server and serving data to MariaDB 10.x slaves. In order to use binlog encryption the master server MariaDB 10.1 must have encryption active (encrypt-binlog=1 in my.cnf). This is required because both master and maxscale must store encrypted data for a working scenario for Secure data-at-rest. Additionally, as long as Master server doesn't send the StartEncryption event (which contains encryption setup information for the binlog file), there is a position gap between end of FormatDescription event pos and next event start pos. The StartEncryption event size is 36 or 40 (depending on CRC32 being used), so the gap has that size.
MaxScale binlog server adds its own StartEncryption to binlog files consequently the binlog events positions in binlog file are the same as in the master binlog file and there is no position mismatch.
encryption_algorithm
The encryption algorithm, either 'aes_ctr' or 'aes_cbc'. The default is 'aes_cbc'
encryption_key_file
The specified key file must contains lines with following format:
id;HEX(KEY)
Id is the scheme identifier, which must have the value 1 for binlog encryption , the ';' is a separator and HEX(KEY) contains the hex representation of the KEY. The KEY must have exact 16, 24 or 32 bytes size and the selected algorithm (aes_ctr or aes_cbc) with 128, 192 or 256 ciphers will be used.
Note: the key file has the same format as MariaDB 10.1 server so it's
possible to use an existing key file (not encrypted) which could contain severalscheme;key
values: only key id with value 1 will be parsed, and if not found
an error will be reported.
Example key file with multiple keys:
#
# This is the Encryption Key File
# key id 1 is for binlog files encryption: it's mandatory
# The keys come from a 32bytes value, 64 bytes with HEX format
#
2;abcdef1234567890abcdef12345678901234567890abcdefabcdef1234567890
1;5132bbabcde33ffffff12345ffffaaabbbbbbaacccddeee11299000111992aaa
3;bbbbbbbbbaaaaaaabbbbbccccceeeddddd3333333ddddaaaaffffffeeeeecccd
mariadb10_master_gtid
This option allows MaxScale binlog router to register with MariaDB 10.X master using GTID instead of binlog_file name and position in CHANGE MASTER TO admin command. This feature is disabled by default.
The user can set a known GTID or an empty value (in this case the Master server will send events from it's first available binlog file).
Example of MaxScale connection to a MariaDB 10.X Master
# mysql -h $MAXSCALE_HOST -P $MAXCALE_PORT
MariaDB> SET @@global.gtid_slave_pos='0-198-123';
MariaDB> CHANGE MASTER TO
MASTER_HOST='192.168.10.5',
MASTER_PORT=3306,
MASTER_USE_GTID=Slave_pos;
MariaDB> START SLAVE;
If using GTID request then it's no longer possible to use MASTER_LOG_FILE and
MASTER_LOG_POS in CHANGE MASTER TO
command: an error will be reported.
If this feature is enabled, the transaction_safety option will be automatically enabled. The binlog files will also be stored in a hierarchical directory tree instead of a single directory.
Note:
When the option is On, the connecting slaves can only use GTID request: specifying file and pos will end up in an error sent by MaxScale and replication cannot start.
The GTID request could cause the writing of events in any position of the binlog file, whose name has been sent by the master server before any event. In order to avoid holes in the binlog files, MaxScale will fill all gaps in the binlog files with ignorable events.
It's not possible to specify the GTID _domain_id: the master one is being used for all operations. All slave servers must use the same replication domain as the master server.
master_retry_count
This option sets the maximum number of connection retries when the master server is disconnected or not reachable. Default value is 1000.
connect_retry
The option sets the time interval for a new connection retry to master server. The duration can be specified as documentedhere. If no explicit unit is provided, the value is interpreted as seconds in MaxScale 2.4. In subsequent versions a value without a unit may be rejected. Note that since the granularity of the parameter is seconds, a value specified in milliseconds will be rejected, even if the duration is longer than a second. The default value is 60 seconds.
A complete example of a service entry for a binlog router service would be as follows.
[Replication]
type=service
router=binlogrouter
user=maxscale
password=maxpwd
server_id=3
binlogdir=/var/lib/maxscale/
mariadb10-compatibility=1
encrypt_binlog=1
encryption_algorithm=aes_ctr
encryption_key_file=/var/binlogs/enc_key.txt
From MaxScale 2.3 onwards it is possible to specify secondary masters that the binlog router can use in case the connection to the default master fails.
Note: This is only supported in a Galera Cluster environment in which:
Wsrep GTID mode is enabled in the cluster.
All of the requirements for wsrep GTID mode are met by the cluster.
Wsrep GTID mode is also imperfect, so this secondary master functionality is only guaranteed to work if GTIDs have not become inconsistent within the cluster.
SeeWsrep GTID Mode for more information.
The initial setup is performed exactly like when there is but one default master.
# mysql -h $MAXSCALE_HOST -P $MAXCALE_PORT
MariaDB> SET @@global.gtid_slave_pos='0-198-123';
MariaDB> CHANGE MASTER TO
MASTER_HOST='192.168.10.5',
MASTER_PORT=3306,
MASTER_USER='repl',
MASTER_PASSWORD='repl',
MASTER_USE_GTID=Slave_pos;
After the setup of the default master, secondary masters can be configured as follows:
MariaDB> CHANGE MASTER ':2' TO
MASTER_HOST='192.168.10.6',
MASTER_PORT=3306,
MASTER_USER='repl',
MASTER_PASSWORD='repl',
MASTER_USE_GTID=Slave_pos;
That is, a connection name must be provided and the name must be of the
format :N
where N
is a positive integer. If several secondary masters
are specified, they must be numbered consecutively, starting from 2
.
All settings that are not explicitly specified are copied from the default master. That is, the following is equivalent with the command above:
MariaDB> CHANGE MASTER ':2' TO MASTER_HOST='192.168.10.6';
If a particular master configuration exists already, then any specified
definitions will be changed and unspecified ones will remain unchanged.
For instance, the following command would only change the password of :2
.
MariaDB> CHANGE MASTER ':2' TO MASTER_PASSWORD='repl2';
It is not possible to delete a particular secondary master, but ifMASTER_HOST
is set on the default master, even if it is set to the same
value, then all secondary master configurations are deleted.
When START SLAVE
is issued, MaxScale will first attempt to connect to the
default master and if that fails, try the secondary masters in order, until
a connection can be created. Only if all connection attempts fail, will
MaxScale wait as specified with connect_retry
, before doing the cycle over
again.
Once the binlog router has successfully connected to a server, it will stay
connected to that server until the connection breaks or STOP SLAVE
is
issued.
The configurations of the secondary masters are also stored to themaster.ini
in sections whose name include the connection name.
[binlog_configuration]
master_host=192.168.121.150
...
[binlog_configuration:2]
master_host=192.168.121.148
...
[binlog_configuration:3]
master_host=192.168.121.76
...
This page is licensed: CC BY-SA / Gnu FDL
NOTE: The binlog router delivered with 2.5 is completely new and is not 100% backward compatible with the binlog router delivered with earlier versions of MaxScale.
The binlogrouter is a router that acts as a replication proxy for MariaDB master-slave replication. The router connects to a master, retrieves the binary logs and stores them locally. Slave servers can connect to MaxScale like they would connect to a normal master server. If the master server goes down, replication between MaxScale and the slaves can still continue up to the latest point to which the binlogrouter replicated to. The master can be changed without disconnecting the slaves and without them noticing that the master server has changed. This allows for a more highly available replication setup.
In addition to the high availability benefits, the binlogrouter creates only one connection to the master whereas with normal replication each individual slave will create a separate connection. This reduces the amount of work the master database has to do which can be significant if there are a large number of replicating slaves.
The binlogrouter in MaxScale 2.5.0 is a new and improved version of the original binlogrouter found in older MaxScale versions. The new implementation contains most of the features that were in the original binlogrouter but some of them were removed as they were either redundant or not useful.
The major differences between the new and old binlog router are:
The list of servers where the database users for authentication are loaded
must be explicitly configured with the cluster
, servers
ortargets
parameter. Alternatively, the users can be read from a file. Seeuser_accounts_file
for more information.
The old binlog router had both server_id
and master_id
, the new onlyserver_id
.
No need to configure heartbeat and burst interval anymore as they are now automatically configured.
Traditional replication that uses the binary log name and file offset to start the replication process is not supported.
Semi-sync support is not implemented.
Binlog encryption is not implemented.
Secondary masters are not supported, but the functionality provided byselect_master
is roughly equivalent.
The new binlogrouter will write its own binlog files to prevent problems that could happen when the master changes. This causes the binlog names to be different in the binlogrouter when compared to the ones on the master.
The documentation for the binlogrouter in MaxScale 2.4 is provided for referencehere.
The binlogrouter supports a subset of the SQL constructs that the MariaDB server supports. The following commands are supported:
CHANGE MASTER TO
The binlogrouter supports the same syntax as the MariaDB server but only the following values are allowed:
MASTER_HOST
MASTER_PORT
MASTER_USER
MASTER_PASSWORD
MASTER_USE_GTID
MASTER_SSL
MASTER_SSL_CA
MASTER_SSL_CAPATH
MASTER_SSL_CERT
MASTER_SSL_CRL
MASTER_SSL_CRLPATH
MASTER_SSL_KEY
MASTER_SSL_CIPHER
MASTER_SSL_VERIFY_SERVER_CERT
NOTE: MASTER_LOG_FILE
and MASTER_LOG_POS
are not supported
as binlogrouter only supports GTID based replication.
STOP SLAVE
Stops replication, same as MariaDB.
START SLAVE
Starts replication, same as MariaDB.
RESET SLAVE
Resets replication. Note that the RESET SLAVE ALL
form that is supported
by MariaDB isn't supported by the binlogrouter.
SHOW BINARY LOGS
Lists the current files and their sizes. These will be different from the ones listed by the original master where the binlogrouter is replicating from.
PURGE { BINARY | MASTER } LOGS TO <filename>
Purges binary logs up to but not including the given file. The file name
must be one of the names shown in SHOW BINARY LOGS
. The version of this
command which accepts a timestamp is not currently supported.
Automatic purging is supported using the configuration
parameter expire_log_duration.
The files are purged in the order they were created. If a file to be purged
is detected to be in use, the purge stops. This means that the purge will
stop at the oldest file that a slave is still reading.
NOTE: You should still take precaution not to purge files that a potential
slave will need in the future. MaxScale can only detect that a file is
in active use when a slave is connected, and requesting events from it.
SHOW MASTER STATUS
Shows the name and position of the file to which the binlogrouter will write the next replicated data. The name and position do not correspond to the name and position in the master.
SHOW SLAVE STATUS
Shows the slave status information similar to what a normal MariaDB slave server shows. Some of the values are replaced with constants values that never change. The following values are not constant:
Slave_IO_State
: Set to Waiting for master to send event
when
replication is ongoing.
Master_Host
: Address of the current master.
Master_User
: The user used to replicate.
Master_Port
: The port the master is listening on.
Master_Log_File
: The name of the latest file that the binlogrouter is
writing to.
Read_Master_Log_Pos
: The current position where the last event was
written in the latest binlog.
Slave_IO_Running
: Set to Yes
if replication running and No
if it's
not.
Slave_SQL_Running
Set to Yes
if replication running and No
if it's
not.
Exec_Master_Log_Pos
: Same as Read_Master_Log_Pos
.
Gtid_IO_Pos
: The latest replicated GTID.
SELECT { Field } ...
The binlogrouter implements a small subset of the MariaDB SELECT syntax as it is mainly used by the replicating slaves to query various parameters. If a field queried by a client is not known to the binlogrouter, the value will be returned back as-is. The following list of functions and variables are understood by the binlogrouter and are replaced with actual values:
@@gtid_slave_pos
, @@gtid_current_pos
or @@gtid_binlog_pos
: All of
these return the latest GTID replicated from the master.
version()
or @@version
: The version string returned by MaxScale when
a client connects to it.
UNIX_TIMESTAMP()
: The current timestamp.
@@version_comment
: Always pinloki
.
@@global.gtid_domain_id
: Always 0
.
@master_binlog_checksum
: Always CRC32
.
@@session.auto_increment_increment
: Always 1
@@character_set_client
: Always utf8
@@character_set_connection
: Always utf8
@@character_set_results
: Always utf8
@@character_set_server
: Always utf8mb4
@@collation_server
: Always utf8mb4_general_ci
@@collation_connection
: Always utf8_general_ci
@@init_connect
: Always an empty string
@@interactive_timeout
: Always 28800
@@license
: Always BSL
@@lower_case_table_names
: Always 0
@@max_allowed_packet
: Always 16777216
@@net_write_timeout
: Always 60
@@performance_schema
: Always 0
@@query_cache_size
: Always 1048576
@@query_cache_type
: Always OFF
@@sql_mode
: Always an empty string
@@system_time_zone
: Always UTC
@@time_zone
: Always SYSTEM
@@tx_isolation
: Always REPEATABLE-READ
@@wait_timeout
: Always 28800
SET
@@global.gtid_slave_pos
: Set the position from which binlogrouter should
start replicating. E.g. SET @@global.gtid_slave_pos="0-1000-1234,1-1001-5678"
SHOW VARIABLES LIKE '...'
Shows variables matching a string. The LIKE
operator in SHOW VARIABLES
is mandatory for the binlogrouter. This means that a plain SHOW VARIABLES
is not currently supported. In addition, the LIKE
operator in
binlogrouter only supports exact matches.
Currently the only variables that are returned are gtid_slave_pos
,gtid_current_pos
and gtid_binlog_pos
which return the current GTID
coordinates of the binlogrouter. In addition to these, the server_id
variable will return the configured server ID of the binlogrouter.
The binlogrouter is configured similarly to how normal routers are configured in MaxScale. It requires at least one listener where clients can connect to and one server from which the database user information can be retrieved. An example configuration can be found in the example section of this document.
datadir
Type: path
Mandatory: No
Dynamic: No
Default: /var/lib/maxscale/binlogs
Directory where binary log files are stored.
server_id
Type: number
Mandatory: No
Dynamic: No
Default: 1234
The server ID that MaxScale uses when connecting to the master and when serving binary logs to the slaves.
net_timeout
Type: duration
Mandatory: No
Dynamic: No
Default: 10s
Network connection and read timeout in seconds for the connection to the master.
select_master
Type: boolean
Mandatory: No
Dynamic: No
Default: false
Automatically select the master server to replicate from.
When this feature is enabled, the master which binlogrouter will replicate
from will be selected from the servers defined by a monitor cluster=TheMonitor
.
Alternatively servers can be listed in servers
. The servers should be monitored
by a monitor. Only servers with the Master
status are used. If multiple master
servers are available, the first available master server will be used.
If a CHANGE MASTER TO
command is received while select_master
is on, the
command will be honored and select_master
turned off until the next reboot.
This allows the Monitor to perform failover, and more importantly, switchover.
It also allows the user to manually redirect the Binlogrouter. The current
master is "sticky", meaning that the same master will be chosen on reboot.
NOTE: Do not use the mariadbmon
parameterauto_rejoin if the monitor is
monitoring a binlogrouter. The binlogrouter does not support all the SQL
commands that the monitor will send and the rejoin will fail. This restriction
will be lifted in a future version.
The GTID the replication will start from, will be based on the latest replicated
GTID. If no GTID has been replicated, the router will start replication from the
start. Manual configuration of the GTID can be done by first configuring the
replication manually with CHANGE MASTER TO
.
expire_log_duration
Type: duration
Mandatory: No
Dynamic: No
Default: 0s
Duration after which a binary log file can be automatically removed.
The duration is measured from the last modification of the log file. Files are
purged in the order they were created. The automatic purge works in a similar
manner to PURGE BINARY LOGS TO <filename>
in that it will stop the purge if
an eligible file is in active use, i.e. being read by a slave.
The duration can be specified as explainedhere.
expire_log_minimum_files
Type: number
Mandatory: No
Dynamic: No
Default: 2
The minimum number of log files the automatic purge keeps. At least one file is always kept.
Configure and start MaxScale.
If you have not configured select_master=true
(automatic
master selection), issue a CHANGE MASTER TO
command to binlogrouter.
mysql -u USER -pPASSWORD -h maxscale-IP -P binlog-PORT
CHANGE MASTER TO master_host="master-IP", master_port=master-PORT, master_user=USER, master_password="PASSWORD", master_use_gtid=slave_pos;
START SLAVE;
Redirect each slave to replicate from Binlogrouter
mysql -u USER -pPASSWORD -h slave-IP -P slave-PORT
STOP SLAVE;
CHANGE MASTER TO master_host="maxscale-IP", master_port=binlog-PORT,
master_user="USER", master_password="PASSWORD", master_use_gtid=slave_pos;
START SLAVE;
SHOW SLAVE STATUS \G
Binlogrouter does not read any of the data that a version prior to 2.5 has saved. By default binlogrouter will request the replication stream from the blank state (from the start of time), which is basically meant for new systems. If a system is live, the entire replication data probably does not exist, and if it does, it is not necessary for binlogrouter to read and store all the data.
Note that binlogrouter only supports GTID based replication.
Make sure that the configured data directory for the new binlogrouter is different from the old one, or move old data away. See datadir.
If the master contains binlogs from the blank state, and there is a large amount of data, consider purging old binlogs. See Using and Maintaining the Binary Log.
The method described here inflicts the least downtime. Assuming you have configured version 2.5, and it is ready to go:
Redirect each slave that replicates from Binlogrouter to replicate from the master.
mysql -u USER -pPASSWORD -h slave-IP -P slave-PORT
STOP SLAVE;
CHANGE MASTER TO master_host="master-IP", master_port=master-PORT,
master_user="USER", master_password="PASSWORD", master_use_gtid=slave_pos;
START SLAVE;
SHOW SLAVE STATUS \G
Stop the old version of MaxScale, and start the new one. Verify routing functionality.
Issue a CHANGE MASTER TO
command, or use select_master.
mysql -u USER -pPASSWORD -h maxscale-IP -P binlog-PORT
CHANGE MASTER TO master_host="master-IP", master_port=master-PORT,
master_user=USER,master_password="PASSWORD", master_use_gtid=slave_pos;
Run maxctrl list servers
. Make sure all your servers are accounted for.
Pick the lowest gtid state (e.g. 0-1000-1234,1-1001-5678) on display and
issue this command to Binlogrouter:
STOP SLAVE
SET @@global.gtid_slave_pos = "0-1000-1234,1-1001-5678";
START SLAVE
NOTE: Even with select_master=true
you have to set @@global.gtid_slave_pos
if any binlog files have been purged on the master. The server will only stream
from the start of time if the first binlog file is present.
See select_master.
Redirect each slave to replicate from Binlogrouter.
mysql -u USER -pPASSWORD -h slave-IP -P slave-PORT
STOP SLAVE;
CHANGE MASTER TO master_host="maxscale-IP", master_port=binlog-PORT,
master_user="USER", master_password="PASSWORD",
master_use_gtid=slave_pos;
START SLAVE;
SHOW SLAVE STATUS \G
When replicating from a Galera cluster, select_master must be set to true, and the servers must be monitored by theGalera Monitor. Configuring binlogrouter is the same as described above.
The Galera cluster must be configured to use Wsrep GTID Mode.
The MariaDB version must be 10.5.1 or higher. The required GTID related server settings for MariaDB/Galera to work with Binlogrouter are listed here:
[mariadb]
log_slave_updates = ON
log_bin = pinloki # binlog file base name. Must be the same on all servers
gtid_domain_id = 1001 # Must be different for each galera server
binlog_format = ROW
[galera]
wsrep_on = ON
wsrep_gtid_mode = ON
wsrep_gtid_domain_id = 42 # Must be the same for all servers
The following is a small configuration file with automatic master selection. With it, the service will accept connections on port 3306.
[server1]
type=server
address=192.168.0.1
port=3306
[server2]
type=server
address=192.168.0.2
port=3306
[MariaDB-Monitor]
type=monitor
module=mariadbmon
servers=server1, server2
user=maxuser
password=maxpwd
monitor_interval=10s
[Replication-Proxy]
type=service
router=binlogrouter
cluster=MariaDB-Monitor
select_master=true
expire_log_duration=5h
expire_log_minimum_files=3
user=maxuser
password=maxpwd
[Replication-Listener]
type=listener
service=Replication-Proxy
port=3306
Old-style replication with binlog name and file offset is not supported and the replication must be started by setting up the GTID to replicate from.
Only replication from MariaDB servers (including Galera) is supported.
The MariaDB server where the replication is done from must be configured withbinlog_checksum=CRC32
.
This page is licensed: CC BY-SA / Gnu FDL
The cat
router is a special router that concatenates result sets.
Note: This module is experimental and must be built from source. The module is deprecated in MaxScale 23.08 and might be removed in a future release.
The router has no special parameters. To use it, define a service withrouter=cat
and add the servers you want to use.
The order the servers are defined in is the order in which the servers are
queried. This means that the results are ordered based on the servers
parameter of the service. The result will only be completed once all servers
have executed this.
All commands executed via this router will be executed on all servers. This
means that an INSERT through the cat
router will send it to all servers. In
the case of commands that do not return resultsets, the response of the last
server is sent to the client. This means that if one of the earlier servers
returns a different result, the client will not see it.
As the intended use-case of the router is to mainly reduce multiple result sets into one, it has no mechanisms to prevent writes from being executed on slave servers (which would cause data corruption or replication failure). Take great care when performing administrative operations though this router.
If a connection to one of the servers is lost, the client connection will also be closed.
Here is a simple example service definition that uses the servers from theConfiguring Servers tutorial and the credentials from the MaxScale Tutorial.
[concat-service]
type=service
router=cat
servers=dbserv1,dbserv2,dbserv3
user=maxscale
password=maxscale_pw
This page is licensed: CC BY-SA / Gnu FDL
HintRouter was introduced in 2.2 and is still beta.
The HintRouter module is a simple router intended to operate in conjunction with the NamedServerFilter. The router looks at the hints embedded in a packet buffer and attempts to route the packet according to the hint. The user can also set a default action to be taken when a query has no hints or when the hints could not be applied.
If a packet has multiple hints attached, the router will read them in order and attempt routing. Any successful routing ends the process and any further hints are ignored for the packet.
The HintRouter is a rather simple router and only accepts a few configuration settings.
default_action=<master|slave|named|all>
This setting defines what happens when a query has no routing hint or applying the routing hint(s) fails. If also the default action fails, the routing will end in error and the session closes. The different values are:
master
Route to the master server.
slave
Route to any single slave server.
named
Route to a named server. The name is given in the default_server-setting.
all
Default value. Route to all connected servers.
Note that setting default action to anything other than all
means that session
variable write commands are by default not routed to all backends.
default_server=<server-name>
Defines the default backend name if default_action=named
. <server-name>
must
be a valid backend name.
max_slaves=<limit>
<limit>
should be an integer, -1 by default. Defines how many backend slave
servers a session should attempt to connect to. Having less slaves defined in
the services and/or less successful connections during session creation is not
an error. The router will attempt to distribute slaves evenly between sessions
by assigning them in a round robin fashion. The session will always try to
connect to a master regardless of this setting, although not finding one is not
an error.
Negative values activate default mode, in which case this value is set to the number of backends in the service - 1, so that the sessions are connected to all slaves.
If the hints or the default_action
point to a named server, this setting is
probably best left to default to ensure that the specific server is connected to
at session creation. The router will not attempt to connect to additional
servers after session creation.
A minimal configuration doesn't require any parameters as all settings have reasonable defaults.
[Routing-Service]
type=service
router=hintrouter
servers=slave1,slave2,slave3
If packets should be routed to the master server by default and only a few connections are required, the configuration might be as follows.
[Routing-Service]
type=service
router=hintrouter
servers=MyMaster, slave1,slave2,slave3,slave4,slave5,slave6,slave7
default_action=master
max_slaves=2
This page is licensed: CC BY-SA / Gnu FDL
The KafkaCDC module reads data changes in MariaDB via replication and converts them into JSON objects that are then streamed to a Kafka broker.
DDL events (CREATE TABLE
, ALTER TABLE
) are streamed as JSON objects in the
following format (example created by CREATE TABLE test.t1(id INT)
):
{
"namespace": "MaxScaleChangeDataSchema.avro",
"type": "record",
"name": "ChangeRecord",
"table": "t2", // name of the table
"database": "test", // the database the table is in
"version": 1, // schema version, incremented when the table format changes
"gtid": "0-3000-14", // GTID that created the current version of the table
"fields": [
{
"name": "domain", // First part of the GTID
"type": "int"
},
{
"name": "server_id", // Second part of the GTID
"type": "int"
},
{
"name": "sequence", // Third part of the GTID
"type": "int"
},
{
"name": "event_number", // Sequence number of the event inside the GTID
"type": "int"
},
{
"name": "timestamp", // UNIX timestamp when the event was created
"type": "int"
},
{
"name": "event_type", // Event type
"type": {
"type": "enum",
"name": "EVENT_TYPES",
"symbols": [
"insert", // The row that was inserted
"update_before", // The row before it was updated
"update_after", // The row after it was updated
"delete" // The row that was deleted
]
}
},
{
"name": "id", // Field name
"type": [
"null",
"long"
],
"real_type": "int", // Field type
"length": -1, // Field length, if found
"unsigned": false // Whether the field is unsigned
}
]
}
The domain
, server_id
and sequence
fields contain the GTID that this event
belongs to. The event_number
field is the sequence number of events inside the
transaction starting from 1. The timestamp
field is the UNIX timestamp when
the event occurred. The event_type
field contains the type of the event, one
of:
insert
: the event is the data that was added to MariaDB
delete
: the event is the data that was removed from MariaDB
update_before
: the event contains the data before an update statement modified it
update_after
: the event contains the data after an update statement modified it
All remaining fields contains data from the table. In the example event this
would be the fields id
and data
.
DML events (INSERT
, UPDATE
, DELETE
) are streamed as JSON objects that
follow the format specified in the DDL event. The objects are in the following
format (example created by INSERT INTO test.t1 VALUES (1)
):
{
"domain": 0,
"server_id": 3000,
"sequence": 20,
"event_number": 1,
"timestamp": 1580485945,
"event_type": "insert",
"id": 1,
"table_name": "t2",
"table_schema": "test"
}
The table_name
and table_schema
fields were added in MaxScale 2.5.3. These
contain the table name and schema the event targets.
The router stores table metadata in the MaxScale data directory. The
default value is /var/lib/maxscale/<service name>
. If data for a table
is replicated before a DDL event for it is replicated, the CREATE TABLE
will be queried from the master server.
During shutdown, the Kafka event queue is flushed. This can take up to 60 seconds if the network is slow or there are network problems.
In order for kafkacdc
to work, the binary logging on the source server must
be configured to use row-based replication and the row image must be set to
full by configuring binlog_format=ROW
and binlog_row_image=FULL
.
The servers
parameter defines the set of servers where the data is
replicated from. The replication will be done from the first master server
that is found.
The user
and password
of the service will be used to connect to the
master. This user requires the REPLICATION SLAVE
grant.
The KafkaCDC service must not be configured to use listeners. If a listener is configured, all attempts to start a session will fail.
bootstrap_servers
Type: string
Mandatory: Yes
Dynamic: No
The list of Kafka brokers to use in host:port
format. Multiple values
can be separated with commas. This is a mandatory parameter.
topic
Type: string
Mandatory: Yes
Dynamic: No
The Kafka topic where the replicated events will be sent. This is a mandatory parameter.
enable_idempotence
Type: boolean
Mandatory: No
Dynamic: No
Default: false
Enable idempotent producer mode. This feature requires Kafka version 0.11 or newer to work and is disabled by default.
When enabled, the Kafka producer enters a strict mode which avoids event duplication due to broker outages or other network errors. In HA scenarios where there are more than two MaxScale instances, event duplication can still happen as there is no synchronization between the MaxScale instances.
The Kafka C library,librdkafka, describes the parameter as follows:
When set to true, the producer will ensure that messages are successfully produced exactly once and in the original produce order. The following configuration properties are adjusted automatically (if not modified by the user) when idempotence is enabled: max.in.flight.requests.per.connection=5 (must be less than or equal to 5), retries=INT32_MAX (must be greater than 0), acks=all, queuing.strategy=fifo.
timeout
Type: duration
Mandatory: No
Dynamic: Yes
Default: 10s
The connection and read timeout for the replication stream.
gtid
Type: string
Mandatory: No
Dynamic: No
Default: ""
The initial GTID position from where the replication is started. By default the replication is started from the beginning. The value of this parameter is only used if no previously replicated events with GTID positions can be retrieved from Kafka.
Once the replication has started and a GTID position has been recorded, this
parameter will be ignored. To reset the recorded GTID position, delete thecurrent_gtid.txt
file located in /var/lib/maxscale/<SERVICE>/
where<SERVICE>
is the name of the KafkaCDC service.
server_id
Type: number
Mandatory: No
Dynamic: No
Default: 1234
Theserver_id used when replicating from the master in direct replication mode. The default value is 1234. This parameter was added in MaxScale 2.5.7.
match
Type: regex
Mandatory: No
Dynamic: Yes
Default: ""
Only include data from tables that match this pattern.
For example, if configured with match=accounts[.].*
, only data from theaccounts
database is sent to Kafka.
The pattern is matched against the combined database and table name separated by
a period. This means that the event for the table t1
in the test
database
would appear as test.t1
. The behavior is the same even if the database or the
table name contains a period. This means that an event for the test.table
table in the my.data
database would appear as my.data.test.table
.
Here is an example configuration that only sends events for tables from thedb1
database. The accounts
and users
tables in the db1
database are
filtered out using the exclude
parameter.
[Kafka-CDC]
type=service
router=kafkacdc
servers=server1
user=maxuser
password=maxpwd
bootstrap_servers=127.0.0.1:9092
topic=my-cdc-topic
match=db1[.]
exclude=db1[.](accounts|users)
exclude
Type: regex
Mandatory: No
Dynamic: Yes
Default: ""
Exclude data from tables that match this pattern.
For example, if configured with exclude=mydb[.].*
, all data from the tables in
the mydb
database is not sent to Kafka.
The pattern matching works the same way for both of the exclude
and match
parameters. See match for an explanation on how the patterns are
matched against the database and table names.
cooperative_replication
Type: boolean
Mandatory: No
Dynamic: No
Default: false
Controls whether multiple instances cooperatively replicate from the same cluster. This is a boolean parameter and is disabled by default. It was added in MaxScale 6.0.
When this parameter is enabled and the monitor pointed to by the cluster
parameter supports cooperative monitoring (currently only mariadbmon
), the
replication is only active if the monitor owns the cluster it is monitoring.
Whenever an instance that does not own the cluster gains ownership of the cluster, the replication will continue from the latest GTID that was delivered to Kafka.
This means that multiple MaxScale instances can replicate from the same set of
servers and the event is only processed once. This feature does not provide
exactly-once semantics for the Kafka event delivery. However, it does provide
high-availability for the kafkacdc
instances which allows automated failover
between multiple MaxScale instances.
read_gtid_from_kafka
Type: boolean
Mandatory: No
Dynamic: No
Default: true
On startup, the latest GTID is by default read from the Kafka cluster. This makes it possible to recover the replication position stored by another MaxScale. Sometimes this is not desirable and the GTID should only be read from the local file or started anew. Examples of these are when the GTIDs are reset or the replication topology has changed.
kafka_ssl
Type: boolean
Mandatory: No
Dynamic: No
Default: false
Enable SSL for Kafka connections. This is a boolean parameter and is disabled by default.
kafka_ssl_ca
Type: path
Mandatory: No
Dynamic: No
Default: ""
Path to the certificate authority file in PEM format. If this is not provided, the default system certificates will be used.
kafka_ssl_cert
Type: path
Mandatory: No
Dynamic: No
Default: ""
Path to the public certificate in PEM format.
The client must provide a certificate if the Kafka server performs authentication of the client certificates. This feature is enabled by default in Kafka and is controlled byssl.endpoint.identification.algorithm.
If kafka_ssl_cert
is provided, kafka_ssl_key
must also be provided.
kafka_ssl_key
Type: path
Mandatory: No
Dynamic: No
Default: ""
Path to the private key in PEM format.
If kafka_ssl_key
is provided, kafka_ssl_cert
must also be provided.
kafka_sasl_user
Type: string
Mandatory: No
Dynamic: No
Default: ""
Username for SASL authentication.
If kafka_sasl_user
is provided, kafka_sasl_password
must also be provided.
kafka_sasl_password
Type: string
Mandatory: No
Dynamic: No
Default: ""
Password for SASL authentication.
If kafka_sasl_password
is provided, kafka_sasl_user
must also be provided.
kafka_sasl_mechanism
Type: enum
Mandatory: No
Dynamic: No
Values: PLAIN
, SCRAM-SHA-256
, SCRAM-SHA-512
Default: PLAIN
The SASL mechanism used. The default value is PLAIN
which uses plaintext
authentication. It is recommended to enable SSL whenever plaintext
authentication is used.
Allowed values are:
PLAIN
SCRAM-SHA-256
SCRAM-SHA-512
The value that should be used depends on the SASL mechanism used by the Kafka broker.
The following configuration defines the minimal setup for streaming replication events from MariaDB into Kafka as JSON:
# The server we're replicating from
[server1]
type=server
address=127.0.0.1
port=3306
# The monitor for the server
[MariaDB-Monitor]
type=monitor
module=mariadbmon
servers=server1
user=maxuser
password=maxpwd
monitor_interval=5000
# The MariaDB-to-Kafka CDC service
[Kafka-CDC]
type=service
router=kafkacdc
servers=server1
user=maxuser
password=maxpwd
bootstrap_servers=127.0.0.1:9092
topic=my-cdc-topic
The KafkaCDC module provides at-least-once semantics for the generated events. This means that each replication event is delivered to kafka at least once but there can be duplicate events in case of failures.
This page is licensed: CC BY-SA / Gnu FDL
The KafkaImporter module reads messages from Kafka and streams them into a MariaDB server. The messages are inserted into a table designated by either the topic name or the message key (see table_name_in for details). The table will be automatically created with the following SQL:
CREATE TABLE IF NOT EXISTS my_table (
data LONGTEXT CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NOT NULL,
id VARCHAR(1024) AS (JSON_EXTRACT(data, '$._id')) UNIQUE KEY,
CONSTRAINT data_is_json CHECK(JSON_VALID(data))
);
The payload of the message is inserted into the data
field from which the id
field is calculated. The payload must be a valid JSON object and it must either
contain a unique _id
field or it must not exist or the value must be a JSON
null. This is similar to the MongoDB document format where the _id
field is
the primary key of the document collection.
If a message is read from Kafka and the insertion into the table fails due to a
violation of one of the constraints, the message is ignored. Similarly, messages
with duplicate _id
value are also ignored: this is done to avoid inserting the
same document multiple times whenever the connection to either Kafka or MariaDB
is lost.
The limitations on the data can be removed by either creating the table before
the KafkaImporter is started, in which case the CREATE TABLE IF NOT EXISTS
does nothing, or by altering the structure of the existing table. The minimum
requirement that must be met is that the table contains the data
field to
which string values can be inserted into.
The database server where the data is inserted is chosen from the set of servers available to the service. The first server labeled as the Master with the best rank will be chosen. This means that a monitor must be configured for the MariaDB server where the data is to be inserted.
In MaxScale versions 21.06.18, 22.08.15, 23.02.12, 23.08.8, 24.02.4 and 24.08.2
the _id
field is not required to be present. Older versions of MaxScale used
the following SQL where the _id
field was mandatory:
CREATE TABLE IF NOT EXISTS my_table (
data LONGTEXT CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NOT NULL,
id VARCHAR(1024) AS (JSON_EXTRACT(data, '$._id')) UNIQUE KEY,
CONSTRAINT data_is_json CHECK(JSON_VALID(data)),
CONSTRAINT id_is_not_null CHECK(JSON_EXTRACT(data, '$._id') IS NOT NULL)
);
The user defined by the user
parameter of the service must have INSERT
andCREATE
privileges on all tables that are created.
bootstrap_servers
Type: string
Mandatory: Yes
Dynamic: Yes
The list of Kafka brokers as a CSV list in host:port
format.
topics
Type: stringlist
Mandatory: Yes
Dynamic: Yes
The comma separated list of topics to subscribe to.
batch_size
Type: count
Mandatory: No
Dynamic: Yes
Default: 100
Maximum number of uncommitted records. The KafkaImporter will buffer records into batches and commit them once either enough records are gathered (controlled by this parameter) or when the KafkaImporter goes idle. Any uncommitted records will be read again if a reconnection to either Kafka or MariaDB occurs.
kafka_sasl_mechanism
Type: enum
Mandatory: No
Dynamic: Yes
Values: PLAIN
, SCRAM-SHA-256
, SCRAM-SHA-512
Default: PLAIN
SASL mechanism to use. The Kafka broker must be configured with the same authentication scheme.
kafka_sasl_user
Type: string
Mandatory: No
Dynamic: Yes
Default: ""
SASL username used for authentication. If this parameter is defined,kafka_sasl_password
must also be provided.
kafka_sasl_password
Type: string
Mandatory: No
Dynamic: Yes
Default: ""
SASL password for the user. If this parameter is defined, kafka_sasl_user
must
also be provided.
kafka_ssl
Type: boolean
Mandatory: No
Dynamic: Yes
Default: false
Enable SSL for Kafka connections.
kafka_ssl_ca
Type: path
Mandatory: No
Dynamic: Yes
Default: ""
SSL Certificate Authority file in PEM format. If this parameter is not defined, the system default CA certificate is used.
kafka_ssl_cert
Type: path
Mandatory: No
Dynamic: Yes
Default: ""
SSL public certificate file in PEM format. If this parameter is defined,kafka_ssl_key
must also be provided.
kafka_ssl_key
Type: path
Mandatory: No
Dynamic: Yes
Default: ""
SSL private key file in PEM format. If this parameter is defined,kafka_ssl_cert
must also be provided.
table_name_in
Type: enum
Mandatory: No
Dynamic: Yes
Values: topic
, key
Default: topic
The Kafka message part that is used to locate the table to insert the data into.
Enumeration Values:
topic
: The topic named is used as the fully qualified table name.
key
: The message key is used as the fully qualified table name. If the Kafka
message does not have a key, the message is ignored.
For example, all messages with a fully qualified table name of my_db.my_table
will be inserted into the table my_table
located in the my_db
database. If
the table or database names have special characters that must be escaped to make
them valid identifiers, the name must also contain those escape characters. For
example, to insert into a table named my table
in the database my database
,
the name would be:
`my database`.`my table`
timeout
Type: duration
Mandatory: No
Dynamic: Yes
Default: 5000ms
Timeout for both Kafka and MariaDB network communication.
The backend servers used by this service must be MariaDB version 10.2 or newer.
This page is licensed: CC BY-SA / Gnu FDL
The mirror
router is designed for data consistency and database behavior
verification during system upgrades. It allows statement duplication to multiple
servers in a manner similar to that of theTee filter with exporting of collected query metrics.
For each executed query the router exports a JSON object that describes the query results and has the following fields:
query
The executed SQL if an SQL statement was executed
command
The SQL command
session
The connection ID of the session that executed the query
query_id
Query sequence number, starts from 1
results
Array of query result objects
The objects in the results
array describe an individual query result and have
the following fields:
target
The target where the query was executed
checksum
The CRC32 checksum of the result
rows
Number of returned rows
warnings
Number of returned warnings
duration
Query duration in milliseconds
type
Result type, one of ok, error or resultset
main
Type: target
Mandatory: Yes
Dynamic: Yes
The main target from which results are returned to the client. This is a
mandatory parameter and must define one of the targets configured in thetargets
parameter of the service.
If the connection to the main target cannot be created or is lost mid-session, the client connection will be closed. Connection failures to other targets are not fatal errors and any open connections to them will be closed. The router does not create new connections after the initial connections are created.
exporter
Type: enum
Mandatory: Yes
Dynamic: Yes
Values: log
, file
, kafka
The exporter where the data is exported. This is a mandatory parameter. Possible values are:
log
Exports metrics to MaxScale log on INFO level. No configuration parameters.
file
Exports metrics to a file. Configured with the file parameter.
kafka
Exports metrics to a Kafka broker. Configured with thekafka_broker and kafka_topic parameters.
file
Type: string
Default: No default value
Mandatory: No
Dynamic: Yes
The output file where the metrics will be written. The file must be writable by
the user that is running MaxScale, usually the maxscale
user.
When the file
parameter is altered at runtime, the old file is closed before
the new file is opened. This makes it a convenient way of rotating the file
where the metrics are exported. Note that the file name alteration must change
the value for it to take effect.
This is a mandatory parameter when configured with exporter=file
.
kafka_broker
Type: string
Default: No default value
Mandatory: No
Dynamic: Yes
The Kafka broker list. Must be given as a comma-separated list of broker hosts
with optional ports in host:port
format.
This is a mandatory parameter when configured with exporter=kafka
.
kafka_topic
Type: string
Default: No default value
Mandatory: No
Dynamic: Yes
The kafka topic where the metrics are sent.
This is a mandatory parameter when configured with exporter=kafka
.
on_error
Type: enum
Default: ignore
Mandatory: No
Dynamic: Yes
Values: ignore
, close
What to do when a backend network connection fails. Accepted values are:
ignore
Ignore the failing backend if it's not the backend that the main
parameter
points to.
close
Close the client connection when the first backend fails.
This parameter was added in MaxScale 6.0. Older versions always ignored failing backends.
report
Type: enum
Default: always
Mandatory: No
Dynamic: Yes
Values: always
, on_conflict
When to report the result of the queries. Accepted values are:
always
Always report the result for all queries.
on_conflict
Only report when one or more backends returns a conflicting result.
This parameter was added in MaxScale 6.0. Older versions always reported the result.
[server1]
type=server
address=127.0.0.1
port=3000
[server2]
type=server
address=127.0.0.1
port=3001
[MariaDB-Monitor]
type=monitor
module=mariadbmon
servers=server1,server2
user=maxuser
password=maxpwd
monitor_interval=2000
[Mirror-Router]
type=service
router=mirror
user=maxuser
password=maxpwd
targets=server1,server2
main=server1
exporter=file
file=/tmp/Mirror-Router.log
[Mirror-Listener]
type=listener
service=Mirror-Router
port=3306
Broken network connections are not recreated.
Prepared statements are not supported.
Contents of non-SQL statements are not added to the exported metrics.
Data synchronization in dynamic environments (e.g. when replication is in use) is not guaranteed. This means that result mismatches can be reported when the data is only eventually consistent.
This page is licensed: CC BY-SA / Gnu FDL
This document provides an overview of the readconnroute router module and its intended use case scenarios. It also displays all router configuration parameters with their descriptions.
The readconnroute router provides simple and lightweight load balancing across a set of servers. The router can also be configured to balance connections based on a weighting parameter defined in the server's section.
Note that *readconnroute balances connections and not statements. When a client connects, the router selects a server based upon the router configuration and current server load, but the single created connection is fixed and will not be changed for the duration of the session. If the connection between MaxScale and the server breaks, the connection can not be re-established and the session will be closed. The fact that the server is fixed when the client connects also means that routing hints are ignored.
Warning: readconnroute
will not prevent writes from being done even if you
define router_options=slave
. The client application is responsible for
making sure that it only performs read-only queries in such
cases. readconnroute
is simple by design: it selects a server for each
client connection and routes all queries there. If something more complex is
required, the readwritesplit router is usually the right
choice.
For more details about the standard service parameters, refer to theConfiguration Guide.
router_options
router_options
can contain a comma separated list of valid server
roles. These roles are used as the valid types of servers the router will
form connections to when new sessions are created.
Examples:
router_options=slave
router_options=master,slave
Here is a list of all possible values for the router_options
.
master
A server assigned as a master by one of MariaDB MaxScale monitors. Depending on the monitor implementation, this could be a master server of a Master-Slave replication cluster or a Write-Master of a Galera cluster.
slave
A server assigned as a slave of a master. If all slaves are down, but the master is still available, then the router will use the master.
synced
A Galera cluster node which is in a synced state with the cluster.
running
A server that is up and running. All servers that MariaDB MaxScale can connect to are labeled as running.
If no router_options
parameter is configured in the service definition,
the router will use the default value of running
. This means that it will
load balance connections across all running servers defined in the servers
parameter of the service.
When a connection is being created and the candidate server is being chosen, the list of servers is processed in from first entry to last. This means that if two servers with equal weight and status are found, the one that's listed first in the servers parameter for the service is chosen.
master_accept_reads
This option can be used to prevent queries from being sent to the current master.
If router_options
does not contain "master", the readconnroute instance is
usually meant for reading. Setting master_accept_reads=false
excludes the master
from server selection (and thus from receiving reads).
If router_options
contains "master", the setting of master_accept_reads
has no effect.
By default master_accept_reads=true
.
max_replication_lag
The maximum acceptable replication lag. The value is in seconds and is specified
as documented here. The
default value is 0s
, which means that the lag is ignored.
The replication lag of a server must be less than the configured value in order
for it to be used for routing. To configure the router to not allow any lag, use
the smallest duration larger than 0, that is, max_replication_lag=1s
.
The most common use for the readconnroute is to provide either a read or write port for an application. This provides a more lightweight routing solution than the more complex readwritesplit router but requires the application to be able to use distinct write and read ports.
To configure a read-only service that tolerates master failures, we first need to add a new section in to the configuration file.
[Read-Service]
type=service
router=readconnroute
servers=slave1,slave2,slave3
router_options=slave
Here the router_options
designates slaves as the only valid server
type. With this configuration, the queries are load balanced across the
slave servers.
For more complex examples of the readconnroute router, take a look at the examples in the Tutorials folder.
The router_diagnostics
output for readconnroute has the following fields.
queries
: Number of queries executed through this service.
Sending of binary data with LOAD DATA LOCAL INFILE
is not supported.
The router will never reconnect to the server it initially connected to.
This page is licensed: CC BY-SA / Gnu FDL
This document provides a short overview of the readwritesplit router module and its intended use case scenarios. It also displays all router configuration parameters with their descriptions. A list of current limitations of the module is included and use examples are provided.
The readwritesplit router is designed to increase the read-only processing capability of a cluster while maintaining consistency. This is achieved by splitting the query load into read and write queries. Read queries, which do not modify data, are spread across multiple nodes while all write queries will be sent to a single node.
The router is designed to be used with a traditional Master-Slave replication cluster. It automatically detects changes in the master server and will use the current master server of the cluster. With a Galera cluster, one can achieve a resilient setup and easy master failover by using one of the Galera nodes as a Write-Master node, where all write queries are routed, and spreading the read load over all the nodes.
Maintenance
and Draining
stateWhen a server that readwritesplit uses is put into maintenance mode, any ongoing requests are allowed to finish before the connection is closed. If the server that is put into maintenance mode is a master, open transaction are allowed to complete before the connection is closed. Note that this means neither idle session nor long-running transactions will be closed by readwritesplit. To forcefully close the connections, use the following command:
maxctrl set server <server> maintenance --force
If a server is put into the Draining
state while a connection is open, the
connection will be used normally. Whenever a new connection needs to be created,
whether that be due to a network error or when a new session being opened, only
servers that are neither Draining
nor Drained
will be used.
Readwritesplit router-specific settings are specified in the configuration file of MariaDB MaxScale in its specific section. The section can be freely named but the name is used later as a reference in a listener section.
For more details about the standard service parameters, refer to theConfiguration Guide.
Starting with 2.3, all router parameters can be configured at runtime. Usemaxctrl alter service
to modify them. The changed configuration will only be
taken into use by new sessions.
max_slave_connections
Type: integer
Mandatory: No
Dynamic: Yes
Default: 255
max_slave_connections
sets the maximum number of slaves a router session uses
at any moment. The default is to use at most 255 slave connections per client
connection. In older versions the default was to use all available slaves with
no limit.
For MaxScale 2.5.12 and newer, the minimum value is 0.
For MaxScale versions 2.5.11 and older, the minimum value is 1. These versions suffer from a bug (MXS-3536) that causes the parameter to accept any values but only function when a value greated than one was given.
Starting with MaxScale 2.5.0, the use of percentage values inmax_slave_connections
is deprecated. The support for percentages will be
removed in a future release.
For example, if you have configured MaxScale with one master and three slaves
and set max_slave_connections=2
, for each client connection a connection to
the master and two slave connections would be opened. The read query load
balancing is then done between these two slaves and writes are sent to the
master.
By tuning this parameter, you can control how dynamic the load balancing is at
the cost of extra created connections. With a lower value ofmax_slave_connections
, less connections per session are created and the set of
possible slave servers is smaller. With a higher value inmax_slave_connections
, more connections are created which requires more
resources but load balancing will almost always give the best single query
response time and performance. Longer sessions are less affected by a highmax_slave_connections
as the relative cost of opening a connection is lower.
Behavior of max_slave_connections=0
When readwritesplit is configured with max_slave_connections=0
, readwritesplit
will behave slightly differently in that it will route all reads to the current
master server. This is a convenient way to force all of the traffic to go to a
single node while still being able to leverage the replay and reconnection
features of readwritesplit.
In this mode, the behavior of master_failure_mode=fail_on_write
also changes
slightly. If the current Master
server fails and a read is done when there's
no other Master
server available, the connection will be closed. This is done
to prevent an extra slave connection from being opened that would not be closed
if a new Master
server would arrive.
slave_connections
Type: integer
Mandatory: No
Dynamic: Yes
Default: 255
This parameter controls how many slave connections each new session starts
with. The default value is 255 which is the same as the default value ofmax_slave_connections
.
In contrast to max_slave_connections
, slave_connections
serves as a
soft limit on how many slave connections are created. The number of slave
connections can exceed slave_connections
if the load balancing algorithm
finds an unconnected slave server better than all other slaves.
Setting this parameter to 1 allows faster connection creation and improved
resource usage due to the smaller amount of initial backend
connections. It is recommended to use slave_connections=1
when the
lifetime of the client connections is short.
max_slave_replication_lag
Type: duration
Mandatory: No
Dynamic: Yes
Default: 0s
Specify how many seconds a slave is allowed to be behind the master. The lag of a slave must be less than the configured value in order for it to be used for routing. If set to 0 (the default value), the feature is disabled.
In MaxScale 2.5.0, the slave lag must be less than max_slave_replication_lag
whereas in older versions the slave lag had to be less than or equal tomax_slave_replication_lag
. This means that in MaxScale 2.5.0 it is possible to
define, with max_slave_replication_lag=1
, that all slaves must be up to date
in order for them to be used for routing.
Note that this feature does not guarantee that writes done on the master are visible for reads done on the slave. This is mainly due to the method of replication lag measurement. For a feature that guarantees this, refer tocausal_reads.
The lag is specified as documentedhere. If no explicit unit is provided, the value is interpreted as seconds in MaxScale 2.4. In subsequent versions a value without a unit may be rejected. Note that since the granularity of the lag is seconds, a lag specified in milliseconds will be rejected, even if the duration is longer than a second.
The Readwritesplit-router does not detect the replication lag itself. A monitor such as the MariaDB-monitor for a Master/Slave-cluster is required. This option only affects Master-Slave clusters. Galera clusters do not have a concept of slave lag even if the application of write sets might have lag. When a server is disqualified from routing because of replication lag, a warning is logged. Similarly, when the server has caught up enough to be a valid routing target, another warning is logged. These messages are only logged when a query is being routed and the replication state changes.
use_sql_variables_in
Type: enum
Mandatory: No
Dynamic: Yes
Values: master
, all
Default: all
use_sql_variables_in
specifies where should queries, which read session
variable, be routed. The syntax for use_sql_variable_in
is:
use_sql_variables_in=[master|all]
The default is to use SQL variables in all servers.
When value all
is used, queries reading session variables can be routed to any
available slave (depending on selection criteria). Queries modifying session
variables are routed to all backend servers by default, excluding write queries
with embedded session variable modifications, such as:
INSERT INTO test.t1 VALUES (@myid:=@myid+1)
In above-mentioned case the user-defined variable would only be updated in the
master where the query would be routed to due to the INSERT
statement.
[Splitter-Service]
type=service
router=readwritesplit
servers=dbserv1, dbserv2, dbserv3
user=maxscale
password=96F99AA1315BDC3604B006F427DD9484
disable_sescmd_history=true
master_failure_mode=fail_on_write
connection_keepalive
Note: This parameter has been moved into the MaxScale core. For the current documentation, read theconnection_keepalive section in the configuration guide.
Send keepalive pings to backend servers. This feature was introduced in MaxScale 2.2.0. The default value is 300 seconds starting with 2.3.2 and for older versions the feature was disabled by default. This parameter was converted into a service parameter in MaxScale 2.5.0.
master_reconnection
Type: boolean
Mandatory: No
Dynamic: Yes
Default: false
Allow the master server to change mid-session. This feature was introduced in
MaxScale 2.3.0 and is disabled by default. This feature requires thatdisable_sescmd_history
is not used.
When a readwritesplit session starts, it will pick a master server as the current master server of that session. By default, when this master server is lost or changes to another server, the connection will be closed.
When master_reconnection
is enabled, readwritesplit can sometimes recover a
lost connection to the master server. This largely depends on the value ofmaster_failure_mode
.
With master_failure_mode=fail_instantly
, the master server is only allowed to
change to another server. This change must happen without a loss of the master
server.
With master_failure_mode=fail_on_write
, the loss of the master server is no
longer a fatal error: if a replacement master server appears before any write
queries are received, readwritesplit will transparently reconnect to the new
master server.
In both cases the change in the master server can only take place ifprune_sescmd_history
is enabled or max_sescmd_history
has not yet
been exceeded and the session does not have an open transaction.
The recommended configuration is to use master_reconnection=true
andmaster_failure_mode=fail_on_write
. This provides improved fault tolerance
without any risk to the consistency of the database.
slave_selection_criteria
Type: enum
Mandatory: No
Dynamic: Yes
Values: LEAST_CURRENT_OPERATIONS
, ADAPTIVE_ROUTING
, LEAST_BEHIND_MASTER
, LEAST_ROUTER_CONNECTIONS
, LEAST_GLOBAL_CONNECTIONS
Default: LEAST_CURRENT_OPERATIONS
This option controls how the readwritesplit router chooses the slaves it
connects to and how the load balancing is done. The default behavior is to route
read queries to the slave server with the lowest amount of ongoing queries i.e.LEAST_CURRENT_OPERATIONS
.
The option syntax:
slave_selection_criteria=<criteria>
Where <criteria>
is one of the following values.
LEAST_CURRENT_OPERATIONS
(default), the slave with least active operations
ADAPTIVE_ROUTING
, based on server average response times.
LEAST_BEHIND_MASTER
, the slave with smallest replication lag
LEAST_GLOBAL_CONNECTIONS
, the slave with least connections from MariaDB MaxScale
LEAST_ROUTER_CONNECTIONS
, the slave with least connections from this service
LEAST_CURRENT_OPERATIONS
uses the current number of active operations
(i.e. SQL queries) as the load balancing metric and it optimizes for maximal
query throughput. Each query gets routed to the server with the least active
operations which results in faster servers processing more traffic.
ADAPTIVE_ROUTING
uses the server response time and current estimated server
load as the load balancing metric. The server that is estimated to finish an
additional query first is chosen. A modified average response time for each
server is continuously updated to allow slow servers at least some traffic and
quickly react to changes in server load conditions. This selection criteria is
designed for heterogeneous clusters: servers of differing hardware, differing
network distances, or when other loads are running on the servers (including a
backup). If the servers are queried by other clients than MaxScale, the load
caused by them is indirectly taken into account.
LEAST_BEHIND_MASTER
uses the measured replication lag as the load balancing
metric. This means that servers that are more up-to-date are favored which
increases the likelihood of the data being read being up-to-date. However, this
is not as effective as causal_reads
would be as there's no guarantee that
writes done by the same connection will be routed to a server that has
replicated those changes. The recommended approach is to useLEAST_CURRENT_OPERATIONS
or ADAPTIVE_ROUTING
in combination withcausal_reads
NOTE: LEAST_GLOBAL_CONNECTIONS
and LEAST_ROUTER_CONNECTIONS
should not
be used, they are legacy options that exist only for backwards
compatibility. Using them will result in skewed load balancing as the algorithm
uses a metric that's too coarse (number of connections) to load balance
something that's finer (individual SQL queries).
The LEAST_GLOBAL_CONNECTIONS
and LEAST_ROUTER_CONNECTIONS
use the
connections from MariaDB MaxScale to the server, not the amount of connections
reported by the server itself.
Starting with MaxScale versions 2.5.29, 6.4.11, 22.08.9, 23.02.5 and 23.08.1,
lowercase versions of the values are also accepted. For example,slave_selection_criteria=LEAST_CURRENT_OPERATIONS
andslave_selection_criteria=least_current_operations
are both accepted as valid
values.
max_sescmd_history
This parameter has been moved tothe MaxScale core in MaxScale 6.0.
disable_sescmd_history
This parameter has been moved tothe MaxScale core in MaxScale 6.0.
prune_sescmd_history
This parameter has been moved tothe MaxScale core in MaxScale 6.0.
master_accept_reads
Type: boolean
Mandatory: No
Dynamic: Yes
Default: false
master_accept_reads
allows the master server to be used for reads. This is
a useful option to enable if you are using a small number of servers and wish to
use the master for reads as well.
By default, no reads are sent to the master as long as there is a valid slave
server available. If no slaves are available, reads are sent to the master
regardless of the value of master_accept_reads
.
# Use the master for reads
master_accept_reads=true
strict_multi_stmt
Type: boolean
Mandatory: No
Dynamic: Yes
Default: false
This option is disabled by default since MaxScale 2.2.1. In older versions, this option was enabled by default.
When a client executes a multi-statement query, it will be treated as if it were a DML statement and routed to the master. If the option is enabled, all queries after a multi-statement query will be routed to the master to guarantee a consistent session state.
If the feature is disabled, queries are routed normally after a multi-statement query.
Warning: Enable the strict mode only if you know that the clients will send statements that cause inconsistencies in the session state.
# Enable strict multi-statement mode
strict_multi_stmt=true
strict_sp_calls
Type: boolean
Mandatory: No
Dynamic: Yes
Default: false
Similar to strict_multi_stmt
, this option allows all queries after a CALL
operation on a stored procedure to be routed to the master. This option is
disabled by default and was added in MaxScale 2.1.9.
All warnings and restrictions that apply to strict_multi_stmt
also apply tostrict_sp_calls
.
master_failure_mode
Type: enum
Mandatory: No
Dynamic: Yes
Values: fail_instantly
, fail_on_write
, error_on_write
Default: fail_instantly
This option controls how the failure of a master server is handled. By default, the router will close the client connection as soon as the master is lost.
The following table describes the values for this option and how they treat the loss of a master server.
fail_instantly
When the failure of the master server is detected, the connection will be closed immediately.
fail_on_write
The client connection is closed if a write query is received when no master is available.
error_on_write
If no master is available and a write query is received, an error is returned stating that the connection is in read-only mode.
These also apply to new sessions created after the master has failed. This means
that in fail_on_write
or error_on_write
mode, connections are accepted as
long as slave servers are available.
When configured with fail_on_write
or error_on_write
, sessions that are idle
will not be closed even if all backend connections for that session have
failed. This is done in the hopes that before the next query from the idle
session arrives, a reconnection to one of the slaves is made. However, this can
leave idle connections around unless the client application actively closes
them. To prevent this, use theconnection_timeout
parameter.
Note: If master_failure_mode
is set to error_on_write
and the connection
to the master is lost, by default, clients will not be able to execute write
queries without reconnecting to MariaDB MaxScale once a new master is
available. If master_reconnection is enabled, the
session can recover if one of the slaves is promoted as the master.
retry_failed_reads
Type: boolean
Mandatory: No
Dynamic: Yes
Default: true
This option controls whether autocommit selects are retried in case of failure. This option is enabled by default.
When a simple autocommit select is being executed outside of a transaction and the slave server where the query is being executed fails, readwritesplit can retry the read on a replacement server. This makes the failure of a slave transparent to the client.
delayed_retry
Type: boolean
Mandatory: No
Dynamic: Yes
Default: false
Retry queries over a period of time. This parameter takes a boolean value, was added in Maxscale 2.3.0 and is disabled by default.
When this feature is enabled, a failure to route a query due to a connection problem will not immediately result in an error. The routing of the query is delayed until either a valid candidate server is available or the retry timeout is reached. If a candidate server becomes available before the timeout is reached, the query is routed normally and no connection error is returned. If no candidates are found and the timeout is exceeded, the router returns to normal behavior and returns an error.
When combined with the master_reconnection
parameter, failures of writes done
outside of transactions can be hidden from the client connection. This allows a
master to be replaced while writes are being sent.
Starting with MaxScale 21.06.18, 22.08.15, 23.02.12, 23.08.8, 24.02.4 and
24.08.1, delayed_retry
will no longer attempt to retry a query if it was
already sent to the database. If a query is received while a valid target server
is not available, the execution of the query is delayed until a valid target is
found or the delayed retry timeout is hit. If a query was already sent, it will
not be replayed to prevent duplicate execution of statements.
In older versions of MaxScale, duplicate execution of a statement can occur if
the connection to the server is lost or the server crashes but the server comes
back up before the timeout for the retrying is exceeded. At this point, if the
server managed to read the client's statement, it will be executed. For this
reason, it is recommended to only enable delayed_retry
for older versions of
MaxScale when the possibility of duplicate statement execution is an acceptable
risk.
delayed_retry_timeout
Type: duration
Mandatory: No
Dynamic: Yes
Default: 10s
The duration to wait until an error is returned to the client whendelayed_retry
is enabled. The default value is 10 seconds.
The timeout is specified as documentedhere. If no explicit unit is provided, the value is interpreted as seconds in MaxScale 2.4. In subsequent versions a value without a unit may be rejected. Note that since the granularity of the timeout is seconds, a timeout specified in milliseconds will be rejected, even if the duration is longer than a second.
transaction_replay
Type: boolean
Mandatory: No
Dynamic: Yes
Default: false
Replay interrupted transactions. This parameter was added in MaxScale 2.3.0 and
is disabled by default. Enabling this parameter enables both delayed_retry
andmaster_reconnection
and sets master_failure_mode
to fail_on_write
, thereby
overriding any configured values for these parameters.
When the server where the transaction is in progress fails, readwritesplit can migrate the transaction to a replacement server. This can completely hide the failure of a master node without any visible effects to the client.
If no replacement node becomes available, the client connection is closed.
To control how long a transaction replay can take, usetransaction_replay_timeout
.
Please refer to theTransaction Replay Limitations section for a more detailed explanation of what should and should not be done with transaction replay.
transaction_replay_max_size
Type: size
Mandatory: No
Dynamic: Yes
Default: 1 MiB
The limit on transaction size for transaction replay in bytes. Any transaction
that exceeds this limit will not be replayed. The default value is 1 MiB. This
limit applies at a session level which means that the total peak memory
consumption can be transaction_replay_max_size
times the number of client
connections.
The amount of memory needed to store a particular transaction will be slightly larger than the length in bytes of the SQL used in the transaction. If the limit is ever exceeded, a message will be logged at the info level.
Starting with MaxScale 6.4.10, the number of times that this limit has been
exceeded is shown in maxctrl show service
as trx_max_size_exceeded
.
Read the configuration guide for more details on size type parameters in MaxScale.
transaction_replay_attempts
Type: integer
Mandatory: No
Dynamic: Yes
Default: 5
The upper limit on how many times a transaction replay is attempted before giving up. The default value is 5.
A transaction replay failure can happen if the server where the transaction is being replayed fails while the replay is in progress. In practice this parameter controls how many server and network failures a single transaction replay tolerates. If a transaction is replayed successfully, the counter for failed attempts is reset.
transaction_replay_timeout
Type: duration
Mandatory: No
Dynamic: Yes
Default: 0s
The time how long transactions are attempted for. This feature is disabled by default and was added in MaxScale 6.2.1. To explicitly disable this feature, set the value to 0 seconds.
The timeout isa duration type and the value must include a unit for the duration.
When transaction_replay_timeout
is enabled, the time a transaction replay can
take is controlled solely by this parameter. This is a more convenient and
predictable method of controlling how long a transaction replay can be attempted
before the connection is closed.
If delayed_retry_timeout
is less than transaction_replay_timeout
, it is set
to the same value.
By default the time how long a transaction can be retried is controlled bydelayed_retry_timeout
and transaction_replay_attempts
. This can result in a
maximum replay time limit of delayed_retry_timeout
multiplied bytransaction_replay_attempts
, by default this is 50 seconds. The minimum replay
time limit can be as low as transaction_replay_attempts
seconds (5 seconds by
default) in cases where the connection fails after it was created. Usually this
happens due to problems like the max_connections limit being hit on the database
server.
With the introduction of transaction_replay_timeout
, these problems are
avoided. Starting with MaxScale 6.2.1, this is the recommended method of
controlling the timeouts for transaction replay.
transaction_replay_retry_on_deadlock
Type: boolean
Mandatory: No
Dynamic: Yes
Default: false
Enable automatic retrying of transactions that end up in a deadlock. This parameter was added in MaxScale 2.4.6 and the feature is disabled by default. MaxScale versions from 2.4.0 to 2.4.5 always tried to replay deadlocked transactions.
If this feature is enabled and a transaction returns a deadlock error
(e.g. SQLSTATE 40001: Deadlock found when trying to get lock; try restarting transaction
),
the transaction is automatically retried. If the retrying of the transaction
results in another deadlock error, it is retried until it either succeeds or a
transaction checksum error is encountered.
transaction_replay_retry_on_mismatch
Type: boolean
Mandatory: No
Dynamic: Yes
Default: false
Retry transactions that end in checksum mismatch. This parameter was added in MaxScale 6.2.1 is disabled by default.
When enabled, any replayed transactions that end with a checksum mismatch are
retried until they either succeeds or one of the transaction replay limits is
reached (delayed_retry_timeout
, transaction_replay_timeout
ortransaction_replay_attempts
).
transaction_replay_checksum
Type: enum
Mandatory: No
Dynamic: Yes
Values: full
, result_only
, no_insert_id
Default: full
Selects which transaction checksum method is used to verify the result of the replayed transaction.
Note that only transaction_replay_checksum=full
is guaranteed to retain the
consistency of the replayed transaction.
Possible values are:
full
(default)
All responses from the server are included in the checksum. This retains the full consistency guarantee of the replayed transaction as it must match exactly the one that was already returned to the client.
result_only
Only resultsets and errors are included in the checksum. OK packets
(i.e. successful queries that do not return results) are ignored. This mode
is intended to be used in cases where the extra information (auto-generated
ID, warnings etc.) returned in the OK packet is not used by the
application.
This mode is safe to use only if the auto-generated ID is not actually used
by any following queries. An example of such behavior would be a transaction
that ends with an INSERT
into a table with an AUTO_INCREMENT
field.
no_insert_id
The same as result_only
but results from queries that useLAST_INSERT_ID()
are also ignored. This mode is safe to use only if the
result of the query is not used by any subsequent statement in the
transaction.
optimistic_trx
Type: boolean
Mandatory: No
Dynamic: Yes
Default: false
Enable optimistic transaction execution. This parameter controls whether normal
transactions (i.e. START TRANSACTION
or BEGIN
) are load balanced across
slaves. This feature is disabled by default and enabling it implicitly enablestransaction_replay
, delayed_retry
and master_reconnection
parameters.
When this mode is enabled, all transactions are first attempted on slave servers. If the transaction contains no statements that modify data, it is completed on the slave. If the transaction contains statements that modify data, it is rolled back on the slave server and restarted on the master. The rollback is initiated the moment a data modifying statement is intercepted by readwritesplit so only read-only statements are executed on slave servers.
As with transaction_replay
and transactions that are replayed, if the results
returned by the master server are not identical to the ones returned by the
slave up to the point where the first data modifying statement was executed, the
connection is closed. If the execution of ROLLBACK statement on the slave fails,
the connection to that slave is closed.
All limitations that apply to transaction_replay
also apply tooptimistic_trx
.
causal_reads
Type: enum
Mandatory: No
Dynamic: Yes
Values: none
, local
, global
, fast
Default: none
Enable causal reads. This parameter is disabled by default and was introduced in MaxScale 2.3.0.
If a client connection modifies the database and causal_reads
is enabled, any
subsequent reads performed on slave servers will be done in a manner that
prevents replication lag from affecting the results. This only applies to the
modifications done by the client itself.
Note: This feature requires MariaDB 10.2.16 or newer to function. In
addition to this, the session_track_system_variables
parameter must includelast_gtid
in its list of tracked system variables.
Note: This feature also enables multi-statement execution of SQL in the
protocol. This is equivalent to using allowMultiQueries=true
inConnector/J
or using CLIENT_MULTI_STATEMENTS
and CLIENT_MULTI_RESULTS
in the
Connector/C. The Implementation of causal_reads section explains why this is
necessary.
The possible values for this parameter are:
none
(default)
Read causality is disabled.
local
Writes are locally visible. Writes are guaranteed to be visible only to the connection that does it. Unrelated modifications done by other connections are not visible. This mode improves read scalability at the cost of latency and reduces the overall load placed on the master server without breaking causality guarantees.
global
Writes are globally visible. If one connection writes a value, all
connections to the same service will see it. In general this mode is slower
than the local
mode due to the extra synchronization it has to do. This
guarantees global happens-before ordering of reads when all transactions are
inside a single GTID domain.This mode gives similar benefits as the local
mode in that it improves read scalability at the cost of latency.
With MaxScale versions 2.5.14 and older, multi-domain use of causal_reads
could cause non-causal reads to occur. Starting with MaxScale 2.5.15, this
was fixed and all the GTID coordinates are passed alongside all requests
which makes multi-domain GTIDs safe to use. However, this does mean that the
GTID coordinates will never be reset: if replication is reset and GTID
coordinates go "backwards", readwritesplit will not consider these as being
newer than the ones already stored. To reset the stored GTID coordinates in
readwritesplit, MaxScale must be restarted.
MaxScale 6.4.11 added the new reset-gtid
module command to
readwritesplit. This allows the global GTID state used bycausal_reads=global
to be reset without having to restart MaxScale.
fast
This mode is similar to the local
mode where it will only affect the
connection that does the write but where the local
mode waits for a slave
server to catch up, the fast
mode will only use servers that are known to
have replicated the write. This means that if no slave has replicated the
write, the master where the write was done will be used. The value ofcausal_reads_timeout
is ignored in this mode. Currently the replication
state is only updated by the mariadbmon monitor whenever the servers are
monitored. This means that a smaller monitor_interval
provides faster
replication state updates and possibly better overall usage of servers.
This mode is the inverse of the local
mode in the sense that it improves
read latency at the cost of read scalability while still retaining the
causality guarantees for reads. This functionality can also be considered an
improved version of the functionality that theCCRFilter module provides.
Before MaxScale 2.5.0, the causal_reads
parameter was a boolean
parameter. False values translated to none
and true values translated tolocal
. The use of boolean parameters is deprecated but still accepted in
MaxScale 2.5.0.
Implementation of causal_reads
This feature is based on the MASTER_GTID_WAIT
function and the tracking of
server-side status variables. By tracking the latest GTID that each statement
generates, readwritesplit can then perform a synchronization operation with the
help of the MASTER_GTID_WAIT
function.
If the slave has not caught up to the master within the configured time, as specified by causal_reads_timeout, it will be retried on the master. In MaxScale 2.3.0 an error was returned to the client when the slave timed out.
Normal SQL
A practical example can be given by the following set of SQL commands executed
with autocommit=1
.
INSERT INTO test.t1 (id) VALUES (1);
SELECT * FROM test.t1 WHERE id = 1;
As the statements are not executed inside a transaction, from the load balancers point of view, the latter statement can be routed to a slave server. The problem with this is that if the value that was inserted on the master has not yet replicated to the server where the SELECT statement is being performed, it can appear as if the value we just inserted is not there.
By prefixing these types of SELECT statements with a command that guarantees consistent results for the reads, read scalability can be improved without sacrificing consistency.
The set of example SQL above will be translated by MaxScale into the following statements.
INSERT INTO test.t1 (id) VALUES (1);
-- These are executed as one multi-query
SET @maxscale_secret_variable=(
SELECT CASE
WHEN MASTER_GTID_WAIT('0-3000-8', 10) = 0 THEN 1
ELSE (SELECT 1 FROM INFORMATION_SCHEMA.ENGINES)
END); SELECT * FROM test.t1 WHERE id = 1;
The SET
command will synchronize the slave to a certain logical point in the
replication stream (seeMASTER_GTID_WAIT for more
details). If the synchronization fails, the query will not run and it will be
retried on the server where the transaction was originally done.
Prepared Statements
Binary protocol prepared statements are handled in a different manner. Instead of adding the synchronization SQL into the original SQL query, it is sent as a separate packet before the prepared statement is executed.
We'll use the same example SQL but use a binary protocol prepared statement for the SELECT:
COM_QUERY: INSERT INTO test.t1 (id) VALUES (1);
COM_STMT_PREPARE: SELECT * FROM test.t1 WHERE id = ?;
COM_STMT_EXECUTE: ? = 123
The SQL that MaxScale executes will be the following:
COM_QUERY: INSERT INTO test.t1 (id) VALUES (1);
COM_STMT_PREPARE: SELECT * FROM test.t1 WHERE id = ?;
COM_QUERY: IF (MASTER_GTID_WAIT('0-3000-8', 10) <> 0) THEN KILL (SELECT CONNECTION_ID()); END IF
COM_STMT_EXECUTE: ? = 123
Both the synchronization query and the execution of the prepared statement are sent at the same time. This is done to remove the need to wait for the result of the synchronization query before routing the execution of the prepared statement. This keeps the performance of causal_reads for prepared statements the same as it is for normal SQL queries.
As a result of this, each time the synchronization query times out, the
connection will be killed by the KILL
statement and readwritesplit will retry
the query on the master. This is done to prevent the execution of the prepared
statement that follows the synchronization query from being processed by the
MariaDB server.
It is recommend that the session command history is enabled whenever prepared
statements are used with causal_reads
. This allows new connections to be
created whenever a causal read times out.
Starting with MaxScale 2.5.17, a failed causal read inside of a read-only
transaction started with START TRANSACTION READ ONLY
will return the following
error:
Error: 1792
SQLSTATE: 25006
Message: Causal read timed out while in a read-only transaction, cannot retry command.
Older versions of MaxScale attempted to retry the command on the current master server which would cause the connection to be closed and a warning to be logged.
Limitations of Causal Reads
This feature does not work with Galera or any other non-standard
replication mechanisms. As Galera does not update the gtid_slave_pos
variable when events are replicated via the Galera library, theMASTER_GTID_WAIT
function used by MaxScale to synchronize reads will wait until the
timeout. With Galera this is not a serious issue as it, by nature, is a
mostly-synchronous replication mechanism.
If the combination of the original SQL statement and the modifications added to it by readwritesplit exceed the maximum packet size (16777213 bytes), the causal read will not be attempted and a non-causal read is done instead.
SQL like INSERT ... RETURNING
that commits a transaction and returns a
resultset will only work with causal reads if the connector supports the
DEPRECATE_EOF protocol feature. The following table contains a list of MariaDB
connectors and whether they support the protocol feature.
Connector/J
Yes
3.5.2
Connector/Node.js
Yes
3.4.0
Connector/R2DBC
Yes
1.3.0
Connector/C
No
3.4.4
Connector/C++
No
1.1.5
Connector/ODBC
No
3.2.5
causal_reads_timeout
Type: duration
Mandatory: No
Dynamic: Yes
Default: 10s
The timeout for the slave synchronization done by causal_reads
. The
default value is 10 seconds.
The timeout is specified as documentedhere. If no explicit unit is provided, the value is interpreted as seconds in MaxScale 2.4. In subsequent versions a value without a unit may be rejected. Note that since the granularity of the timeout is seconds, a timeout specified in milliseconds will be rejected, even if the duration is longer than a second.
lazy_connect
Type: boolean
Mandatory: No
Dynamic: Yes
Default: false
Lazy connection creation causes connections to backend servers to be opened only when they are needed. This reduces the load that is placed on the backend servers when the client connections are short. This parameter is a boolean type and is disabled by default.
By default readwritesplit opens as many connections as it can when the session
is first opened. This makes the execution of the first query faster when all
available connections are already created. When lazy_connect
is enabled, this
initial connection creation is skipped. If the client executes only read
queries, no connection to the master is made. If only write queries are made,
only the master connection is used.
reuse_prepared_statements
Type: boolean
Mandatory: No
Dynamic: Yes
Default: false
Reuse identical prepared statements inside the same client connection. This is a boolean parameter and is disabled by default. This feature only applies to binary protocol prepared statements.
When this parameter is enabled and the connection prepares an identical prepared statement multiple times, instead of preparing it on the server the existing prepared statement handle is reused. This also means that whenever prepared statements are closed by the client, they will be left open by readwritesplit.
Enabling this feature will increase memory usage of a session. The amount of memory stored per prepared statement is proportional to the length of the prepared SQL statement and the number of parameters the statement has.
The router_diagnostics
output for a readwritesplit service contains the
following fields.
queries
: Number of queries executed through this service.
route_master
: Number of writes routed to master.
route_slave
: Number of reads routed to slaves.
route_all
: Number of session commands routed to all servers.
rw_transactions
: Number of explicit read-write transactions.
ro_transactions
: Number of explicit read-only transactions.
replayed_transactions
: Number of replayed transactions.
server_query_statistics
: Statistics for each configured and used server consisting of the following fields.
id
: Name of the server
total
: Total number of queries.
read
: Total number of reads.
write
: Total number of writes.
avg_sess_duration
: Average duration of a client session to this server.
avg_sess_active_pct
: Average percentage of time client sessions were active. 0% means connections were opened but never used.
avg_selects_per_session
: Average number of selects per session.
The general rule with server ranks is that primary servers will be used before secondary servers. Readwritesplit is an exception to this rule. The following rules govern how readwritesplit behaves with servers that have different ranks.
Sessions will use the current master server as long as possible. This means that sessions with a secondary master will not use the primary master as long as the secondary master is available.
All slave connections will use the same rank as the master connection. Any stale connections with a different rank than the master will be discarded.
If no master connection is available and master_reconnection
is enabled, a
connection to the best master is created. If the new master has a different
priority than existing connections have, the connections with a different rank
will be discarded.
If open connections exist, these will be used for routing. This means that if the master is lost but the session still has slave servers with the same rank, they will remain in use.
If no open connections exist, the servers with the best rank will used.
The readwritesplit router supports routing hints. For a detailed guide on hint syntax and functionality, please read this document.
Note: Routing hints will always have the highest priority when a routing decision is made. This means that it is possible to cause inconsistencies in the session state and the actual data in the database by adding routing hints to DDL/DML statements which are then directed to slave servers. Only use routing hints when you are sure that they can cause no harm.
An exception to this rule is transaction_replay
: when it is enabled, all
routing hints inside transaction are ignored. This is done to prevent changes
done inside a replayable transaction from affecting servers outside of the
transaction. This behavior was added in MaxScale 6.1.4. Older versions allowed
routing hints to override the transaction logic.
If a SELECT
statement with a maxscale route to slave
hint is received
while autocommit is disabled, the query will be routed to a slave server. This
causes some metadata locks to be acquired on the database in question which
will block DDL statements on the server until either the connection is closed
or autocommit is enabled again.
The readwritesplit router implements the following module commands.
reset-gtid
The command resets the global GTID state in the router. It can be used withcausal_reads=global
to reset the state. This can be useful when the cluster is
reverted to an earlier state and the GTIDs recorded in MaxScale are no longer
valid.
The first and only argument to the command is the router name. For example, to
reset the GTID state of a readwritesplit named My-RW-Router
, the following
MaxCtrl command should be used:
maxctrl call command readwritesplit reset-gtid My-RW-Router
Examples of the readwritesplit router in use can be found in theTutorials folder.
Here is a small explanation which shows what kinds of queries are routed to which type of server.
Routing to master is important for data consistency and because majority of writes are written to binlog and thus become replicated to slaves.
The following operations are routed to master:
DML statements (INSERT
, UPDATE
, DELETE
etc.)
DDL statements (DROP
, CREATE
, ALTER
etc.)
All statements within an open read-write transaction
Stored procedure calls
User-defined function calls
Queries that use sequences (NEXT VALUE FOR seq
, NEXTVAL(seq)
or seq.nextval
)
Statements that use any of the following functions:
LAST_INSERT_ID()
GET_LOCK()
RELEASE_LOCK()
IS_USED_LOCK()
IS_FREE_LOCK()
Statements that use any of the following variables:
@@last_insert_id
@@identity
In addition to these, if the readwritesplit service is configured with themax_slave_replication_lag
parameter, and if all slaves suffer from too much
replication lag, then statements will be routed to the Master. (There might be
other similar configuration parameters in the future which limit the number of
statements that will be routed to slaves.)
Transaction Isolation Level Tracking
Use of the SERIALIZABLE transaction isolation level with readwritesplit is not recommended as it somewhat goes against the goals of load balancing.
If either session_track_transaction_info=CHARACTERISTICS
orsession_track_system_variables=tx_isolation
is configured for the MariaDB
server, readwritesplit will track the transaction isolation level and lock the
session to the master when the isolation level is set to serializable. This
retains the correctness of the isolation level which can otherwise cause
problems. Once a session is locked to the master, it will not be unlocked. To
reinstate the normal routing behavior, a new connection must be created.
For example, if transaction isolation level tracking cannot be done and an autocommit SELECT is routed to a slave, it no longer behaves in a serializable manner. This can also have an effect on the replication in the slave server.
The ability to route some statements to slaves is important because it also decreases the load targeted to master. Moreover, it is possible to have multiple slaves to share the load in contrast to single master.
Queries which can be routed to slaves must be auto committed and belong to one of the following group:
Read-only statements (i.e. SELECT
) that only use read-only built-in functions
All statements within an explicit read-only transaction (START TRANSACTION READ ONLY
)
SHOW
statements except SHOW MASTER STATUS
The list of supported built-in fuctions can be foundhere.
A third class of statements includes those which modify session data, such as session system variables, user-defined variables, the default database, etc. We call them session commands, and they must be replicated as they affect the future results of read and write operations. They must be executed on all servers that could execute statements on behalf of this client.
Session commands include for example:
Commands that modify the session state (SET
, USE
, CHANGE USER
)
Text protocol PREPARE
statements
Binary protocol prepared statements
Other miscellaneous commands (COM_QUIT, COM_PING etc.)
NOTE: if variable assignment is embedded in a write statement it is routed
to Master only. For example, INSERT INTO t1 values(@myvar:=5, 7)
would be
routed to Master only.
The router stores all of the executed session commands so that in case of a
slave failure, a replacement slave can be chosen and the session command history
can be repeated on that new slave. This means that the router stores each
executed session command for the duration of the session. Applications that use
long-running sessions might cause MariaDB MaxScale to consume a growing amount
of memory unless the sessions are closed. This can be solved by adjusting the
value of max_sescmd_history
.
In the following cases, a query is routed to the same server where the previous query was executed. If no previous target is found, the query is routed to the current master.
If a query uses the FOUND_ROWS()
function, it will be routed to the server
where the last query was executed. This is done with the assumption that a
query with SQL_CALC_FOUND_ROWS
was previously executed.
COM_STMT_FETCH_ROWS will always be routed to the same server where the COM_STMT_EXECUTE was routed.
Read queries are routed to the master server in the following situations:
Query is executed inside an open read-write transaction
Statement includes a stored procedure or an UDF call
If there are multiple statements inside one query e.g.INSERT INTO ... ; SELECT LAST_INSERT_ID();
If a prepared statement targets a temporary table on the master, the slave servers will fail to execute it. This will cause all slave connections to be closed (MXS-1816).
If the results from the replacement server are not identical when the
transaction is replayed, the client connection is closed. This means that any
transaction with a server specific result (e.g. NOW()
, @@server_id
) cannot
be replayed successfully but it will still be attempted.
If a transaction reads data before updating it, the rows should be locked by
using SELECT ... FOR UPDATE
. This will prevent overlapping transactions when
multiple transactions are being replayed that modify the same set of rows.
If the connection to the server where the transaction is being executed is
lost when the final COMMIT
is being executed, it is impossible to know
whether the transaction was successfully committed. This means that there
is a possibility for duplicate transaction execution which can result in
data duplication in certain cases. Data duplication can happen if the
transaction consists of the following statement types:
INSERT of rows into a table that does not have an auto-increment primary key
A "blind update" of one or more rows e.g. UPDATE t SET c = c + 1 WHERE id = 123
A "blind delete" e.g. DELETE FROM t LIMIT 100
This is not an exhaustive list and any operations that do not check the row contents before performing the operation on them might face this problem.
In all cases the problem of duplicate transaction execution can be avoided by
including a SELECT ... FOR UPDATE
in the statement. This will guarantee that
in the case that the transaction fails when it is being committed, the row is
only modified if it matches the expected contents.
Similarly, a connection loss during COMMIT
can also result in transaction
replay failure. This happens due to the same reason as duplicate transaction
execution but the retried transaction will not be committed. This can be
considered a success case as the transaction replay detected that the results of
the two transactions are different. In these cases readwritesplit will abort the
transaction and close the client connection.
Statements that result in an implicit commit do not reset the transaction when
transaction_replay is enabled. This means that if the transaction is replayed,
the transaction will be committed twice due to the implicit commit being
present. The exception to this are the transaction management statements such asBEGIN
and START TRANSACTION
: they are detected and will cause the
transaction to be correctly reset.
Any changes to the session state (e.g. autocommit state, SQL mode) done inside a transaction will remain in effect even if the connection to the server where the transaction is being executed fails. When readwritesplit creates a new connection to a server to replay the transaction, it will first restore the session state by executing all session commands that were executed. This means that if the session state is changed mid-transaction in a way that affects the results, transaction replay will fail.
The following partial transaction demonstrates the problem by usingSQL_MODE inside a transaction.
SET SQL_MODE=''; -- A session command
BEGIN;
SELECT "hello world"; -- Returns the string "hello world"
SET SQL_MODE='ANSI_QUOTES'; -- A session command
SELECT 'hello world'; -- Returns the string "hello world"
If this transaction has to be replayed the actual SQL that gets executed is the following.
SET SQL_MODE=''; -- Replayed session command
SET SQL_MODE='ANSI_QUOTES'; -- Replayed session command
BEGIN;
SELECT "hello world"; -- Returns an error
SELECT 'hello world'; -- Returns the string "hello world"
First the session state is restored by executing all commands that changed the state after which the actual transaction is replayed. Due to the fact that the SQL_MODE was changed mid-transaction, one of the queries will now return an error instead of the result we expected leading to a transaction replay failure.
In a service-to-service configuration (i.e. a service using another service in
its targets
list ), if the topmost service starts a transaction, all
lower-level readwritesplit services will also behave as if a transaction is
open. If a connection to a backend database fails during this, it can result in
unnecessary transaction replays which in turn can end up with checksum
conflicts. The recommended approach is to not use any commands inside a
transaction that would be routed to more than one node.
If the connection to the server where a transaction is being executed is lost
while a ROLLBACK
is being executed, readwritesplit will still attempt to
replay the transaction in the hopes that the real response can be delivered to
the client. However, this does mean that it is possible that a rolled back
transaction which gets replayed ends up with a conflict and is reported as a
replay failure when in reality a rolled back transaction could be safely
ignored.
In older versions of MaxScale, routers were configured via the router_options parameter. This functionality was deprecated in 2.2 and was removed in 2.3.
Readwritesplit does not support pipelining of JDBC batched statements. This is caused by the fact that readwritesplit executes the statements one at a time to track the state of the response.
Limitations in multi-statement handling
When a multi-statement query is executed through the readwritesplit router, it will always be routed to the master. Seestrict_multi_stmt for more details.
If the multi-statement query creates a temporary table, it will not be detected and reads to this table can be routed to slave servers. To prevent this, always execute the temporary table creation as an individual statement.
Limitations in client session handling
Some of the queries that a client sends are routed to all backends instead of
just to one. These queries include USE <db name>
and SET autocommit=0
, among
many others. Readwritesplit sends a copy of these queries to each backend server
and forwards the master's reply to the client. Below is a list of MySQL commands
which are classified as session commands.
COM_INIT_DB (USE <db name> creates this)
COM_CHANGE_USER
COM_STMT_CLOSE
COM_STMT_SEND_LONG_DATA
COM_STMT_RESET
COM_STMT_PREPARE
COM_QUIT (no response, session is closed)
COM_REFRESH
COM_DEBUG
COM_PING
SQLCOM_CHANGE_DB (USE ... statements)
SQLCOM_DEALLOCATE_PREPARE
SQLCOM_PREPARE
SQLCOM_SET_OPTION
SELECT ..INTO variable|OUTFILE|DUMPFILE
SET autocommit=1|0
Prior to MaxScale 2.3.0, session commands that were 2²⁴ - 1 bytes or longer were not supported and caused the session to be closed.
There is a possibility for misbehavior. If USE mytable
is executed in one of
the slaves and fails, it may be due to replication lag rather than the database
not existing. Thus, the same command may produce different result in different
backend servers. The slaves which fail to execute a session command will be
dropped from the active list of slaves for this session to guarantee a
consistent session state across all the servers used by the session. In
addition, the server will not be used again for routing for the duration of the
session.
The above-mentioned behavior for user variables can be partially controlled with
the configuration parameter use_sql_variables_in
:
use_sql_variables_in=[master|all] (default: all)
WARNING
If a SELECT query modifies a user variable when the use_sql_variables_in
parameter is set to all
, it will not be routed and the client will receive an
error. A log message is written into the log further explaining the reason for
the error. Here is an example use of a SELECT query which modifies a user
variable and how MariaDB MaxScale responds to it.
MySQL [(none)]> set @id=1;
Query OK, 0 rows affected (0.00 sec)
MySQL [(none)]> SELECT @id := @id + 1 FROM test.t1;
ERROR 1064 (42000): Routing query to backend failed. See the error log for further details.
Allow user variable modification in SELECT queries by settinguse_sql_variables_in=master
. This will route all queries that use user
variables to the master.
This page is licensed: CC BY-SA / Gnu FDL
The SchemaRouter provides an easy and manageable sharding solution by building a single logical database server from multiple separate ones. Each database is shown to the client and queries targeting unique databases are routed to their respective servers. In addition to providing simple database-based sharding, the schemarouter also enables cross-node session variable usage by routing all queries that modify the session to all nodes.
By default the SchemaRouter assumes that each database and table is only located on one server. If it finds the same database or table on multiple servers, it will close the session with the following error:
ERROR 5000 (DUPDB): Error: duplicate tables found on two different shards.
The exception to this rule are the system tables mysql
, information_schema
,performance_schema
, sys
that are never treated as duplicates.
If duplicate tables are expected, use theignore_tables_regex parameter to controls which
duplicate tables are allowed. To disable the duplicate database detection, useignore_tables_regex=.*
.
Schemarouter compares table and database names case-insensitively. This means
that the tables test.t1
and test.T1
are assumed to refer to the same table.
The main limitation of SchemaRouter is that aside from session variable writes and some specific queries, a query can only target one server. This means that queries which depend on results from multiple servers give incorrect results. See Limitations for more information.
From 2.3.0 onwards, SchemaRouter is capable of limited table family sharding.
The auth_all_servers
parameter is no longer automatically enabled by the
schemarouter. To retain the old behavior that was present in 2.5, explicitly
define auth_all_servers=true
for all schemarouter services.
If a command modifies the session state by modifying any session or user
variables, the query is routed to all nodes. These statements include SET
statements as well as any other statements that modify the behavior of the
client.
If a client changes the default database after connecting, either with a USE <db>
query or a COM_INIT_DB
command, the query is routed to all servers
that contain the database. This same logic applies when a client connects with
a default database: the default database is set only on servers that actually
contain it.
If a query targets one or more tables that the schemarouter has discovered during the database mapping phase, the query is only routed if a server is found that contains all of the tables that the query uses. If no such server is found, the query is routed to the server that was previously used or to the first available backend if none have been used. If a query uses a table but doesn't define the database it is in, it is assumed to be located on the default database of the connection.
If a query targets a table or a database that is present on all nodes
(e.g. information_schema
) and the connection is using a default database,
the query is routed based on the default database. This makes it possible to
control where queries that do match a specifc node are routed. If the
connection is not using a default database, the query is routed based solely
on the tables it contains.
If a query uses a table that is unknown to the schemarouter or executes a command that doesn't target a table, the query is routed to a server contains the current active default database. If the connection does not have a default database, the query is routed to the backend that was last used or to the first available backend if none have been used. If the query contains a routing hint that directs it to a server, the query is routed there.
This means that all administrative commands, replication related command as well as certain transaction control statements (XA transaction) are routed to the first available server in certain cases. To avoid problems, use routing hints to direct where these statements should go.
Starting with MaxScale 6.4.5, transaction control commands (BEGIN
, COMMIT
and ROLLBACK
) are routed to all nodes. Older versions of MaxScale routed the
queries to the first available backend. This means that cross-shard
transactions are technically possible but, without external synchronization,
the transactions are not guaranteed to be globally consistent.
LOAD DATA LOCAL INFILE
commands are routed to the first available server
that contains the tables listed in the query.
To check how databases and tables map to servers, execute the special querySHOW SHARDS
. The query does not support any modifiers such as LIKE
.
show shards;
Database |Server |
---------|-------------|
db1.t1 |MyServer1 |
db1.t2 |MyServer1 |
db2.t1 |MyServer2 |
The schemarouter will also intercept the SHOW DATABASES
command and generate
it based on its internal data. This means that newly created databases will not
show up immediately and will only be visible when the cached data has been
updated.
The schemarouter maps each of the servers to know where each database and table
is located. As each user has access to a different set of tables and databases,
the result is unique to the username and the set of servers that the service
uses. These results are cached by the schemarouter. The lifetime of the cached
result is controlled by the refresh_interval
parameter.
When a server needs to be mapped, the schemarouter will route a query to each of the servers using the client's credentials. While this query is being executed, all other sessions that would otherwise share the cached result will wait for the update to complete. This waiting functionality was added in MaxScale 2.4.19, older versions did not wait for existing updates to finish and would perform parallel database mapping queries.
Here is an example configuration of the schemarouter:
[Shard-Router]
type=service
router=schemarouter
servers=server1,server2
user=myuser
password=mypwd
The module generates the list of databases based on the servers parameter using the connecting client's credentials. The user and password parameters define the credentials that are used to fetch the authentication data from the database servers. The credentials used only require the same grants as mentioned in the configuration documentation.
The list of databases is built by sending a SHOW DATABASES query to all the servers. This requires the user to have at least USAGE and SELECT grants on the databases that need be sharded.
If you are connecting directly to a database or have different users on some
of the servers, you need to get the authentication data from all the
servers. You can control this with the auth_all_servers
parameter. With
this parameter, MariaDB MaxScale forms a union of all the users and their
grants from all the servers. By default, the schemarouter will fetch the
authentication data from all servers.
For example, if two servers have the database shard
and the following
rights are granted only on one server, all queries targeting the databaseshard
would be routed to the server where the grants were given.
# Execute this on both servers
CREATE USER 'john'@'%' IDENTIFIED BY 'password';
# Execute this only on the server where you want the queries to go
GRANT SELECT,USAGE ON shard.* TO 'john'@'%';
This would in effect allow the user 'john' to only see the database 'shard' on this server. Take notice that these grants are matched against MariaDB MaxScale's hostname instead of the client's hostname. Only user authentication uses the client's hostname and all other grants use MariaDB MaxScale's hostname.
ignore_tables
Type: stringlist
Mandatory: No
Dynamic: Yes
Default: ""
List of full table names (e.g. db1.t1) to ignore when checking for duplicate tables. By default no tables are ignored.
This parameter was once called ignore_databases
.
ignore_tables_regex
Type: regex
Mandatory: No
Dynamic: No
Default: ""
A PCRE2 regular expression that is matched against database names when checking for duplicate databases. By default no tables are ignored.
The following configuration ignores duplicate tables in the databases db1
and db2
,
and all tables starting with "t" in db3
.
[Shard-Router]
type=service
router=schemarouter
servers=server1,server2
user=myuser
password=mypwd
ignore_tables_regex=^db1|^db2|^db3\.t
This parameter was once called ignore_databases_regex
.
max_sescmd_history
This parameter has been moved tothe MaxScale core in MaxScale 6.0.
disable_sescmd_history
This parameter has been moved tothe MaxScale core in MaxScale 6.0.
refresh_databases
Type: boolean
Mandatory: No
Dynamic: No
Default: false
Enable database map refreshing mid-session. These are triggered by a failure to
change the database i.e. USE ...
queries. This feature is disabled by default.
Before MaxScale 6.2.0, this parameter did nothing. Starting with the 6.2.0 release of MaxScale this parameter now works again but it is disabled by default to retain the same behavior as in older releases.
refresh_interval
Type: duration
Mandatory: No
Dynamic: Yes
Default: 300s
The minimum interval between database map refreshes in seconds. The default value is 300 seconds.
The interval is specified as documentedhere. If no explicit unit is provided, the value is interpreted as seconds in MaxScale 2.4. In subsequent versions a value without a unit may be rejected. Note that since the granularity of the intervaltimeout is seconds, a timeout specified in milliseconds will be rejected, even if the duration is longer than a second.
This functionality was introduced in 2.3.0.
If the same database exists on multiple servers, but the database contains different tables in each server, SchemaRouter is capable of routing queries to the right server, depending on which table is being addressed.
As an example, suppose the database db
exists on servers server1 and server2, but
that the database on server1 contains the table tbl1
and on server2 contains the
table tbl2
. The query SELECT * FROM db.tbl1
will be routed to server1 and the querySELECT * FROM db.tbl2
will be routed to server2. As in the example queries, the table
names must be qualified with the database names for table-level sharding to work.
Specifically, the query series below is not supported.
USE db;
SELECT * FROM tbl1; // May be routed to an incorrect backend if using table sharding.
The router_diagnostics
output for a schemarouter service contains the
following fields.
queries
: Number of queries executed through this service.
sescmd_percentage
: The percentage of queries that were session commands.
longest_sescmd_chain
: The largest amount of session commands executed by one client session.
times_sescmd_limit_exceeded
: Number of times the session command history limit was exceeded.
longest_session
: The longest client session in seconds.
shortest_session
: The shortest client session in seconds.
average_session
: The average client session duration in seconds.
shard_map_hits
: Cache hits for the shard map cache.
shard_map_misses
: Cache misses for the shard map cache.
Cross-database queries (e.g. SELECT column FROM database1.table UNION select column FROM database2.table
) are not properly supported. Such queries are routed either to the
first explicit database in the query, the current database in use or to the first
available database, depending on which succeeds.
Without a default database, queries that do not use fully qualified table
names and which do not modify the session state (e.g. SELECT * FROM t1
) will
be routed to the first available server. This includes queries such as explicit
transaction commands (BEGIN
, COMMIT
, ROLLBACK
), all non-table CREATE
commands (CREATE DATABASE
, CREATE SEQUENCE
) as well as any SELECT
statements that do not directly refer to a table. CREATE
commands should be
done directly on the node or the router should be equipped with the hint filter
and a routing hint should be used. Queries that modify the session state
(e.g. SET autocommit=1
) will be routed to all servers regardless of the
default database. For explicit transactions, the recommended way is to use SET autocommit=0
to start a transaction and SET autocommit=1
to commit it,
otherwise routing hints are required to correctly route the transaction control
commands. MXS-4467 changed the
routing of transaction control commands to route them to all servers used by the
schemarouter.
SELECT queries that modify session variables are not supported because uniform results can not be guaranteed. If such a query is executed, the behavior of the router is undefined. To work around this limitation, the query must be executed in separate parts.
If a query targets a database the SchemaRouter has not mapped to a server, the query will be routed to the first available server. This possibly returns an error about database rights instead of a missing database.
Prepared statement support is limited. PREPARE, EXECUTE and DEALLOCATE are routed to the
correct backend if the statement is known and only requires one backend server. EXECUTE
IMMEADIATE is not supported and is routed to the first available backend and may give
wrong results. Similarly, preparing a statement from a variable (e.g. PREPARE stmt FROM @a
) is not supported and may be routed wrong.
SHOW DATABASES
is handled by the router instead of routed to a server. The router only
answers correctly to the basic version of the query. Any modifiers such as LIKE
are
ignored.
SHOW TABLES
is routed to the server with the current database. If using table-level
sharding, the results will be incomplete. Similarly, SHOW TABLES FROM db1
is routed to
the server with database db1
, ignoring table sharding. Use SHOW SHARDS
to get results
from the router itself.
USE db1
is routed to the server with db1
. If the database is divided to multiple
servers, only one server will get the command.
Here is a small tutorial on how to set up a sharded database.
This page is licensed: CC BY-SA / Gnu FDL
SmartRouter is the query router of the SmartQuery framework. Based on the type of the query, each query is routed to the server or cluster that can best handle it.
For workloads where both transactional and analytical queries are needed, SmartRouter unites the Transactional (OLTP) and Analytical (OLAP) workloads into a single entry point in MaxScale. This allows a MaxScale client to freely mix transactional and analytical queries using the same connection. This is known as Hybrid Transactional and Analytical Processing, HTAP.
SmartRouter is configured as a service that either routes to other MaxScale routers or plain servers. Although one can configure SmartRouter to use a plain server directly, we refer to the configured "servers" as clusters.
For details about the standard service parameters, refer to theConfiguration Guide.
master
Type: target
Mandatory: Yes
Dynamic: No
One of the clusters must be designated as the master
. All writes go to the
master cluster, which for all practical purposes should be a master-slave
ReadWriteSplit. This document does not go into details about setting up
master-slave clusters, but suffice to say, that when setting up the ColumnStore
servers they should be configured to be slaves of a MariaDB server running an
InnoDB engine.
The ReadWriteSplit documentation has more on master-slave setup.
Example
Suppose we have a Transactional service like
[RWS-Row]
type=service
router=readwritesplit
servers = row_server_1, row_server_2, ...
for which we have defined the listener
[RWS-Row-Listener]
type=listener
service=RWS-Row
socket=/tmp/rws-row.sock
That is, that service can be accessed using the socket /tmp/rws-row.sock
.
The Analytical service could look like this
[RWS-Column]
type = service
router = readwritesplit
servers = column_server_1, column_server_2, ...
[RWS-Column-Listener]
type = listener
service = RWS-Column
socket = /tmp/rws-col.sock
Then we can define the SmartQuery service as follows
[SmartQuery]
type = service
router = smartrouter
targets = RWS-Row, RWS-Column
master = RWS-Row
[SmartQuery-Listener]
type = listener
service = SmartQuery
port = <port>
Note that the SmartQuery listener listens on a port, while the Row and Column service listeners listen on Unix domain sockets. The reason is that there is a significant performance benefit when SmartRouter accesses the services over a Unix domain socket compared to accessing them over a TCP/IP socket.
A complete configuration example can be found at the end of this document.
SmartRouter keeps track of the performance, or the execution time, of queries to the clusters. Measurements are stored with the canonical of a query as the key. The canonical of a query is the sql with all user-defined constants replaced with question marks. When SmartRouter sees a read-query whose canonical has not been seen before, it will send the query to all clusters. The first response from a cluster will designate that cluster as the best one for that canonical. Also, when the first response is received, the other queries are cancelled. The response is sent to the client once all clusters have responded to the query or the cancel.
There is obviously overhead when a new canonical is seen. This means that queries after a MaxScale start will be slightly slower than normal. The execution time of a query depends on the database engine, and on the contents of the tables being queried. As a result, MaxScale will periodically re-measure queries.
The performance behavior of queries under dynamic conditions, and their effect on different storage engines is being studied at MariaDB. As we learn more, we will be able to better categorize queries and move that knowledge into SmartRouter.
LOAD DATA LOCAL INFILE
is not supported.
The performance data is not persisted. The measurements will be performed anew after each startup.
[maxscale]
[row_server_1]
type = server
address = <ip>
port = <port>
[row_server_2]
type = server
address = <ip>
port = <port>
[Row-Monitor]
type = monitor
module = mariadbmon
servers = row_server_1, row_server_2
user = <user>
password = <password>
monitor_interval = 2000ms
[column_server_1]
type = server
address = <ip>
port = <port>
[Column-Monitor]
type = monitor
module = csmon
servers = column_server_1
user = <user>
password = <password>
monitor_interval = 2000ms
# Row Read write split
[RWS-Row]
type = service
router = readwritesplit
servers = row_server_1, row_server_2
user = <user>
password = <password>
[RWS-Row-Listener]
type = listener
service = RWS-Row
socket = /tmp/rws-row.sock
# Columnstore Read write split
[RWS-Column]
type = service
router = readwritesplit
servers = column_server_1
user = <user>
password = <password>
[RWS-Column-Listener]
type = listener
service = RWS-Column
socket = /tmp/rws-col.sock
# Smart Query router
[SmartQuery]
type = service
router = smartrouter
targets = RWS-Row, RWS-Column
master = RWS-Row
user = <user>
password = <password>
[SmartQuery-Listener]
type = listener
service = SmartQuery
port = <port>
This page is licensed: CC BY-SA / Gnu FDL
MariaDB MaxScale is a database proxy that extends the high availability, scalability, and security of MariaDB Server while at the same time simplifying application development by decoupling it from underlying database infrastructure. It includes multiple routers that each support different use cases.
• Performs query-based load balancing. • Each client connected to MaxScale is mapped to multiple connections to different back-end database servers. • Routes write queries to connection to primary server. • Routes read queries to connection to replica server.
Supports MariaDB Replication deployments
Yes
Supports Galera Cluster deployments
Yes
Supports Multi-Node Enterprise ColumnStore deployments
Yes
Performs query-based load balancing
Yes
Routes client connections to multiple servers simultaneously
Yes
Routes write queries to primary and read queries to replicas
Yes
Automatically reconnects clients to the new primary after failover or switchover
Yes
Automatically replays transactions on the new primary after failover or switchover
Yes
Automatically retries failed queries
Yes
Enforces causal reads to avoid reading stale data caused by slave lag
Yes
Performs connection-based load balancing
No
Routes client connections to configured server type
No
Mitigates the effect of primary failures by acting as a Binlog Server
No
Reduces bandwidth requirements of primary server in environments with many replica servers
No
Replicates from MariaDB to a Kafka broker
No
This page is: Copyright © 2025 MariaDB. All rights reserved.
This router sends write queries to a single primary server while load-balancing read queries across replica servers. Learn how to configure this for scaling out read workloads.
The Read/Write Split Router (readwritesplit) routes write queries to the primary server and load balances read-only queries between one or more replica servers. If a server fails, then the router may need to retry failed queries on a different server. The retry may need to be delayed in some cases, such as when automatic failover is in progress.
Configure delayed retries for failed queries by configuring several parameters for the Read/Write Split Router in maxscale.cnf
.
delayed_retry
• When this parameter is enabled, failed queries will not immediately return an error to the client. Instead, the router will retry the query if a different server becomes available before the timeout is reached. • This parameter is disabled by default.
delayed_retry_timeout
• The maximum amount of time to wait until returning an error if a query fails. • The value can be followed by any of the following units: h, m, s, and ms, for specifying durations in hours, minutes, seconds, and milliseconds. • The default value is 10 seconds.
For example:
[split-router]
type = service
router = readwritesplit
...
delayed_retry = true
delayed_retry_timeout = 30s
Restart the MaxScale instance.
$ sudo systemctl restart maxscale
This page is: Copyright © 2025 MariaDB. All rights reserved.
MaxScale's Read/Write Split Router (readwritesplit) performs query-based load balancing. For each client connected to MaxScale, it opens up connections to multiple back-end database servers. When the client sends a write query to MaxScale, it routes the query to the connection opened with the primary server. When the client sends a read query to MaxScale, it routes the query to a connection opened with one of the replicas.
This page contains topics that need to be considered when designing applications that use the Read/Write Split Router.
How does the Read/Write Split Router route queries?
How does the Read/Write Split Router retry failed reads?
Additional information is available here.
This page is: Copyright © 2025 MariaDB. All rights reserved.
The Read/Write Split Router (readwritesplit) load balances read-only queries between one or more replica servers. If the replica servers are using asynchronous MariaDB replication, then the data on the replica servers can sometimes lag behind the primary server. When this occurs, read-only queries that are executed on the replica servers can return stale results if they are not executed in a causally consistent manner. Causal consistency is the act of ensuring that interdependent operations maintain consistency by performing them in the same order on all servers.
To prevent this, the Read/Write Split Router can be configured to enable "causal reads", which ensures causal consistency for read-only queries. When causal reads is enabled, the Read/Write Split Router ensures that load balanced read-only queries are only executed on the replica server after all write statements previously executed on the primary server are fully replicated and applied on that specific replica server.
Starting with MaxScale 22.08, the Read/Write Split Router's causal reads functionality can be used with multiple MaxScale nodes.
Example of a Causal Read Let's say that a client does the following:
The client executes an INSERT statement:
INSERT INTO hq_sales.invoices
(customer_id, invoice_date, invoice_total, payment_method)
VALUES
(1, '2020-05-10 12:35:10', 1087.23, 'CREDIT_CARD');
The router will route this statement to the primary server.
The client executes a SELECT statement that reads the inserted row:
SELECT * FROM hq_sales.invoices
WHERE customer_id = 1
AND invoice_date = '2020-05-10 12:35:10';
The router will route this statement to a replica server.
In the above example, the replica server may not replicate and apply the INSERT statement immediately. If the SELECT statement is executed before this happens, then the results of the query will not be causally consistent.
However, if causal reads is enabled, then the Read/Write Split Router will only execute the SELECT statement after the INSERT statement has been fully replicated and applied on the replica server.
Causal reads requires configuration changes on both the back-end MariaDB Servers and on the MaxScale instance.
Perform the following procedure on all MariaDB Servers used by MaxScale:
Choose a configuration file in which to configure your system variables and options. It is not recommended to make custom changes to one of the bundled configuration files. Instead, it is recommended to create a custom configuration file in one of the included directories. Configuration files in included directories are read in alphabetical order. If you want your custom configuration file to override the bundled configuration files, then it is a good idea to prefix the custom configuration file's name with a string that will be sorted last, such as z-.
On RHEL, CentOS, Rocky Linux, and SLES, a good custom configuration file would be: /etc/my.cnf.d/z-custom-my.cnf
On Debian and Ubuntu, a good custom configuration file would be: /etc/mysql/mariadb.conf.d/z-custom-my.cnf
Set the session_track_system_variables system variable to last_gtid, so that the server will track session-level changes to the value of the last_gtid system variable.
It needs to be set in the configuration file in a group that will be read by mariadbd, such as [mariadb] or [server].
For example:
[mariadb]
...
session_track_system_variables=last_gtid
Restart the server.
$ sudo systemctl restart mariadb
Set the causal_reads and causal_reads_timeout
parameters for the Read/Write Split Router in maxscale.cnf
.
The causal_reads
parameter can be set to the following values:
none
• Causal reads are disabled. • This is the default value.
local
• Writes are locally visible. • Writes are guaranteed to be visible only to the connection that does it. Unrelated modifications done by other connections are not visible. • This mode improves read scalability at the cost of latency and reduces the overall load placed on the primary server without breaking causality guarantees.
global
• Writes are globally visible. • If one connection writes a value, all connections to the same service will see it. • In general this mode is slower than the local mode due to the extra synchronization it has to do. This guarantees global happens-before ordering of reads when all transactions are inside a single GTID domain. • This mode gives similar benefits as the local mode in that it improves read scalability at the cost of latency.
fast
• This mode is similar to the local mode where it will only affect the connection that does the write. • Whereas the local mode waits for a replica server to catch up, this mode will only use servers that are known to have replicated the write. • This means that if no replica server has replicated the write, the primary server where the write was done will be used. • The value of causal_reads_timeout is ignored in this mode. • Currently the replication state is only updated by the MariaDB Monitor (mariadbmon) whenever the servers are monitored. This means that a smaller monitor_interval provides faster replication state updates and possibly better overall usage of servers. • This mode is the inverse of the local mode in the sense that it improves read latency at the cost of read scalability while still retaining the causality guarantees for reads.
For example:
[split-router]
type = service
router = readwritesplit
...
causal_reads = local
causal_reads_timeout = 15
The unit for the causal_reads_timeout parameter is seconds, and the default value is 10.
Restart the MaxScale instance.
$ sudo systemctl restart maxscale
Set the causal_reads and causal_reads_timeout
parameters for the Read/Write Split Router in maxscale.cnf.
For example:
[split-router]
type = service
router = readwritesplit
...
causal_reads = ON
causal_reads_timeout = 15
The unit for the causal_reads_timeout parameter is seconds, and the default value is 10.
Restart the MaxScale instance.
$ sudo systemctl restart maxscale
This page is: Copyright © 2025 MariaDB. All rights reserved.
The Read/Write Split Router (readwritesplit) load balances read-only queries between one or more replica servers. If a replica server fails, then the router may need to create new connections to a different replica server for any existing client connections. The router takes certain steps to ensure that those new replica server connections have the same state as the old replica server connections.
The Read/Write Split Router (readwritesplit) maintains connection state on replica servers by keeping a session command history. If the router has to create a new connection, then it replays these session commands from the previous connection on the new connection.
The session command history can require a lot of memory if connections are long-lived. In these cases, there are two options to limit memory usage:
Configure a maximum size for the session command history
Disable the session command history. This option is not recommended, because you would lose out on the benefits of the session command history.
Set the maximum size of the session command history by configuring some parameters for the Read/Write Split Router in maxscale.cnf
.
max_sescmd_history
• The maximum number of distinct session commands that will be stored for each connection. • The default value is 50.
prune_sescmd_history
• When this parameter is enabled, the session command history for a connection is pruned when the number of distinct session commands reaches max_sescmd_history. • When this parameter is disabled, the session command history for a connection is disabled when the number of distinct session commands reaches max_sescmd_history. • This parameter is disabled by default.
For example:
[split-router]
type = service
router = readwritesplit
...
max_sescmd_history = 1500
prune_sescmd_history = true
Restart the MaxScale instance.
$ sudo systemctl restart maxscale
Disable the session command history by configuring the disable_sescmd_history
parameter for the Read/Write Split Router in maxscale.cnf
.
For example:
[split-router]
type = service
router = readwritesplit
...
disable_sescmd_history = true
Restart the MaxScale instance.
$ sudo systemctl restart maxscale
This page is: Copyright © 2025 MariaDB. All rights reserved.
The Read/Write Split Router (readwritesplit) routes write queries to the primary server and load balances read-only queries between one or more replica servers. If the primary server fails, then the router can automatically reconnect existing client connections to the new primary server.
Configure automatic primary server re-connection by configuring several parameters for the Read/Write Split Router in maxscale.cnf.
master_reconnection
• When this parameter is enabled, if the primary server fails and if master_failure_mode is not set to fail_instantly, then existing client connections will be automatically reconnected to the new primary server. • This parameter is disabled by default.
master_failure_mode
• This parameter defines how client connections are handled when the primary server fails. • This parameter must be set to either fail_on_write or error_on_write to allow automatic primary server reconnection. • When this parameter is set to fail_on_write, the client connection is closed if a write query is received when no primary is available. • When this parameter is set to error_on_write, if no primary server is available and a write query is received, an error is returned stating that the connection is in read-only mode.
For example:
[split-router]
type = service
router = readwritesplit
...
master_reconnection = true
master_failure_mode = fail_on_write
Restart the MaxScale instance.
$ sudo systemctl restart maxscale
This page is: Copyright © 2025 MariaDB. All rights reserved.
The Read/Write Split Router (readwritesplit) routes write queries to the primary server and load balances read-only queries between one or more replica servers. If a server fails, then the router may need to replay in-progress transactions on a different server.
The Read/Write Split Router (readwritesplit) maintains connection state on replica servers by keeping a session command history. If the router has to create a new connection, then it replays these session commands from the previous connection on the new connection.
The session command history can require a lot of memory if connections are long-lived. In these cases, there are two options to limit memory usage:
Configure a maximum size for the session command history
Disable the session command history. This option is not recommended, because you would lose out on the benefits of the session command history.
Configure transaction replay by configuring several parameters for the Read/Write Split Router in maxscale.cnf
.
transaction_replay
• When this parameter is enabled, transactions will be replayed if they are interrupted. It also implicitly enables the delayed_retry and master_reconnection parameters. • When this parameter is disabled, interrupted transactions will cause the client connection to be closed. • This parameter is disabled by default.
transaction_replay_max_size
• The maximum size of the transaction cache for each client connection. The unit is bytes, but EIC binary prefixes (Ki, Mi, Gi, and Ti) and SI prefixes (k, M, G, and T) can also be specified. • The default value is 1 MiB.
transaction_replay_attempts
• The maximum number of attempts to make when replaying a transaction. • The default value is 5.
transaction_replay_retry_on_deadlock
• When this parameter is enabled, transactions will be replayed if a deadlock occurs. • When this parameter is disabled, the client will receive an error if a deadlock occurs. • This parameter is disabled by default.
For example:
[split-router]
type = service
router = readwritesplit
...
transaction_replay = true
transaction_replay_max_size = 10Mi
transaction_replay_attempts = 10
transaction_replay_retry_on_deadlock = true
Restart the MaxScale instance.
$ sudo systemctl restart maxscale
This page is: Copyright © 2025 MariaDB. All rights reserved.
The Read/Write Split Router (readwritesplit) routes write queries to the primary server and load balances read-only queries between one or more replica servers. If a read-only query fails, then the router can retry the query on a different server.
Configure retries for failed reads by configuring the retry_failed_reads
parameter for the Read/Write Split Router in maxscale.cnf
.
For example:
[split-router]
type = service
router = readwritesplit
...
retry_failed_reads = true
Restart the MaxScale instance.
$ sudo systemctl restart maxscale
This page is: Copyright © 2025 MariaDB. All rights reserved.
The Read/Write Split Router (readwritesplit) uses well-defined rules to determine whether a statement can be routed to a replica server, or whether it needs to be routed to the primary server. Application designers must understand these rules to ensure that the router can properly load balance queries.
The following statements are routed to the primary server:
Queries that write to the database. For example, this includes, but is not limited to, the following statements:
Queries that modify the database (DDL) For example, this includes, but is not limited to, the following statements:
Queries within open transactions If the application uses explicit transactions, then all queries within the transaction will be routed to the primary server. Explicit transactions are used in the following cases:
When autocommit is set to OFF.
When BEGIN is executed.
When START TRANSACTION is executed.
For example, all queries will be routed to the primary server in this case:
SET SESSION autocommit=OFF;
SELECT * FROM hq_sales.invoices WHERE branch_id=1;
INSERT INTO hq_sales.invoices
(customer_id, invoice_date, invoice_total, payment_method)
VALUES
(1, '2020-05-10 12:35:10', 1087.23, 'CREDIT_CARD');
COMMIT;
And all queries will also be routed to the primary server in this case:
BEGIN;
SELECT * FROM hq_sales.invoices WHERE branch_id=1;
INSERT INTO hq_sales.invoices
(customer_id, invoice_date, invoice_total, payment_method)
VALUES
(1, '2020-05-10 12:35:10', 1087.23, 'CREDIT_CARD');
COMMIT;
Queries using stored procedures
Queries using stored functions
Queries using user-defined functions (UDF)
Queries that use temporary tables
EXECUTE statements that execute prepared statements
The following statements are routed to a replica server:
Queries that are read-only For example, this includes, but is not limited to, the following statements:
Queries that read system or user-defined variables For example, this includes, but is not limited to, the following statements:
For example, the following queries would be routed to a replica:
SELECT @@global.alter_algorithm;
select @@my_user_var;
SHOW statements
Queries using built-in functions
The following statements are routed to all servers:
SET statements, including those embedded in read-only statements
USE statements
PREPARE statements that create prepared statements
Internal client commands, such as QUIT, PING, STMT RESET, and CHANGE USER
.
This page is: Copyright © 2025 MariaDB. All rights reserved.
The Read/Write Split Router (readwritesplit) load balances read-only queries between one or more replica servers. It selects a replica server to execute a query using criteria configured by the slave_selection_criteria
parameter.
ADAPTIVE_ROUTING
Selects using average response times
LEAST_BEHIND_MASTER
Selects based on replication lag
LEAST_CURRENT_OPERATIONS
Selects based on number of active operations (the default)
LEAST_GLOBAL_CONNECTIONS
Selects based on number of connections from MariaDB MaxScale
LEAST_ROUTER_CONNECTIONS
Selects based on number of connections from the service
The Read/Write Split Router (readwritesplit) uses adaptive routing when the slave_selection_criteria
parameter is set to ADAPTIVE_ROUTING
.
In this mode, the router measures average server response times. When the router routes queries, it compares the response times of the different replica servers. It favors the faster servers for most queries, while still guaranteeing some traffic on the slowest servers.
The Read/Write Split Router (readwritesplit) uses the replica server that is least behind the primary server when the slave_selection_criteria
parameter is set to LEAST_BEHIND_MASTER
. This mode is only compatible with MariaDB replication.
In this mode, the router measures replica lag using the Seconds_Behind_Master
column from SHOW REPLICA STATUS. The replica server that has the lowest value is considered to be the least behind the primary server.
Set the replica selection criteria by configuring the slave_selection_criteria
parameter for the Read/Write Split Router in maxscale.cnf
:
[split-router]
type = service
router = readwritesplit
...
slave_selection_criteria = LEAST_GLOBAL_CONNECTIONS
Restart the MaxScale instance.
$ sudo systemctl restart maxscale
This page is: Copyright © 2025 MariaDB. All rights reserved.
MaxScale's Read/Write Split Router (readwritesplit) performs query-based load balancing. For each client connected to MaxScale, it opens up connections to multiple back-end database servers. When the client sends a write query to MaxScale, it routes the query to the connection opened with the primary server. When the client sends a read query to MaxScale, it routes the query to a connection opened with one of the replicas.
The Read/Write Split Router (readwritesplit) supports:
MariaDB replication deployments
Galera Cluster deployments
Multi-Node Enterprise ColumnStore deployments
The Read/Write Split Router (readwritesplit) allows you to:
Perform query-based load balancing.
Route client connections to multiple servers simultaneously.
Route write queries to primary and read queries to replicas.
Automatically reconnect clients to the new primary after failover or switchover.
Automatically replay transactions on the new primary after failover or switchover.
Automatically retry failed queries.
Enforce causal reads to avoid reading stale data caused by slave lag.
Deploy Xpand Topology
write-split-router-usageDeploy MaxScale with MariaDB Monitor and Read/Write Split Router
This page is: Copyright © 2025 MariaDB. All rights reserved.