All pages
Powered by GitBook
1 of 4

State Snapshot Transfers (SSTs) in Galera Cluster

State Snapshot Transfers (SSTs) in MariaDB Galera Cluster copy the full dataset from a donor node to a new or recovering joiner node, ensuring data consistency before the joiner joins replication.

Introduction to State Snapshot Transfers (SSTs)

In a State Snapshot Transfer (SST), the cluster provisions nodes by transferring a full data copy from one node to another. When a new node joins the cluster, the new node initiates a State Snapshot Transfer to synchronize its data with a node that is already part of the cluster.

Types of SSTs

There are two conceptually different ways to transfer a state from one MariaDB server to another:

  1. Logical: The only SST method of this type is the mysqldump SST method, which uses the mysqldump utility to get a logical dump of the donor. This SST method requires the joiner node to be fully initialized and ready to accept connections before the transfer. This method is, by definition, blocking, in that it blocks the donor node from modifying its state for the duration of the transfer. It is also the slowest of all, and that might be an issue in a cluster with a lot of loads.

  2. Physical: SST methods of this type physically copy the data files from the donor node to the joiner node. This requires that the joiner node be initialized after the transfer. The mariadb-backup SST method and a few other SST methods fall into this category. These SST methods are much faster than the mysqldump SST method, but they have certain limitations. For example, they can be used only on server startup, and the joiner node must be configured very similarly to the donor node (e.g., innodb_file_per_table should be the same, and so on). Some of the SST methods in this category are non-blocking on the donor node, meaning that the donor node is still able to process queries while donating the SST (e.g. the mariadb-backup SST method is non-blocking).

SST Methods

SST methods are supported via a scriptable interface. New SST methods could potentially be developed by creating new SST scripts. The scripts usually have names of the form wsrep_sst_<method> where <method> is one of the SST methods listed below.

You can choose your SST method by setting the wsrep_sst_method system variable. It can be changed dynamically with SET GLOBAL on the node that you intend to be an SST donor. For example:

SET GLOBAL wsrep_sst_method='mariadb-backup';

It can also be set in a server option group in an option file prior to starting up a node:

[mariadb]
...
wsrep_sst_method = mariadb-backup

For an SST to work properly, the donor and joiner node must use the same SST method. Therefore, it is recommended to set wsrep_sst_method to the same value on all nodes, since any node will usually be a donor or joiner node at some point.

MariaDB Galera Cluster comes with the following built-in SST methods:

mariadb-backup

This SST method uses the mariadb-backup utility for performing SSTs. It is one of the two non-locking methods. This is the recommended SST method if you require the ability to run queries on the donor node during the SST. Note that if you use the mariadb-backup SST method, then you also need to have socat installed on the server. This is needed to stream the backup from the donor to the joiner. This is a limitation inherited from the xtrabackup-v2 SST method.

  • This SST method supports GTID

  • This SST method supports Data at Rest Encryption.

  • This SST method is available from MariaDB 10.1.26 and MariaDB 10.2.10.

With this SST method, it is impossible to upgrade the cluster between some major versions; see MDEV-27437.

See mariadb-backup SST method for more information.

rsync / rsync_wan

rsync is the default method. This method uses the rsync utility to create a snapshot of the donor node. rsync should be available by default on all modern Linux distributions. The donor node is blocked with a read lock during the SST. This is the fastest SST method, especially for large datasets since it copies binary data. Because of that, this is the recommended SST method if you do not need to allow the donor node to execute queries during the SST.

The rsync method runs rsync in --whole-file mode, assuming that nodes are connected by fast local network links so that the default delta transfer mode would consume more processing time than it may save on data transfer bandwidth. When having a distributed cluster with slow links between nodes, the rsync_wan method runs rsync in the default delta transfer mode, which may reduce data transfer time substantially when an older datadir state is already present on the joiner node. Both methods are actually implemented by the same script, wsrep_sst_rsync_wan is just a symlink to the wsrep_sst_rsync script and the actual rsync mode to use is determined by the name the script was called by.

  • This SST method supports GTID

  • This SST method supports Data at Rest Encryption.

The rsync SST method does not support tables created with the DATA DIRECTORY or INDEX DIRECTORY clause. Use the mariadb-backup SST method as an alternative to support this feature.

Use of this SST method could result in data corruption when using innodb_use_native_aio (the default).

Use of this SST method could result in data corruption when using innodb_use_native_aio (the default). wsrep_sst_method=rsync is a reliable way to upgrade the cluster to a newer major version.

stunnel can be used to encrypt data over the wire. Be sure to have stunnel installed. You will also need to generate certificates and keys. See the stunnel documentation for information on how to do that. Once you have the keys, you will need to add the tkey and tcert options to the [sst] option group in your MariaDB configuration file, such as:

[sst]
tkey = /etc/my.cnf.d/certificates/client-key.pem
tcert = /etc/my.cnf.d/certificates/client-cert.pem

You also need to run the certificate directory through openssl rehash.

stunnel cannot be used to encrypt data over the wire.

mysqldump

This SST method runs mysqldump on the donor node and pipes the output to the mariadb client connected to the joiner node. The mysqldump SST method needs a username/password pair set in the wsrep_sst_auth variable in order to get the dump. The donor node is blocked with a read lock during the SST. This is the slowest SST method.

  • This SST method supports GTID.

  • This SST method supports Data at Rest Encryption.

xtrabackup-v2

Percona XtraBackup is not supported in MariaDB. mariadb-backup is the recommended backup method to use instead of Percona XtraBackup. See Percona XtraBackup Overview: Compatibility with MariaDB for more information.

This SST method uses the Percona XtraBackup utility for performing SSTs. It is one of the two non-blocking methods. Note that if you use the xtrabackup-v2 SST method, you also need to have socat installed on the server. Since Percona XtraBackup is a third-party product, this SST method requires an additional installation and some additional configuration. Please refer to Percona's xtrabackup SST documentation for information from the vendor.

  • This SST method does not support GTID

  • This SST method does not support Data at Rest Encryption.

This SST method is available from MariaDB Galera Cluster 5.5.37 and MariaDB Galera Cluster 10.0.10.

See xtrabackup-v2 SST method for more information.

xtrabackup

Percona XtraBackup is not supported in MariaDB. mariadb-backup is the recommended backup method to use instead of Percona XtraBackup. See Percona XtraBackup Overview: Compatibility with MariaDB for more information.

This SST method is an older SST method that uses the Percona XtraBackup utility for performing SSTs. The xtrabackup-v2 SST method should be used instead of the xtrabackup SST method starting from MariaDB 5.5.33.

  • This SST method does not support GTID

  • This SST method does not support Data at Rest Encryption.

Authentication

All SST methods except rsync require authentication via username and password. You can tell the client what username and password to use by setting the wsrep_sst_auth system variable. It can be changed dynamically with SET GLOBAL on the node that you intend to be a SST donor. For example:

SET GLOBAL wsrep_sst_auth = 'mariadb-backup:password';

It can also be set in a server option group in an option file prior to starting up a node:

[mariadb]
...
wsrep_sst_auth = mariadb-backup:password

Some authentication plugins do not require a password. For example, the unix_socket and gssapi authentication plugins do not require a password. If you are using a user account that does not require a password in order to log in, then you can just leave the password component of wsrep_sst_auth empty. For example:

[mariadb]
...
wsrep_sst_auth = mariadb-backup:

See the relevant description or page for each SST method to find out what privileges need to be granted to the user and whether the privileges are needed on the donor node or joiner node for that method.

SSTs and Systemd

MariaDB's systemd unit file has a default startup timeout of about 90 seconds on most systems. If an SST takes longer than this default startup timeout on a joiner node, then systemd will assume that mysqld has failed to startup, which causes systemd to kill the mysqld process on the joiner node. To work around this, you can reconfigure the MariaDB systemd unit to have an infinite timeout, such as by executing one of the following commands:

If you are using systemd 228 or older, then you can execute the following to set an infinite timeout:

sudo tee /etc/systemd/system/mariadb.service.d/timeoutstartsec.conf <<EOF
[Service]

TimeoutStartSec=0
EOF
sudo systemctl daemon-reload

Systemd 229 added the infinity option, so if you are using systemd 229 or later, then you can execute the following to set an infinite timeout:

sudo tee /etc/systemd/system/mariadb.service.d/timeoutstartsec.conf <<EOF
[Service]

TimeoutStartSec=infinity
EOF
sudo systemctl daemon-reload

See Configuring the Systemd Service Timeout for more details.

Note that systemd 236 added the EXTEND_TIMEOUT_USEC environment variable that allows services to extend the startup timeout during long-running processes. Starting with MariaDB 10.1.35, MariaDB 10.2.17, and MariaDB 10.3.8, on systems with systemd versions that support it, MariaDB uses this feature to extend the startup timeout during long SSTs. Therefore, if you are using systemd 236 or later, then you should not need to manually override TimeoutStartSec, even if your SSTs run for longer than the configured value. See MDEV-15607 for more information.

SST Failure

An SST failure generally renders the joiner node unusable. Therefore, when an SST failure is detected, the joiner node will abort.

Restarting a node after a mysqldump SST failure may require manual restoration of the administrative tables.

SSTs and Data at Rest Encryption

Look at the description of each SST method to determine which methods support Data at Rest Encryption.

For logical SST methods like mysqldump, each node should be able to have different encryption keys. For physical SST methods, all nodes need to have the same encryption keys, since the donor node will copy encrypted data files to the joiner node, and the joiner node will need to be able to decrypt them.

Minimal Cluster Size

In order to avoid a split-brain condition, the minimum recommended number of nodes in a cluster is 3.

When using an SST method that blocks the donor, there is yet another reason to require a minimum of 3 nodes. In a 3-node cluster, if one node is acting as an SST joiner and one other node is acting as an SST donor, then there is still one more node to continue executing queries.

Manual SSTs

In some cases, if Galera Cluster's automatic SSTs repeatedly fail, then it can be helpful to perform a "manual SST". See the following pages on how to do that:

  • Manual SST of Galera Cluster node with mariadb-backup

  • Manual SST of Galera Cluster node with Percona XtraBackup

Known Issues

mysqld_multi

SST scripts can't currently read the mysqld<#> option group in an option file that are read by instances managed by mysqld_multi.

See MDEV-18863 for more information.

See Also

  • Galera Cluster documentation: STATE SNAPSHOT TRANSFERS

This page is licensed: CC BY-SA / Gnu FDL

Manual SST of Galera Cluster Node With mariadb-backup

Sometimes it can be helpful to perform a "manual SST" when Galera's normal SSTs fail. This can be especially useful when the cluster's datadir is very large, since a normal SST can take a long time to fail in that case.

A manual SST essentially consists of taking a backup of the donor, loading the backup on the joiner, and then manually editing the cluster state on the joiner node. This page will show how to perform this process with mariadb-backup.

Process

  • Check that the donor and joiner nodes have the same mariadb-backup version.

mariadb-backup --version
  • Create backup directory on donor.

MYSQL_BACKUP_DIR=/mysql_backup
mkdir $MYSQL_BACKUP_DIR
  • Take a full backup the of the donor node with mariadb-backup. The --galera-info option should also be provided, so that the node's cluster state is also backed up.

DB_USER=sstuser
DB_USER_PASS=password
mariadb-backup --backup  --galera-info \
   --target-dir=$MYSQL_BACKUP_DIR \
   --user=$DB_USER \
   --password=$DB_USER_PASS
  • Verify that the MariaDB Server process is stopped on the joiner node. This will depend on your service manager.

For example, on systemd systems, you can execute::

systemctl status mariadb
  • Create the backup directory on the joiner node.

MYSQL_BACKUP_DIR=/mysql_backup
mkdir $MYSQL_BACKUP_DIR
  • Copy the backup from the donor node to the joiner node.

OS_USER=dba
JOINER_HOST=dbserver2.mariadb.com
rsync -av $MYSQL_BACKUP_DIR/* ${OS_USER}@${JOINER_HOST}:${MYSQL_BACKUP_DIR}
  • Prepare the backup on the joiner node.

mariadb-backup --prepare \
   --target-dir=$MYSQL_BACKUP_DIR
  • Get the Galera Cluster version ID from the donor node's grastate.dat file.

MYSQL_DATADIR=/var/lib/mysql
cat $MYSQL_DATADIR/grastate.dat | grep version

For example, a very common version number is "2.1".

  • Get the node's cluster state from the xtrabackup_galera_info file in the backup that was copied to the joiner node.

cat $MYSQL_BACKUP_DIR/xtrabackup_galera_info

The file contains the values of the wsrep_local_state_uuid and wsrep_last_committed status variables.

The values are written in the following format:

wsrep_local_state_uuid:wsrep_last_committed

For example:

d38587ce-246c-11e5-bcce-6bbd0831cc0f:1352215
  • Create a grastate.dat file in the backup directory of the joiner node. The Galera Cluster version ID, the cluster uuid, and the seqno from previous steps will be used to fill in the relevant fields.

For example, with the example values from the last two steps, we could do:

sudo tee $MYSQL_BACKUP_DIR/grastate.dat <<EOF
# GALERA saved state
version: 2.1
uuid:    d38587ce-246c-11e5-bcce-6bbd0831cc0f
seqno:   1352215
safe_to_bootstrap: 0
EOF
  • Remove the existing contents of the datadir on the joiner node.

MYSQL_DATADIR=/var/lib/mysql
rm -Rf $MYSQL_DATADIR/*
  • Copy the contents of the backup directory to the datadir the on joiner node.

mariadb-backup --copy-back \
   --target-dir=$MYSQL_BACKUP_DIR
  • Make sure the permissions of the datadir are correct on the joiner node.

chown -R mysql:mysql $MYSQL_DATADIR/
  • Start the MariaDB Server process on the joiner node. This will depend on your service manager.

For example, on systemd systems, you can execute::

systemctl start mariadb
  • Watch the MariaDB error log on the joiner node and verify that the node does not need to perform a normal SSTs due to the manual SST.

tail -f /var/log/mysql/mysqld.log

mariadb-backup SST Method

The mariabackup SST method uses the mariadb-backup utility for performing SSTs. It is one of the methods that does not block the donor node. mariadb-backup was originally forked from Percona XtraBackup, and similarly, the mariadb-backup SST method was originally forked from the xtrabackup-v2 SST method.

Note that if you use the mariadb-backup SST method, then you also need to have socat installed on the server. This is needed to stream the backup from the donor node to the joiner node. This is a limitation that was inherited from the xtrabackup-v2 SST method.

Choosing mariadb-backup for SSTs

To use the mariadb-backup SST method, you must set the wsrep_sst_method=mariabackup on both the donor and joiner node. It can be changed dynamically with SET GLOBAL on the node that you intend to be an SST donor. For example:

SET GLOBAL wsrep_sst_method='mariadbbackup';

It can be set in a server option group in an option file prior to starting up a node:

[mariadb]
...
wsrep_sst_method = mariadbbackup

For an SST to work properly, the donor and joiner node must use the same SST method. Therefore, it is recommended to set wsrep_sst_method to the same value on all nodes, since any node will usually be a donor or joiner node at some point.

Major Version Upgrades

The InnoDB redo log format has been changed in MariaDB 10.5 and MariaDB 10.8 in a way that will not allow the crash recovery or the preparation of a backup from an older major version. Because of this, the mariabackup SST method cannot be used for some major-version upgrades, unless you temporarily edit the wsrep_sst_mariadbbackup script so that the --prepare step on the newer-major-version joiner will be executed using the older-major-version mariadb-backup tool.

The default method wsrep_sst_method=rsync works for major-version upgrades; see MDEV-27437.

Authentication and Privileges

To use the mariadb-backup SST method, mariadb-backup needs to be able to authenticate locally on the donor node, so that it can create a backup to stream to the joiner. You can tell the donor node what username and password to use by setting the wsrep_sst_auth system variable. It can be changed dynamically with SET GLOBAL on the node that you intend to be an SST donor:

SET GLOBAL wsrep_sst_auth = 'mariadbbackup:mypassword';

It can also be set in a server option group in an option file prior to starting up a node:

[mariadb]
...
wsrep_sst_auth = mariadbbackup:mypassword

Some authentication plugins do not require a password. For example, the unix_socket and gssapi authentication plugins do not require a password. If you are using a user account that does not require a password in order to log in, then you can just leave the password component of wsrep_sst_auth empty. For example:

[mariadb]
...
wsrep_sst_auth = mariadbbackup:

The user account that performs the backup for the SST needs to have the same privileges as mariadb-backup, which are the RELOAD, PROCESS, LOCK TABLES and BINLOG MONITOR, REPLICA MONITOR global privileges. To be safe, ensure that these privileges are set on each node in your cluster. mariadb-backup connects locally on the donor node to perform the backup, so the following user should be sufficient:

CREATE USER 'mariadbbackup'@'localhost' IDENTIFIED BY 'mypassword';
GRANT RELOAD, PROCESS, LOCK TABLES, 
BINLOG MONITOR ON *.* TO 'mariadbbackup'@'localhost';

Passwordless Authentication - Unix Socket

It is possible to use the unix_socket authentication plugin for the user account that performs SSTs. This would provide the benefit of not needing to configure a plain-text password in wsrep_sst_auth.

The user account would have to have the same name as the operating system user account that is running the mysqld process. On many systems, this is the user account configured as the user option, and it tends to default to mysql.

For example, if the unix_socket authentication plugin is already installed, then you could execute the following to create the user account:

CREATE USER 'mysql'@'localhost' IDENTIFIED VIA unix_socket;
GRANT RELOAD, PROCESS, LOCK TABLES, 
REPLICATION CLIENT ON *.* TO 'mysql'@'localhost';

To configure wsrep_sst_auth, set the following in a server option group in an option file prior to starting up a node:

[mariadb]
...
wsrep_sst_auth = mysql:

Passwordless Authentication - GSSAPI

It is possible to use the gssapi authentication plugin for the user account that performs SSTs. This would provide the benefit of not needing to configure a plain-text password in wsrep_sst_auth.

The following steps would need to be done beforehand:

  • You need a KDC running MIT Kerberos or Microsoft Active Directory.

  • You will need to create a keytab file for the MariaDB server.

  • You will need to install the package containing the gssapi authentication plugin.

  • You will need to install the plugin in MariaDB, so that the gssapi authentication plugin is available to use.

  • You will need to configure the plugin.

  • You will need to create a user account gssapi

For example, you could execute the following to create the user account in MariaDB:

CREATE USER 'mariadbbackup'@'localhost' IDENTIFIED VIA gssapi;
GRANT RELOAD, PROCESS, LOCK TABLES, 
BINLOG MONITOR ON *.* TO 'mariadbbackup'@'localhost';

To configure wsrep_sst_auth, set the following in a server option group in an option file prior to starting up a node:

[mariadb]
...
wsrep_sst_auth = mariadbbackup:

Choosing a Donor Node

When mariadb-backup is used to create the backup for the SST on the donor node, mariadb-backup briefly requires a system-wide lock at the end of the backup. This is done with BACKUP STAGE BLOCK_COMMIT.

If a specific node in your cluster is acting as the primary node by receiving all of the application's write traffic, then this node should not usually be used as the donor node, because the system-wide lock could interfere with the application. In this case, you can define one or more preferred donor nodes by setting the wsrep_sst_donor system variable.

For example, let's say that we have a 5-node cluster with the nodes node1, node2, node3, node4, and node5, and let's say that node1 is acting as the primary node. The preferred donor nodes for node2 could be configured by setting the following in a server option group in an option file prior to starting up a node:

[mariadb]
...
wsrep_sst_donor=node3,node4,node5,

The trailing comma tells the server to allow any other node as donor when the preferred donors are not available. Therefore, if node1 is the only node left in the cluster, the trailing comma allows it to be used as the donor node.

Socat Dependency

During the SST process, the donor node uses socat to stream the backup to the joiner node. Then the joiner node prepares the backup before restoring it. The socat utility must be installed on both the donor node and the joiner node in order for this to work. Otherwise, the MariaDB error log will contain an error like:

WSREP_SST: [ERROR] socat not found in path: /usr/sbin:/sbin:/usr//bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin (20180122 14:55:32.993)

Installing Socat on RHEL/CentOS

On RHEL/CentOS, socat can be installed from the Extra Packages for Enterprise Linux (EPEL) repository.

TLS

This SST method supports two different TLS methods. The specific method can be selected by setting the encrypt option in the [sst] section of the MariaDB configuration file. The options are:

  • TLS using OpenSSL encryption built into socat (encrypt=2)

  • TLS using OpenSSL encryption with Galera-compatible certificates and keys (encrypt=3)

Note that encrypt=1 refers to a TLS encryption method that has been deprecated and removed. encrypt=4 refers to a TLS encryption method in xtrabackup-v2 that has not yet been ported to mariadb-backup. See MDEV-18050 about that.

TLS Using OpenSSL Encryption Built into Socat

To generate keys compatible with this encryption method, follow these directions.

First, generate the keys and certificates:

FILENAME=sst
openssl genrsa -out $FILENAME.key 1024
openssl req -new -key $FILENAME.key -x509 -days 3653 -out $FILENAME.crt
cat $FILENAME.key $FILENAME.crt >$FILENAME.pem
chmod 600 $FILENAME.key $FILENAME.pem

On some systems, you may also have to add dhparams to the certificate:

openssl dhparam -out dhparams.pem 2048
cat dhparams.pem >> sst.pem

Next, copy the certificate and keys to all nodes in the cluster.

When done, configure the following on all nodes in the cluster:

[sst]
encrypt=2
tca=/etc/my.cnf.d/certificates/sst.crt
tcert=/etc/my.cnf.d/certificates/sst.pem

Make sure to replace the paths with whatever is relevant on your system. This should allow your SSTs to be encrypted.

TLS Using OpenSSL Encryption With Galera-Compatible Certificates and Keys

To generate keys compatible with this encryption method, follow these directions.

First, generate the keys and certificates:

# CA
openssl genrsa 2048 > ca-key.pem
openssl req -new -x509 -nodes -days 365000 \
-key ca-key.pem -out ca-cert.pem
 
# server1
openssl req -newkey rsa:2048 -days 365000 \
-nodes -keyout server1-key.pem -out server1-req.pem
openssl rsa -in server1-key.pem -out server1-key.pem
openssl x509 -req -in server1-req.pem -days 365000 \
-CA ca-cert.pem -CAkey ca-key.pem -set_serial 01 \
-out server1-cert.pem

Next, copy the certificate and keys to all nodes in the cluster.

When done, configure the following on all nodes in the cluster:

[sst]
encrypt=3
tkey=/etc/my.cnf.d/certificates/server1-key.pem
tcert=/etc/my.cnf.d/certificates/server1-cert.pem

Make sure to replace the paths with whatever is relevant on your system. This should allow your SSTs to be encrypted.

Logs

The mariadb-backup SST method has its own logging outside of the MariaDB Server logging.

Logging to SST Logs

Logging for mariadb-backup SSTs works the following way.

By default, on the donor node, it logs to mariadb-backup.backup.log. This log file is located in the datadir.

By default, on the joiner node, it logs to mariadb-backup.prepare.log and mariadb-backup.move.log These log files are also located in the datadir.

By default, before a new SST is started, existing mariadb-backup SST log files are compressed and moved to /tmp/sst_log_archive. This behavior can be disabled by setting sst-log-archive=0 in the [sst] option group in an option file. Similarly, the archive directory can be changed by setting sst-log-archive-dir:

[sst]
sst-log-archive=1
sst-log-archive-dir=/var/log/mysql/sst/

See MDEV-17973 for more information.

Logging to Syslog

Redirect the SST logs to the syslog instead, by setting the following in the [sst] option group in an option file:

[sst]
sst-syslog=1

You can also redirect the SST logs to the syslog by setting the following in the [mysqld_safe] option group in an option file:

[mysqld_safe]
syslog

Performing SSTs With IPv6 Addresses

If you are performing mariadb-backup SSTs with IPv6 addresses, then the socat utility needs to be passed the pf=ip6 option. This can be done by setting the sockopt option in the [sst] option group in an option file:

[sst]
sockopt=",pf=ip6"

See MDEV-18797 for more information.

Manual SST With mariadb-backup

If Galera Cluster's automatic SSTs repeatedly fail, it can be helpful to perform a "manual SST"; see: Manual SST of Galera Cluster node with mariadb-backup

See Also

  • Percona XtraBackup SST Configuration

  • Encrypting PXC Traffic: ENCRYPTING SST TRAFFIC

  • XTRABACKUP PARAMETERS

  • SSL FOR STATE SNAPSHOT TRANSFERS: ENABLING SSL FOR XTRABACKUP

This page is licensed: CC BY-SA / Gnu FDL