All pages
Powered by GitBook
1 of 7

MariaDB MaxScale 22.08 Monitors

MaxScale 22.08 Aurora Monitor

Aurora Monitor

Aurora Monitor

Note The Aurora Monitor is deprecated in 22.08.2 and will be removed in 23.02.0.

This module monitors the status of Aurora cluster replicas. These replicas do not use the standard MySQL protocol replication but rely on a mechanism provided by AWS to replicate changes.

How Aurora Is Monitored

Each node in an Aurora cluster has the variable @@aurora_server_id which is the unique identifier for that node. An Aurora replica stores information relevant to replication in the information_schema.replica_host_status table. The table contains information about the status of all replicas in the cluster. The server_id column in this table holds the values of@@aurora_server_id variables from all nodes. The session_id column contains an unique string for all read-only replicas. For the master node, this value will be MASTER_SESSION_ID. By executing the following query, we are able to retrieve the @@aurora_server_id of the master node along with the@@aurora_server_id of the current node.

SELECT @@aurora_server_id, server_id FROM information_schema.replica_host_status WHERE session_id = 'MASTER_SESSION_ID';

The node which returns a row with two identical fields is the master. All other nodes are read-only replicas and will be labeled as slave servers.

In addition to replica status information, theinformation_schema.replica_host_status table contains information about replication lag between the master and the read-only nodes. This value is stored in the replica_lag_in_milliseconds column. This can be used to detect read replicas that are lagging behind the master node. This information can then be used by the routing modules to route reads to up-to-date nodes.

Configuring the Aurora Monitor

The Aurora monitor should connect directly to the unique endpoints of the Aurora replicas. The cluster end point should not be included in the set of monitored servers. Read the Amazon RDS User Guide for more information about how to retrieve the unique endpoints of your cluster.

The Aurora monitor requires no parameters apart from the standard monitor parameters. It supports the monitor script functionality described inMonitor Common documentation.

Here is an example Aurora monitor configuration.

[Aurora-Monitor]
type=monitor
module=auroramon
servers=cluster-1,cluster-2,cluster-3
user=aurora
password=borealis
monitor_interval=2500ms

The servers cluster-1, cluster-2 and cluster-3 are the unique Aurora endpoints configured as MaxScale servers. The monitor will use theaurora:borealis credentials to connect to each of the endpoint. The status of the nodes is inspected every 2500 milliseconds.

CC BY-SA / Gnu FDL

MaxScale 22.08 ColumnStore Monitor

ColumnStore Monitor

ColumnStore Monitor

Note The ColumnStore monitor is deprecated in 22.08.2 and will be removed in 23.02.0. It is superseded by MariaDB Monitor.

The ColumnStore monitor, csmon, is a monitor module for MariaDB ColumnStore servers. The monitor supports ColumnStore version 1.5.

Required Grants

The credentials defined with the user and password parameters must have all grants on the infinidb_vtable database.

For example, to create a user for this monitor with the required grants execute the following SQL.

CREATE USER 'maxscale'@'maxscalehost' IDENTIFIED BY 'maxscale-password';
GRANT ALL ON infinidb_vtable.* TO 'maxscale'@'maxscalehost';

Configuration

Read the Monitor Common document for a list of supported common monitor parameters.

version

With this deprecated optional parameter the used ColumnStore version is specified. The only allowed value is 1.5.

admin_port

This optional parameter specifies the port of the ColumnStore administrative daemon. The default value is 8640. Note that the daemons of all nodes must be listening on the same port.

admin_base_path

This optional parameter specifies the base path of the ColumnStore administrative daemon. The default value is /cmapi/0.4.0.

api_key

This optional parameter specifies the API key to be used in the communication with the ColumnStore administrative daemon. If no key is specified, then a key will be generated and stored to the file api_key.txt in the directory with the same name as the monitor in data directory of MaxScale. Typically that will be /var/lib/maxscale/<monitor-section>/api_key.txt.

Note that ColumnStore will store the first key provided and thereafter require it, so changing the key requires the resetting of the key on the ColumnStore nodes as well.

local_address

With this parameter it is specified what IP MaxScale should tell the ColumnStore nodes it resides at. Either it orlocal_address at the global level in the MaxScale configuration file must be specified. If both have been specified, then the one specified for the monitor overrides.

dynamic_node_detection

This optional boolean parameter specifies whether the monitor should autonomously figure out the ColumnStore cluster configuration or whether it should solely rely upon the monitor configuration in the configuration file. Please see Dynamic Node Detection for a thorough discussion on the meaning of the parameter. The default value is false.

cluster_monitor_interval

This optional parameter, meaningful only if dynamic_node_detection istrue specifies how often the monitor should probe the ColumnStore cluster and adapt to any changes that have occurred in the number of nodes of the cluster. The default value is 10s, that is, the cluster configuration is probed every 10 seconds.

Note that as the probing is performed at the regular monitor round, the value should be some multiple of monitor_interval.

Dynamic Node Detection

NOTE If dynamic node detection is used, the network setup must be such that the hostname/IP-address of a ColumnStore node is the same when viewed both from MaxScale and from another node.

By default, the ColumnStore monitor behaves like the regular MariaDB monitor. That is, it only monitors the servers it has been configured with.

If dynamic_node_detection has been enabled, the behaviour of the monitor changes significantly. Instead of being explicitly told which servers it should monitor, the monitor is only told how to get into contact with the cluster whereafter it autonomously figures out the cluster configuration and creates dynamic server entries accordingly.

When dynamic node detection is enabled, the servers the monitor has been configured with are only used for "bootstrapping" the monitor, because at the initial startup the monitor does not otherwise know how to get into contact with the cluster.

In the following is shown a configuration using dynamic node detection.

[CsBootstrap1]
type=server
address=mcs1
port=3306

[CsBootstrap2]
type=server
address=mcs2
port=3306

[CsMonitor]
type=monitor
module=csmon
servers=CsBootstrap1, CsBootstrap2
dynamic_node_detection=true
...

As can be seen, the server entries look just like any other server entries, but to make them stand out and to indicate what they are used for, they have the word bootstrap in their name.

In principle, it is sufficient with a single entry, but to cater for the case that a node happens to be down, it is adviseable to have more than one. Once the monitor has been able to connect to a node, it will fetch the configuration and store information about the nodes locally. On subsequent startups, the monitor will use the bootstrap information only if it cannot connect using the persisted information. Also, if there has been any change in the bootstrap servers, the persisted information is not used.

Based on the information obtained from the cluster itself, the monitor will create dynamic server instances that are named as @@ followed by the monitor name, followed by a :, followed by the hostname.

If the cluster in fact consists of three nodes, then the output ofmaxctrl list servers may look like

┌──────────────────┬─────────┬──────┬─────────────┬─────────────────┬──────┐
│ Server           │ Address │ Port │ Connections │ State           │ GTID │
├──────────────────┼─────────┼──────┼─────────────┼─────────────────┼──────┤
│ @@CSMonitor:mcs2 │ mcs2    │ 3306 │ 0           │ Slave, Running  │      │
├──────────────────┼─────────┼──────┼─────────────┼─────────────────┼──────┤
│ @@CSMonitor:mcs3 │ mcs3    │ 3306 │ 0           │ Master, Running │      │
├──────────────────┼─────────┼──────┼─────────────┼─────────────────┼──────┤
│ @@CSMonitor:mcs1 │ mcs1    │ 3306 │ 0           │ Slave, Running  │      │
├──────────────────┼─────────┼──────┼─────────────┼─────────────────┼──────┤
│ CsBootstrap1     │ mcs1    │ 3306 │ 0           │ Slave, Running  │      │
├──────────────────┼─────────┼──────┼─────────────┼─────────────────┼──────┤
│ CsBootstrap2     │ mcs2    │ 3306 │ 0           │ Slave, Running  │      │
└──────────────────┴─────────┴──────┴─────────────┴─────────────────┴──────┘

Note that there will be dynamic server entries also for the nodes for which there is a bootstrap entry.

When the service is defined, it is imperative that it does not explicitly refer to either the bootstrap or the dynamic entries. Instead, it should refer to the monitor using the cluster parameter.

[RWS]
type=service
router=readwritesplit
cluster=CsMonitor
...

With this configuration the RWS service will automatically adapt to any changes made to the ColumnStore cluster.

Commands

The ColumnStore monitor provides module commands using which the ColumnStore cluster can be managed. The commands can be invoked using the REST-API with a client such as curl or using maxctrl.

All commands require the monitor instance name as the first parameters. Additional parameters must be provided depending on the command.

Note that as maxctrl itself has a timeout of 10 seconds, if a timeout larger than that is provided to any command, the timeout of maxctrl must also be increased. For instance:

maxctrl --timeout 30s call command csmon shutdown CsMonitor 20s

Here a 30 second timeout is specified for maxctrl to ensure that it does not expire before the timeout of 20s provided for the shutdown command possibly does.

The output is always a JSON object.

In the following, assume a configuration like this:

[CsNode1]
type=server
...

[CsNode2]
type=server
...

[CsMonitor]
type=monitor
module=csmon
servers=CsNode1,CsNode2
...

start

Starts the ColumnStore cluster.

call command csmon start <monitor-name> <timeout>

Example

call command csmon start CsMonitor 20s

shutdown

Shuts down the ColumnStore cluster.

call command csmon shutdown <monitor-name> <timeout>

Example

call command csmon shutdown CsMonitor 20s

status

Get the status of the ColumnStore cluster.

call command csmon status <monitor-name> [<server>]

Returns the status of the cluster or the status of a specific server.

Example

call command csmon status CsMonitor
call command csmon status CsMonitor CsNode1

mode-set

Sets the mode of the cluster.

call command csmon mode-set <monitor-name> (readonly|readwrite) <timeout>

Example

call command csmon mode-set CsMonitor readonly 20s

config-get

Returns the cluster configuration.

call command csmon config-get <monitor-name> [<server-name>]

If no server is specified, the configuration is fetched from the first server in the monitor configuration, otherwise from the specified server.

Note that if everything is in order, the returned configuration should be identical regardless of the server it is fetched from.

Example

call command csmon config-get CsMonitor CsNode2

add-node

Adds a new node located on the server at the hostname or IP host to the ColumnStore cluster.

call command csmon add-node <monitor-name> <host> <timeout>

Example

call command csmon add-node CsMonitor mcs2 20s

For a more complete example, please refer to adding a node.

remove-node

Remove the node located on the server at the hostname or IP host from the ColumnStore cluster.

call command csmon remove-node <monitor-name> <host> <timeout>

Example

call command csmon remove-node CsMonitor mcs2 20s

For a more complete example, please refer to removing a node.

Example

The following is an example of a csmon configuration.

[CSMonitor]
type=monitor
module=csmon
version=1.5
servers=CsNode1,CsNode2
user=myuser
password=mypwd
monitor_interval=5s
api_key=somekey1234

Adding a Node

Note that in the following dynamic_node_detection is not used, but the monitor is configured in the traditional way. The impact ofdynamic_node_detection is described here.

Adding a new node to a ColumnStore cluster can be performed dynamically at runtime, but it must be done in two steps. First, the node is added to ColumnStore and then, the corresponding server object (that possibly has to be created) in the MaxScale configuration is added to the ColumnStore monitor.

In the following, assume a two node ColumnStore cluster and an initial MaxScale configuration like.

[CsNode1]
type=server
...

[CsNode2]
type=server
...

[CsMonitor]
type=monitor
module=csmon
servers=CsNode1,CsNode2
...

Invoking maxctrl list servers will now show:

┌─────────┬─────────────┬──────┬─────────────┬─────────────────┬──────┐
│ Server  │ Address     │ Port │ Connections │ State           │ GTID │
├─────────┼─────────────┼──────┼─────────────┼─────────────────┼──────┤
│ CsNode1 │ 10.10.10.10 │ 3306 │ 0           │ Master, Running │      │
├─────────┼─────────────┼──────┼─────────────┼─────────────────┼──────┤
│ CsNode2 │ 10.10.10.11 │ 3306 │ 0           │ Slave, Running  │      │
└─────────┴─────────────┴──────┴─────────────┴─────────────────┴──────┘

If we now want to add a new ColumnStore node, located at mcs3/10.10.10.12 to the cluster, the steps are as follows.

First the node is added

maxctrl --timeout 30s call command csmon add-node CsMonitor mcs3 20s

After a while the following is output:

{
    "links": {
        "self": "http://localhost:8989/v1/maxscale/modules/csmon/add-node"
    },
    "meta": {
        "message": "Node mcs3 successfully added to cluster.",
        "result": {
            "node_id": "mcs3",
            "timestamp": "2020-08-07 10:03:49.474539"
        },
        "success": true
    }
}

At this point, the ColumnStore cluster consists of three nodes. However, the ColumnStore monitor is not yet aware of the new node.

First we need to create the corresponding server object.

maxctrl create server CsNode3 10.10.10.12

Invoking maxctrl list servers will now show:

┌─────────┬─────────────┬──────┬─────────────┬─────────────────┬──────┐
│ Server  │ Address     │ Port │ Connections │ State           │ GTID │
├─────────┼─────────────┼──────┼─────────────┼─────────────────┼──────┤
│ CsNode3 │ 10.10.10.12 │ 3306 │ 0           │ Down            │      │
├─────────┼─────────────┼──────┼─────────────┼─────────────────┼──────┤
│ CsNode1 │ 10.10.10.10 │ 3306 │ 0           │ Master, Running │      │
├─────────┼─────────────┼──────┼─────────────┼─────────────────┼──────┤
│ CsNode2 │ 10.10.10.11 │ 3306 │ 0           │ Slave, Running  │      │
└─────────┴─────────────┴──────┴─────────────┴─────────────────┴──────┘

The server CsNode3 has been created, but its state is Down since it is not yet being monitored.

┌───────────┬─────────┬──────────────────┐
│ Monitor   │ State   │ Servers          │
├───────────┼─────────┼──────────────────┤
│ CsMonitor │ Running │ CsNode1, CsNode2 │
└───────────┴─────────┴──────────────────┘

It must now be added to the monitor.

maxctrl link monitor CsMonitor CsNode3

Now the server is monitored and maxctrl list monitors shows:

┌───────────┬─────────┬───────────────────────────┐
│ Monitor   │ State   │ Servers                   │
├───────────┼─────────┼───────────────────────────┤
│ CsMonitor │ Running │ CsNode1, CsNode2, CsNode3 │
└───────────┴─────────┴───────────────────────────┘

The state of the new node is now also set correctly, as shown bymaxctrl list servers.

┌─────────┬─────────────┬──────┬─────────────┬─────────────────┬──────┐
│ Server  │ Address     │ Port │ Connections │ State           │ GTID │
├─────────┼─────────────┼──────┼─────────────┼─────────────────┼──────┤
│ CsNode3 │ 10.10.10.12 │ 3306 │ 0           │ Slave, Running  │      │
├─────────┼─────────────┼──────┼─────────────┼─────────────────┼──────┤
│ CsNode1 │ 10.10.10.10 │ 3306 │ 0           │ Master, Running │      │
├─────────┼─────────────┼──────┼─────────────┼─────────────────┼──────┤
│ CsNode2 │ 10.10.10.11 │ 3306 │ 0           │ Slave, Running  │      │
└─────────┴─────────────┴──────┴─────────────┴─────────────────┴──────┘

Note that the MaxScale server object can be created at any point, but it must not be added to the monitor before the node has been added to the ColumnStore cluster using call command csmon add-node.

Impact of dynamic_node_detection

If dynamic_node_detection is enabled, there is no need to create any explicit server entries. All that needs to be done, is to add the node and the monitor will adapt automatically. Note that it does not matter whether the node is added indirectly via maxscale or directly using the REST-API of ColumnStore. The only difference is that in the former case, MaxScale may detect the new situation slightly faster.

Removing a Node

Note that in the following dynamic_node_detection is not used, but the monitor is configured in the traditional way. The impact ofdynamic_node_detection is described here.

Removing a node should be performed in the reverse order of how a node was added. First, the MaxScale server should be removed from the monitor. Then, the node should be removed from the ColumnStore cluster.

Suppose we want to remove the ColumnStore node at mcs2/10.10.10.12 and the current situation is as:

┌───────────┬─────────┬───────────────────────────┐
│ Monitor   │ State   │ Servers                   │
├───────────┼─────────┼───────────────────────────┤
│ CsMonitor │ Running │ CsNode1, CsNode2, CsNode3 │
└───────────┴─────────┴───────────────────────────┘

First, the server is removed from the monitor.

maxctrl unlink monitor CsMonitor CsNode3

Checking with maxctrl list monitors we see that the server has indeed been removed.

┌───────────┬─────────┬──────────────────┐
│ Monitor   │ State   │ Servers          │
├───────────┼─────────┼──────────────────┤
│ CsMonitor │ Running │ CsNode1, CsNode2 │
└───────────┴─────────┴──────────────────┘

Now the node can be removed from the cluster itself.

maxctrl --timeout 30s call command csmon remove-node CsMonitor mcs3 20s
{
    "links": {
        "self": "http://localhost:8989/v1/maxscale/modules/csmon/remove-node"
    },
    "meta": {
        "message": "Node mcs3 removed from the cluster.",
        "result": {
            "node_id": "mcs3",
            "timestamp": "2020-08-07 11:41:36.573425"
        },
        "success": true
    }
}

Impact of dynamic_node_detection

If dynamic_node_detection is enabled, there is in general no need to explicitly remove a static server entry (as there never was one in the first place). The only exception is if the removed node happened to be a bootstrap server. In that case, the server entry should be removed from the monitor's list of servers (used as bootstrap nodes). If that is not done, then the monitor will log a warning at each startup.

CC BY-SA / Gnu FDL

MaxScale 22.08 Common Monitor Parameters

Common Monitor Parameters

Common Monitor Parameters

This document settings supported by all monitors. These should be defined in the monitor section of the configuration file.

  • Common Monitor Parameters

    • Parameters

      • module

      • user

      • password

      • servers

      • monitor_interval

      • backend_connect_timeout

      • backend_write_timeout

      • backend_read_timeout

      • backend_connect_attempts

      • disk_space_threshold

      • disk_space_check_interval

      • script

      • script_timeout

      • events

      • journal_max_age

    • Monitor Crash Safety

    • Script example

Parameters

module

  • Type: string

  • Mandatory: Yes

  • Dynamic: No

The monitor module this monitor should use. Typically mariadbmon orgaleramon.

user

  • Type: string

  • Mandatory: Yes

  • Dynamic: Yes

Username used by the monitor to connect to the backend servers. If a server defines the monitoruser parameter, that value will be used instead.

password

  • Type: string

  • Mandatory: Yes

  • Dynamic: Yes

Password for the user defined with the user parameter. If a server defines the monitorpw parameter, that value will be used instead.

Note: In older versions of MaxScale this parameter was called passwd. The use of passwd was deprecated in MaxScale 2.3.0.

servers

  • Type: string

  • Mandatory: Yes

  • Dynamic: Yes

A comma-separated list of servers the monitor should monitor.

servers=MyServer1,MyServer2

monitor_interval

  • Type: duration

  • Mandatory: No

  • Dynamic: Yes

  • Default: 2s

Defines how often the monitor updates the status of the servers. Choose a lower value if servers should be queried more often. The smallest possible value is 100 milliseconds. If querying the servers takes longer than monitor_interval, the effective update rate is reduced.

monitor_interval=2s

The interval is specified as documentedhere. If no explicit unit is provided, the value is interpreted as milliseconds in MaxScale 2.4. In subsequent versions a value without a unit may be rejected.

backend_connect_timeout

  • Type: duration

  • Mandatory: No

  • Dynamic: Yes

  • Default: 3s

This parameter controls the timeout for connecting to a monitored server. The interval is specified as documentedhere. If no explicit unit is provided, the value is interpreted as seconds in MaxScale 2.4. In subsequent versions a value without a unit may be rejected. Note that since the granularity of the timeout is seconds, a timeout specified in milliseconds will be rejected, even if the duration is longer than a second. The minimum value is 1 second.

backend_connect_timeout=3s

backend_write_timeout

  • Type: duration

  • Mandatory: No

  • Dynamic: Yes

  • Default: 3s

This parameter controls the timeout for writing to a monitored server. The timeout is specified as documentedhere. If no explicit unit is provided, the value is interpreted as seconds in MaxScale 2.4. In subsequent versions a value without a unit may be rejected. Note that since the granularity of the timeout is seconds, a timeout specified in milliseconds will be rejected, even if the duration is longer than a second. The minimum value is 1 seconds.

backend_write_timeout=3s

backend_read_timeout

  • Type: duration

  • Mandatory: No

  • Dynamic: Yes

  • Default: 3s

This parameter controls the timeout for reading from a monitored server. The timeout is specified as documentedhere. If no explicit unit is provided, the value is interpreted as seconds in MaxScale 2.4. In subsequent versions a value without a unit may be rejected. Note that since the granularity of the timeout is seconds, a timeout specified in milliseconds will be rejected, even if the duration is longer than a second. The minimum value is 1 second.

backend_read_timeout=3s

backend_connect_attempts

  • Type: number

  • Mandatory: No

  • Dynamic: Yes

  • Default: 1

This parameter defines the maximum times a backend connection is attempted every monitoring loop. Every attempt may take up to backend_connect_timeout seconds to perform. If none of the attempts are successful, the backend is considered to be unreachable and down.

backend_connect_attempts=1

disk_space_threshold

  • Type: string

  • Mandatory: No

  • Dynamic: Yes

  • Default: None

This parameter duplicates the disk_space_thresholdserver parameter. If the parameter has not been specified for a server, then the one specified for the monitor is applied.

NOTE: Since MariaDB 10.4.7, MariaDB 10.3.17 and MariaDB 10.2.26, the information will be available only if the monitor user has the FILE privilege.

That is, if the disk configuration is the same on all servers monitored by the monitor, it is sufficient (and more convenient) to specify the disk space threshold in the monitor section, but if the disk configuration is different on all or some servers, then the disk space threshold can be specified individually for each server.

For example, suppose server1, server2 and server3 are identical in all respects. In that case we can specify disk_space_threshold in the monitor.

[server1]
type=server
...

[server2]
type=server
...

[server3]
type=server
...

[monitor]
type=monitor
servers=server1,server2,server3
disk_space_threshold=/data:80
...

However, if the servers are heterogeneous with the disk used for the data directory mounted on different paths, then the disk space threshold must be specified separately for each server.

[server1]
type=server
disk_space_threshold=/data:80
...

[server2]
type=server
disk_space_threshold=/Data:80
...

[server3]
type=server
disk_space_threshold=/DBData:80
...

[monitor]
type=monitor
servers=server1,server2,server3
...

If most of the servers have the data directory disk mounted on the same path, then the disk space threshold can be specified on the monitor and separately on the server with a different setup.

[server1]
type=server
disk_space_threshold=/DbData:80
...

[server2]
type=server
...

[server3]
type=server
...

[monitor]
type=monitor
servers=server1,server2,server3
disk_space_threshold=/data:80
...

Above, server1 has the disk used for the data directory mounted at /DbData while both server2 and server3 have it mounted on/data and thus the setting in the monitor covers them both.

disk_space_check_interval

  • Type: duration

  • Mandatory: No

  • Dynamic: Yes

  • Default: 0s

With this parameter it can be specified the minimum amount of time between disk space checks. The interval is specified as documentedhere. If no explicit unit is provided, the value is interpreted as milliseconds in MaxScale 2.4. In subsequent versions a value without a unit may be rejected. The default value is 0, which means that by default the disk space will not be checked.

Note that as the checking is made as part of the regular monitor interval cycle, the disk space check interval is affected by the value ofmonitor_interval. In particular, even if the value ofdisk_space_check_interval is smaller than that of monitor_interval, the checking will still take place at monitor_interval intervals.

script

  • Type: string

  • Mandatory: No

  • Dynamic: Yes

  • Default: None

This command will be executed on a server state change. The parameter should be an absolute path to a command or the command should be in the executable path. The user running MaxScale should have execution rights to the file itself and the directory it resides in. The script may have placeholders which MaxScale will substitute with useful information when launching the script.

The placeholders and their substitution results are:

  • $INITIATOR -> IP and port of the server which initiated the event

  • $EVENT -> event description, e.g. "server_up"

  • $LIST -> list of IPs and ports of all servers

  • $NODELIST -> list of IPs and ports of all running servers

  • $SLAVELIST -> list of IPs and ports of all slave servers

  • $MASTERLIST -> list of IPs and ports of all master servers

  • $SYNCEDLIST -> list of IPs and ports of all synced Galera nodes

  • $PARENT -> IP and port of the parent of the server which initiated the event. For master-slave setups, this will be the master if the initiating server is a slave.

  • $CHILDREN -> list of IPs and ports of the child nodes of the server who initiated the event. For master-slave setups, this will be a list of slave servers if the initiating server is a master.

The expanded variable value can be an empty string if no servers match the variable's requirements. For example, if no masters are available $MASTERLIST will expand into an empty string. The list-type substitutions will only contain servers monitored by the current monitor.

script=/home/user/myscript.sh initiator=$INITIATOR event=$EVENT live_nodes=$NODELIST

The above script could be executed as:

/home/user/myscript.sh initiator=[192.168.0.10]:3306 event=master_down live_nodes=[192.168.0.201]:3306,[192.168.0.121]:3306

See section Script example below for an example script.

Any output by the executed script will be logged into the MaxScale log. Each outputted line will be logged as a separate log message.

The log level on which the messages are logged depends on the format of the messages. If the first word in the output line is one of alert:, error:,warning:, notice:, info: or debug:, the message will be logged on the corresponding level. If the message is not prefixed with one of the keywords, the message will be logged on the notice level. Whitespace before, after or between the keyword and the colon is ignored and the matching is case-insensitive.

Currently, the script must not execute any of the following MaxCtrl calls as they cause a deadlock:

  • alter monitor to the monitor executing the script

  • stop monitor to the monitor executing the script

  • call command to a MariaDB-Monitor that is executing the script

script_timeout

  • Type: duration

  • Mandatory: No

  • Dynamic: Yes

  • Default: 90s

The timeout for the executed script. The interval is specified as documentedhere. If no explicit unit is provided, the value is interpreted as seconds in MaxScale 2.4. In subsequent versions a value without a unit may be rejected. Note that since the granularity of the timeout is seconds, a timeout specified in milliseconds will be rejected, even if the duration is longer than a second.

If the script execution exceeds the configured timeout, it is stopped by sending a SIGTERM signal to it. If the process does not stop, a SIGKILL signal will be sent to it once the execution time is greater than twice the configured timeout.

events

  • Type: enum

  • Mandatory: No

  • Dynamic: Yes

  • Values: master_down, master_up, slave_down, slave_up, server_down, server_up, lost_master, lost_slave, new_master, new_slave

  • Default: All events

A list of event names which cause the script to be executed. If this option is not defined, all events cause the script to be executed. The list must contain a comma separated list of event names.

events=master_down,slave_down

The following table contains all the possible event types and their descriptions.

Event Name
Description

master_down

A Master server has gone down

master_up

A Master server has come up

slave_down

A Slave server has gone down

slave_up

A Slave server has come up

server_down

A server with no assigned role has gone down

server_up

A server with no assigned role has come up

lost_master

A server lost Master status

lost_slave

A server lost Slave status

new_master

A new Master was detected

new_slave

A new Slave was detected

journal_max_age

  • Type: duration

  • Mandatory: No

  • Dynamic: Yes

  • Default: 28800s

The maximum journal file age. The interval is specified as documentedhere. If no explicit unit is provided, the value is interpreted as seconds in MaxScale 2.4. In subsequent versions a value without a unit may be rejected. Note that since the granularity of the max age is seconds, a max age specified in milliseconds will be rejected, even if the duration is longer than a second.

When the monitor starts, it reads any stored journal files. If the journal file is older than the value of journal_max_age, it will be removed and the monitor starts with no prior knowledge of the servers.

Monitor Crash Safety

Starting with MaxScale 2.2.0, the monitor modules keep an on-disk journal of the latest server states. This change makes the monitors crash-safe when options that introduce states are used. It also allows the monitors to retain stateful information when MaxScale is restarted.

For MySQL monitor, options that introduce states into the monitoring process are the detect_stale_master and detect_stale_slave options, both of which are enabled by default. Galeramon has the disable_master_failback parameter which introduces a state.

The default location for the server state journal is in/var/lib/maxscale/<monitor name>/monitor.dat where <monitor name> is the name of the monitor section in the configuration file. If MaxScale crashes or is shut down in an uncontrolled fashion, the journal will be read when MaxScale is started. To skip the recovery process, manually delete the journal file before starting MaxScale.

Script example

Below is an example monitor configuration which launches a script with all supported substitutions. The example script reads the results and prints it to file and sends it as email.

[MyMonitor]
type=monitor
module=mariadbmon
servers=C1N1,C1N2,C1N3
user=maxscale
password=password
monitor_interval=10s
script=/path/to/maxscale_monitor_alert_script.sh --initiator=$INITIATOR --parent=$PARENT --children=$CHILDREN --event=$EVENT --node_list=$NODELIST --list=$LIST --master_list=$MASTERLIST --slave_list=$SLAVELIST --synced_list=$SYNCEDLIST

File "maxscale_monitor_alert_script.sh":

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

#!/usr/bin/env bash

initiator="" parent="" children="" event="" node_list="" list="" master_list="" slave_list="" synced_list=""

process_arguments() { while [ "$1" != "" ]; do if [[ "$1" =~ ^--initiator=.* ]]; then initiator=${1#'--initiator='} elif [[ "$1" =~ ^--parent.* ]]; then parent=${1#'--parent='} elif [[ "$1" =~ ^--children.* ]]; then children=${1#'--children='} elif [[ "$1" =~ ^--event.* ]]; then event=${1#'--event='} elif [[ "$1" =~ ^--node_list.* ]]; then node_list=${1#'--node_list='} elif [[ "$1" =~ ^--list.* ]]; then list=${1#'--list='} elif [[ "$1" =~ ^--master_list.* ]]; then master_list=${1#'--master_list='} elif [[ "$1" =~ ^--slave_list.* ]]; then slave_list=${1#'--slave_list='} elif [[ "$1" =~ ^--synced_list.* ]]; then synced_list=${1#'--synced_list='} fi shift done }

process_arguments $@ read -r -d '' MESSAGE << EOM A server has changed state. The following information was provided:

Initiator: $initiator Parent: $parent Children: $children Event: $event Node list: $node_list List: $list Master list: $master_list Slave list: $slave_list Synced list: $synced_list EOM

print message to file

echo "$MESSAGE" > /path/to/script_output.txt

email the message

echo "$MESSAGE" | mail -s "MaxScale received $event event for initiator $initiator." mariadb_admin@domain.com |

CC BY-SA / Gnu FDL

MaxScale 22.08 Galera Monitor

Galera Monitor

Galera Monitor

  • Galera Monitor

    • Overview

      • WSREP Variables and Their Effects

      • Galera clusters and slaves replicating from it

    • Required Grants

    • Configuration

    • Common Monitor Parameters

    • Galera Monitor optional parameters

      • disable_master_failback

      • available_when_donor

      • disable_master_role_setting

      • use_priority

      • root_node_as_master

      • set_donor_nodes

    • Interaction with Server Priorities

      • Switchover

Overview

The Galera Monitor is a monitoring module for MaxScale that monitors a Galera cluster. It detects whether nodes are a part of the cluster and if they are in sync with the rest of the cluster. It can also assign master and slave roles inside MaxScale, allowing Galera clusters to be used with modules designed for traditional master-slave clusters.

By default, the Galera Monitor will choose the node with the lowest wsrep_local_index value as the master. This will mean that two MaxScales running on different servers will choose the same server as the master.

WSREP Variables and Their Effects

The following WSREP variables are inspected by galeramon to see whether a node is usable. If the node is not usable, it loses the Master and Slave labels and will be in the Running state.

  • If wsrep_ready=0, the WSREP system is not yet ready and the Galera node cannot accept queries.

  • If wsrep_desync=1 is set, the node is desynced and is not participating in the Galera replication.

  • If wsrep_reject_queries=[ALL|ALL_KILL] is set, queries are refused and the node is unusable.

  • With wsrep_sst_donor_rejects_queries=1, donor nodes reject queries. Galeramon treats this the same as if wsrep_reject_queries=ALL was set.

  • If wsrep_local_state is not 4 (or 2 with available_when_donor=true), the node is not in the correct state and is not used.

Galera clusters and slaves replicating from it

MaxScale 2.4.0 added support for slaves replicating off of Galera nodes. If a non-Galera server monitored by galeramon is replicating from a Galera node also monitored by galeramon, it will be assigned the Slave, Running status as long as the replication works. This allows read-scaleout with Galera servers without increasing the size of the Galera cluster.

Required Grants

The Galera Monitor requires the REPLICA MONITOR grant to work:

CREATE USER 'maxscale'@'maxscalehost' IDENTIFIED BY 'maxscale-password';
GRANT REPLICA MONITOR ON *.* TO 'maxscale-user'@'maxscalehost';

With MariaDB Server 10.4 and earlier, REPLICATION CLIENT is required instead.

GRANT REPLICATION CLIENT ON *.* TO 'maxscale-user'@'maxscalehost';

If set_donor_nodes is configured, the SUPER grant is required:

GRANT SUPER ON *.* TO 'maxscale'@'maxscalehost';

Configuration

A minimal configuration for a monitor requires a set of servers for monitoring and a username and a password to connect to these servers. The user requires the REPLICATION CLIENT privilege to successfully monitor the state of the servers.

[Galera-Monitor]
type=monitor
module=galeramon
servers=server1,server2,server3
user=myuser
password=mypwd

Common Monitor Parameters

For a list of optional parameters that all monitors support, read the Monitor Common document.

Galera Monitor optional parameters

These are optional parameters specific to the Galera Monitor.

disable_master_failback

  • Type: boolean

  • Default: false

  • Dynamic: Yes

If a node marked as master inside MaxScale happens to fail and the master status is assigned to another node MaxScale will normally return the master status to the original node after it comes back up. With this option enabled, if the master status is assigned to a new node it will not be reassigned to the original node for as long as the new master node is running. In this case the Master Stickiness status bit is set which will be visible in the maxctrl list servers output.

available_when_donor

  • Type: boolean

  • Default: false

  • Dynamic: Yes

This option allows Galera nodes to be used normally when they are donors in an SST operation when the SST method is non-blocking (e.g. wsrep_sst_method=mariadb-backup).

Normally when an SST is performed, both participating nodes lose their Synced, Master or Slave statuses. When this option is enabled, the donor is treated as if it was a normal member of the cluster (i.e. wsrep_local_state = 4). This is especially useful if the cluster drops down to one node and an SST is required to increase the cluster size.

The current list of non-blocking SST methods are xtrabackup, xtrabackup-v2 and mariadb-backup. Read the wsrep_sst_method documentation for more details.

disable_master_role_setting

  • Type: boolean

  • Default: false

  • Dynamic: Yes

This disables the assignment of master and slave roles to the Galera cluster nodes. If this option is enabled, Synced is the only status assigned by this monitor.

use_priority

  • Type: boolean

  • Default: false

  • Dynamic: Yes

Enable interaction with server priorities. This will allow the monitor to deterministically pick the write node for the monitored Galera cluster and will allow for controlled node replacement.

root_node_as_master

  • Type: boolean

  • Default: false

  • Dynamic: Yes

This option controls whether the write master Galera node requires a wsrep_local_index value of 0. This option was introduced in MaxScale 2.1.0 and it is disabled by default in versions 2.1.5 and newer. In versions 2.1.4 and older, the option was enabled by default.

A Galera cluster will always have a node which has a wsrep_local_index value of 0. Based on this information, multiple MaxScale instances can always pick the same node for writes.

If the root_node_as_master option is disabled for galeramon, the node with the lowest index will always be chosen as the master. If it is enabled, only the node with a wsrep_local_index value of 0 can be chosen as the master.

This parameter can work with disable_master_failback but using them together is not advisable: the intention of root_node_as_master is to make sure that all MaxScale instances that are configured to use the same Galera cluster will send writes to the same node. If disable_master_failback is enabled, this is no longer true if the Galera cluster reorganizes itself in a way that a different node gets the node index 0, writes would still be going to the old node that previously had the node index 0. A restart of one of the MaxScales or a new MaxScale joining the cluster will cause writes to be sent to the wrong node, thus resulting in an increasing the rate of deadlock errors and sub-optimal performance.

set_donor_nodes

  • Type: boolean

  • Default: false

  • Dynamic: Yes

This option controls whether the global variable wsrep_sst_donor should be set in each cluster node with slave' status. The variable contains a list of slave servers, automatically sorted, with possible master candidates at its end.

The sorting is based either on wsrep_local_index or node server priority depending on the value of use_priority option. If no server has priority defined the sorting switches to wsrep_local_index. Node names are collected by fetching the result of the variable wsrep_node_name.

Example of variable being set in all slave nodes, assuming three nodes:

SET GLOBAL wsrep_sst_donor = "galera001,galera000"

Note: in order to set the global variable wsrep_sst_donor, proper privileges are required for the monitor user that connects to cluster nodes. This option is disabled by default and was introduced in MaxScale 2.1.0.

Interaction with Server Priorities

If the use_priority option is set and a server is configured with the priority=<int> parameter, galeramon will use that as the basis on which the master node is chosen. This requires the disable_master_role_setting to be undefined or disabled. The server with the lowest positive value of priority will be chosen as the master node when a replacement Galera node is promoted to a master server inside MaxScale. If all candidate servers have the same priority, the order of the servers in the servers parameter dictates which is chosen as the master.

Nodes with a negative value (priority < 0) will never be chosen as the master. This allows you to mark some servers as permanent slaves by assigning a non-positive value into priority. Nodes with the default priority of 0 are only selected if no nodes with higher priority are present and the normal node selection rules apply to them (i.e. selection is based on wsrep_local_index).

Here is an example.

[node-1]
type=server
address=192.168.122.101
port=3306
priority=1

[node-2]
type=server
address=192.168.122.102
port=3306
priority=3

[node-3]
type=server
address=192.168.122.103
port=3306
priority=2

[node-4]
type=server
address=192.168.122.104
port=3306
priority=-1

In this example node-1 is always used as the master if available. If node-1 is not available, then the next node with the highest priority rank is used. In this case it would be node-3. If both node-1 and node-3 were down, then node-2 would be used. Because node-4 has a value of -1 in priority, it will never be the master. Nodes without priority parameter are considered as having a priority of 0 and will be used only if all nodes with a positive priority value are not available.

With priority ranks you can control the order in which MaxScale chooses the master node. This will allow for a controlled failure and replacement of nodes.

Switchover

Priorities can be used to force a runtime change of the primary server in a Galera Cluster. For example, if server1 has a priority of 1 and server2 a priority of 2 (with server1 being primary), the roles can be reversed with MaxCtrl:

maxctrl alter server server1 priority=2
maxctrl alter server server2 priority=1

This does not affect the Galera Cluster itself, just the roles MaxScale assigns to the servers. If multiple MaxScales monitor the same Galera Cluster without configuration synchronization, the commands should be run on all MaxScales.

CC BY-SA / Gnu FDL

MaxScale 22.08 MariaDB Monitor

MariaDB Monitor

MariaDB Monitor

  • MariaDB Monitor

    • Overview

    • Required Grants

      • Cluster Manipulation Grants

    • Master selection

    • Configuration

    • Common Monitor Parameters

    • MariaDB Monitor optional parameters

      • assume_unique_hostnames

      • master_conditions

      • slave_conditions

      • failcount

      • enforce_writable_master

      • enforce_read_only_slaves

      • enforce_read_only_servers

      • maintenance_on_low_disk_space

      • cooperative_monitoring_locks

      • script_max_replication_lag

    • Cluster manipulation operations

      • Operation details

      • Manual activation

        • Queued switchover

      • Automatic activation

      • Limitations and requirements

      • External master support

      • Configuration parameters

        • auto_failover

        • auto_rejoin

        • switchover_on_low_disk_space

        • enforce_simple_topology

        • replication_user and replication_password

        • replication_master_ssl

        • failover_timeout and switchover_timeout

        • verify_master_failure

        • master_failure_timeout

        • servers_no_promotion

        • promotion_sql_file and demotion_sql_file

        • handle_events

    • Cooperative monitoring

      • Releasing locks

    • Rebuild server

      • Settings

        • ssh_user

        • ssh_keyfile

        • ssh_check_host_key

        • ssh_timeout

        • ssh_port

        • rebuild_port

      • sudoers.d configuration

    • ColumnStore commands

      • Get status

      • Add or remove node

      • Start and stop cluster

      • Set read-only or readwrite

      • Settings

        • cs_admin_port

        • cs_admin_api_key

        • cs_admin_base_path

    • Other commands

      • fetch-cmd-result

      • cancel-cmd

    • Troubleshooting

      • Failover/switchover fails

      • Slave detection shows external masters

    • Using the MariaDB Monitor With Binlogrouter

Overview

MariaDB Monitor monitors a Master-Slave replication cluster. It probes the state of the backends and assigns server roles such as master and slave, which are used by the routers when deciding where to route a query. It can also modify the replication cluster by performing failover, switchover and rejoin. Backend server versions older than MariaDB/MySQL 5.5 are not supported. Failover and other similar operations require MariaDB 10.0.2 or later.

Up until MariaDB MaxScale 2.2.0, this monitor was called MySQL Monitor.

Required Grants

The monitor user requires the following grant:

CREATE USER 'maxscale'@'maxscalehost' IDENTIFIED BY 'maxscale-password';
GRANT REPLICATION CLIENT ON *.* TO 'maxscale'@'maxscalehost';

In MariaDB Server versions 10.5.0 to 10.5.8, the monitor user instead requires REPLICATION SLAVE ADMIN:

GRANT REPLICATION SLAVE ADMIN ON *.* TO 'maxscale'@'maxscalehost';

In MariaDB Server 10.5.9 and later, REPLICA MONITOR is required:

GRANT REPLICA MONITOR ON *.* TO 'maxscale'@'maxscalehost';

If the monitor needs to query server disk space (i.e. disk_space_threshold is set), then the FILE-grant is required with MariaDB Server versions 10.4.7, 10.3.17, 10.2.26 and 10.1.41 and later.

GRANT FILE ON *.* TO 'maxscale'@'maxscalehost';

MariaDB Server 10.5.2 introduces CONNECTION ADMIN. This is recommended since it allows the monitor to log in even if server connection limit has been reached.

GRANT CONNECTION ADMIN ON *.* TO 'maxscale'@'maxscalehost';

Cluster Manipulation Grants

If cluster manipulation operations are used, the following additional grants are required:

GRANT SUPER, RELOAD, PROCESS, SHOW DATABASES, EVENT ON *.* TO 'maxscale'@'maxscalehost';
GRANT SELECT ON mysql.user TO 'maxscale'@'maxscalehost';

As of MariaDB Server 11.0.1, the SUPER-privilege no longer contains several of its former sub-privileges. These must be given separately.

GRANT RELOAD, PROCESS, SHOW DATABASES, EVENT, SET USER, READ_ONLY ADMIN ON *.* TO 'maxscale'@'maxscalehost';
GRANT REPLICATION SLAVE ADMIN, BINLOG ADMIN, CONNECTION ADMIN ON *.* TO 'maxscale'@'maxscalehost';
GRANT SELECT ON mysql.user TO 'maxscale'@'maxscalehost';

If a separate replication user is defined (with replication_user andreplication_password), it requires the following grant:

CREATE USER 'replication'@'replicationhost' IDENTIFIED BY 'replication-password';
GRANT REPLICATION SLAVE ON *.* TO 'replication'@'replicationhost';

Master selection

Only one backend can be master at any given time. A master must be running (successfully connected to by the monitor) and its read_only-setting must be off. A master may not be replicating from another server in the monitored cluster unless the master is part of a multimaster group. Master selection prefers to select the server with the most slaves, possibly in multiple replication layers. Only slaves reachable by a chain of running relays or directly connected to the master count. When multiple servers are tied for master status, the server which appears earlier in the servers-setting of the monitor is selected.

Servers in a cyclical replication topology (multimaster group) are interpreted as having all the servers in the group as slaves. Even from a multimaster group only one server is selected as the overall master.

After a master has been selected, the monitor prefers to stick with the choice even if other potential masters with more slave servers are available. Only if the current master is clearly unsuitable does the monitor try to select another master. An existing master turns invalid if:

  1. It is unwritable (read_only is on).

  2. It has been down for more than failcount monitor passes and has no running slaves. Running slaves behind a downed relay count. A slave in this context is any server with at least a partially running replication connection (either io or sql thread is running). The slave servers must also be down for more than failcount monitor passes to allow new master selection.

  3. It did not previously replicate from another server in the cluster but it is now replicating.

  4. It was previously part of a multimaster group but is no longer, or the multimaster group is replicating from a server not in the group.

Cases 1 and 2 cover the situations in which the DBA, an external script or even another MaxScale has modified the cluster such that the old master can no longer act as master. Cases 3 and 4 are less severe. In these cases the topology has changed significantly and the master should be re-selected, although the old master may still be the best choice.

The master change described above is different from failover and switchover described in sectionFailover, switchover and auto-rejoin. A master change only modifies the server roles inside MaxScale but does not modify the cluster other than changing the targets of read and write queries. Failover and switchover perform a master change on their own.

As a general rule, it's best to avoid situations where the cluster has multiple standalone servers, separate master-slave pairs or separate multimaster groups. Due to master invalidation rule 2, a standalone master can easily lose the master status to another valid master if it goes down. The new master probably does not have the same data as the previous one. Non-standalone masters are less vulnerable, as a single running slave or multimaster group member will keep the master valid even when down.

Configuration

A minimal configuration for a monitor requires a set of servers for monitoring and a username and a password to connect to these servers.

[MyMonitor]
type=monitor
module=mariadbmon
servers=server1,server2,server3
user=myuser
password=mypwd

From MaxScale 2.2.1 onwards, the module name is mariadbmon instead ofmysqlmon. The old name can still be used.

The grants required by user depend on which monitor features are used. A full list of the grants can be found in the Required Grants section.

Common Monitor Parameters

For a list of optional parameters that all monitors support, read theMonitor Common document.

MariaDB Monitor optional parameters

These are optional parameters specific to the MariaDB Monitor. Failover, switchover and rejoin-specific parameters are listed in their ownsection. Rebuild-related parameters are described in the Rebuild server-section. ColumnStore parameters are described in the ColumnStore commands-section.

assume_unique_hostnames

  • Type: boolean

  • Mandatory: No

  • Dynamic: Yes

  • Default: true

When active, the monitor assumes that server hostnames and ports are consistent between the server definitions in the MaxScale configuration file and the "SHOW ALL SLAVES STATUS" outputs of the servers themselves. Specifically, the monitor assumes that if server A is replicating from server B, then A must have a slave connection with Master_Host andMaster_Port equal to B's address and port in the configuration file. If this is not the case, e.g. an IP is used in the server while a hostname is given in the file, the monitor may misinterpret the topology. In MaxScale 2.4.1, the monitor attempts name resolution on the addresses if a simple string comparison does not find a match. Using exact matching addresses is, however, more reliable.

This setting must be ON to use any cluster operation features such as failover or switchover, because MaxScale uses the addresses and ports in the configuration file when issuing "CHANGE MASTER TO"-commands.

If the network configuration is such that the addresses MaxScale uses to connect to backends are different from the ones the servers use to connect to each other, assume_unique_hostnames should be set to OFF. In this mode, MaxScale uses server id:s it queries from the servers and the Master_Server_Id fields of the slave connections to deduce which server is replicating from which. This is not perfect though, since MaxScale doesn't know the id:s of servers it has never connected to (e.g. server has been down since MaxScale was started). Also, the Master_Server_Id-field may have an incorrect value if the slave connection has not been established. MaxScale will only trust the value if the monitor has seen the slave connection IO thread connected at least once. If this is not the case, the slave connection is ignored.

master_conditions

  • Type: enum_mask

  • Mandatory: No

  • Dynamic: Yes

  • Values: none, connecting_slave, connected_slave, running_slave, primary_monitor_master

  • Default: primary_monitor_master

Designate additional conditions forMaster-status, i.e qualified for read and write queries.

Normally, if a suitable master candidate server is found as described inMaster selection, MaxScale designates it Master.master_conditions sets additional conditions for a master server. This setting is an enum_mask, allowing multiple conditions to be set simultaneously. Conditions 2, 3 and 4 refer to slave servers. If combined, a single slave must fulfill all of the given conditions for the master to be viable.

If the master candidate fails master_conditions but fulfillsslave_conditions, it may be designated Slave instead.

The available conditions are:

  1. none : No additional conditions

  2. connecting_slave : At least one immediate slave (not behind relay) is attempting to replicate or is replicating from the master (Slave_IO_Running is 'Yes' or 'Connecting', Slave_SQL_Running is 'Yes'). A slave with incorrect replication credentials does not count. If the slave is currently down, results from the last successful monitor tick are used.

  3. connected_slave : Same as above, with the difference that the replication connection must be up (Slave_IO_Running is 'Yes'). If the slave is currently down, results from the last successful monitor tick are used.

  4. running_slave : Same as connecting_slave, with the addition that the slave must also be Running.

  5. primary_monitor_master : If this MaxScale iscooperating with another MaxScale and this is the secondary MaxScale, require that the candidate master is selected also by the primary MaxScale.

The default value of this setting ismaster_requirements=primary_monitor_master to ensure that both monitors use the same master server when cooperating.

For example, to require that the master must have a slave which is both connected and running, set

master_conditions=connected_slave,running_slave

slave_conditions

  • Type: enum_mask

  • Mandatory: No

  • Dynamic: Yes

  • Values: none, linked_master, running_master, writable_master, primary_monitor_master

  • Default: none

Designate additional conditions for Slave-status, i.e qualified for read queries.

Normally, a server is Slave if it is at least attempting to replicate from the master candidate or a relay (Slave_IO_Running is 'Yes' or 'Connecting', Slave_SQL_Running is 'Yes', valid replication credentials). The master candidate does not necessarily need to be writable, e.g. if it fails itsmaster_conditions. slave_conditions sets additional conditions for a slave server. This setting is an enum_mask, allowing multiple conditions to be set simultaneously.

The available conditions are:

  1. none : No additional conditions. This is the default value.

  2. linked_master : The slave must be connected to the master (Slave_IO_Running and Slave_SQL_Running are 'Yes') and the master must be Running. The same applies to any relays between the slave and the master.

  3. running_master : The master must be running. Relays may be down.

  4. writable_master : The master must be writable, i.e. labeled Master.

  5. primary_monitor_master : If this MaxScale iscooperating with another MaxScale and this is the secondary MaxScale, require that the candidate master is selected also by the primary MaxScale.

For example, to require that the master server of the cluster must be running and writable for any servers to have Slave-status, set

slave_conditions=running_master,writable_master

failcount

  • Type: number

  • Mandatory: No

  • Dynamic: Yes

  • Default: 5

Number of consecutive monitor passes a master server must be down before it is considered failed. If automatic failover is enabled (auto_failover=true), it may be performed at this time. A value of 0 or 1 enables immediate failover.

If automatic failover is not possible, the monitor will try to search for another server to fulfill the master role. See sectionMaster selection for more details. Changing the master may break replication as queries could be routed to a server without previous events. To prevent this, avoid having multiple valid master servers in the cluster.

The worst-case delay between the master failure and the start of the failover can be estimated by summing up the timeout values and monitor_interval and multiplying that by failcount:

(monitor_interval + backend_connect_timeout) * failcount

enforce_writable_master

  • Type: boolean

  • Mandatory: No

  • Dynamic: Yes

  • Default: false

If set to ON, the monitor attempts to disable the read_only-flag on the master when seen. The flag is checked every monitor tick. The monitor user requires the SUPER-privilege for this feature to work.

Typically, the master server should never be in read-only-mode. Such a situation may arise due to misconfiguration or accident, or perhaps if MaxScale crashed during switchover.

When this feature is enabled, setting the master manually to read_only will no longer cause the monitor to search for another master. The master will instead for a moment lose its [Master]-status (no writes), until the monitor again enables writes on the master. When starting from scratch, the monitor still prefers to select a writable server as master if possible.

enforce_read_only_slaves

  • Type: boolean

  • Mandatory: No

  • Dynamic: Yes

  • Default: false

If set to ON, the monitor attempts to enable the read_only-flag on any writable slave server. The flag is checked every monitor tick. The monitor user requires the SUPER-privilege (or READ_ONLY ADMIN) for this feature to work. While the read_only-flag is ON, only users with the SUPER-privilege (or READ_ONLY ADMIN) can write to the backend server. If temporary write access is required, this feature should be disabled before attempting to disable read_only manually. Otherwise, the monitor will quickly re-enable it.

read_only won't be enabled on the master server, even if it has lost [Master]-status due to master_conditions and is marked [Slave].

enforce_read_only_servers

  • Type: boolean

  • Mandatory: No

  • Dynamic: Yes

  • Default: false

Works similar toenforce_read_only_slaves except will setread_only on any writable server that is not the primary and not in maintenance (a superset of the servers altered by enforce_read_only_slaves).

The monitor user requires the SUPER-privilege (or READ_ONLY ADMIN) for this feature to work. If the cluster has no valid primary or primary candidate, read_only is not set on any server as it is unclear which servers should be altered.

maintenance_on_low_disk_space

  • Type: boolean

  • Mandatory: No

  • Dynamic: Yes

  • Default: true

If a running server that is not the master or a relay master is out of disk space the server is set to maintenance mode. Such servers are not used for router sessions and are ignored when performing a failover or other cluster modification operation. See the general monitor parameters disk_space_threshold anddisk_space_check_interval on how to enable disk space monitoring.

Once a server has been put to maintenance mode, the disk space situation of that server is no longer updated. The server will not be taken out of maintenance mode even if more disk space becomes available. The maintenance flag must be removed manually:

maxctrl clear server server2 Maint

cooperative_monitoring_locks

  • Type: enum

  • Mandatory: No

  • Dynamic: Yes

  • Values: none, majority_of_all, majority_of_running

  • Default: none

Using this setting is recommended when multiple MaxScales are monitoring the same backend cluster. When enabled, the monitor attempts to acquire exclusive locks on the backend servers. The monitor considers itself the primary monitor if it has a majority of locks. The majority can be either over all configured servers or just over running servers. SeeCooperative monitoring for more details on how this feature works and which value to use.

Allowed values:

  1. none Default value, no locking.

  2. majority_of_all Primary monitor requires a majority of locks, even counting servers which are [Down].

  3. majority_of_running Primary monitor requires a majority of locks over [Running] servers.

This setting is separate from the global MaxScale setting passive. Ifpassive is set to true, cluster operations are disabled even if monitor has acquired the locks. Generally, it's best not to mix cooperative monitoring withpassive. Either set passive=false or do not set it at all.

script_max_replication_lag

  • Type: number

  • Mandatory: No

  • Dynamic: Yes

  • Default: -1

Defines a replication lag limit in seconds for launching the monitor script configured in the script-parameter. If the replication lag of a server goes above this limit, the script is ran with the $EVENT-placeholder replaced by "rlag_above". If the lag goes back below the limit, the script is ran again with replacement "rlag_below".

Negative values disable this feature. For more information on monitor scripts, see general monitor documentation.

Cluster manipulation operations

Starting with MaxScale 2.2.1, MariaDB Monitor supports replication cluster modification. The operations implemented are:

  • failover, which replaces a failed master with a slave

  • switchover, which swaps a running master with a slave

  • async-switchover, which schedules a switchover and returns

  • rejoin, which directs servers to replicate from the master

  • reset-replication (added in MaxScale 2.3.0), which deletes binary logs and resets gtid:s

See operation details for more information on the implementation of the commands.

The cluster operations require that the monitor user (user) has the following privileges:

  • SUPER, to modify slave connections, set globals such as read_only and kill connections from other super-users

  • REPLICATION CLIENT (REPLICATION SLAVE ADMIN in MariaDB Server 10.5), to list slave connections

  • RELOAD, to flush binary logs

  • PROCESS, to check if the event_scheduler process is running

  • SHOW DATABASES and EVENT, to list and modify server events

  • SELECT on mysql.user, to see which users have SUPER

A list of the grants can be found in the Required Grants section.

The privilege system was changed in MariaDB Server 10.5. The effects of this on the MaxScale monitor user are minor, as the SUPER-privilege contains many of the required privileges and is still required to kill connections from other super-users.

In MariaDB Server 11.0.1 and later, SUPER no longer contains all the required grants. The monitor requires:

  • READ_ONLY ADMIN, to set read_only

  • REPLICA MONITOR and REPLICATION SLAVE ADMIN, to view and manage replication connections

  • RELOAD, to flush binary logs

  • PROCESS, to check if the event_scheduler process is running

  • SHOW DATABASES, EVENT and SET USER, to list and modify server events

  • BINLOG ADMIN, to delete binary logs (during reset-replication)

  • CONNECTION ADMIN, to kill connections

  • SELECT on mysql.user, to see which users have SUPER

In addition, the monitor needs to know which username and password a slave should use when starting replication. These are given inreplication_user and replication_password.

The user can define files with SQL statements which are executed on any server being demoted or promoted by cluster manipulation commands. See the sections onpromotion_sql_file and demotion_sql_file for more information.

The monitor can manipulate scheduled server events when promoting or demoting a server. See the section on handle_events for more information.

All cluster operations can be activated manually through MaxCtrl. See section Manual activation for more details.

See Limitations and requirements for information on possible issues with failover and switchover.

Operation details

Failover replaces a failed master with a running slave. It does the following:

  1. Select the most up-to-date slave of the old master to be the new master. The selection criteria is as follows in descending priority:

  2. gtid_IO_pos (latest event in relay log)

  3. gtid_current_pos (most processed events)

  4. log_slave_updates is on

  5. disk space is not low

  6. If the new master has unprocessed relay log items, cancel and try again later.

  7. Prepare the new master:

  8. Remove the slave connection the new master used to replicate from the old master.

  9. Disable the read_only-flag.

  10. Enable scheduled server events (if event handling is on). Only events that were enabled on the old master are enabled.

  11. Run the commands in promotion_sql_file.

  12. Start replication from external master if one existed.

  13. Redirect all other slaves to replicate from the new master:

  14. STOP SLAVE

  15. CHANGE MASTER TO

  16. START SLAVE

  17. Check that all slaves are replicating.

Failover is considered successful if steps 1 to 3 succeed, as the cluster then has at least a valid master server.

Switchover swaps a running master with a running slave. It does the following:

  1. Prepare the old master for demotion:

  2. Stop any external replication.

  3. Kill connections from super-users since read_only does not affect them.

  4. Enable the read_only-flag to stop writes.

  5. Disable scheduled server events (if event handling is on).

  6. Run the commands in demotion_sql_file.

  7. Flush the binary log (FLUSH LOGS) so that all events are on disk.

  8. Wait for the new master to catch up with the old master.

  9. Promote new master and redirect slaves as in failover steps 3 and 4. Also redirect the demoted old master.

  10. Check that all slaves are replicating.

Similar to failover, switchover is considered successful if the new master was successfully promoted.

Rejoin joins a standalone server to the cluster or redirects a slave replicating from a server other than the master. A standalone server is joined by:

  1. Run the commands in demotion_sql_file.

  2. Enable the read_only-flag.

  3. Disable scheduled server events (if event handling is on).

  4. Start replication: CHANGE MASTER TO and START SLAVE.

A server which is replicating from the wrong master is redirected simply with STOP SLAVE, RESET SLAVE, CHANGE MASTER TO and START SLAVE commands.

Reset-replication (added in MaxScale 2.3.0) deletes binary logs and resets gtid:s. This destructive command is meant for situations where the gtid:s in the cluster are out of sync while the actual data is known to be in sync. The operation proceeds as follows:

  1. Reset gtid:s and delete binary logs on all servers:

  2. Stop (STOP SLAVE) and delete (RESET SLAVE ALL) all slave connections.

  3. Enable the read_only-flag.

  4. Disable scheduled server events (if event handling is on).

  5. Delete binary logs (RESET MASTER).

  6. Set the sequence number of gtid_slave_pos to zero. This also affects gtid_current_pos.

  7. Prepare new master:

  8. Disable the read_only-flag.

  9. Enable scheduled server events (if event handling is on). Events are only enabled if the cluster had a master server when starting the reset-replication operation. Only events that were enabled on the previous master are enabled on the new.

  10. Direct other servers to replicate from the new master as in the other operations.

Manual activation

Cluster operations can be activated manually through the REST API or MaxCtrl. The commands are only performed when MaxScale is in active mode. The commands generally match their automatic versions. The exception is rejoin, in which the manual command allows rejoining even when the joining server has empty gtid:s. This rule allows the user to force a rejoin on a server without binary logs.

All commands require the monitor instance name as the first parameter. Failover selects the new master server automatically and does not require additional parameters. Rejoin requires the name of the joining server as second parameter. Replication reset accepts the name of the new master server as second parameter. If not given, the current master is selected.

Switchover takes one to three parameters. If only the monitor name is given, switchover will autoselect both the slave to promote and the current master as the server to be demoted. If two parameters are given, the second parameter is interpreted as the slave to promote. If three parameters are given, the third parameter is interpreted as the current master. The user-given current master is compared to the master server currently deduced by the monitor and if the two are unequal, an error is given.

Example commands are below:

call command mariadbmon failover MyMonitor
call command mariadbmon rejoin MyMonitor OldMasterServ
call command mariadbmon reset-replication MyMonitor
call command mariadbmon reset-replication MyMonitor NewMasterServ
call command mariadbmon switchover MyMonitor
call command mariadbmon switchover MyMonitor NewMasterServ
call command mariadbmon switchover MyMonitor NewMasterServ OldMasterServ

The commands follow the standard module command syntax. All require the monitor configuration name (MyMonitor) as the first parameter. For switchover, the last two parameters define the server to promote (NewMasterServ) and the server to demote (OldMasterServ). For rejoin, the server to join (OldMasterServ) is required. Replication reset requires the server to promote (NewMasterServ).

It is safe to perform manual operations even with automatic failover, switchover or rejoin enabled since automatic operations cannot happen simultaneously with manual ones.

When a cluster modification is initiated via the REST-API, the URL path is of the form:

/v1/maxscale/modules/mariadbmon/<operation>?<monitor-instance>&<server-param1>&<server-param2>
  • <operation> is the name of the command: failover, switchover, rejoin or reset-replication.

  • <monitor-instance> is the monitor section name from the MaxScale configuration file.

  • <server-param1> and <server-param2> are server parameters as described above for MaxCtrl. Only switchover accepts both, failover doesn't need any and both rejoin and reset-replication accept one.

Given a MaxScale configuration file like

[Cluster1]
type=monitor
module=mariadbmon
servers=server1, server2, server3, server 4
...

with the assumption that server2 is the current master, then the URL path for making server4 the new master would be:

/v1/maxscale/modules/mariadbmon/switchover?Cluster1&server4&server2

Example REST-API paths for other commands are listed below.

/v1/maxscale/modules/mariadbmon/failover?Cluster1
/v1/maxscale/modules/mariadbmon/rejoin?Cluster1&server3
/v1/maxscale/modules/mariadbmon/reset-replication?Cluster1&server3

Queued switchover

Most cluster modification commands wait until the operation either succeeds or fails. async-switchover is an exception, as it returns immediately. Otherwiseasync-switchover works identical to a normal switchover command. Use the module command fetch-cmd-result to view the result of the queued command.fetch-cmd-result returns the status or result of the latest manual command, whether queued or not.

maxctrl call command mariadbmon async-switchover Cluster1
OK
maxctrl call command mariadbmon fetch-cmd-result Cluster1
{
    "links": {
        "self": "http://localhost:8989/v1/maxscale/modules/mariadbmon/fetch-cmd-result"
    },
    "meta": "switchover completed successfully."
}

Automatic activation

Failover can activate automatically if auto_failover is on. The activation begins when the master has been down at least failcount monitor iterations. Before modifying the cluster, the monitor checks that all prerequisites for the failover are fulfilled. If the cluster does not seem ready, an error is printed and the cluster is rechecked during the next monitor iteration.

Switchover can also activate automatically with theswitchover_on_low_disk_space-setting. The operation begins if the master server is low on disk space but otherwise the operating logic is quite similar to automatic failover.

Rejoin stands for starting replication on a standalone server or redirecting a slave replicating from the wrong master (any server that is not the cluster master). The rejoined servers are directed to replicate from the current cluster master server, forcing the replication topology to a 1-master-N-slaves configuration.

A server is categorized as standalone if the server has no slave connections, not even stopped ones. A server is replicating from the wrong master if the slave IO thread is connected but the master server id seen by the slave does not match the cluster master id. Alternatively, the IO thread may be stopped or connecting but the master server host or port information differs from the cluster master info. These criteria mean that a STOP SLAVE does not yet set a slave as standalone.

With auto_rejoin active, the monitor will try to rejoin any servers matching the above requirements. Rejoin does not obey failcount and will attempt to rejoin any valid servers immediately. When activating rejoin manually, the user-designated server must fulfill the same requirements.

Limitations and requirements

Switchover and failover are meant for simple topologies (one master and several slaves). Using these commands with complicated topologies (multiple masters, relays, circular replication) may give unpredictable results and should be tested before use on a production system.

The server cluster is assumed to be well-behaving with no significant replication lag (within failover_timeout/switchover_timeout) and all commands that modify the cluster (such as "STOP SLAVE", "CHANGE MASTER", "START SLAVE") complete in a few seconds (faster than backend_read_timeout and backend_write_timeout).

The backends must all use GTID-based replication, and the domain id should not change during a switchover or failover. Slave servers should not have extra local events so that GTIDs are compatible across the cluster.

Failover cannot be performed if MaxScale was started only after the master server went down. This is because MaxScale needs reliable information on the gtid domain of the cluster and the replication topology in general to properly select the new master. enforce_simple_topology=1 relaxes this requirement.

Failover may lose events. If a master goes down before sending new events to at least one slave, those events are lost when a new master is chosen. If the old master comes back online, the other servers have likely moved on with a diverging history and the old master can no longer join the replication cluster.

To reduce the chance of losing data, usesemisynchronous replication. In semisynchronous mode, the master waits for a slave to receive an event before returning an acknowledgement to the client. This does not yet guarantee a clean failover. If the master fails after preparing a transaction but before receiving slave acknowledgement, it will still commit the prepared transaction as part of its crash recovery. If the slaves never saw this transaction, the old master has diverged from the cluster. SeeConfiguring the Master Wait Point for more information. This situation is much less likely in MariaDB Server 10.6.2 and later, as the improved crash recovery logic will delete such transactions.

Even a controlled shutdown of the master may lose events. The server does not by default wait for all data to be replicated to the slaves when shutting down and instead simply closes all connections. Before shutting down the master with the intention of having a slave promoted, run switchover first to ensure that all data is replicated. For more information on server shutdown, seeBinary Log Dump Threads and the Shutdown Process.

Switchover requires that the cluster is "frozen" for the duration of the operation. This means that no data modifying statements such as INSERT or UPDATE are executed and the GTID position of the master server is stable. When switchover begins, the monitor sets the global read_only flag on the old master backend to stop any updates. read_only does not affect users with the SUPER-privilege so any such user can issue writes during a switchover. These writes have a high chance of breaking replication, because the write may not be replicated to all slaves before they switch to the new master. To prevent this, any users who commonly do updates should NOT have the SUPER-privilege. For even more security, the only SUPER-user session during a switchover should be the MaxScale monitor user. This also applies to users running scheduled server events. Although the monitor by default disables events on the master, an event may already be executing. If the event definer has SUPER-privilege, the event can write to the database even through read_only.

When mixing rejoin with failover/switchover, the backends should havelog_slave_updates on. The rejoining server is likely lagging behind the rest of the cluster. If the current cluster master does not have binary logs from the moment the rejoining server lost connection, the rejoining server cannot continue replication. This is an issue if the master has changed and the new master does not have log_slave_updates on.

If an automatic cluster operation such as auto-failover or auto-rejoin fails, all cluster modifying operations are disabled for failcount monitor iterations, after which the operation may be retried. Similar logic applies if the cluster is unsuitable for such operations, e.g. replication is not using GTID.

External master support

The monitor detects if a server in the cluster is replicating from an external master (a server that is not monitored by the monitor). If the replicating server is the cluster master server, then the cluster itself is considered to have an external master.

If a failover/switchover happens, the new master server is set to replicate from the cluster external master server. The username and password for the replication are defined in replication_user and replication_password. The address and port used are the ones shown by SHOW ALL SLAVES STATUS on the old cluster master server. In the case of switchover, the old master also stops replicating from the external server to preserve the topology.

After failover the new master is replicating from the external master. If the failed old master comes back online, it is also replicating from the external server. To normalize the situation, either have auto_rejoin on or manually execute a rejoin. This will redirect the old master to the current cluster master.

Configuration parameters

auto_failover

  • Type: boolean

  • Mandatory: No

  • Dynamic: Yes

  • Default: false

Enable automatic master failover. When automatic failover is enabled, MaxScale will elect a new master server for the cluster if the old master goes down. A server is assumed Down if it cannot be connected to, even if this is caused by incorrect credentials. Failover triggers if the master stays down forfailcount monitor intervals. Failover will not take place if MaxScale is set passive.

As failover alters replication, it requires more privileges than normal monitoring. See here for a list of grants.

Failover is designed to be used with simple master-slave topologies. More complicated topologies, such as multilayered or circular replication, are not guaranteed to always work correctly. Test before using failover with such setups.

auto_rejoin

  • Type: boolean

  • Mandatory: No

  • Dynamic: Yes

  • Default: false

Enable automatic joining of servers to the cluster. When enabled, MaxScale will attempt to direct servers to replicate from the current cluster master if they are not currently doing so. Replication will be started on any standalone servers. Servers that are replicating from another server will be redirected. This effectively enforces a 1-master-N-slaves topology. The current master itself is not redirected, so it can continue to replicate from an external master. Rejoin is also not performed on any server that is replicating from multiple sources, as this indicates a complicated topology (this rule is overridden by enforce_simple_topology).

This feature is often paired with auto_failover to redirect the former master when it comes back online. Sometimes this kind of rejoin will fail as the old master may have transactions that were never replicated to the current one. See limitations for more information.

As an example, consider the following series of events:

  1. Slave A goes down

  2. Master goes down and a failover is performed, promoting Slave B

  3. Slave A comes back

  4. Old master comes back

Slave A is still trying to replicate from the downed master, since it wasn't online during failover. If auto_rejoin is on, Slave A will quickly be redirected to Slave B, the current master. The old master will also rejoin the cluster if possible.

switchover_on_low_disk_space

  • Type: boolean

  • Mandatory: No

  • Dynamic: Yes

  • Default: false

If enabled, the monitor will attempt to switchover a master server low on disk space with a slave. The switch is only done if a slave without disk space issues is found. Ifmaintenance_on_low_disk_space is also enabled, the old master (now a slave) will be put to maintenance during the next monitor iteration.

For this parameter to have any effect, disk_space_threshold must be specified for the server or the monitor. Also, disk_space_check_interval must be defined for the monitor.

switchover_on_low_disk_space=true

enforce_simple_topology

  • Type: boolean

  • Mandatory: No

  • Dynamic: Yes

  • Default: false

This setting tells the monitor to assume that the servers should be arranged in a 1-master-N-slaves topology and the monitor should try to keep it that way. Ifenforce_simple_topology is enabled, the settings assume_unique_hostnames,auto_failover and auto_rejoin are also activated regardless of their individual settings.

By default, mariadbmon will not rejoin servers with more than one replication stream configured into the cluster. Starting with MaxScale 6.2.0, whenenforce_simple_topology is enabled, all servers will be rejoined into the cluster and any extra replication sources will be removed. This is done to make automated failover with multi-source external replication possible.

This setting also allows the monitor to perform a failover to a cluster where the master server has not been seen [Running]. This is usually the case when the master goes down before MaxScale is started. When using this feature, the monitor will guess the GTID domain id of the master from the slaves. For reliable results, the GTID:s of the cluster should be simple.

enforce_simple_topology=true

replication_user and replication_password

  • Type: string

  • Mandatory: No

  • Dynamic: Yes

  • Default: None

The username and password of the replication user. These are given as the values for MASTER_USER and MASTER_PASSWORD whenever a CHANGE MASTER TO command is executed.

Both replication_user and replication_password parameters must be defined if a custom replication user is used. If neither of the parameters is defined, theCHANGE MASTER TO-command will use the monitor credentials for the replication user.

The credentials used for replication must have the REPLICATION SLAVE privilege.

replication_password uses the same encryption scheme as other password parameters. If password encryption is in use, replication_password must be encrypted with the same key to avoid erroneous decryption.

replication_master_ssl

  • Type: boolean

  • Mandatory: No

  • Dynamic: Yes

  • Default: false

If set to ON, any CHANGE MASTER TO-command generated will set MASTER_SSL=1 to enable encryption for the replication stream. This setting should only be enabled if the backend servers are configured for ssl. This typically means setting ssl_ca, ssl_cert andssl_key in the server configuration file. Additionally, credentials for the replication user should require an encrypted connection (e.g. ALTER USER repl@'%' REQUIRE SSL;).

If the setting is left OFF, MASTER_SSL is not set at all, which will preserve existing settings when redirecting a slave connection.

failover_timeout and switchover_timeout

  • Type: duration

  • Mandatory: No

  • Dynamic: Yes

  • Default: 90s

Time limit for failover and switchover operations. The default values are 90 seconds for both. switchover_timeout is also used as the time limit for a rejoin operation. Rejoin should rarely time out, since it is a faster operation than switchover.

The timeouts are specified as documentedhere. If no explicit unit is provided, the value is interpreted as seconds in MaxScale 2.4. In subsequent versions a value without a unit may be rejected. Note that since the granularity of the timeouts is seconds, a timeout specified in milliseconds will be rejected, even if the duration is longer than a second.

If no successful failover/switchover takes place within the configured time period, a message is logged and automatic failover is disabled. This prevents further automatic modifications to the misbehaving cluster.

verify_master_failure

  • Type: boolean

  • Mandatory: No

  • Dynamic: Yes

  • Default: true

Enable additional master failure verification for automatic failover.verify_master_failure enables this feature andmaster_failure_timeout defines the timeout.

Failure verification is performed by checking whether the slave servers are still connected to the master and receiving events. An event is either a change in the Gtid_IO_Pos-field of the SHOW SLAVE STATUS output or a heartbeat event. Effectively, if a slave has received an event withinmaster_failure_timeout duration, the master is not considered down when deciding whether to failover, even if MaxScale cannot connect to the master.master_failure_timeout should be longer than the Slave_heartbeat_period of the slave connection to be effective.

If every slave loses its connection to the master (Slave_IO_Running is not "Yes"), master failure is considered verified regardless of timeout. This allows faster failover when the master properly disconnects.

For automatic failover to activate, the failcount requirement must also be met.

master_failure_timeout

  • Type: duration

  • Mandatory: No

  • Dynamic: Yes

  • Default: 10s

The master failure timeout is specified as documentedhere. If no explicit unit is provided, the value is interpreted as seconds in MaxScale 2.4. In subsequent versions a value without a unit may be rejected. Note that since the granularity of the timeout is seconds, a timeout specified in milliseconds will be rejected, even if the duration is longer than a second.

servers_no_promotion

  • Type: string

  • Mandatory: No

  • Dynamic: Yes

  • Default: None

This is a comma-separated list of server names that will not be chosen for master promotion during a failover or autoselected for switchover. This does not affect switchover if the user selects the server to promote. Using this setting can disrupt new master selection for failover such that an non-optimal server is chosen. At worst, this will cause replication to break. Alternatively, failover may fail if all valid promotion candidates are in the exclusion list.

servers_no_promotion=backup_dc_server1,backup_dc_server2

promotion_sql_file and demotion_sql_file

  • Type: string

  • Mandatory: No

  • Dynamic: Yes

  • Default: None

These optional settings are paths to text files with SQL statements in them. During promotion or demotion, the contents are read line-by-line and executed on the backend. Use these settings to execute custom statements on the servers to complement the built-in operations.

Empty lines or lines starting with '#' are ignored. Any results returned by the statements are ignored. All statements must succeed for the failover, switchover or rejoin to continue. The monitor user may require additional privileges and grants for the custom commands to succeed.

When promoting a slave to master during switchover or failover, thepromotion_sql_file is read and executed on the new master server after its read-only flag is disabled. The commands are ran before starting replication from an external master if any.

demotion_sql_file is ran on an old master during demotion to slave, before the old master starts replicating from the new master. The file is also ran before rejoining a standalone server to the cluster, as the standalone server is typically a former master server. When redirecting a slave replicating from a wrong master, the sql-file is not executed.

Since the queries in the files are ran during operations which modify replication topology, care is required. If promotion_sql_file contains data modification (DML) queries, the new master server may not be able to successfully replicate from an external master. demotion_sql_file should never contain DML queries, as these may not replicate to the slave servers before slave threads are stopped, breaking replication.

promotion_sql_file=/home/root/scripts/promotion.sql
demotion_sql_file=/home/root/scripts/demotion.sql

handle_events

  • Type: boolean

  • Mandatory: No

  • Dynamic: Yes

  • Default: true

If enabled, the monitor continuously queries the servers for enabled scheduled events and uses this information when performing cluster operations, enabling and disabling events as appropriate.

When a server is being demoted, any events with "ENABLED" status are set to "SLAVESIDE_DISABLED". When a server is being promoted to master, events that are either "SLAVESIDE_DISABLED" or "DISABLED" are set to "ENABLED" if the same event was also enabled on the old master server last time it was successfully queried. Events are considered identical if they have the same schema and name. When a standalone server is rejoined to the cluster, its events are also disabled since it is now a slave.

The monitor does not check whether the same events were disabled and enabled during a switchover or failover/rejoin. All events that meet the criteria above are altered.

The monitor does not enable or disable the event scheduler itself. For the events to run on the new master server, the scheduler should be enabled by the admin. Enabling it in the server configuration file is recommended.

Events running at high frequency may cause replication to break in a failover scenario. If an old master which was failed over restarts, its event scheduler will be on if set in the server configuration file. Its events will also remember their "ENABLED"-status and run when scheduled. This may happen before the monitor rejoins the server and disables the events. This should only be an issue for events running more often than the monitor interval or events that run immediately after the server has restarted.

Cooperative monitoring

As of MaxScale 2.5, MariaDB-Monitor supports cooperative monitoring. This means that multiple monitors (typically in different MaxScale instances) can monitor the same backend server cluster and only one will be the primary monitor. Only the primary monitor may perform switchover, failover or rejoin operations. The primary also decides which server is the master. Cooperative monitoring is enabled with thecooperative_monitoring_locks-setting. Even with this setting, only one monitor per server per MaxScale is allowed. This limitation can be circumvented by defining multiple copies of a server in the configuration file.

Cooperative monitoring usesserver locks for coordinating between monitors. When cooperating, the monitor regularly checks the status of a lock named maxscale_mariadbmonitor on every server and acquires it if free. If the monitor acquires a majority of locks, it is the primary. If a monitor cannot claim majority locks, it is a secondary monitor.

The primary monitor of a cluster also acquires the lockmaxscale_mariadbmonitor_master on the master server. Secondary monitors check which server this lock is taken on and only accept that server as the master. This arrangement is required so that multiple monitors can agree on which server is the master regardless of replication topology. If a secondary monitor does not see the master-lock taken, then it won't mark any server as [Master], causing writes to fail.

The lock-setting defines how many locks are required for primary status. Settingcooperative_monitoring_locks=majority_of_all means that the primary monitor needs n_servers/2 + 1 (rounded down) locks. For example, a cluster of three servers needs two locks for majority, a cluster of four needs three, and a cluster of five needs three. This scheme is resistant against split-brain situations in the sense that multiple monitors cannot be primary simultaneously. However, a split may cause both monitors to consider themselves secondary, in which case a master server won't be detected.

Even without a network split, cooperative_monitoring_locks=majority_of_all will lead to neither monitor claiming lock majority once too many servers go down. This scenario is depicted in the image below. Only two out of four servers are running when three are needed for majority. Although both MaxScales see both running servers, neither is certain they have majority and the cluster stays in read-only mode. If the primary server is down, no failover is performed either.

Setting cooperative_monitoring_locks=majority_of_running changes the wayn_servers is calculated. Instead of using the total number of servers, only servers currently [Running] are considered. This scheme adapts to multiple servers going down, ensuring that claiming lock majority is always possible. However, it can lead to multiple monitors claiming primary status in a split-brain situation. As an example, consider a cluster with servers 1 to 4 with MaxScales A and B, as in the image below. MaxScale A can connect to servers 1 and 2 (and claim their locks) but not to servers 3 and 4 due to a network split. MaxScale A thus assumes servers 3 and 4 are down. MaxScale B does the opposite, claiming servers 3 and 4 and assuming 1 and 2 are down. Both MaxScales claim two locks out of two available and assume that they have lock majority. Both MaxScales may then promote their own primaries and route writes to different servers.

The recommended strategy depends on which failure scenario is more likely and/or more destructive. If it's unlikely that multiple servers are ever down simultaneously, then majority_of_all is likely the safer choice. On the other hand, if split-brain is unlikely but multiple servers may be down simultaneously, then majority_of_running would keep the cluster operational.

To check if a monitor is primary, fetch monitor diagnostics with maxctrl show monitors or the REST API. The boolean field primary indicates whether the monitor has lock majority on the cluster. If cooperative monitoring is disabled, the field value is null. Lock information for individual servers is listed in the server-specific field lock_held. Again, null indicates that locks are not in use or the lock status is unknown.

If a MaxScale instance tries to acquire the locks but fails to get majority (perhaps another MaxScale was acquiring locks simultaneously) it will release any acquired locks and try again after a random number of monitor ticks. This prevents multiple MaxScales from fighting over the locks continuously as one MaxScale will eventually wait less time than the others. Conflict probability can be further decreased by configuring each monitor with a differentmonitor_interval.

The flowchart below illustrates the lock handling logic.

Releasing locks

Monitor cooperation depends on the server locks. The locks are connection-specific. The owning connection can manually release a lock, allowing another connection to claim it. Also, if the owning connection closes, the MariaDB Server process releases the lock. How quickly a lost connection is detected affects how quickly the primary monitor status moves from one monitor and MaxScale to another.

If the primary MaxScale or its monitor is stopped normally, the monitor connections are properly closed, releasing the locks. This allows the secondary MaxScale to quickly claim the locks. However, if the primary simply vanishes (broken network), the connection may just look idle. In this case, the MariaDB Server may take a long time before it considers the monitor connection lost. This time ultimately depends on TCP keepalive settings on the machines running MariaDB Server.

On MariaDB Server 10.3.3 and later, the TCP keepalive settings can be configured for just the server process. SeeServer System Variables for information on settings tcp_keepalive_interval, tcp_keepalive_probes andtcp_keepalive_time. These settings can also be set on the operating system level, as describedhere.

As of MaxScale 6.4.16, 22.08.13, 23.02.10, 23.08.6 and 24.02.2, configuring TCP keepalive is no longer necessary as monitor sets the session wait_timeout variable when acquiring a lock. This causes the MariaDB Server to close the monitor connection if the connection appears idle for too long. The value ofwait_timeout used depends on the monitor interval and connection timeout settings, and is logged at MaxScale startup.

A monitor can also be ordered to manually release its locks via the module command release-locks. This is useful for manually changing the primary monitor. After running the release-command, the monitor will not attempt to reacquire the locks for one minute, even if it wasn't the primary monitor to begin with. This command can cause the cluster to become temporarily unusable by MaxScale. Only use it when there is another monitor ready to claim the locks.

maxctrl call command mariadbmon release-locks MyMonitor1

Rebuild server

The rebuild server-feature replaces the contents of a database server with the contents of another server. The source server is effectively cloned and all data on the target server is lost. This is useful when a slave server has diverged from the master server, or when adding a new server to the cluster. The MariaDB Server configuration files are not affected.

MariaDB-Monitor can perform this operation by runningmariadb-backup on both the source and target servers. To do this, MaxScale needs to have ssh-access on the machines. Also, the following tools need to be installed on the source and target machines:

  1. mariadb-backup. Backups and restores MariaDB Server contents. Installed e.g. with yum install MariaDB-backup.

  2. pigz. Compresses and decompresses the backup stream. Installed e.g. withyum install pigz.

  3. socat. Streams data from one machine to another. Is likely already installed. If not, can be installed e.g. with yum install socat.

The ssh_user and ssh_keyfile-settings define the SSH credentials MaxScale uses to access the servers. MaxScale must be able to run commands with sudo on both the source and target servers. mariadb-backup, on the other hand, needs to authenticate to the MariaDB Server being copied from. For this, MaxScale uses the monitor user. The monitor user may thus require additional privileges. Seemariadb-backup documentation for more details.

When launched, the rebuild operation proceeds as below. If any step fails, the operation is stopped and the target server will be left in an unspecified state.

  1. Log in to both servers with ssh and check that the tools listed above are present (e.g. mariadb-backup -v should succeed).

  2. Check that the port used for transferring the backup is free on the source server. If not, kill the process holding it. This requires running lsof and kill.

  3. Test the connection by streaming a short message from the source host to the target.

  4. Launch mariadb-backup on the source machine, compress the stream and listen for an incoming connection. This is performed with a command likemariadb-backup --backup --safe-slave-backup --stream=xbstream | pigz -c | socat - TCP-LISTEN:<port>.

  5. Stop MariaDB-server on the target machine and delete all contents of the data directory /var/lib/mysql.

  6. On the target machine, connect to the source machine, read the backup stream, decompress it and write to the data directory. This is performed with a command like socat -u TCP:<host>:<port> STDOUT | pigz -dc | mbstream -x. This step can take a long time if there is much data to transfer.

  7. Check that the data directory is not empty.

  8. Prepare the backup on the target server with a command likemariadb-backup --use-memory=1G --prepare. This step can also take some time if the source server performed writes during data transfer.

  9. On the target server, change ownership of datadir contents to the mysql-user and start MariaDB-server.

  10. Read gtid from the data directory. Have the target server start replicating from the master.

The rebuild-operation is a monitor module command and is best launched with MaxCtrl. The command takes three arguments: the monitor name, target server name and source server name. The source server can be left out, in which case it is autoselected. When autoselecting, the monitor prefers to pick an up-to-date slave server. Due to the --safe-slave-backup-option, the slave will stop replicating until the backup data has been transferred.

maxctrl call command mariadbmon async-rebuild-server MariaDB-Monitor MyServer3 MyServer2

The operation does not launch if the target server is already replicating or if the source server is not a master or slave.

Steps 6 and 8 can take a long time depending on the size of the database and if writes are ongoing. During these steps, the monitor will continue monitoring the cluster normally. After each monitor tick the monitor checks if the rebuild-operation can proceed. No other monitor operations, either manual or automatic, can run until the rebuild completes.

Settings

ssh_user

  • Type: string

  • Mandatory: No

  • Dynamic: Yes

  • Default: None

Ssh username. Used when logging in to backend servers to run commands.

ssh_keyfile

  • Type: path

  • Mandatory: No

  • Dynamic: Yes

  • Default: None

Path to file with an ssh private key. Used when logging in to backend servers to run commands.

ssh_check_host_key

  • Type: boolean

  • Mandatory: No

  • Dynamic: Yes

  • Default: true

Boolean, default: true. When logging in to backends, require that the server is already listed in the known_hosts-file of the user running MaxScale.

ssh_timeout

  • Type: duration

  • Mandatory: No

  • Dynamic: Yes

  • Default: 10s

The rebuild operation consists of multiple ssh commands. Most of the commands are assumed to complete quickly. If these commands take more than ssh_timeout to complete, the operation fails. Adjust this setting if rebuild fails due to ssh commands timing out. This setting does not affect steps 5 and 6, as these are assumed to take significant time.

ssh_port

  • Type: number

  • Mandatory: No

  • Dynamic: Yes

  • Default: 22

SSH port. Used for running remote commands on servers.

rebuild_port

  • Type: number

  • Mandatory: No

  • Dynamic: Yes

  • Default: 4444

The port which the source server listens on for a connection. The port must not be blocked by a firewall or listened on by any other program. If another process is listening on the port when rebuild is starting, MaxScale will attempt to kill the process.

sudoers.d configuration

If giving MaxScale general sudo-access is out of the question, MaxScale must be allowed to run the specific commands required by the rebuild-operation. This can be achieved by creating a file with the commands in the/etc/sudoers.d-directory. In the example below, the user johnny is given the power to run commands as root. The contents of the file may need to be tweaked due to changes in install locations.

johnny ALL= NOPASSWD: /bin/systemctl stop mariadb
johnny ALL= NOPASSWD: /bin/systemctl start mariadb
johnny ALL= NOPASSWD: /usr/sbin/lsof
johnny ALL= NOPASSWD: /bin/kill
johnny ALL= NOPASSWD: /usr/bin/mariadb-backup
johnny ALL= NOPASSWD: /bin/mbstream
johnny ALL= NOPASSWD: /bin/du
johnny ALL= NOPASSWD: /bin/rm -rf /var/lib/mysql/*
johnny ALL= NOPASSWD: /bin/chown -R mysql\:mysql /var/lib/mysql/*
johnny ALL= NOPASSWD: /bin/cat

ColumnStore commands

Since MaxScale version 22.08, MariaDB Monitor can run ColumnStore administrative commands against a ColumnStore cluster. The commands interact with the ColumnStore REST-API present in recent ColumnStore versions and have been tested with MariaDB-Server 10.6 running the ColumnStore plugin version 6.2. None of the commands affect monitor configuration or replication topology. MariaDB Monitor simply relays the commands to the backend cluster.

MariaDB Monitor can fetch cluster status, add and remove nodes, start and stop the cluster, and set cluster read-only or readwrite. MaxScale only communicates with the first server in the servers-list.

Most of the commands are asynchronous, i.e. they do not wait for the operation to complete on the ColumnStore backend before returning to the command prompt. MariaDB Monitor itself, however, runs the command in the background and does not perform normal monitoring until the operation completes or fails. After an operation has started the user should use fetch-cmd-result to check its status. The examples below show how to run the commands using MaxCtrl. If a command takes a timeout-parameter, the timeout can be given in seconds (s), minutes (m) or hours (h).

ColumnStore command settings are listed here. At leastcs_admin_api_key must be set.

Get status

Fetch cluster status. Returns the result as is. Status fetching has an automatic timeout of ten seconds.

maxctrl call command mariadbmon cs-get-status <monitor-name>
maxctrl call command mariadbmon async-cs-get-status <monitor-name>

Examples:

maxctrl call command mariadbmon cs-get-status MyMonitor
{
    "mcs1": {
        "cluster_mode": "readwrite",
        "dbrm_mode": "master",
<snip>

maxctrl call command mariadbmon async-cs-get-status MyMonitor
OK
maxctrl call command mariadbmon fetch-cmd-result MyMonitor
{
    "mcs1": {
        "cluster_mode": "readwrite",
        "dbrm_mode": "master",
<snip>

Add or remove node

Add or remove a node to/from the ColumnStore cluster.

maxctrl call command mariadbmon async-cs-add-node <monitor-name> <node-host> <timeout>
maxctrl call command mariadbmon async-cs-remove-node <monitor-name> <node-host> <timeout>

<node-host> is the hostname or IP of the node being added or removed.

Examples:

maxctrl call command mariadbmon async-cs-add-node MyMonitor mcs3 1m
OK
maxctrl call command mariadbmon fetch-cmd-result MyMonitor
{
    "node_id": "mcs3",
    "timestamp": "2022-05-05 08:07:51.518268"
}
maxctrl call command mariadbmon async-cs-remove-node MyMonitor mcs3 1m
OK
maxctrl call command mariadbmon fetch-cmd-result MyMonitor
{
    "node_id": "mcs3",
    "timestamp": "2022-05-05 10:46:46.506947"
}

Start and stop cluster

maxctrl call command mariadbmon async-cs-start-cluster <monitor-name> <timeout>
maxctrl call command mariadbmon async-cs-stop-cluster <monitor-name> <timeout>

Examples:

maxctrl call command mariadbmon async-cs-start-cluster MyMonitor 1m
OK
maxctrl call command mariadbmon fetch-cmd-result MyMonitor
{
    "timestamp": "2022-05-05 09:41:57.140732"
}
maxctrl call command mariadbmon async-cs-stop-cluster MyMonitor 1m
OK
maxctrl call command mariadbmon fetch-cmd-result MyMonitor
{
    "mcs1": {
        "timestamp": "2022-05-05 09:45:33.779837"
    },
<snip>

Set read-only or readwrite

maxctrl call command mariadbmon async-cs-set-readonly <monitor-name> <timeout>
maxctrl call command mariadbmon async-cs-set-readwrite <monitor-name> <timeout>

Examples:

maxctrl call command mariadbmon async-cs-set-readonly MyMonitor 30s
OK
maxctrl call command mariadbmon fetch-cmd-result MyMonitor
{
    "cluster-mode": "readonly",
    "timestamp": "2022-05-05 09:49:18.365444"
}
maxctrl call command mariadbmon async-cs-set-readwrite MyMonitor 30s
OK
maxctrl call command mariadbmon fetch-cmd-result MyMonitor
{
    "cluster-mode": "readwrite",
    "timestamp": "2022-05-05 09:50:30.718972"
}

Settings

cs_admin_port

Numeric, default: 8640. The REST-API port on the ColumnStore nodes. All nodes are assumed to listen on the same port.

cs_admin_port=8641

cs_admin_api_key

String. The API-key MaxScale sends to the ColumnStore nodes when making a REST-API request. Should match the value configured on the ColumnStore nodes.

cs_admin_api_key=somekey123

cs_admin_base_path

String, default: /cmapi/0.4.0. Base path sent with the REST-API request.

Other commands

fetch-cmd-result

Fetches the result of the last manual command. Requires monitor name as parameter. Most commands only return a generic success message or an error description. ColumnStore commands may return more data. Scheduling another command clears a stored result.

maxctrl call command mariadbmon fetch-cmd-result MariaDB-Monitor
"switchover completed successfully."

cancel-cmd

Cancels the latest operation, whether manual or automatic, if possible. Requires monitor name as parameter. A scheduled manual command is simply canceled before it can run. If a command is already running, it stops as soon as possible. The cancel-cmd itself does not wait for a running operation to stop. Use fetch-cmd-result or check the log to see if the operation has truly completed. Canceling is most useful for stopping a stalled rebuild operation.

maxctrl call command mariadbmon cancel-cmd MariaDB-Monitor
OK

Troubleshooting

Failover/switchover fails

See the Limitations and requirements-section.

Before performing failover or switchover, the monitor checks that prerequisites are fulfilled, printing any errors and warnings found. This should catch and explain most issues with failover or switchover not working. If the operations are attempted and still fail, then most likely one of the commands the monitor issued to a server failed or timed out. The log should explain which query failed.

A typical failure reason is that a command such as STOP SLAVE takes longer than thebackend_read_timeout of the monitor, causing the connection to break. As of 2.3, the monitor will retry most such queries if the failure was caused by a timeout. The retrying continues until the total time for a failover or switchover has been spent. If the log shows warnings or errors about commands timing out, increasing the backend timeout settings of the monitor should help. Other settings to look at are query_retries andquery_retry_timeout. These are general MaxScale settings described in theConfiguration guide. Settingquery_retries to 2 is a reasonable first try.

If switchover causes the old master (now slave) to fail replication, then most likely a user or perhaps a scheduled event performed a write while monitor had set read_only=1. This is possible if the user performing the write has "SUPER" or "READ_ONLY ADMIN" privileges. The switchover-operation tries to kick out SUPER-users but this is not certain to succeed. Remove these privileges from any users that regularly do writes to prevent them from interfering with switchover.

The server configuration files should have log-slave-updates=1 to ensure that a newly promoted master has binary logs of previous events. This allows the new master to replicate past events to any lagging slaves.

To print out all queries sent to the servers, start MaxScale with--debug=enable-statement-logging. This setting prints all queries sent to the backends by monitors and authenticators. The printed queries may include usernames and passwords.

Slave detection shows external masters

If a slave is shown in maxctrl as "Slave of External Server" instead of "Slave", the reason is likely that the "Master_Host"-setting of the replication connection does not match the MaxScale server definition. As of 2.3.2, the MariaDB Monitor by default assumes that the slave connections (as shown by SHOW ALL SLAVES STATUS) use the exact same "Master_Host" as used the MaxScale configuration file server definitions. This is controlled by the setting assume_unique_hostnames.

Using the MariaDB Monitor With Binlogrouter

Since MaxScale 2.2 it's possible to detect a replication setup which includes Binlog Server: the required action is to add the binlog server to the list of servers only if master_id identity is set.

CC BY-SA / Gnu FDL

MaxScale 22.08 Xpand Monitor

Xpand Monitor

Xpand Monitor

Overview

The Xpand Monitor is a monitor that monitors a Xpand cluster. It is capable of detecting the cluster setup and creating corresponding server instances within MaxScale.

Required Grants

The monitor user must have the following grants:

CREATE USER 'maxscale'@'maxscalehost' IDENTIFIED BY 'maxscale-password';
GRANT SELECT ON system.membership TO 'maxscale'@'maxscalehost';
GRANT SELECT ON system.nodeinfo TO 'maxscale'@'maxscalehost';
GRANT SELECT ON system.softfailed_nodes TO 'maxscale'@'maxscalehost';

Further, if you want be able to softfail and unsoftfail a node via MaxScale, then the monitor user must have SUPER privileges:

GRANT SUPER ON *.* TO 'maxscale'@'maxscalehost';

Configuration

A minimal configuration for a monitor requires one server in the Xpand cluster, and a username and a password to connect to the server. Note that by default the Xpand monitor will only use that server in order to dynamically find out the configuration of the cluster; after startup it will completely rely upon information obtained at runtime. To change the default behaviour, please see the parameterdynamic_node_detection.

To ensure that the Xpand monitor will be able to start, it is adviseable to provide more than one server to cater for the case that not all nodes are always up when MaxScale starts.

Note: All services that use servers monitored by xpandmon should use the cluster parameter to define the set of servers they use. This will guarantee that the services use servers that are valid members of the XPand cluster.

[TheXpandMonitor]
type=monitor
module=xpandmon
servers=server1,server2,server3
user=myuser
password=mypwd

[MyService]
type=service
router=readconnroute
cluster=TheXpandMonitor
user=myuser
password=mypwd

Dynamic Servers

The server objects the Xpand monitor creates for each detected Xpand node will be named like

@@<name-of-xpand-monitor>:node-<id>

where <name-of-xpand-monitor> is the name of the Xpand monitor instance, as defined in the MaxScale configuration file, and <id> is the id of the Xpand node.

For instance, with the Xpand monitor defined as above and a Xpand cluster consisting of 3 nodes whose ids are 1, 2 and 3 respectively, the names of the created server objects will be:

@@TheXpandMonitor:node-1
@@TheXpandMonitor:node-2
@@TheXpandMonitor:node-3

When dynamic servers are created, the values for the configuraton settingsmax_routing_connections, persistmaxtime, persistpoolmax andproxy_protocol are copied from the settings of the bootstrap servers. Note that the values of these settings must be identical on every bootstrap server.

Common Monitor Parameters

For a list of optional parameters that all monitors support, read theMonitor Common document.

Xpand Monitor optional parameters

These are optional parameters specific to the Xpand Monitor.

cluster_monitor_interval

Defines, in milliseconds, how often the monitor checks the state of the entire cluster. The default value is 60000 (1 minute), which should not be lowered as that may have an adverse effect on the Cluster itself.

cluster_monitor_interval=120000ms

The interval is specified as documentedhere. If no explicit unit is provided, the value is interpreted as milliseconds in MaxScale 2.4. In subsequent versions a value without a unit may be rejected.

health_check_threshold

Defines how many times the health check may fail before the monitor considers a particular node to be down. The default value is 2.

health_check_threshold=3

dynamic_node_detection

By default, the Xpand monitor will only use the bootstrap nodes in order to connect to the Xpand cluster and then find out the cluster configuration dynamically at runtime.

That behaviour can be turned off with this optional parameter, in which case all Xpand nodes must manually be defined as shown below.

[Node-1]
type=server
address=192.168.121.77
port=3306
...

[Node-2]
...

[Node-3]
...

[TheXpandMonitor]
type=monitor
module=xpandmon
servers=Node-1, Node-2, Node-3
dynamic_node_detection=false

The default value of dynamic_node_detection is true.

See also health_check_port.

health_check_port

With this optional parameter it can be specified what health check port to use, if dynamic_node_detection has been disabled.

health_check_port=4711

The default value is 3581.

Note that this parameter is ignored unless dynamic_node_detection is false. Note also that the port must be the same for all nodes.

Commands

The Xpand monitor supports the following module commands.

softfail

With the softfail module command, a node can be softfailed via MaxScale. The command requires as argument the name of the Xpand monitor instance (as defined in the configuration file) and the name of the node to be softfailed.

For instance, with a configuration file like

[TheXpandMonitor]
type=monitor
module=xpandmon
...

then the node whose server name is @@TheXpandMonitor:node-1 can be softfailed like

$ maxctrl call command xpandmon softfail TheXpandMonitor @@TheXpandMonitor:node-1

If the softfailing of a node is successfully initiated, then the status of the corresponding MaxScale server object will be set to Draining, which will prevent new connections from being created to the node.

When the number of connections through MaxScale to the node has dropped to 0, its state will change to Drained. Note that the state Drained only tells that there are no connections to the node, not what the state of the softfailing operation is.

unsoftfail

With the unsoftfail module command, a node can be unsoftfailed via MaxScale. The command requires as argument the name of the Xpand monitor instance (as defined in the configuration file) and the name of the node to be unsoftfailed.

With a setup similar to the softfail case, a node can be unsoftfailed like:

$ maxctrl call command xpandmon unsoftfail TheXpandMonitor @@TheXpandMonitor:node-1

If a node is successfully softfailed, then a Draining status of the corresponding MaxScale server object will be cleared.

SOFTFAILed nodes

During the cluster check, which is performed once percluster_monitor_interval, the monitor will also check whether any nodes are being softfailed. The status of the corresponding server object of a node being softfailed will be set to Draining, which will prevent new connections from being created to that node.

When the number of connections through MaxScale to the node has dropped to 0, its state will change to Drained. Note that the state Drained only tells that there are no connections to the node, not what the state of the softfailing operation is.

If a node that was softfailed is UNSOFTFAILed then the Draining status will be cleared.

If the softfailing and unsoftfailing is initiated using the softfail and unsoftfail commands of the Xpand monitor, then there will be no delay between the softfailing or unsoftfailing being initiated and theDraining status being turned on/off.

CC BY-SA / Gnu FDL