2 of 44

Automated MariaDB Deployment and Administration

Automate MariaDB Server deployment and administration. This section covers tools and practices for streamlined installation, configuration, and ongoing management using binary packages.

A Comparison Between Automation Systems

This page compares the automation systems that are covered by this section of the MariaDB documentation. More information about these systems are presented in the relevant pages, and more systems may be added in the future.

Code Structure Differences

Different automation systems provide different ways to describe our infrastructure. Understanding how they work is the first step to evaluate them and choose one for our organization.

Ansible Code Structure

Ansible code consists of the following components:

An inventory determines which hosts Ansible should be able to deploy. Each host may belong to one or more groups. Groups may have children, forming a hierarchy. This is useful because it allows us to deploy on a group, or to assign variables to a group.
A role describes the state that a host, or group of hosts, should reach after a deploy.
A play associates hosts or groups to their roles. Each role/group can have more than one role.
A role consists of a list of tasks. Despite its name a task is not necessarily something to do, but something that must exist in a certain state.
Tasks can use variables. They can affect how a task is executed (for example a variable could be a file name), or even whether a task is executed or not. Variables exist at role, group or host level. Variables can also be passed by the user when a play is applied.
Playbooks are the code that is used to define tasks and variables.
Facts are data that Ansible retrieves from remote hosts before deploying. This is a very important step, because facts may determine which tasks are executed or how they are executed. Facts include, for example, the operating system family or its version. A playbook sees facts as pre-set variables.
Modules implement actions that tasks can use. Action examples are file (to declare that files and directories must exist) or mysql_variables (to declare MySQL/MariaDB variables that need to be set).

See Ansible Overview - concepts for more details and an example.

Puppet Code Structure

Puppet code consists of the following components:

An inventory file defines a set of groups and their targets (the members of a group). plugins can be used to retrieve groups and target dynamically, so they are equivalent to Ansible dynamic inventories.
A manifest is a file that describes a configuration.
A resource is a component that should run on a server. For example, "file" and "service" are existing support types.
An attribute relates to a resource and affects the way it is applied. For example, a resource of type "file" can have attributes like "owner" and "mode".
A class groups resources and variables, describing a logical part of server configuration. A class can be associated to several servers. A class is part of a manifest.
A module is a set of manifests and describes an infrastructure or a part of it.
Classes can have typed parameters that affect how they are applied.
Properties are variables that are read from the remote server, and cannot be arbitrarily assigned.
Facts are pre-set variables collected by Puppet before applying or compiling a manifest.

Architectural Differences

The architecture of the various systems is different. Their architectures determine how a deploy physically works, and what is needed to be able to deploy.

Ansible Architecture

Ansible architecture is simple. Ansible can run from any host, and can apply its playbooks on remote hosts. To do this, it runs commands via SSH. In practice, in most cases the commands will be run as superuser via sudo, though this is not always necessary.

Inventories can be dynamic. In this case, when we apply a playbook Ansible connects to remote services to discover hosts.

Ansible playbooks are applied via the ansible-playbook binary. Changes to playbooks are only applied when we perform this operation.

To recap, Ansible does not need to be installed on the server is administers. It needs an SSH access, and normally its user needs to be able to run sudo. It is also possible to configure a dynamic inventory, and a remote service to be used for this purpose.

Puppet Architecture

Puppet supports two types of architecture: agent-master or standalone. The agent-master architecture is recommended by Puppet Labs, and it is the most popular among Puppet users. For this reason, those who prefer a standalone architecture tend to prefer Ansible.

Agent-Master Architecture

When this architecture is chosen, manifests are sent to the Puppet master. There can be more than one master, for high availability reasons. All target hosts run a Puppet agent. Normally this is a service that automatically starts at system boot. The agent contacts a master at a given interval. It sends facts, and uses them to compile a catalog from the manifests. A catalog is a description of what exactly an individual server should run. The agent receives the catalog and checks if there are differences between its current configuration and the catalog. If differences are found, the agent applies the relevant parts of the catalog.

An optional component is PuppetDB. This is a central place where some data are stored, including manifests, retrieved facts and logs. PuppetDB is based on PostgreSQL and there are no plans to support MariaDB or other DBMSs.

If a manual change is made to a remove server, it will likely be overwritten the next time Puppet agent runs. To avoid this, the Puppet agent service can be stopped.

Standalone Architecture

As mentioned, this architecture is not recommended by Puppet Labs nor popular amongst Puppet users. It is similar to Ansible architecture.

Users can apply manifests from any host with Puppet installed. This could be their laptop but, in order to emulate the behavior of an agent-master architecture, normally Puppet runs on a dedicated node as a cronjob. The Puppet apply application will require facts from remote hosts, it will compile a catalog for each host, will check which parts of it need to be applied, and will apply them remotely.

If a manual change is made to a remove server, it will be overwritten the next time Puppet apply runs. To avoid this, comment out any cron job running Puppet apply, or comment out the target server in the inventory.

Inventory

As mentioned, Puppet supports plugins to retrieve the inventory dynamically from remote services. In an agent-master architecture, one has to make sure that each target host has access to these services. In a standalone architecture, one has to make sure that the hosts running Puppet apply have access to these services.

Storing Secrets

Often our automation repositories need to contain secrets, like MariaDB user passwords or private keys for SSH authentication.

Both Ansible and Puppet support integration with secret stores, like Hashicorp Vault. For Puppet integration, see Integrations with secret stores.

In the simplest case, Ansible allows encrypting secrets in playbooks and decrypting them during execution using ansible-vault. This implies a minimal effort to handle secrets. However, it is not the most secure way to store secrets. The passwords to disclose certain secrets need to be shared with the users who have the right to use them. Also, brute force attacks are possible.

Ecosystems and Communities

Automation software communities are very important, because they make available a wide variety of modules to handle specific software.

Ansible Ecosystem

Ansible is open source, released under the terms of the GNU GPL. It is produced by RedHat. RedHat has a page about Red Hat Ansible Automation Platform Partners, who can provide support and consulting.

Ansible Galaxy is a big repository of Ansible roles produced by both the vendor and the community. Ansible comes with ansible-galaxy, a tool that can be used to create roles and upload them to Ansible Galaxy.

At the time of this writing, Ansible does not have specific MariaDB official modules. MySQL official modules can be used. However, be careful not try to use features that only apply to MySQL. There are several community-maintained MariaDB roles.

Puppet Ecosystem

Puppet is open source, released under the GNU GPL. It is produced by a homonym company. The page Puppet Partners lists partners that can provide support and consulting about Puppet.

Puppet Forge is a big repository of modules produced by the vendor and by the community, as well as how-to guides.

Currently Puppet has many MariaDB modules.

Automating MariaDB Tasks with Events

MariaDB has an event scheduler that can be used to automate tasks, making them run at regular intervals of time. This page is about using events for automation. For more information about events themselves, and how to work with them, see event scheduler.

Pros and Cons of Using Events for Automation

Events can be compared to Unix cron jobs or Windows scheduled tasks. MariaDB events have at least the following benefits compared to those tools:

Events are system-independent. The same code can run on any system.
Events are written in procedural SQL. There is no need to install other languages or libraries.
If you use user-defined functions, you can still take advantage of them in your events.
Events run in MariaDB. An implication, for example, is that the results of queries remain in MariaDB itself and are not sent to a client. This means that network glitches don't affect events, there is no overhead due to data roundtrip, and therefore locks are held for a shorter time.

Some drawbacks of using events are the following:

Events can only perform tasks that can be developed in SQL. So, for example, it is not possible to send alerts. Access to files or remote databases is limited.
The event scheduler runs as a single thread. This means that events that are scheduled to run while another event is running will wait until the other event has stopped. This means that there is no guarantee that an event will run on exactly it's scheduled time. This should not be a problem as long as one ensures that events are short lived.
For more events limitations, see Event Limitations.

In many cases you may prefer to develop scripts in an external programming language. However, you should know that simple tasks consisting of a few queries can easily be implemented as events.

Good Practices

When using events to automate tasks, there are good practices one may want to follow.

Move your SQL code in a stored procedure. All the event will do is to call a stored procedures. Several events may call the same stored procedure, maybe with different parameters. The procedure may also be called manually, if necessary. This will avoid code duplication. This will separate the logic from the schedule, making it possible to change an event without a risk of making changes to the logic, and the other way around.

Just like cron jobs, events should log whether if they succeed or not. Logging debug messages may also be useful for non-trivial events. This information can be logged into a dedicated table. The contents of the table can be monitored by a monitoring tool like Grafana. This allows to visualize in a dashboard the status of events, and send alerts in case of a failure.

Examples

Some examples of tasks that could easily be automated with events:

Copying data from a remote table to a local table by night, using the CONNECT storage engine. This can be a good idea if many rows need be copied, because data won't be sent to an external client.
Periodically delete historical data. For example, rows that are older than 5 years. Nothing prevents us from doing this with an external script, but probably this wouldn't add any value.
Periodically delete invalid rows. In an e-commerce, they could be abandoned carts. In a messaging system, they could be messages to users that don't exist anymore.
Add a new partition to a table and drop the oldest one (partition rotation).

Content initially contributed by Vettabase Ltd.

_{This page is licensed: CC BY-SA / Gnu FDL}

Automating Upgrades with MariaDB.Org Downloads REST API

The MariaDB Foundation maintains a Downloads REST API. See the Downloads API documentation to find out all the tasks that you can accomplish with this API. Generally speaking, we can say that it provides information about MariaDB products and available versions. This allows to easily automate upgrades for MariaDB and related products.

The Downloads API exposes HTTPS endpoints that return information in JSON format. HTTP and JSON are extremely common standards that can be easily used with any programming language. All the information provided by the API is public, so no authentication is required.

How to Use the API with a Unix Shell

Linux shells are great for writing simple scripts. They are compatible to each other to some extent, so simple scripts can be run on almost any Unix/Linux system. In the following examples we'll use Bash.

On Linux, some programs you'll generally need to work with any REST API are:

curl, to call HTTP URLs and get their output.
js, to extract or transform information from a JSON document.

Example: Check When a New Version Becomes GA

A trivial use case is to write a script that checks the list of MariaDB GA major versions and, when something changes, send us an email. So we can test the newest GA version and eventually install it.

The script in this example will be extremely simple. We'll do it this way:

Retrieve the JSON object describing all MariaDB versions.
For each element of the array, only show the release_id and release_status properties, and concatenate them.
Apply a filter, so we only select the rows containing 'stable' but not 'old' (so we exclude 'Old Stable').
From the remaining rows, only show the first column (the version number).
If the results we obtained are different from the previously written file (see last point) send an email.
Save the results into a file.

This is something that we can easily do with a Unix shell:

#!/bin/bash

current_ga_versions=$(
    curl https://downloads.mariadb.org/rest-api/mariadb/ | \
    jq -r '.major_releases[] | .release_id + " " + .release_status' | \
    grep -i 'stable' | grep -vi 'old' | \
    cut -d ' ' -f 1
)

# create file if it doesn't exist, then compare version lists
touch ga_versions
previous_ga_versions=$( cat ga_versions )

echo "$current_ga_versions" > ga_versions

if [ "$current_ga_versions" != "$previous_ga_versions" ];
then
    mail -s 'NOTE: New MariaDB GA Versions' devops@example.com <<< 'There seems to be a new MariaDB GA version! Yay!'
fi

The only non-standard command here is jq. It is a great way to manipulate JSON documents, so if you don't know it you may want to take a look at jq documentation.

How to Use the API with a Python Script

To use the API with Python, we need a module that is able to send HTTP requests and parse a JSON output. The requests module has both these features. It can be installed as follows:

pip install requests

The following script prints stable versions to the standard output:

#!/usr/bin/env python

import requests

response = requests.get('https://downloads.mariadb.org/rest-api/mariadb/').json()

for x in response['major_releases']:
    if x['release_status'] == 'Stable':
        print(x['release_id'])

requests.get() makes an HTTP call of type GET, and requests.json() returns a dictionary representing the previously obtained JSON document.

Content initially contributed by Vettabase Ltd.

_{This page is licensed: CC BY-SA / Gnu FDL}

HashiCorp Vault and MariaDB

Vault is open source software for secret management provided by HashiCorp. It is designed to avoid sharing secrets of various types, like passwords and private keys. When building automation, Vault is a good solution to avoid storing secrets in plain text in a repository.

MariaDB and Vault may relate to each other in several ways:

MariaDB has a Hashicorp Key Management plugin, to manage and rotate SSH keys.
Users passwords can be stored in Vault.
MariaDB (and MySQL) can be used as a secret engine, a component which stores, generates, or encrypts data.
MariaDB (and MySQL) can be used as a backend storage, providing durability for Vault data.

For information about how to install Vault, see Install Vault, as well as MySQL/MariaDB Database Secrets Engine.

Vault Features

Vault is used via an HTTP/HTTPS API.

Vault is identity-based. Users login and Vault sends them a token that is valid for a certain amount of time, or until certain conditions occur. Users with a valid token may request to obtain secrets for which they have proper permissions.

Vault encrypts the secrets it stores.

Vault can optionally audit changes to secrets and secrets requests by the users.

Vault Architecture

Vault is a server. This allows decoupling the secrets management logic from the clients, which only need to login and keep a token until it expires.

The sever can actually be a cluster of servers, to implement high availability.

The main Vault components are:

Storage Backed: This is where the secrets are stored. Vault only send encrypted data to the backend storage.
HTTP API: This API is used by the clients, and provides an access to Vault server.
Barrier: Similarly to an actual barrier, it protects all inner Vault components. The HTTP API and the storage backend are outside of the barrier and could be accessed by anyone. All communications from and to these components have to pass through the barrier. The barrier verifies data and encrypts it. The barrier can have two states: sealed or unsealed. Data can only pass through when the barrier is unsealed. All the following components are located inside the barrier.
Auth Method: Handles login attempts from clients. When a login succeeds, the auth method returns a list of security policies to Vault core.
Token Store: Here the tokens generated as a result of a succeeded login are stored.
Secrets Engines: These components manage secrets. They can have different levels of complexity. Some of them simply expect to receive a key, and return the corresponding secret. Others may generate secrets, including one-time-passwords.
Audit Devices: These components log the requests received by Vault and the responses sent back to the clients.There may be multiple devices, in which case an Audit Broker sends the request or response to the proper device.

Dev Mode

It is possible to start Vault in dev mode:

vault server -dev

Dev mode is useful for learning Vault, or running experiments on some particular features. It is extremely insecure, because dev mode is equivalent to starting Vault with several insecure options. This means that Vault should never run in production in dev mode. However, this also means that all the regular Vault features are available in dev mode.

Dev mode simplifies all operations. Actually, no configuration is necessary to get Vault up and running in dev mode. It makes it possible to communicate with the Vault API from the shell without any authentication. Data is stored in memory by default. Vault is unsealed by default, and if explicitly sealed, it can be unsealed using only one key.

For more details, see "Dev" Server Mode in Vault documentation.

Vault Resources and References

Content initially contributed by Vettabase Ltd.

_{This page is licensed: CC BY-SA / Gnu FDL}

Orchestrator Overview

Orchestrator is no longer actively maintained.

Orchestrator is a MySQL and MariaDB high availability and replication management tool. It is released by Shlomi Noach under the terms of the Apache License, version 2.0.

Orchestrator provides automation for MariaDB replication in the following ways:

It can be used to perform certain operations, like repairing broken replication or moving a replica from one master to another. These operations can be requested using CLI commands, or via the GUI provided with Orchestrator. The actual commands sent to MariaDB are automated by Orchestrator, and the user doesn't have to worry about the details.
Orchestrator can also automatically perform a failover in case a master crashes or is unreachable by its replicas. If that is the case, Orchestrator will promote one of the replicas to a master. The replica to promote is chosen based on several criteria, like the server versions, the binary log formats in use and the datacenters locations.

Note that, if we don't want to use Orchestrator to automate operations, we can still use it as a dynamic inventory. Other tools can use it to obtain a list of existing MariaDB servers via its REST API or CLI commands.

Orchestrator has several big users, listed in the documentation Users page. It is also included in the PMM monitoring solution.

To install Orchestrator, see:

The install.md for a manual installation;
The links in README.md, to install Orchestrator using automation tools.

Supported Topologies

Currently, Orchestrator fully supports MariaDB GTID, replication, and semi-synchronous replication. While Orchestrator does not support Galera specific logic, it works with Galera clusters. For details, see Supported Topologies and Versions in Orchestrator documentation.

Architecture

Orchestrator consists of a single executable called orchestrator. This is a process that periodically connects to the target servers. It will run SQL queries against target servers, so it needs a user with proper permissions. When the process is running, a GUI is available via a web browser, at the URL 'localhost:3000'. It also exposes a REST API (see Using the web API in the Orchestrator documentation).

Orchestrator expects to find a JSON configuration file called orchestrator.conf.json, in /etc.

A database is used to store the configuration and the state of the target servers. By default, this is done using built-in SQLite. However, it is possible to use an external MariaDB or MySQL server instance.

If a cluster of Orchestrator instances is running, only one central database is used. One Orchestrator node is active, while the others are passive and are only used for failover. If the active node crashes or becomes unreachable, one of the other nodes becomes the active instance. The active_node table shows which node is active. Nodes communicate between them using the Raft protocol.

CLI Examples

As mentioned, Orchestrator can be used from the command-line. Here you can find some examples.

List clusters:

orchestrator -c clusters

Discover a specified instance and add it to the known topology:

orchestrator -c discover -i <host>:<port>

Forget about an instance:

orchestrator -c topology -i <host>:<port>

Move a replica to a different master:

orchestrator -c move-up -i <replica-host>:<replica-port> -d <master-host>:<master-port>

Move a replica up, so that it becomes a "sibling" of its master:

orchestrator -c move-up -i <replica-host>:<replica-port>

Move a replica down, so that it becomes a replica of its"sibling":

orchestrator -c move-below -i <replica-host>:<replica-port> -d <master-host>:<master-port>

Make a node read-only:

orchestrator -c set-read-only -i <host>:<port>

Make a node writeable:

orchestrator -c set-writeable -i <host>:<port>

The --debug and --stack options can be added to the above commands to make them more verbose.

Orchestrator Resources and References

The README.md file lists some related community projects, including modules to automate Orchestrator with Puppet and other technologies.

On GitHub you can also find links to projects that allow the use of automation software to deploy and manage Orchestrator.

Content initially contributed by Vettabase Ltd.

_{This page is licensed: CC BY-SA / Gnu FDL}

Why to Automate MariaDB Deployments and Management

MariaDB includes a powerful configuration system. This is enough when we need to deploy a single MariaDB instance, or a small number of instances. But many modern organisations have many database servers. Deploying and upgrading them manually could require too much time, and would be error-prone.

Infrastructure as Code

Several tools exist to deploy and manage several servers automatically. These tools operate at a higher level, and execute tasks like installing MariaDB, running queries, or generating new configuration files based on a template. Instead of upgrading servers manually, users can launch a command to upgrade a group of servers, and the automation software will run the necessary tasks.

Servers can be described in a code repository. This description can include MariaDB version, its configuration, users, backup jobs, and so on. This code is human-readable, and can serve as a documentation of which servers exist and how they are configured. The code is typically versioned in a repository, to allow collaborative development and track the changes that occurred over time. This is a paradigm called Infrastructure as Code.

Automation code is high-level and one usually doesn’t care how operations are implemented. Their implementation is delegated to modules that handle specific components of the infrastructure. For example a module could equally work with apt and yum package managers. Other modules can implement operations for a specific cloud vendor, so we declare we want a snapshot to be done, but we don’t need to write the commands to make it happen. For special cases, it is of course possible to write Bash commands, or scripts in every language, and declare that they must be run.

Manual interventions on the servers will still be possible. This is useful, for example, to investigate performance problems. But it is important to leave the servers in the state that is described by the code.

This code is not something you write once and never touch again. It is periodically necessary to modify infrastructures to update some software, add new replicas, and so on. Once the base code is in place, making such changes is often trivial and potentially it can be done in minutes.

Automated Failover

Once replication is in place, two important aspects to automate are load balancing and failover.

Proxies can implement load balancing, redirecting the queries they receive to different server, trying to distribute the load equally. They can also monitor that MariaDB servers are running and in good health, thus avoiding sending queries to a server that is down or struggling.

However, this does not solve the problem with replication: if a primary server crashes, its replicas should point to another server. Usually this means that an existing replica is promoted to a master. This kind of changes are possible thanks to MariaDB GTID.

One can promote a replica to a primary by making change to existing automation code. This is typically simple and relatively quick to do for a human operator. But this operation takes time, and in the meanwhile the service could be down.

Automating failover will minimise the time to recover. A way to do it is to use Orchestrator, a tool that can automatically promote a replica to a primary. The choice of the replica to promote is done in a smart way, keeping into account things like the servers versions and the binary log format.

Resources

Content initially contributed by Vettabase Ltd.

_{This page is licensed: CC BY-SA / Gnu FDL}

Ansible and MariaDB

General information and hints on how to automate MariaDB deployments and configuration with Ansible, an open source tool to automate deployment, configuration and operations

Ansible Overview for MariaDB Users

Ansible is a tool to automate servers configuration management. It is produced by Red Hat and it is open source software released under the terms of the GNU GPL.

It is entirely possible to use Ansible to automate MariaDB deployments and configuration. This page contains generic information for MariaDB users who want to learn, or evaluate, Ansible.

For information about how to install Ansible, see Installing Ansible in Ansible documentation.

Automation Hubs

Normally, Ansible can run from any computer that has access to the target hosts to be automated. It is not uncommon that all members of a team has Ansible installed on their own laptop, and use it to deploy.

Red Hat offers a commercial version of Ansible called Ansible Tower. It consists of a REST API and a web-based interface that work as a hub that handles all normal Ansible operations.

An alternative is AWX. AWX is the open source upstream project from which many Ansible Tower features are originally developed. AWX is released under the terms of the Apache License 2.0. However, Red Hat does not recommend to run AWX in production.

AWX development is fast. It has several features that may or may not end up in Ansible Tower. Ansible Tower is more focused on making AWS features more robust, providing a stable tool to automate production environments.

Design Principles

Ansible allows us to write playbooks that describe how our servers should be configured. Playbooks are lists of tasks.

Tasks are usually declarative. You don't explain how to do something, you declare what should be done.

Playbooks are idempotent. When you apply a playbook, tasks are only run if necessary.

Here is a task example:

- name: Install Perl
  package:
    name: perl
    state: present

"Install Perl" is just a description that will appear on screen when the task is applied. Then we use the package module to declare that a package called "perl" should be installed. When we apply the playbook, if Perl is already installed nothing happens. Otherwise, Ansible installs it.

When we apply a playbook, the last information that appears on the screen is a recap like the following:

PLAY RECAP ***************************************************************************************************
mariadb-01        : ok=6    changed=2    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

This means that six tasks were already applied (so no action was taken), and two tasks were applied.

As the above example shows, Ansible playbooks are written in YAML.

Modules (like package) can be written in any language, as long as they are able to process a JSON input and produce a JSON output. However the Ansible community prefers to write them in Python, which is the language Ansible is written in.

Concepts

A piece of Ansible code that can be applied to a server is called a playbook.

A task is the smallest brick of code in a playbook. The name is a bit misleading, though, because an Ansible task should not be seen as "something to do". Instead, it is a minimal description of a component of a server. In the example above, we can see a task.

A task uses a single module, which is an interface that Ansible uses to interact with a specific system component. In the example, the module is "package".

A task also has attributes, that describe what should be done with that module, and how. In the example above, "name" and "state" are both attributes. The state attribute exists for every module, by convention (though there may be exceptions). Typically, it has at least the "present" and "absent" state, to indicate if an object should exist or not.

Other important code concepts are:

An inventory determines which hosts Ansible should be able to deploy. Each host may belong to one or more groups. Groups may have children, forming a hierarchy. This is useful because it allows us to deploy on a group, or to assign variables to a group.
A role describes the state that a host, or group of hosts, should reach after a deploy.
A play associates hosts or groups to their roles. Each role/group can have more than one role.
A role is a playbook that describes how certain servers should be configured, based on the logical role they have in the infrastructure. Servers can have multiple roles, for example the same server could have both the "mariadb" and the "mydumper" role, meaning that they run MariaDB and they have mydumper installed (as shown later).
Tasks can use variables. They can affect how a task is executed (for example a variable could be a file name), or even whether a task is executed or not. Variables exist at role, group or host level. Variables can also be passed by the user when a play is applied.
Facts are data that Ansible retrieves from remote hosts before deploying. This is a very important step, because facts may determine which tasks are executed or how they are executed. Facts include, for example, the operating system family or its version. A playbook sees facts as pre-set variables.
Modules implement actions that tasks can use. Action examples are file (to declare that files and directories must exist) or mysql_variables (to declare MySQL/MariaDB variables that need to be set).

Example

Let's describe a hypothetical infrastructure to find out how these concepts can apply to MariaDB.

The inventory could define the following groups:

"db-main" for the cluster used by our website. All nodes belong to this group.
"db-analytics" for our replicas used by data analysts.
"dump" for one or more servers that take dumps from the replicas.
"proxysql" for one or more hosts that run ProxySQL.

Then we'll need the following nodes:

"mariadb-node" for the nodes in "db-main". This role describes how to setup nodes of a cluster using Galera.
"mariadb-replica" for the members of "db-analytics". It describes a running replica, and it includes the tasks that are necessary to provision the node if the data directory is empty when the playbook is applied. The hostname of the primary server is defined in a variable.
"mariadb". The aforementioned "mariadb-node" and "mariadb-replica" can be children of this group. They have many things in common (filesystem for the data directory, some basic MariaDB configuration, some installed tools...), so it could make sense to avoid duplication and describe the common traits in a super-role.
A "mariadb-backup" role to take backups with mariadb-backup, running jobs during the night. We can associate this role to the "db-main" group, or we could create a child group for servers that will take the backups.
"mariadb-dump" for the server that takes dumps with mariadb-dump. Note that we may decide to take dumps on a replica, so the same host may belong to "db-analytics" and "mariadb-dump".
"proxysql" for the namesake group.

Architecture

Ansible architecture is extremely simple. Ansible can run on any host. To apply playbooks, it connects to the target hosts and runs system commands. By default the connection happens via ssh, though it is possible to develop connection plugins to use different methods. Applying playbooks locally without establishing a connection is also possible.

Modules can be written in any language, though Python is the most common choice in the Ansible community. Modules receive JSON "requests" and facts from Ansible core, they are supposed to run useful commands on a target host, and then they should return information in JSON. Their output informs Ansible whether something has changed on the remote server and if the operations succeeded.

Ansible is not centralized. It can run on any host, and it is common for a team to run it from several laptops. However, to simplify things and improve security, it may be desirable to run it from a dedicated host. Users will connect to that host, and apply Ansible playbooks.

Ansible Resources and References

Ansible.com
AWX
Ansible Tower
Ansible Galaxy
Ansible on Wikipedia
Ansible Automation Platform YouTube channel
Ansible: Getting Started
MariaDB Deployment and Management with Ansible (video)

Further information about the concepts discussed in this page can be found in Ansible documentation:

Content initially contributed by Vettabase Ltd.

_{This page is licensed: CC BY-SA / Gnu FDL}

Deploying Docker Containers with Ansible

Ansible can be used to manage Docker container upgrades and configuration changes. Docker has native ways to do this, namely Dockerfiles and Docker Compose. But sometimes there are reasons to start basic containers from an image and then manage configuration with Ansible or similar software. See Benefits of Managing Docker Containers with Automation Software.

In this page we'll discuss how to use Ansible to manage Docker containers.

How to Deploy a Container with Ansible

Ansible has modules to manage the Docker server, Docker containers, and Docker Compose. These modules are maintained by the community.

A dynamic inventory plugin for Docker exists. It retrieves the list of existing containers from Docker.

Docker modules and the Docker inventory plugin communicate with Docker using its API. The connection to the API can use a TSL connection and supports key authenticity verification.

To communicate with Docker API, Ansible needs a proper Python module installed on the Ansible node (docker or docker-py).

Several roles exist to deploy Docker and configure it. They can be found in Ansible Galaxy.

References

Further information can be found in Ansible documentation.

Docker Guide.
docker_container module.

Content initially contributed by Vettabase Ltd.

_{This page is licensed: CC BY-SA / Gnu FDL}

Deploying to Remote Servers with Ansible

If we manage several remote servers, running commands on them manually can be frustrating and time consuming. Ansible allows one to run commands on a whole group of servers.

This page shows some examples of ansible-playbook invocations. We'll see how to deploy roles or parts of them to remote servers. Then we'll see how to run commands on remote hosts, and possibly to get information from them. Make sure to read Ansible Overview first, to understand Ansible general concepts.

Pinging Remote Servers

Let's start with the simplest example: we just want our local Ansible to ping remote servers to see if they are reachable. Here's how to do it:

ansible -i production-mariadb all -m ping

Before proceeding with more useful examples, let's discuss this syntax.

ansible is the executable we can call to run a command from remote servers.
-i production-mariadb means that the servers must be read from an inventory called production-mariadb.
all means that the command must be executed against all servers from the above inventory.
-m ping specifies that we want to run the ping module. This is not the ping Linux command. It tells us if Ansible is able to connect a remote server and run a simple commands on them.

To run ping on a specific group or host, we can just replace "all" with a group name or host name from the inventory:

ansible -i production-mariadb main_cluster -m ping

Running Commands on Remote Servers

The previous examples show how to run an Ansible module on remote servers. But it's also possible to run custom commands over SSH. Here's how:

ansible -i production-mariadb all -a 'echo $PATH'

This command shows the value of $PATH on all servers in the inventory "production-mariadb".

We can also run commands as root by adding the -b (or --become) option:

# print a MariaDB variable
ansible -i production-mariadb all -b -a 'mysql -e "SHOW GLOBAL VARIABLES LIKE \'innodb_buffer_pool_size\';"'

# reboot servers
ansible -i production-mariadb all -b -a 'reboot'

Applying Roles to Remote Servers

We saw how to run commands on remote hosts. Applying roles to remote hosts is not much harder, we just need to add some information. An example:

ansible-playbook -i production-mariadb production-mariadb.yml

Let's see what changed:

ansible-playbook is the executable file that we need to call to apply playbooks and roles.
production-mariadb.yml is the play that associates the servers listed in the inventory to their roles.

If we call ansible-playbook with no additional arguments, we will apply all applicable roles to all the servers mentioned in the play.

To only apply roles to certain servers, we can use the -l parameter to specify a group, an individual host, or a pattern:

# Apply to the mariadb-main role role
ansible-playbook -i production-mariadb -l mariadb-main production-mariadb.yml

# Apply to the mariadb-main-01 host
ansible-playbook -i production-mariadb -l mariadb-main-01 production-mariadb.yml

# Apply to multiple hosts whose name starts with "mariadb-main-"
ansible-playbook -i production-mariadb -l mariadb-main-* production-mariadb.yml

We can also apply tasks from roles selectively. Tasks may optionally have tags, and each tag corresponds to an operation that we may want to run on our remote hosts. For example, a "mariadb" role could have the "timezone-update" tag, to update the contents of the timezone tables. To only apply the tasks with the "timezone-update" tag, we can use this command:

ansible-playbook -i production-mariadb --tag timezone-update production-mariadb.yml

Using tags is especially useful for database servers. While most of the technologies typically managed by Ansible are stateless (web servers, load balancers, etc.) database servers are not. We must pay special attention not to run tasks that could cause a database server outage, for example destroying its data directory or restarting the service when it is not necessary.

Check mode

We should always test our playbooks and roles on test servers before applying them to production. However, if test servers and production servers are not exactly in the same state (which means, some facts may differ) it is still possible that applying roles will fail. If it fails in the initial stage, Ansible will not touch the remote hosts at all. But there are cases where Ansible could successfully apply some tasks, and fail to apply another task. After the first failure, ansible-playbook will show errors and exit. But this could leave a host in an inconsistent state.

Ansible has a check mode that is meant to greatly reduce the chances of a failure. When run in check mode, ansible-playbook will read the inventory, the play and roles; it will figure out which tasks need to be applied; then it will connect to target hosts, read facts, and value all the relevant variables. If all these steps succeed, it is unlikely that running ansible-playbook without check mode will fail.

To run ansible-playbook in check mode, just add the --check (or -C) parameter.

References

Further documentation can be found in the Ansible website:

ansible tool.
ansible-playbook tool.
Validating tasks: check mode and diff mode.

Content initially contributed by Vettabase Ltd.

_{This page is licensed: CC BY-SA / Gnu FDL}

Existing Ansible Modules and Roles for MariaDB

This page contains links to Ansible modules and roles that can be used to automate MariaDB deployment and configuration. The list is not meant to be exhaustive. Use it as a starting point, but then please do your own research.

Modules

At the time of writing, there are no MariaDB-specific modules in Ansible Galaxy. MySQL modules can be used. Trying to use MySQL-specific features may result in errors or unexpected behavior. However, the same applies when trying to use a feature not supported by the MySQL version in use.

Currently, the MySQL collection in Ansible Galaxy contains at least the following modules:

mysql_db: manages MySQL databases.
mysql_info: gathers information about a MySQL server.
mysql_query: runs SQL queries against MySQL.
mysql_replication: configures and operates asynchronous replication.
mysql_user: creates, modifies and deletes MySQL users.
mysql_variables: manages MySQL configuration.

Note that some modules only exist as shortcuts, and it is possible to use mysql_query instead. However, it is important to notice that mysql_query is not idempotent. Ansible does not understand MySQL queries, therefore it cannot check whether a query needs to be run or not.

To install this collection locally:

ansible-galaxy collection install community.mysql

MariaDB Corporation maintains a ColumnStore playbook on GitHub.

Other Useful Modules

Let's see some other modules that are useful to manage MariaDB servers.

shell and command

Modules like shell and command allow one to run system commands.

To deploy on Windows, win_shell and win_command can be used.

Among other things, it is possible to use one of these modules to run MariaDB queries:

- name: Make the server read-only
  # become root to log into MariaDB with UNIX_SOCKET plugin
  become: yes
  shell: $( which mysql ) -e "SET GLOBAL read_only = 1;"

The main disadvantage with these modules is that they are not idempotent, because they're meant to run arbitrary system commands that Ansible can't understand. They are still useful in a variety of cases:

To run queries, because mysql_query is also not idempotent.
In cases when other modules do not allow us to use the exact arguments we need to use, we can achieve our goals by writing shell commands ourselves.
To run custom scripts that implement non-trivial logic. Implementing complex logic in Ansible tasks is possible, but it can be tricky and inefficient.
To call command-line tools. There may be specific roles for some of the most common tools, but most of the times using them is an unnecessary complication.

copy and template

An important part of configuration management is copying configuration files to remote servers.

The copy module allows us to copy files to target hosts. This is convenient for static files that we want to copy exactly as they are. An example task:

- name: Copy my.cnf
  copy:
    src: ./files/my.cnf.1
    dest: /etc/mysql/my.cnf

As you can see, the local name and the name on remote host don't need to match. This is convenient, because it makes it easy to use different configuration files with different servers. By default, files to copy are located in a files subdirectory in the role.

However, typically the content of a configuration file should vary based on the target host, the group and various variables. To do this, we can use the template module, which compiles and copies templates written in Jinja.

A simple template task:

- name: Compile and copy my.cnf
  copy:
    src: ./templates/my.cnf.j2
    dest: /etc/mysql/my.cnf

Again, the local and the remote names don't have to match. By default, Jinja templates are located in a templates subdirectory in the role, and by convention they have the .j2 extension. This is because Ansible uses Jinja version 2 for templating, at the time writing.

A simple template example:

## WARNING: DO NOT EDIT THIS FILE MANUALLY !!
## IF YOU DO, THIS FILE WILL BE OVERWRITTEN BY ANSIBLE

[mysqld]
innodb_buffer_pool_size = {{ innodb_buffer_pool_size }}

<div data-gb-custom-block data-tag="if" data-0='true' data-1='true' data-2='true' data-3='true'>

connect_work_size = {{ connect_work_size }}

</div>

Other Common Modules

The following modules are also often used for database servers:

package, apt or yum. Package is package manager-agnostic. Use them to install, uninstall and upgrade packages.
user, useful to create the system user and group that run MariaDB binary.
file can be used to make sure that MariaDB directories (like the data directory) exist and have proper permissions. It can also be used to upload static files.
template allows to create configuration files (like my.cnf) more dynamically, using the Jinja template language.
service is useful after installing MariaDB as a service, to start it, restart it or stop it.

Roles

Specific roles exist for MariaDB in Ansible Galaxy. Using them for MariaDB is generally preferable, to be sure to avoid incompatibilities and to probably be able to use some MariaDB specific features. However, using MySQL or Percona Server roles is also possible. This probably makes sense for users who also administer MySQL and Percona Server instances.

To find roles that suits you, check Ansible Galaxy search page. Most roles are also available on GitHub.

You can also search roles using the ansible-galaxy tool:

ansible-galaxy search mariadb

Installing MariaDB .deb Files with Ansible

This page refers to the operations described in Installing MariaDB .deb Files. Refer to that page for a complete list and explanation of the tasks that should be performed.

Here we discuss how to automate such tasks using Ansible. For example, here we show how to install a package or how to import a GPG key; but for an updated list of the necessary packages and for the keyserver to use, you should refer to Installing MariaDB .deb Files.

Adding apt Repositories

To add a repository:

- name: Add specified repository into sources list
  ansible.builtin.apt_repository:
    repo: deb [arch=amd64,arm64,ppc64el] http://sfo1.mirrors.digitalocean.com/mariadb/repo/10.3/ubuntu bionic main
    state: present

If you prefer to keep the repository information in a source list file in the Ansible repository, you can upload that file to the target hosts in this way:

- name: Create a symbolic link
  ansible.builtin.file:
    src: ./file/mariadb.list
    dest: /etc/apt/sources.list.d/
    owner: root
    group: root
    mod: 644
    state: file

Updating the Repository Cache

Both the Ansible modules ansible.builtin.apt and ansible.builtin.apt_repository have an update_cache attribute. In ansible.builtin.apt it is set to "no" by default. Whenever a task sets it to 'yes', apt-get update is run on the target system. You have three ways to make sure that repositories are updated.

The first is to use ansible.builtin.apt_repository to add the desired repository, as shown above. So you only need to worry about updating repositories if you use the file method.

The second is to make sure that update_cache is set to 'yes' when you install a repository:

- name: Install foo
  apt:
    name: foo
    update_cache: yes

But if you run certain tasks conditionally, this option may not be very convenient. So the third option is to update the repository cache explicitly as a separate task:

- name: Update repositories
  apt:
    - update_cache: yes

Importing MariaDB GPG Key

To import the GPG key for MariaDB we can use the ansible.builtin.apt_key Ansible module. For example:

- name: Add an apt key by id from a keyserver
  ansible.builtin.apt_key:
    keyserver: hkp://keyserver.ubuntu.com:80
    id: 0xF1656F24C74CD1D8

Installing Packages

To install Deb packages into a system:

- name: Install software-properties-common
  apt:
    name: software-properties-common
    state: present

To make sure that a specific version is installed, performing an upgrade or a downgrade if necessary:

- name: Install foo 1.0
  apt:
    name: foo=1.0

To install a package or upgrade it to the latest version, use: state: latest.

To install multiple packages at once:

- name: Install the necessary packages
  apt:
    pkg:
    - pkg1
    - pkg2=1.0

If all your servers run on the same system, you will always use ansible.builtin.apt and the names and versions of the packages will be the same for all servers. But suppose you have some servers running systems from the Debian family, and others running systems from the Red Hat family. In this case, you may find convenient to use two different task files for two different types of systems. To include the proper file for the target host's system:

- include: mariadb-debian.yml
  when: "{{ ansible_facts['os_family'] }} == 'Debian'

The variables you can use to run the tasks related to the proper system are:

There is also a system-independent package module, but if the package names depend on the target system using it may be of very little benefit.

Managing Secrets in Ansible

An Ansible role often runs commands that require certain privileges, so it must perform some forms of login, using passwords or key pairs. In the context of database automation, we normally talk about: SSH access, sudo access, and access to MariaDB. If we write these secrets (passwords or private keys) in clear text in an Ansible repository, anyone who has access to the repository can access them, and this is not what we want.

Let's see how we can manage secrets.

The SSH Password or Keys

Most of the times, Ansible connects to the target hosts via SSH. It is common to use the system username and the SSH keys installed in /.ssh, which is the SSH clients default. In this case, nothing has to be done on the clients to be able to allow Ansible to use SSH, as long as they are already able to connect to the target hosts.

It is also possible to specify a different username as ANSIBLE_REMOTE_USER and an SSH configuration file as ANSIBLE_NETCONF_SSH_CONFIG. These settings can be specified in Ansible configuration file or as environment variables.

ANSIBLE_ASK_PASS can be specified. If this is the case, Ansible will prompt the user asking to type an SSH password.

As a general rule, any configuration that implies communicating sensible information to the persons who will connects to a system implies some degree of risk. Therefore, the most common choice is to allow users to login into remote systems with their local usernames, using SSH keys.

Once Ansible is able to connect remote hosts, it can also be used to install the public keys of some users to grant them access. Sharing these keys implies no risk. Sharing private keys is never necessary, and must be avoided.

MariaDB has a UNIX_SOCKET plugin that can be used to let some users avoid entering a password, as far as they're logged in the operating system. This authentication method is used by default for the root user. This is a good way to avoid having one more password and possibly writing to a .my.cnf file so that the user doesn't have to type it.

Even for users who connect remotely, it is normally not necessary to insert passwords in an Ansible file. When we create a user with a password, a hash of the original password is stored in MariaDB. That hash can be found in the mysql.user table. To know the hash of a password without even creating a user, we can use the PASSWORD() function:

SELECT PASSWORD('my_password12') AS hash;

When we create a user, we can actually specify a hash instead of the password the user will have to type:

CREATE USER user@host IDENTIFIED BY PASSWORD '*54958E764CE10E50764C2EECBB71D01F08549980';

ansible-vault

Even if you try to avoid sharing secrets, it's likely you'll have to keep some in Ansible. For example, MariaDB users that connect remotely have passwords, and if we want Ansible to create and manage those users, the hashes must be placed somewhere in our Ansible repository. While a hash cannot be converted back to a password, treating hashes as secrets is usually a good idea. Ansible provides a native way to handle secrets: ansible-vault.

In the simplest case, we can manage all our passwords with a single ansible-vault password. When we add or change a new password in some file (typically a file in host_vars or group_vars) we'll use ansible-vault to crypt this password. While doing so, we'll be asked to insert our ansible-vault password. When we apply a role and Ansible needs to decrypt this password, it will ask us to enter again our ansible-vault password.

ansible-vault can use more than one password. Each password can manage a different set of secrets. So, for example, some users may have the password to manage regular MariaDB users passwords, and only one may have the password that is needed to manage the root user.

Content initially contributed by Vettabase Ltd.

_{This page is licensed: CC BY-SA / Gnu FDL}

Running mariadb-tzinfo-to-sql with Ansible

For documentation about the mariadb-tzinfo-to-sql utility, see mysql_tzinfo_to_sql. This page is about running it using Ansible.

Installing or Upgrading the Package

First, we should install mariadb-tzinfo-to-sql if it is available on our system. For example, to install it on Ubuntu, we can use this task. For other systems, use the proper module and package name.

- name: Update timezone info
  tags: [ timezone-update ]
  apt:
    name: tzdata
    state: latest
    install_recommends: no
  register: timezone_info

This task installs the latest version of the tzdata, unless it is already installed and up to date. We register the timezone_info variables, so we can only run the next task if the package was installed or updated.

We also specify a timezone-update tag, so we can apply the role to only update the timezone tables.

Running the Script

The next task runs mariadb-tzinfo-to-sql.

- name: Move system timezone info into MariaDB
  tags: [ timezone-update ]
  shell: >
    mysql_tzinfo_to_sql /usr/share/zoneinfo \
      | grep -v "^Warning" \
      | mysql --database=mysql
  when: timezone_info.changed

We use the shell module to run the command. Running a command in this way is not idempotent, so we specify when: timezone_info.changed to only run it when necessary. Some warnings may be generated, so we pipe the output of mysql_tzinfo_to_sql to grep to filter warnings out.

Using Galera

If we're using MariaDB Galera Cluster we'll want to only update the timezone tables in one node, because the other nodes will replicate the changes. For our convenience, we can run this operation on the first node. If the nodes hostnames are defined in a list called cluster_hosts, we can check if the current node is the first in this way:

when: timezone_info.changed and inventory_hostname == cluster_hosts[0].hostname

Content initially contributed by Vettabase Ltd.

_{This page is licensed: CC BY-SA / Gnu FDL}

Puppet and MariaDB

General information and hints on how to automate MariaDB deployments and configuration with Puppet, an open source tool for deployment, configuration, and operations.

Puppet Overview

Puppet is a tool to automate servers configuration management. It is produced by Puppet Inc, and released under the terms of the Apache License, version 2.

It is entirely possible to use Ansible to automate MariaDB deployments and configuration. This page contains generic information for MariaDB users who want to learn, or evaluate, Puppet.

Puppet modules can be searched using Puppet Forge. Most of them are also published on GitHub with open source licenses. Puppet Forge allows filtering modules to only view the most reliable: supported by Puppet, supported by a Puppet partner, or approved.

For information about installing Puppet, see Installing and upgrading in Puppet documentation.

Design Principles

With Puppet, you write manifests that describe the resources you need to run on certain servers and their attributes.

Therefore manifests are declarative. You don't write the steps to achieve the desired result. Instead, you describe the desired result. When Puppet detects differences between your description and the current state of a server, it decides what to do to fix those differences.

Manifests are also idempotent. You don't need to worry about the effects of applying a manifest twice. This may happen (see Architecture below) but it won't have any side effects.

Defining Resources

Here's an example of how to describe a resource in a manifest:

file { '/etc/motd':
  content => '',
  ensure => present,
}

This block describes a resource. The resource type is file, while the resource itself is /etc/motd. The description consists of a set of attributes. The most important is ensure, which in this case states that the file must exist. It is also common to use this resource to indicate that a file (probably created by a previous version of the manifest) doesn't exist.

These classes of resource types exist:

Built-in resources, or Puppet core resources: Resources that are part of Puppet, maintained by the Puppet team.
Defined resources: Resources that are defined as a combination of other resources. They are written in the Puppet domain-specific language.
Custom resources: Resources that are written by users, in the Ruby language.

To obtain information about resources:

# list existing resource types
puppet resource --types
# print information about the file resource type
puppet describe file

To group several resources in a reusable class:

class ssh_server {
  file { '/etc/motd':
    content => '',
    ensure => present,
  }
  file { '/etc/issue.net':
    content => '',
    ensure => present,
  }
}

There are several ways to include a class. For example:

include Class['ssh_server']

Defining Nodes

Puppet has a main manifest that could be a site.pp file or a directory containing .pp files. For simple infrastructures, we can define the nodes here. For more complex infrastructures, we may prefer to import other files that define the nodes.

Nodes are defined in this way:

node 'maria-1.example.com' {
  include common
  include mariadb
}

The resource type is node. Then we specify a hostname that is used to match this node to an existing host. This can also be a list of hostnames, a regular expression that matches multiple nodes, or the default keyword that matches all hosts. To use a regular expression:

node /^(maria|mysql)-[1-3]\.example\.com$/ {
  include common
}

Concepts

The most important Puppet concepts are the following:

Target: A host whose configuration is managed via Puppet.
Group: A logical group of targets. For example there may be a mariadb group, and several targets may be part of this group.
Facts: Information collected from the targets, like the system name or system version. They're collected by a Ruby gem called Facter. They can be core facts (collected by default) or custom facts (defined by the user).
Manifest: A description that can be applied to a target.
Catalog: A compiled manifest.
Apply: Modifying the state of a target so that it reflects its description in a manifest.
Module: A set of manifests.
Resource: A minimal piece of description. A manifest consists of a piece of resources, which describe components of a system, like a file or a service.
Resource type: Determines the class of a resource. For example there is a file resource type, and a manifest can contain any number of resources of this type, which describe different files.
Attribute: It's a characteristic of a resource, like a file owner, or its mode.
Class: A group of resources that can be reused in several manifests.

Architecture

Depending on how the user decides to deploy changes, Puppet can use two different architectures:

An Agent-master architecture. This is the preferred way to use Puppet.
A standalone architecture, that is similar to Ansible architecture.

Agent-master Architecture

A Puppet master stores a catalog for each target. There may be more than one Puppet master, for redundancy.

Each target runs a Puppet agent in background. Each Puppet agent periodically connects to the Puppet master, sending its facts. The Puppet master compiles the relevant manifest using the facts it receives, and send back a catalog. Note that it is also possible to store the catalogs in PuppetDB instead.

Once the Puppet agent receives the up-to-date catalog, it checks all resources and compares them with its current state. It applies the necessary changes to make sure that its state reflects the resources present in the catalog.

Standalone Architecture

With this architecture, the targets run Puppet apply. This application usually runs as a Linux cron job or a Windows scheduled task, but it can also be manually invoked by the user.

When Puppet apply runs, it compiles the latest versions of manifests using the local facts. Then it checks every resource from the resulting catalogs and compares it to the state of the local system, applying changes where needed.

Newly created or modified manifests are normally deployed to the targets, so Puppet apply can read them from the local host. However it is possible to use PuppetDB instead.

PuppetDB

PuppetDB is a Puppet node that runs a PostgreSQL database to store information that can be used by other nodes. PuppetDB can be used with both the Agent-master and the standalone architectures, but it is always optional. However it is necessary to use some advanced Puppet features.

PuppetDB stored the following information:

The latest facts from each target.
The latest catalogs, compiled by Puppet apply or a Puppet master.
Optionally, the recent history of each node activities.

External Node Classifiers

With both architectures, it is possible to have a component called an External Node Classifier (ENC). This is a script or an executable written in any language that Puppet can call to determine the list of classes that should be applied to a certain target.

An ENC received a node name in input, and should return a list of classes, parameters, etc, as a YAML hash.

Bolt

Bolt can be used in both architectures to run operations against a target or a set of targets. These operations can be commands passed manually to Bolt, scripts, Puppet tasks or plans. Bolt directly connects to targets via ssh and runs system commands.

See Bolt Examples to get an idea of what you can do with Bolt.

hiera

hiera is a hierarchical configuration system that allows us to:

Store configuration in separate files;
Include the relevant configuration files for every server we automate with Puppet.

See Puppet hiera Configuration System for more information.

Puppet Resources

More information about the topics discussed in this page can be found in the Ansible documentation:

Puppet Glossary in Puppet documentation.
Overview of Puppet's architecture in Puppet documentation.
PuppetDB documentation.
Classifying nodes in Puppet documentation.
Hiera in Puppet documentation.
Bolt documentation.

Content initially contributed by Vettabase Ltd.

_{This page is licensed: CC BY-SA / Gnu FDL}

Deploying Docker Containers with Puppet

Puppet can also be used to manage Docker container upgrades and configuration changes. Docker has more specific tools for this purpose, but sometimes there are reasons to choose alternatives. See Benefits of Managing Docker Containers with Automation Software.

In this page you will find out what managing Docker with Puppet looks like. All the snippets in this page use the docker resource type, supported by the Puppet company.

How to Install, Upgrade or Uninstall Docker with Puppet

Installing or upgrading Docker is simple:

class { 'docker':
  use_upstream_package_source => false,
  version => '17.09.0~ce-0~debian',
}

In this example we are using our system's repositories instead of Docker official repositories, and we are specifying the desired version. To upgrade Docker later, all we need to do is to modify the version number. While specifying a version is not mandatory, it is a good idea because it makes our manifest more reproducible.

To uninstall Docker:

class { 'docker':
  ensure => absent
}

Check the docker resource type documentation to find out how to use more features: for example you can use Docker Enterprise Edition, or bind the Docker daemon to a TCP port.

How to Build or Pull Docker Images with Puppet

To pull an image from Dockerhub:

docker::image { 'mariadb:10.0': }

We specified the 10.0 tag to get the desired MariaDB version. If we don't, the image with the latest tag will be used. Note that this is not desirable in production, because it can lead to unexpected upgrades.

You can also write a Dockerfile yourself, and then build it to create a Docker image. To do so, you need to instruct Puppet to copy the Dockerfile to the target and then build it::

file { '/path/to/remote/Dockerfile':
  ensure => file,
  source => 'puppet:///path/to/local/Dockerfile',
}

docker::image { 'image_name':
  docker_file => '/path/to/remote/Dockerfile'
}

It is also possible to subscribe to Dockerfile changes, and automatically rebuild the image whenever a new file is found:

docker::image { 'image_name':
  docker_file => '/path/to/remote/Dockerfile'
  subscribe => File['/path/to/remote/Dockerfile'],
}

To remove an image that was possibly built or pulled:

docker::image { 'mariadb':
  ensure => absent
}

How to Deploy Containers with Puppet

To run a container:

docker::run { 'mariadb-01':
    image   => 'mariadb:10.5',
    ports   => ['3306:6606']
}

mariadb-01 is the contained name. We specified the optional 10.5 tag, and we mapped the guest port 3306 to the host port 6606. In production, you normally don't map ports because you don't need to connect MariaDB clients from the host system to MariaDB servers in the containers. Third-party tools can be installed as separate containers.

References

docker resource type documentation, in Puppet documentation.

Content initially contributed by Vettabase Ltd.

_{This page is licensed: CC BY-SA / Gnu FDL}

Existing Puppet Modules for MariaDB

This page contains links to Puppet modules that can be used to automate MariaDB deployment and configuration. The list is not meant to be exhaustive. Use it as a starting point, but then please do your own research.

Puppet Forge

Puppet Forge is the website to search for Puppet modules, maintained by the Puppet company. Modules are searched by the technology that needs to be automated, and the target operating system.

Search criteria include whether the modules are supported by Puppet or its partners, and whether a module is approved by Puppet. Approved modules are certified by Puppet based on their quality and maintenance standards.

Acceptance Tests

Some modules that support the Puppet Development Kit allow some types of acceptance tests.

We can run a static analysis on a module's source code to find certain bad practices that are likely to be a source of bugs:

pdk validate

If a module's authors wrote unit tests, we can run them in this way:

pdk test unit

Supported Modules for MariaDB

At the time of writing, there are no supported or approved modules for MariaDB.

However there is a mysql module supported by Puppet, that supports the Puppet Development Kit. Though it doesn't support MariaDB-specific features, it works with MariaDB. Its documentation shows how to use the module to install MariaDB on certain operating systems.

Several unsupported and not approved modules exist for MariaDB and MaxScale.

Resources and References

Puppet Forge website.
Puppet Development Kit documentation.
Modules overview in Puppet documentation.
Beginner's guide to writing modules in Puppet documentation.
Puppet Supported Modules page in Puppet Forge.

Content initially contributed by Vettabase Ltd.

_{This page is licensed: CC BY-SA / Gnu FDL}

Puppet hiera Configuration System

hiera is part of Puppet. It is a hierarchical configuration system that allows us to:

Store configuration in separate files;
Include the relevant configuration files for every server we automate with Puppet.

hiera Configuration Files

Each hierarchy allows to one choose the proper configuration file for a resource, based on certain criteria. For example criteria may include node names, node groups, operating systems, or datacenters. Hierarchies are defined in a hiera.yaml file, which also defines a path for the files in each hierarchy.

Puppet facts are commonly used to select the proper files to use. For example, a path may be defined as "os/%{facts.os.name}.yaml". In this case, each resource will use a file named after the operating system it uses, in the os directory. You may need to use custom facts, for example to check which microservices will use a MariaDB server, or in which datacenter it runs.

We do not have to create a file for each possible value of a certain fact. We can define a default configuration file with settings that are reasonable for most resources. Other files, when included, will override some of the default settings.

A hiera configuration file will look like this:

version: 5
defaults:
  datadir: global
  data_hash: yaml_data

hierarchy:
  - name: "Node data"
    path: "nodes/%{trusted.certname}.yaml"

  - name: "OS data"
    path: "os/%{facts.os.family}.yaml"

  - name: "Per-datacenter business group data" # Uses custom facts.
    path: "location/%{facts.whereami}/%{facts.group}.yaml"

This file would include the global files, the OS-specific files and the node-specific files. Each hierarchy will override settings from previous hierarchies.

We can actually have several hiera configuration files. hiera.yaml is the global file. But we will typically have additional hiera configuration files for each environment. So we can include the configuration files that apply to production, staging, etc, plus global configuration files that should be included for every environment.

Importantly, we can also have hiera configuration files for each module. So, for example, a separate mariadb/hiera.yaml file may defined the hierarchies for MariaDB servers. This allow us to define, for example, different configuration files for MariaDB and for MaxScale, as most of the needed settings are typically different.

Configuration files

You probably noticed that, in the previous example, we defined data_hash: yaml_data, which indicates that configuration files are written in YAML. Other allowed formats are JSON and HOCON. The data_hash setting is defined in defaults, but it can be overridden by hierarchies.

Content initially contributed by Vettabase Ltd.

_{This page is licensed: CC BY-SA / Gnu FDL}

Bolt Examples

This page shows some examples of what we can do with Bolt to administer a set of MariaDB servers. Bolt is a tool that is part of the Puppet ecosystem.

For information about installing Bolt, see Installing Bolt in Bolt documentation.

Inventory Files

The simplest way to call Bolt and instruct it to do something on some remote targets is the following:

bolt ... --nodes 100.100.100.100,200.200.200.200,300,300,300,300

However, for non-trivial setups it is usually better to use an inventory file. An example:

targets:
  - uri: maria-1.example.com
    name: maria_1
    alias: mariadb_main
  ...

In this way, it will be possible to refer the target by name or alias.

We can also define groups, followed by the group members. For example:

groups:
  - name: mariadb-staging
    targets:
        - uri: maria-1.example.com
        name: maria_1
        - uri: maria-2.example.com
        name: maria_2
  - name: mariadb-production
    targets:
      ...
...

With an inventory of this type, it will be possible to run Bolt actions against all the targets that are members of a group:

bolt ... --nodes mariadb-staging

In the examples in the rest of the page, the --targets parameter will be indicated in this way, for simplicity: --targets <targets>.

Running Commands on Targets

The simplest way to run a command remotely is the following:

bolt command run 'mariadb-admin start-all-slaves' --targets <targets>

Copying Files

To copy a file or a whole directory to targets:

bolt file upload /path/to/source /path/to/destination --targets <targets>

To copy a file or a whole directory from the targets to the local host:

bolt file download /path/to/source /path/to/destination --targets <targets>

Running Scripts on Targets

We can use Bolt to run a local script on remote targets. Bolt will temporarily copy the script to the targets, run it, and delete it from the targets. This is convenient for scripts that are meant to only run once.

bolt script run rotate_logs.sh --targets <targets>

Running Tasks on Targets

Puppet tasks are not always as powerful as custom scripts, but they are simpler and many of them are idempotent. The following task stops MariaDB replication:

bolt task run mysql::sql --targets <targets> sql="STOP REPLICA"

Applying Puppet Code on Targets

It is also possible to apply whole manifests or portions of Puppet code (resources) on the targets.

To apply a manifest:

bolt apply manifests/server.pp  --targets <targets>

To apply a resource description:

bolt apply --execute "file { '/etc/mysql/my.cnf': ensure => present }" --targets <targets>

Bolt Resources and References

Further information about the concepts explained in this page can be found in Bolt documentation:

Inventory Files in Bolt documentation.
Applying Puppet code in Bolt documentation.

Content initially contributed by Vettabase Ltd.

_{This page is licensed: CC BY-SA / Gnu FDL}

MariaDB Containers

Containers are an OCI standard format for software images and their specified time all bundled up into a single distributable time. They can be used for production, development or testing.

Docker Inc. run a Docker Official Images program to provide users with an essential base implementation of MariaDB in a container and to exemplify best practices of a container.

The containers are available on Docker Hub as docker.io/library/mariadb though many container runtime implementation will fill in the docker.io/library where the host/path isn't specified.

The containers are in a Open Container Initiative format that allows the containers to be interoperable with a number of container runtime implementations. Docker, or more fully Docker Engine, is just one of the many available runtimes.

Many people use MariaDB Docker Official Image containers in CI systems like GitHub Actions, though its possible to use these in production environments like kubernetes.

The MariaDB Server container images are available with a number of tags:

A full version, like 10.11.5
A major version like 10.11
The most recent stable GA version - latest
The most recent stable LTS version - lts

Versions that aren't stable will be suffixed with -rc, or -alpha to clearly show their release status, and enables Renovatebot and other that follow semantic versioning to follow updates.

For a consistent application between testing an production environment using the SHA hash of the image is recommended like docker.io/library/mariadb@sha256:29fe5062baf36bae8ec68f21a3dce4f0372dadc185e687624f1252fc49d91c67. There is a list of mapping and history of tags to SHA hash on the Docker Library repository.

Adding Plugins to the MariaDB Docker Official Image

MariaDB has many plugins. Most are not enabled by default, some are in the mariadb container, while others need to be installed from additional packages.

The following methods summarize Installing plugins in the MariaDB Docker Library Container (mariadb.org blog post) on this topic.

Which Plugins Does the Container Contain?

To see which plugins are available in the mariadb:

$ docker run --rm mariadb:latest ls -C /usr/lib/mysql/plugin

Enabling a Plugin Using Flags

Using the --plugin-load-add flag with the plugin name (can be repeated), the plugins will be loaded and ready when the container is started:

For example, to enable the simple\_password\_check plugin:

$ docker run --name some-%%REPO%% -e MARIADB_ROOT_PASSWORD=my-secret-pw --network=host -d mariadb:latest --plugin-load-add=simple_password_check

Enabling a Plugin in the Configuration Files

plugin-load-add` can be used as a configuration option to load plugins. The example below loads the FederatedX Storage Engine.

$ printf "[mariadb]\nplugin-load-add=ha_federatedx\n" > /my/custom/federatedx.conf
$ docker run --name some-mariadb -v /my/custom:/etc/mysql/conf.d -e MARIADB_ROOT_PASSWORD=my-secret-pw -d mariadb:latest

Install a Plugin Using SQL in /docker-entrypoint-initdb.d

INSTALL SONAME can be used to install a plugin as part of the database initialization.

Create the SQL file used in initialization:

$ echo 'INSTALL SONAME "disks";' > my_initdb/disks.sql

In this case, the my\_initdb is a /docker-entrypoint-initdb.d directory per "Initializing a fresh instance" section above.

Identifying Additional Plugins in Additional Packages

A number of plugins are in separate packages to reduce their installation size. The package names of MariaDB-created plugins can be determined using the following command:

$ docker run --rm mariadb:latest sh -c 'apt-get update -qq && apt-cache search mariadb-plugin'

Creating an Image With Plugins From Additional Packages

A new image needs to be created when using additional packages. The mariadb image can however be used as a base:

In the following, the CONNECT Storage Engine is installed:

FROM mariadb:latest
RUN apt-get update && \
    apt-get install mariadb-plugin-connect -y && \
    rm -rf /var/lib/apt/lists/*

Installing plugins from packages creates a configuration file in the directory /etc/mysql/mariadb.conf.d/ that loads the plugin on startup.

_{This page is licensed: CC BY-SA / Gnu FDL}

Benefits of Managing MariaDB Containers with Orchestration Software

In this page we'll discuss why automating containers with software like Ansible or Puppet may be desirable in some cases. To talk about this, we'll first need to discuss why containers are defined ephemeral, and how this applies to containerized database servers (particularly MariaDB).

During the discussion, we should keep in mind that Docker Engine, CRI-I, containerd, Mirantis Container Runtime, Podman and other OCI container runtimes can be used to setup production and/or development environments. These use cases are very different from a database perspective: a production database may be big, and typically contains data that we don't want to lose. Development environments usually contain small sample data that can be rebuilt relatively quickly. This page focuses on the latter case.

Container Ephemeral Nature

Images are an OCI specified format that can be compiled from Dockerfiles as one of the ways. Containers are the OCI runtime specified way of creating a runtime version of an images. Normally, a container is not modified from the moment it is created. In other words, containers are usually designed to be ephemeral, meaning that they can be destroyed and replaced with new containers at any time. Provided that there is proper redundancy (for example, there are several web servers running the same services) destroying one container and starting a new one of the same type won't cause any damage.

We will discuss a bit later how this applies to MariaDB, and more generally to database servers.

When something should change, for example some software version or configuration, normally Dockerfiles are updated and containers are recreated from the latest image versions. For this reason, containers shouldn't contain anything that shouldn't be lost, and recreating them should be an extremely cheap operation. Docker Compose or the Swarm mode are used to declare which containers form a certain environment, and how they communicate with each other.

On the contrary, Ansible and Puppet are mainly built to manage the configuration of existing servers. It doesn't recreate servers, it changes their configuration. So Docker and Ansible have very different approaches. For this reason, Ansible and Puppet are not frequently used to deploy containers to production. However, using them together can bring some benefits, especially for development environments.

More on this later in the page. First, we need to understand how these concepts apply to database servers.

Stateful Technologies

Using ephemeral containers works very well for stateless technologies, like web servers and proxies. These technologies virtually only need binaries, configuration and small amounts of data (web pages). If some data need to be restored after a container creation, it will be a fast operation.

In the case of a database, the problem is that data can be large and need to be written somewhere. We don't want all databases to disappear when we destroy a container. Even if we had an up-to-date backup, restoring it would take time.

However, OCI Containers has features called volumes. A volume is a directory in the host system mapped to a directory in one or more containers. Volumes are not destroyed when containers are destroyed. They can be used to share data between any number of containers and the host system. Therefore, they are also a good way to persist data.

Suppose a MariaDB container called mariadb-main-01 uses a volume that is mapped to /var/docker/volumes/mariadb-main. At some point we want to use a more recent MariaDB version. As explained earlier, the container way to do this is to destroy the container and create a new one that uses a more recent version of the MariaDB image.

So, we will destroy mariadb-main-01. The volume is still there. Then we create a new container with the same name, but based on a newer image. We make sure to link the volume to the new container too, so it will be able to use /var/docker/volumes/mariadb-main again. At this point we may want to run mariadb-upgrade, but apart from that, everything should just work.

The container runtime implementations also provide the opportunity to create a volume with an explicit name and this is also persistent. The actual location on the filesystem is managed by the runtime.

The above described steps are simple, but running them manually is time consuming and error-prone. Automating them with some automation software like Ansible or Puppet is often desirable.

Ways to Deploy Containers

Containers can be deployed in the following ways:

Manually. See Installing and Using MariaDB via Docker. This is not recommended for production, or for complex environments. However, it can easily be done for the simplest cases. If we want to make changes to our custom images, we'll need to modify the Dockerfiles, destroy the containers and recreate them.
With Docker Compose. See Setting Up a LAMP Stack with Docker Compose for a simple example. When we modify a Dockerfile, we'll need to destroy the containers and recreate them, which is usually as simple as running docker compose down followed by docker compose up. After changing docker-cmpose.yml (maybe to add a container or a network) we'll simply need to run docker compose up again, because it is idempotent.
Using Ansible, Puppet or other automation software, as mentioned before. We can use Ansible or Puppet to create the containers, and run them again every time we want to apply some change to the containers. This means that the containers are potentially created once and modified any number of times.

In all these cases, it is entirely possible to add Vagrant to the picture. Vagrant is a way to deploy or provision several hosts, including virtual machines (the most common case), and containers. It is agnostic in regarding the underlying technology, so it can deploy to a virtual machine, a container, or even a remote server in the same way. Containers can work with Vagrant in two ways:

As a provisioner. In this case Vagrant will most commonly deploy a virtual machine, and will use Docker to setup the applications that need to run in it, as containers. This guarantees a higher level of isolation, compared to running the containers in the local host. Especially if you have different environments to deploy locally, because you can have them on different virtual machines.
As a provider. Vagrant will deploy one or more containers locally. Once each container is up, Vagrant can optionally use a provisioner on it, to make sure that the container runs the proper software with proper configuration. In this case, Ansible, Puppet or other automation software can be used as a provisioner. But again, this is optional: it is possible to make changes to the Dockerfiles and recreate the containers every time.

Benefits of Managing Containers with Automation Software

Containers can be entirely managed with Docker Compose or the Swarm mode. This is often a good idea.

However, choosing to use automation software like Ansible or Puppet has some benefits too. Benefits include:

Containers allow working without modifying the host system, and their creation is very fast. Much faster than virtual machines. This makes containers desirable for development environments.
As explained, making all containers ephemeral and using volumes to store important data is possible. But this means adding some complexity to adapt an ephemeral philosophy to technologies that are not ephemeral by nature (databases). Also, many database professionals don't like this approach. Using automation software allows easily triggering upgrades and configuration changes in the containers, treating them as non-ephemeral systems.
Sometimes containers are only used in development environments. If production databases are managed via Ansible, Puppet, or other automation software, this could lead to some code duplication. Dealing with configuration changes using the same procedures will reduce the cost of maintenance.
While recreating containers is fast, being able to apply small changes with Ansible or Puppet can be more convenient in some cases: particularly if we write files into the container itself, or if recreating a container bootstrap involves some lengthy procedure.
Trying to do something non-standard with Dockerfiles can be tricky. For example, running two processes in a container is possible but can be problematic, as containers are designed to run single main process per container. However there are situations when this is desirable. For example PMM containers run several different processes. Launching additional processes with Ansible or Puppet may be easier than doing it with a Dockerfile.

With all this in mind, let's see some examples of cases when managing containers with Ansible, Puppet or other automation software is preferable, rather than destroying containers every time we want to make a change:

We use Ansible or Puppet in production, and we try to keep development environments as similar as possible to production. By using Ansible/Puppet in development too, we can reuse part of the code.
We make changes to the containers often, and recreating containers is not as fast as it should be (for example because a MariaDB dump needs to be restored).
Creating a container implies some complex logic that does not easily fit a Dockerfile or Docker Compose (including, but not limited to, running multiple processes per container).

That said, every case is different. There are environments where these advantages do not apply, or bring a very small benefit. In those cases, the cost of adding some automation with Ansible, Puppet or similar software is probably not justified.

How to Deploy to Container from Orchestration Software

Suppose you want to manage containers configuration with Ansible.

At a first glance, the simplest way is to run Ansible in the host system. It will need to connect to the containers via SSH, so they need to expose the 22 port. But we have multiple containers, so we'll need to map the 22 port of each container to a different port in the host. This is hard to maintain and potentially insecure: in production you want to avoid exposing any container port to the host.

A better solution is to run Ansible itself in a container. The playbooks will be in a container volume, so we can access them from the host system to manage them more easily. The Ansible container will communicate with other containers using a container network, using the standard 22 port (or another port of your choice) for all containers.

Container Backup and Restoration

MariaDB databases in containers need backup and restore like their non-container equivalents.

Logicial Backups

In this section, we will assume that the MariaDB container has been created as follows:

$ docker volume create mariadb_data
$ docker volume create mariadb_backup
$ docker run --rm \
  -v mariadb_data:/var/lib/mysql \
  -v mariadb_backup:/backup \
  mariadb \
  chown -R mysql:mysql /var/lib/mysql /backup
$ docker run -d --name mariadb \
  -v mariadb_data:/var/lib/mysql \
  -v mariadb_backup:/backup \
  -e MARIADB_ROOT_PASSWORD='MariaDB11!' \
  <mariadb-image>

Backup

mariadb-dump is in the Docker Official Image and can be used as follows:

$ docker exec mariadb \
  sh -c 'mariadb-dump --all-databases -u root -p"$MARIADB_ROOT_PASSWORD" > backup/db.sql'

Restoring Data from Dump Files

For restoring data, you can use the following docker exec command:

$ docker exec mariadb \
  sh -c 'mariadb -u root -p"$MARIADB_ROOT_PASSWORD" < backup/db.sql'

Physical Backups

mariadb-backup is in the Docker Official Image.

Backup

mariadb-backup can create a backup as follows:

To perform a backup using mariadb-backup, a second container is started that shares the original container's data directory. An additional volume for the backup needs to be included in the second backup instance. Authentication against the MariaDB database instance is required to successfully complete the backup. In the example below, a mysql@localhost user is used with the MariaDB server's Unix socket shared with the backup container.

Note: Privileges listed here are for 10.5+. For an exact list, see mariadb-backup: Authentication and Privileges.

$ docker volume create mariadb_data
$ docker volume create mariadb_backup
$ docker run --rm \
  -v mariadb_data:/var/lib/mysql \
  -v mariadb_backup:/backup \
  mariadb \
  chown -R mysql:mysql /var/lib/mysql /backup
$ docker run -d --name mariadb \
  -v mariadb_data:/var/lib/mysql \
  -v mariadb_backup:/backup \
  -e MARIADB_ROOT_PASSWORD='MariaDB11!' \
  -e MARIADB_MYSQL_LOCALHOST_USER=1 \
  -e MARIADB_MYSQL_LOCALHOST_GRANTS='RELOAD, PROCESS, LOCK TABLES, BINLOG MONITOR' \
  <mariadb-image>

mariadb-backup will run as the mysql user in the container, so the permissions on /backup will need to ensure that it can be written to by this user:

$ docker exec --user mysql mariadb mariadb-backup --backup --target-dir=backup

Restore

These steps restore the backup made with mariadb-backup.

At some point before doing the restore, the backup needs to be prepared. The prepare must be done with the same MariaDB version that performed the backup. Perform the prepare like this:

$ docker run --rm \
  --name mariadb-restore \
  -v mariadb_backup:/backup \
  <mariadb-image> \
  mariadb-backup --prepare --target-dir=backup

Now that the image is prepared, start the container with both the data and the backup volumes and restore the backup. The data directory must be empty to perform this action:

$ docker volume create mariadb_restore
$ docker run --rm \
   -v mariadb_restore:/var/lib/mysql \
   --name mariadb-restore-change-permissions \
  <mariadb-image> \
  chown mysql: /var/lib/mysql
$ docker run --rm \
  --name mariadb-restore \
  -v mariadb_restore:/var/lib/mysql \
  -v mariadb_backup:/backup \
  --user mysql \
  <mariadb-image> \
  mariadb-backup --copy-back --target-dir=backup

With mariadb_restore volume containing the restored backup, start normally as this is an initialized data directory. At this point a later version of <mariadb-image> container can be used:

$ docker run -d --name mariadb \
  -v mariadb_restore:/var/lib/mysql \
  -e MARIADB_AUTO_UPGRADE=1 \
  -e MARIADB_ROOT_PASSWORD='MariaDB11!' \
  <mariadb-image>

On the environment variables here:

MARIADB_AUTO_UPGRADE here in addition to upgrading the system tables ensures there is a healthcheck user.
MARIADB_ROOT_PASSWORD is a convenience if any scripts, like logical backup above, use the environment variable. This environment variable is not strictly required.

For further information on mariadb-backup, see mariadb-backup Overview.

_{This page is licensed: CC BY-SA / Gnu FDL}

Container Security Concerns

When using containers in production, it is important to be aware of container security concerns.

Host System Security

Depending on the container runtime, containers may be running on the host system's kernel or a kernel shared with other containers. If this kernel has security bugs, those bugs are also present in the containers. Malicious containers may attempt to explain a kernel vulnerability to impact the confidentiality, integrity or availability of other containers.

In particular, Linux based containers have a container runtime that can use the following features:

Namespaces, to isolate containers from each other and make sure that a container can't establish unauthorized connections to another container.
Seccomp security profiles.
Rootless operation in Docker, or Rootless Podman
cgroups, to limit the resources (CPU, memory, IO) that each container can consume.

The administrators of a system should be particularly careful to upgrade the kernel whenever security bugs to these features are fixed.

It is important to note that when we upgrade the kernel, runC or Docker itself we cause downtime for all the containers running on the system.

Images Security

Containers are built from images. If security is a major concern, you should make sure that the images you use are secure.

If you want to be sure that you are pulling authentic images, you should only pull images signed with Docker Content Trust. Signing only ensure authenticity or origin, it doesn't dictate that entity is trustworthy.

Updated images should be used. An image usually downloads packages information at build time. If the image is not recently built, a newly created container will have old packages. Updating the packages on container creation and regularly re-updating them will ensure that the container uses packages with the most recent versions. Rebuilding an image often will reduce the time necessary to update the packages the first time.

Security bugs are usually important for a database server, so you don't want your version of MariaDB to contain known security bugs. But suppose you also have a bug in Docker, in runC, or in the kernel. A bug in a user-facing application may allow an attacker to exploit a bug in those lower level technologies. So, after gaining access to the container, an attacker may gain access to the host system. This is why system administrators should keep both the host system and the software running in the containers updated.

References

For more information, see the following links:

Container Security from Red Hat.
Docker security on Docker documentation.

Content initially contributed by Vettabase Ltd.

_{This page is licensed: CC BY-SA / Gnu FDL}

Creating a Custom Container Image

OCI containers, frequently and incorrectly called Docker containers, are created from OCI images. An image contains software that can be launched, including the underlying system. A container is an instance of that software.

When we want to automate MariaDB, creating an image with MariaDB and the desired configuration, we may want to create an image by ourselves, which fulfils our needs.

Images Architecture

One "source code" of an image is a Dockerfile. A Dockerfile is written in Docker specific language, and can be compiled into an image by the docker binary, using the docker build command. It can also be compiled by buildah using buildah bud.

Most images are based on another image. The base image is specified at the beginning of the Dockerfile, with the FROM directive. If the base image is not present in the local system, it is downloaded from the repository specified, or if not specified, from the default repository of the build program. This is often Docker Hub. For example, we can build a mariadb-rocksdb:10.5 image starting from the debian:13 image. In this way, we'll have all the software included in a standard Debian image, and we'll add MariaDB and its configuration upon that image.

All the following Dockerfile directives are compiled into a new Docker image, identified by an SHA256 string. Each of these images is based on the image compiled from the previous directive. A physical compiled image can serve as a base for any number of images. This mechanism saves a lot of disk space, download time and build time.

The following diagram shows the relationship between Dockerfiles, images and containers:

Dockerfile Syntax

Here's a simple Dockerfile example:

FROM ubuntu:20.04

RUN apt-get update
RUN apt-get install -y mariadb-server

EXPOSE 3306

LABEL version="1.0"
LABEL description="MariaDB Server"

HEALTHCHECK --start-period=5m \
  CMD mariadb -e 'SELECT @@datadir;' || exit 1

CMD ["mariadbd"]

This example is not very good for practical purposes, but it shows what a Dockerfile looks like.

First, we declare that the base image to use is ubuntu:20.04.

Then we run some commands to install MariaDB from the Ubuntu default repositories and stop the MariaDB service.

We define some metadata about the image with LABEL. Any label is valid.

We declare that the port 3306 (MariaDB default port) should be exposed. However, this has no effect if the port is not exposed at container creation.

We also define a healthcheck. This is a command that is run to check if the container is healthy. If the return code is 0 the healthcheck succeeds, if it's 1 it fails. In the MariaDB specific case, we want to check that it's running and able to answer a simple query. This is better than just checking that MariaDB process is running, because MariaDB could be running but unable to respond, for example because max_connections was reached or data si corrupted. We read a system variable, because we should not assume that any user-created table exists. We also specify --start-period to allow some time for MariaDB to start, keeping in mind that restarting it may take some time if some data is corrupted. Note that there can be only one healthcheck: if the command is specified multiple times, only the last occurrence will take effect.

Finally, we start the container command: mariadbd. This command is run when a container based on this image starts. When the process stops or crashes, the container will immediately stop.

Note that, in a container, we normally run mariadbd directly or in an entrypoint script exec mariadbd, rather than running mysqld_safe or running MariaDB as a service. Containers restart can be handled by the container service. See automatic restart.

See the documentation links below to learn the syntax allowed in a Dockerfile.

Using Variables

It is possible to use variables in a Dockerfile. This allows us, for example, to install different packages, install different versions of a package, or configure software differently depending on how variables are set, without modifying the Dockerfile itself.

To use a variable, we can do something like this:

FROM ubuntu:20.04

ARG MARIADB_CONFIG_FILE

...

ENTRYPOINT mariadbd --defaults-file=$MARIADB_CONFIG_FILE

Here ARG is used after the FROM directive, thus the variable cannot be used in FROM. It is also possible to declare a variable before FROM, so we can use a variable to select the base image to use or its tag, but in this case the variable cannot be used after the FROM directive, unless ARG is re-declared after the FROM. Here is an example:

ARG UBUNTU_VERSION
FROM ubuntu:$UBUNTU_VERSION

# Uncomment for the build error to be avoided
# ARG UBUNTU_VERSION

# But this will cause a build error:
RUN echo 'Ubuntu version: $UBUNTU_VERSION' > /var/build_log

We'll have to assign variables a value when we build the Dockerfile, in this way:

docker build --build-arg UBUNTU_VERSION=20.04 .

Note that Dockerfile variables are just placeholders for values. Dockerfiles do not support assignment, conditionals or loops.

Versioning and Deploying Images

Dockerfiles are normally versioned, as well as the files that are copied to the images.

Once an image is built, it can be pushed to a container registry. Whenever an image is needed on a host to start containers from it, it is pulled from the registry.

Container registries

A default container registry for OCI images is Docker Hub. It contains Docker Official Images maintained by the Docker Library team and the community. Any individual or organization can open an account and push images to Docker Hub. Most Docker images are open source: the Dockerfiles and the needed files to build the images are usually on GitHub.

It is also possible to setup a self-hosted registry. Images can be pushed to that registry and pulled from it, instead of using Docker Hub. If the registry is not publicly accessible, it can be used to store images used by the organization without making them publicly available.

But a self-hosted registry can also be useful for open source images: if an image is available on Docker Hub and also on a self-hosted registry, in case Docker Hub is down or not reachable, it will still be possible to pull images.

Choosing Image Names and Tags

The names of images developed by the community follow this schema:

repository/maintainer/technology

It doesn't matter if the maintainer is an individual or an organization. For images available on Docker Hub, the maintainer is the name of a Docker Hub account.

Official images maintained by the Docker Library maintainers have the implicit name of library filled in by the container fetching tool. For example, the official MariaDB image is called mariadb which is an alias for docker.io/library/mariadb.

All images have a tag, which identifies the version or the variant of an image. For example, all MariaDB versions available on Docker are used as image tags. MariaDB 10.11 is called mariadb:10.11.

By conversion, tags form a hierarchy. So for example, there is a 10.1.1 tag whose meaning will not change over time. 10.5 will always identify the latest stable version in the 10.5 branch. For some time it was 10.5.1, then it became 10.5.2, and so on.

When we pull an image without specifying a tag (ie, docker pull mariadb), we are implicitly requiring the image with the latest tag. This is even more mutable: at different periods of time, it pointed to the latest 10.0 version, to the latest 10.1 version, and so on.

In production, it is always better to know for sure which version we are installing. Therefore it is better to specify a tag whose meaning won't change over time, like 10.5.21. To keep to a latest LTS version, the lts can be used.

Pushing and Pulling Images

To pull an image from Docker Hub or a self-hosted registry, we use the docker pull command. For example:

docker pull mariadb:10.5

This command downloads the specified image if it is not already present in the system, or if the local version is not up to date.

After modifying a Dockerfile, we can build an image in this way:

docker build .

This step can be automated by services like Docker Hub and GitHub. Check those service's documentation to find out how this feature works.

Once an image is created, it can be pushed to a registry. We can do it in this way:

docker push <image_name>:<tag>

Docker Content Trust

Docker has a feature called Docker Content Trust (DCT). It is a system used to digitally sign images, based on PEM keys. For environments where security is a major concern, it is important to sign images before pushing them. This can be done with both Docker Hub and self-hosted registries.

Good Practices and Caveats

As mentioned, a Dockerfile is built by creating a new image for each directive that follows FROM. This leads to some considerations.

Sometimes it can be a good idea to run several shell commands in a single RUN directive to avoid creating images that are not useful.
Modifying a directive means that all subsequent directives also need to be rebuilt. When possible, directives that are expected to change often should follow directives that will change seldom.
Directives like LABEL or EXPOSE should be placed close to the end of Dockerfiles. In this way they will be rebuilt often, but this operation is cheap. On the other side, changing a label should not trigger a long rebuild process.
Variables should be used to avoid Dockerfiles proliferation. But if a variable is used, changing its value should be tested. So, be sure not to use variables without a good reason.
Writing logic into a Dockerfile is impossible or very hard. Call shell scripts instead, and write your logic into them. For example, in a shell script it is easy to perform a certain operation only if a variable is set to a certain value.
If you need MariaDB containers with different configurations or different sets of plugins, use the method explained above. Do not create several Dockerfiles, with different tags, for each desired configuration or plugin set. This may lead to undesired code duplication and increased maintenance costs.

References

More details can be found in the Docker documentation:

Deploy MariaDB Enterprise Server with Docker

MariaDB Corporation provides Docker images for MariaDB Enterprise Server in the MariaDB Enterprise Docker Registry.

Docker provides multiple benefits:

Docker is an open platform for developing, shipping, and running applications that allows you to separate your applications from your infrastructure.
Docker images are portable. A Docker image can be deployed in a Docker container on any system using the Docker platform, regardless of the host operating system.
Docker containers are isolated from the host operating system and from other Docker containers.

If you want to deploy MariaDB Enterprise Server without Docker, alternative deployment methods are available.

Use Cases

MariaDB Enterprise Server can be deployed with Docker to support use cases that require software to be rapidly deployed on existing infrastructure, such as:

Continuously create and destroy automated testing environments as part of a continuous integration (CI) pipeline
Create a small test environment on a local workstation
Create multiple isolated test environments on the same host
Deployment alongside related containers using Docker Compose

Compatibility

The following products and versions can be deployed using the MariaDB Enterprise Docker Registry:

MariaDB Enterprise Server 10.5
MariaDB Enterprise Server 10.6
MariaDB Enterprise Server 11.4

For details about which storage engines and plugins are supported in the images for each version, see "MariaDB Enterprise Docker Registry".

Deploy Enterprise Server in a Docker Container

To deploy MariaDB Enterprise Server in a Docker container, follow the instructions below.

Step 1: Retrieve Customer Download Token

MariaDB Corporation requires customers to authenticate when logging in to the MariaDB Enterprise Docker Registry. A customer-specific Customer Download Token must be provided as the password.

Customer Download Tokens are available through the MariaDB Customer Portal.

To retrieve the customer download token for your account:

Navigate to the Customer Download Token at the MariaDB Customer Portal.
Log in using your MariaDB ID.
Copy the Customer Download Token to use as the password when logging in to the MariaDB Enterprise Docker Registry.

Step 2: Log In to Docker Registry

$ docker login docker.mariadb.com

When prompted, enter the login details:

As the user name, enter the email address associated with your MariaDB ID.
As the password, enter your Customer Download Token.

The login details will be saved.

Confirm the login details were saved by checking the /.docker/config.json file for a JSON object named "docker.mariadb.com" inside an "auths" parent JSON object:

$ cat ~/.docker/config.json
{
   "auths": {
      "docker.mariadb.com": {
         "auth": "<auth_hash>"
      }
   }
}

Step 3: Choose an Image Tag

The enterprise-server repository in the MariaDB Enterprise Docker Registry contains images for different MariaDB Enterprise Server releases using specific tags. Before continuing, you will need to decide which tag to use.

To deploy a container using the most recent image for the latest MariaDB Enterprise Server release series (currently 11.4), use the latest tag.

For additional information, see "MariaDB Enterprise Docker Registry: Supported Tags".

Step 4: Pull Docker Image

Pull the Docker image with the chosen tag by executing docker pull:

$ docker pull docker.mariadb.com/enterprise-server:latest

latest: Pulling from enterprise-server
5d87d5506868: Pull complete
Digest: sha256:68795ca747901e3402e30dab71d6d8bc72bce727db3b9e4888979468be77d250
Status: Downloaded newer image for docker.mariadb.com/enterprise-server:latest
docker.mariadb.com/enterprise-server:latest

Confirm the Docker image has been pulled by executing docker images:

$ docker images \
   --filter=reference='docker.mariadb.com/enterprise-server'

REPOSITORY                             TAG       IMAGE ID       CREATED        SIZE
docker.mariadb.com/enterprise-server   latest    dd17291aa340   3 months ago   451MB

Step 5: Create a Container

Create a container using the pulled Docker image by executing docker run:

$ docker run --detach \
   --name mariadb-es-latest \
   --env MARIADB_ROOT_PASSWORD='YourSecurePassword123!' \
   --publish '3307:3306/tcp' \
   docker.mariadb.com/enterprise-server:latest \
   --log-bin=mariadb-bin \
   <other mariadbd command-line options>

3082ab69e565be21c6157bb5a3d8c849ec03a2c51576778ac417a8a3aa9e7537

Configure the container and set the root password using environment variables by setting the --env command-line option.
Configure TCP port bindings for the container by setting the --publish or --publish-all command-line options.
Configure MariaDB Enterprise Server by setting mariadbd command-line options.

Confirm the container is running by executing docker ps:

$ docker ps \
   --all \
   --filter ancestor='docker.mariadb.com/enterprise-server:latest'

CONTAINER ID   IMAGE                                         COMMAND                  CREATED          STATUS                      PORTS      NAMES
3082ab69e565   docker.mariadb.com/enterprise-server:latest   "/es-entrypoint.sh -…"   12 seconds ago   Up 11 seconds               3306/tcp   mariadb-es-latest

By default, Docker uses Docker bridge networking for new containers. For details on how to use host networking for new containers, see "Create a Container with Host Networking".

Step 6: Connect to Container

Connect to the container by executing MariaDB Client on the container using docker exec:

$ docker exec --interactive --tty \
   mariadb-es-latest \
   mariadb \
   --user=root \
   --password

Confirm the container is using the correct version of MariaDB Enterprise Server by querying the version system variable with the SHOW GLOBAL VARIABLES statement:

SHOW GLOBAL VARIABLES
   LIKE 'version'\G

*************************** 1. row ***************************
Variable_name: version
        Value: 11.4.4-2-MariaDB-enterprise-log

Exit the container using exit:

exit

Bye

Step 7: Stop Container

Stop a Docker container using docker stop:

$ docker stop mariadb-es-latest

mariadb-es-latest

Confirm the container is stopped by executing docker ps:

$ docker ps \
   --all \
   --filter ancestor='docker.mariadb.com/enterprise-server:latest'

CONTAINER ID   IMAGE                                         COMMAND                  CREATED         STATUS                            PORTS       NAMES
3082ab69e565   docker.mariadb.com/enterprise-server:latest   "/es-entrypoint.sh -…"   2 minutes ago   Exited (143) About a minute ago               mariadb-es-latest

Step 8: Remove Container

Remove a Docker container using docker rm:

$ docker rm mariadb-es-latest

mariadb-es-latest

Confirm the container is removed by executing docker ps:

$ docker ps \
   --all \
   --filter ancestor='docker.mariadb.com/enterprise-server:latest'

CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES

Docker and AWS EC2

This process shows how to deploy MariaDB in a Docker container running on an EC2 instance. First we'll create the EC2 VM, then we'll deploy Docker to it. After that, we'll pull the MariaDB Docker image which we'll use to create a running container with a MariaDB instance. Finally, we'll load a sample database into the MariaDB instance.

Create a VM in AWS EC2

Install MariaDB client on your local machine, either bundled with Maria DB server or standalone.
Login to AWS, navigate to EC2 service home
Choose Region for EC2 in the upper right corner of the console
Launch (1) Instance, giving instance a name (e.g. mrdb-ubuntu-docker-use1) and create or re-use a key pair
Choose Ubuntu 22.04 or similar free tier instance
Choose hardware, t2.micro or similar free tier instance
Create Key Pair with name (e.g. mrdb-docker-aws-pk.pem if using openSSH at the command line, or mrdb- docker-aws-pk..ppk for use with programs like PuTTY.)
Create or select a security group where SSH is allowed from anywhere 0.0.0.0/0. If you’d like to make this more secure, it can be restricted to a specific IP address or CIDR block.

{{aws-firewall}}

Accept remaining instance creation defaults and click “launch instance”.
Save the *.pem or *.ppk keyfile on your local hard drive when prompted. You will need it later. If you’re on Linux, don’t forget to change permissions on the downloaded *.pem / *.ppk key file:

$ chmod 400 mrdb-docker-pk.pem

Click into the instance summary (EC2 > Instances > Instance ID) and click on the “security” tab towards the bottom.

{{security-group}}

In the relevant security group for your instance, Create an inbound rule so that TCP port 3306 is open, allowing external connections to Maria DB (like your local command line client for MariaDB). Double check that port 22 is open while you're there for SSH.

Install Docker on the EC2 VM

For more detailed instructions, refer to Installing and Using MariaDB via Docker

Back in the instance summary (EC2 > Instances > Instance ID), copy the public IP (e.g. ww.xx.yyy.zzz)

{{aws-instance-ip}}

Open terminal window, navigate to the directory with private key (*.pem or *.ppk) file and start a SSH remote shell session by typing:

$ ssh -i mrdb-docker-pk.pem ubuntu@ww.xx.yyy.zzz

(switch ww.xx.yyy.zzz for your IP address from step 14).

Are you sure you want to continue connecting (yes/no/[fingerprint])? Say yes
Escalate to root

$ sudo su

Install Docker

$ curl -fsSL https://get.docker.com | sudo sh

Pull the MariaDB Docker image and create the container

Pull MariaDB Docker image

$ docker pull mariadb:lts

Start MDRB docker process

at your terminal / command line, type:

$ docker run --detach --name mariadb-docker -v \Users\YouUID\Documents\YourDirName:/var/lib/mysql:Z -p 3306:3306 -e MARIADB_ROOT_PASSWORD=yoursecurepassword mariadb:lts

The -v flag mounts a directory that you choose as /var/lib/mysql will ensure that the volume is persistent. Windows file paths like C:\Users\YouUID\Documents\YourDirName should be represented as above. Linux file paths should also be absolute vs. relative. Obviously replace the root password with something that is a bit more secure than you see in this example for anything other than development purposes.

Shell into container

$ docker exec -it mariadb-docker bash

Using the root password specified in step 20, type:

$ mariadb -pyoursecurepassword

Setup admin account with permission for remote connection, configure access control

MariaDB [(none)]> CREATE USER 'admin'@'%' IDENTIFIED BY 'admin';

MariaDB [(none)]> GRANT ALL ON *.* to 'admin'@'%' WITH GRANT OPTION;

MariaDB [(none)]> SHOW GRANTS FOR admin;

Obviously replace these passwords with something that is a bit more secure than you see in this example for anything other than development purposes.

Setup service account for your app with permission for remote connection, configure access control

MariaDB [(none)]> CREATE USER 'yourappname'@'%' IDENTIFIED BY 'yoursecurepassword';

MariaDB [(none)]> GRANT INSERT, UPDATE, DELETE ON *.* to 'yourappname'@'%';

MariaDB [(none)]> SHOW GRANTS FOR yourappname;

Obviously replace these passwords with something that is a bit more secure than you see in this example for anything other than development purposes.

Load up your database from your preexisting SQL script that contains CREATE DATABASE; USE DATABASE; and CREATE TABLE statements.

In a new local terminal window, not your SSH session, change directory to the directory containing your database creation script, say, init.sql in this example. Type:

$ mariadb --host=ww.xx.yyy.zzz --port=3306 --user=admin --password=admin -e “SOURCE init.sql”

(switch ww.xx.yyy.zzz for your IP address from step 14).

_{This page is licensed: CC BY-SA / Gnu FDL}

Docker and Google Cloud

This process shows how to deploy MariaDB in a Docker container running on an GCE instance. First we'll create the GCE VM, then we'll deploy Docker to it. After that, we'll pull the MariaDB Docker image which we'll use to create a running container with a MariaDB instance. Finally, we'll load a sample database into the MariaDB instance.

Create a VM in Google Cloud Compute Engine

Install MariaDB client on your local machine, either bundled with Maria DB server or standalone.
Login to Google Cloud, navigate to VM instances
Enable Compute Engine API if you haven’t already.
Click create instance, give instance a name (e.g. mrdb-ubuntu-docker-use1b), choose a region and zone.
Machine configuration: Choose general-purpose / E2 micro

Boot Disk > Change

Switch the operating system to a modern Ubuntu release x86/64 CPU architecture, or similar free tier offering.

Create a firewall rule in the Firewall Policies section of the console. After naming it, change the targets, add 0.0.0.0/0 as a source IP range, and open TCP port 3306. Then Click create.
Connect using Google Cloud’s built in browser SSH. Accept all prompts for authorization.

Install Docker on the GCE VM

For more detailed instructions, refer to Installing and Using MariaDB via Docker

Escalate to root Escalate to root

$ sudo su

Install Docker

$ curl -fsSL get.docker.com | sudo sh

Pull Docker image

$ docker pull mariadb:lts

Start MDRB docker process

at your terminal / command line, type:

$ docker run --detach --name mariadb-docker -v \Users\YouUID\Documents\YourDirName:/var/lib/mysql:Z -p 3306:3306 -e MARIADB_ROOT_PASSWORD=yoursecurepassword mariadb:lts

Shell into container $ docker exec -it mariadb-docker bash
Login to MRDB inside container

Using the root password specified in step 12, type:

$ mariadb -pyoursecurepassword

Setup admin account with permission for remote connection, configure access control Execute these SQL commands in sequence:

MariaDB [(none)]> CREATE USER 'admin'@'%' IDENTIFIED BY 'admin';

MariaDB [(none)]> GRANT ALL ON . to 'admin'@'%' WITH GRANT OPTION;

MariaDB [(none)]> SHOW GRANTS FOR admin;

Obviously replace these passwords with something that is a bit more secure than you see in this example for anything other than development purposes.

Setup service account for your app with permission for remote connection, configure access control Execute these SQL commands in sequence:

MariaDB [(none)]> CREATE USER 'yourappname'@'%' IDENTIFIED BY 'yoursecurepassword';

MariaDB [(none)]> GRANT INSERT, UPDATE, DELETE ON . to 'yourappname'@'%';

MariaDB [(none)]> SHOW GRANTS FOR yourappname;

Obviously replace these passwords with something that is a bit more secure than you see in this example for anything other than development purposes.

Load up your database from your preexisting SQL script that contains CREATE DATABASE; USE DATABASE; and CREATE TABLE statements.

Copy the external IP address of your VM instance from the Console in the VM instances list.

In a new local terminal window, not your SSH session, change directory to the directory containing your database creation script, say, init.sql in this example.

Type: $ mariadb --host=ww.xx.yyy.zzz --port=3306 --user=admin --password=admin -e “SOURCE init.sql” (switch ww.xx.yyy.zzz for your IP address from step 17.

_{This page is licensed: CC BY-SA / Gnu FDL}

Docker and Microsoft Azure

This process shows how to deploy MariaDB in a Docker container running on an Azure VM instance. First we'll create the Azure VM, then we'll deploy Docker to it. After that, we'll pull the MariaDB Docker image which we'll use to create a running container with a MariaDB instance. Finally, we'll load a sample database into the MariaDB instance.

Create a VM in Azure

Install MariaDB client on your local machine, either bundled with Maria DB server or standalone.
Login to Azure, navigate to Azure Virtual Machine
Create VM. Give the VM a name (e.g. mrdb-ubuntu-docker-use1), and create new or use an existing resource group. Selection region and availability zone, and choose Ubuntu 22.04 LTS x64 (free services eligible).

Choose the VM instance size, like a B1s or similar free tier. Note that Azure free works on a credit based system for new accounts

Configure an administrator account and generate a new key pair, and give the key pair a name.

Click "Review + Create" at the very bottom of the "create virtual machine" page to create the VM.
Download the SSH keys and them in a safe place, you will need them later. For this example, let's name the key file mrdb-docker-pk.pem.

If your local machine is Linux or you are using WSL on Windows, open a terminal window and: $ mv /mnt/c/ /.ssh/ $ chmod 400 /.ssh/

Once the VM is deployed, "click to resource" to get back to the virtual machine's overview page.
From the overview page, the left-hand navigation, choose settings > networking.
Click "add inbound port rule"
Configure the port rule to allow port TCP 3306 inbound (mySQL) so that you can make external connections from your local Maria DB command line client, to the dockerized Maria DB instance in your Azure Linux VM.
Navigate back to the virtual machine's overview page. Then copy the public IP address to the clipboard.

Install Docker on the Azure VM

For more detailed instructions, refer to Installing and Using MariaDB via Docker

Open terminal window, referencing the path to the private key (*.pem or *.ppk) file, and start a SSH remote shell session by typing:

$ ssh -i /.ssh/mrdb-docker-pk.pem azureuser@ww.xx.yyy.zzz

(switch ww.xx.yyy.zzz for your IP address from step 12, and replace "mrdb-docker-pk.pem" with your keyfile name if you chose something different).

If you forget your administrator account details, simply go to the left-hand navigation and choose settings > connect, and Azure will display the public IP address, admin username, and port for you.

Are you sure you want to continue connecting (yes/no/[fingerprint])? Say yes
Escalate to root

$ sudo su

Microsoft Azure on two machines come with docker preinstalled. For any reason you need to reinstall it , chose another machine type is not have docker preinstalled, you can install docker inside your SSH session with cURL by typing:

$ curl -fsSL get.docker.com | sudo sh

Pull the MariaDB Docker image and create the container

Pull MariaDB Docker image

$ docker pull mariadb:lts

Start MDRB docker process

at your terminal / command line, type:

$ docker run --detach --name mariadb-docker -v \Users\YouUID\Documents\YourDirName:/var/lib/mysql:Z -p 3306:3306 -e MARIADB_ROOT_PASSWORD=yoursecurepassword mariadb:lts

Shell into container

$ docker exec -it mariadb-docker bash

Using the root password specified in step 20, type:

$ mariadb -pyoursecurepassword

Setup admin account with permission for remote connection, configure access control

MariaDB [(none)]> CREATE USER 'admin'@'%' IDENTIFIED BY 'admin';

MariaDB [(none)]> GRANT ALL ON . to 'admin'@'%' WITH GRANT OPTION;

MariaDB [(none)]> SHOW GRANTS FOR admin;

Obviously replace these passwords with something that is a bit more secure than you see in this example for anything other than development purposes.

Setup service account for your app with permission for remote connection, configure access control

MariaDB [(none)]> CREATE USER 'yourappname'@'%' IDENTIFIED BY 'yoursecurepassword';

MariaDB [(none)]> GRANT INSERT, UPDATE, DELETE ON . to 'yourappname'@'%';

MariaDB [(none)]> SHOW GRANTS FOR yourappname;

Obviously replace these passwords with something that is a bit more secure than you see in this example for anything other than development purposes.

Load up your database from your preexisting SQL script that contains CREATE DATABASE; USE DATABASE; and CREATE TABLE statements.

In a new local terminal window, not your SSH session, change directory to the directory containing your database creation script, say, init.sql in this example. Then type:

$ mariadb --host=ww.xx.yyy.zzz --port=3306 --user=admin --password=admin -e “SOURCE init.sql”

(switch ww.xx.yyy.zzz for your IP address from step 12).

_{This page is licensed: CC BY-SA / Gnu FDL}

Docker Official Image Frequently Asked Questions

Frequently asked questions about the Docker Official Image

How to Reset Passwords

If you have an existing data directory and wish to reset the root and user passwords, and to create a database which the user can fully modify, perform the following steps.

First create a passwordreset.sql file:

CREATE USER IF NOT EXISTS root@localhost IDENTIFIED BY 'thisismyrootpassword';
SET PASSWORD FOR root@localhost = PASSWORD('thisismyrootpassword');
GRANT ALL ON *.* TO root@localhost WITH GRANT OPTION;
GRANT PROXY ON ''@'%' ON root@localhost WITH GRANT OPTION;
CREATE USER IF NOT EXISTS root@'%' IDENTIFIED BY 'thisismyrootpassword';
SET PASSWORD FOR root@'%' = PASSWORD('thisismyrootpassword');
GRANT ALL ON *.* TO root@'%' WITH GRANT OPTION;
GRANT PROXY ON ''@'%' ON root@'%' WITH GRANT OPTION;
CREATE USER IF NOT EXISTS myuser@'%' IDENTIFIED BY 'thisismyuserpassword';
SET PASSWORD FOR myuser@'%' = PASSWORD('thisismyuserpassword');
CREATE DATABASE IF NOT EXISTS databasename;
GRANT ALL ON databasename.* TO myuser@'%';

Adjust myuser, databasename and passwords as needed.

Then:

$ docker run --rm -v /my/own/datadir:/var/lib/mysql -v /my/own/passwordreset.sql:/passwordreset.sql:z %%IMAGE%%:latest --init-file=/passwordreset.sql

On restarting the MariaDB container in this /my/own/datadir, the root and myuser passwords will be reset.

Temp Server Start Timeout

Question, are you getting errors like the following where a temporary server start fails to succeed in 30 seconds?

Example of log:

2023-01-28 12:53:42+00:00 [Note] [Entrypoint]: Starting temporary server
2023-01-28 12:53:42+00:00 [Note] [Entrypoint]: Waiting for server startup
2023-01-28 12:53:42 0 [Note] mariadbd (server 10.10.2-MariaDB-1:10.10.2+maria~ubu2204) starting as process 72 ...
....
2023-01-28 12:53:42 0 [Note] InnoDB: Setting file './ibtmp1' size to 12.000MiB. Physically writing the file full; Please wait ...
2023-01-28 12:54:13 0 [Note] mariadbd: ready for connections.
Version: '10.10.2-MariaDB-1:10.10.2+maria~ubu2204'  socket: '/run/mysqld/mysqld.sock'  port: 0  mariadb.org binary distribution
2023-01-28 12:54:13+00:00 [ERROR] [Entrypoint]: Unable to start server.

The timeout on a temporary server start is a quite generous 30 seconds.

The lack of a message like the following indicates it failed to complete writing a temporary file of 12MiB in 30 seconds.

2023-01-28 12:53:46 0 [Note] InnoDB: File './ibtmp1' size is now 12.000MiB.

If the datadir where this is stored is remote storage maybe it's a bit slow. It's ideal to have an InnoDB temporary path local so this can be configured using the command or configuration setting:

innodb_temp_data_file_path=/dev/shm/ibtmp1:12M:autoextend

Note: depending on container runtime this space may be limited.

Creating a replication pair

MARIADB_REPLICATION_USER / MARIADB_REPLICATION_PASSWORD specify the authentication for the connection. The MARIADB_MASTER_HOST is the indicator that it is a replica and specifies the container aka hostname, of the master.

A docker-compose.yml example:

version: "3"
services:
  master:
    image: mariadb:latest
    command: --log-bin --log-basename=mariadb
    environment:
      - MARIADB_ROOT_PASSWORD=password
      - MARIADB_USER=testuser
      - MARIADB_PASSWORD=password
      - MARIADB_DATABASE=testdb
      - MARIADB_REPLICATION_USER=repl
      - MARIADB_REPLICATION_PASSWORD=replicationpass
    healthcheck:
      test: ["CMD", "healthcheck.sh", "--connect", "--innodb_initialized"]
      interval: 10s
      timeout: 5s
      retries: 3
  replica:
    image: mariadb:latest
    command: --server-id=2 --log-basename=mariadb
    environment:
      - MARIADB_ROOT_PASSWORD=password
      - MARIADB_MASTER_HOST=master
      - MARIADB_REPLICATION_USER=repl
      - MARIADB_REPLICATION_PASSWORD=replicationpass
      - MARIADB_HEALTHCHECK_GRANTS=REPLICA MONITOR
    healthcheck:
      test: ["CMD", "healthcheck.sh", "--connect", "--replication_io", "--replication_sql", "--replication_seconds_behind_master=1", "--replication"]
      interval: 10s
      timeout: 5s
      retries: 3
    depends_on:
      master:
        condition: service_healthy

Event Scheduler: An error occurred when initializing system tables. Disabling the Event Scheduler.

This will show up in the container log as:

2024-01-29 17:38:13 0 [ERROR] Incorrect definition of table mysql.event: expected column 'definer' at position 3 to have type varchar(, found type char(141).
2024-01-29 17:38:13 0 [ERROR] mariadbd: Event Scheduler: An error occurred when initializing system tables. Disabling the Event Scheduler.

The cause is the underlying table has change structure from the last MariaDB version. The easiest solution to this is to start the container with the environment variable MARIADB_AUTO_UPGRADE=1 and system tables will be updated. This is safe to keep on as it detects the version installed. The next start should not show this error.

InnoDB: Upgrade after a crash is not supported. The redo log was created with MariaDB X.Y.Z

This will show up in the error log as:

2022-05-23 12:29:20 0 [ERROR] InnoDB: Upgrade after a crash is not supported. The redo log was created with MariaDB 10.5.4.
2022-05-23 12:29:20 0 [ERROR] InnoDB: Plugin initialization aborted with error Generic error

This is attempting to start on a higher MariaDB version when the shutdown of the previous version crashed.

By crashed, it means the MariaDB was force killed or had a hard power failure. MariaDB, being a durable database, can recover from these, if started with the same version. The redo log however is a less stable format, so the recovery has to be on the same Major.Minor version, in this case 10.5. This error message is saying that you when from force killed MariaDB to a later version.

So whenever you encounter this message. Start with the again with the tag set to the version in the error message, like 10.5.4, or as the redo long format is consistent in the Major.Minor version 10.5 is sufficient. After this has been started correctly, cleanly shut the service down and it will be recovered.

The logs on shutdown should have a message like:

2023-11-06 10:49:23 0 [Note] InnoDB: Shutdown completed; log sequence number 84360; transaction id 49
2023-11-06 10:49:23 0 [Note] mariadbd: Shutdown complete

After you see this, you can update your MariaDB tag to a later version.

Every MariaDB start gives permission denied messages

2024-02-06 03:03:18+00:00 [Note] [Entrypoint]: Entrypoint script for MariaDB Server 1:10.11.6+maria~ubu2204 started.
/usr/local/bin/docker-entrypoint.sh: line 600: /var/lib/mysql//mysql_upgrade_info: Permission denied
2024-02-06 03:03:18+00:00 [Note] [Entrypoint]: MariaDB upgrade (mariadb-upgrade) required, but skipped due to $MARIADB_AUTO_UPGRADE setting
2024-02-06  3:03:18 0 [Warning] Can't create test file '/var/lib/mysql/80a2bb81d698.lower-test' (Errcode: 13 "Permission denied")
2024-02-06  3:03:18 0 [Note] Starting MariaDB 10.11.6-MariaDB-1:10.11.6+maria~ubu2204 source revision fecd78b83785d5ae96f2c6ff340375be803cd299 as process 1
2024-02-06  3:03:18 0 [ERROR] mariadbd: Can't create/write to file './ddl_recovery.log' (Errcode: 13 "Permission denied")
2024-02-06  3:03:18 0 [ERROR] DDL_LOG: Failed to create ddl log file: ./ddl_recovery.log
2024-02-06  3:03:18 0 [ERROR] Aborting

Or:

2024-08-16  4:54:05 0 [ERROR] InnoDB: Operating system error number 13 in a file operation.
2024-08-16  4:54:05 0 [ERROR] InnoDB: The error means mariadbd does not have the access rights to the directory.

In this case, the container is running as a user that, inside the container, does not have write permissions on the datadir /varlib/mysql.

Bad magic header in tc log

From the transaction coordinator log this is a corrupted file. This will have a log message like the following:

2024-05-21  8:55:58 0 [Note] Recovering after a crash using tc.log
2024-05-21  8:55:58 0 [ERROR] Bad magic header in tc log
2024-05-21  8:55:58 0 [ERROR] Crash recovery failed. Either correct the problem (if it's, for example, out of memory error) and restart, or delete tc log and start server with --tc-heuristic-recover={commit|rollback}
2024-05-21  8:55:58 0 [ERROR] Can't init tc log
2024-05-21  8:55:58 0 [ERROR] Aborting

The cause of this is headed by the first not, its a crash recovery. Like the Every MariaDB start is a crash recovery answer below, this is an indication that MariaDB wasn't given enough time by the container runtime to shutdown cleanly. While MariaDB was shutdown, the new version that was started is a newer MariaDB and doesn't recognise the updated magic information in the header.

MariaDB should always perform crash recovery with the same version that actually crashed, the same major/minor number at least.

As such the solution is to restart the container with the previous MariaDB version that was running, and configure the container runtime to allow a longer stop time. See the Every MariaDB start is a crash recovery answer below to see if the timeout is sufficiently extended.

Every MariaDB start is a crash recovery

Do you get on every start:

db-1  | 2023-02-25 19:10:02 0 [Note] Starting MariaDB 10.11.2-MariaDB-1:10.11.2+maria~ubu2204-log source revision cafba8761af55ae16cc69c9b53a341340a845b36 as process 1                                                                                                         
db-1  | 2023-02-25 19:10:02 0 [Note] mariadbd: Aria engine: starting recovery                                                                                                                                                                                                   
db-1  | tables to flush: 3 2 1 0                                                                                                                                                                                                                                                
db-1  |  (0.0 seconds);                                                                                                                                                                                                                                                         
db-1  | 2023-02-25 19:10:02 0 [Note] mariadbd: Aria engine: recovery done  
...
db-1  | 2023-02-26 13:03:29 0 [Note] InnoDB: Initializing buffer pool, total size = 32.000GiB, chunk size = 512.000MiB
db-1  | 2023-02-26 13:03:29 0 [Note] InnoDB: Completed initialization of buffer pool
db-1  | 2023-02-26 13:03:29 0 [Note] InnoDB: File system buffers for log disabled (block size=512 bytes)
db-1  | 2023-02-26 13:03:29 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=193796878816

Container runtimes are assume to start and stop very quickly. Check the shutdown logs. They may be a log like:

db-1  | 2023-02-26 13:03:17 0 [Note] InnoDB: Starting shutdown...                                                                                                                                                                                                               
db-1  | 2023-02-26 13:03:17 0 [Note] InnoDB: Dumping buffer pool(s) to /var/lib/mysql/ib_buffer_pool                                                                                                                                                                            
db-1  | 2023-02-26 13:03:17 0 [Note] InnoDB: Restricted to 519200 pages due to innodb_buf_pool_dump_pct=25                                                                                                                                                                      
db-1  | 2023-02-26 13:03:17 0 [Note] InnoDB: Buffer pool(s) dump completed at 230226 13:03:17                                                                                                                                                                                   
db-1 exited with code 0

Note that the logs didn't include the following messages:

db-1  | 2023-02-26 13:03:43 0 [Note] InnoDB: Shutdown completed; log sequence number 46590; transaction id 15
db-1  | 2023-02-26 13:03:43 0 [Note] mariadbd: Shutdown complete

As these messages aren't here, the container was killed off before it could just down cleanly. When this happens, the startup will be a crash recovery and you won't be able to upgrade your MariaDB instance (previous FAQ) to the next Major.Minor version.

Solution is to extend the timeout in the container runtime to allow MariaDB to complete its shutdown.

How do I create a MariaDB-backup of the data?

docker volume create backup
docker run --name mdb -v backup:/backup -v datavolume:/var/lib/mysql mariadb
docker exec mdb mariadb-backup --backup --target-dir=/backup/d --user root --password soverysecret
docker exec mdb mariadb-backup --prepare --target-dir=/backup/d
docker exec mdb sh -c '[ ! -f /backup/d/.my-healthcheck.cnf ] && cp /var/lib/mysql/.my-healthcheck.cnf /backup/d'
docker exec --workdir /backup/d mdb tar -Jcf ../backup.tar.xz .
docker exec mdb  rm -rf /backup/d

How do I restore from a MariaDB-backup

With the backup prepared like previously:

docker run -v backup:/docker-entrypoint-initdb.d -v newdatavolume:/var/lib/mysql mariadb

How to start MariaDB with Apptainer

Because Apptainer has all the filesystems readonly except or the volume, the /run/mysqld directory is used as a pidfile and socket directory. An easy way is to mark this as a scratch directory.

mkdir mydatadir
apptainer run --no-home --bind $PWD/mydatadir:/var/lib/mysql --env MARIADB_RANDOM_ROOT_PASSWORD=1 --net --network-args "portmap=3308:3306/tcp" --fakeroot --scratch=/run/mysqld  docker://mariadb:10.5

Alternately:

apptainer run --no-home --bind $PWD/mydatadir:/var/lib/mysql --env MARIADB_RANDOM_ROOT_PASSWORD=1 --net --network-args "portmap=3308:3306/tcp" --fakeroot   docker://mariadb:10.5 --socket=/var/lib/mysql/mariadb.sock --pid-file=/var/lib/mysql/mariadb.pid

Why does the MariaDB container start as root?

The MariaDB entrypoint briefly starts as root, and if a explicit volume is there, the owner of this volume will be root. To allow MariaDB to use the CHOWN capability to change to the volume owner to a user that can write to this volume, it needs to be briefly root. After this one action is taken, the entrypoint uses gosu to drop to a non-root user and continues execution. There is no accessible exploit vector to remotely affect the container startup when it is briefly running as the root user.

Can I run the MariaDB container as an arbitrary user?

Yes. using the user: 2022 in a compose file, or --user 2022 as a command will run the entrypoint as the user id 2022. When this occurs, it is assumed that the volume of the datadir has the right permissions for MariaDB to access the datadir. This can be useful if your local user is user id 2022 and your datadir is owned locally by this user. Note inside the container there isn't the same user names outside the container defined, so working with numbers is more portable.

_{This page is licensed: CC BY-SA / Gnu FDL}

Installing and Using MariaDB via Docker

Sometimes we want to install a specific version of MariaDB, MariaDB ColumnStore, or MaxScale on a certain system, but no packages are available. Or maybe, we simply want to isolate MariaDB from the rest of the system, to be sure that we won't cause any damage.

A virtual machine would certainly serve the scope. However, this means installing a system on the top of another system. It requires a lot of resources.

In many cases, the best solution is using containers. Docker is a framework that runs containers. A container is meant to run a specific daemon, and the software that is needed for that daemon to properly work. Docker does not virtualize a whole system; a container only includes the packages that are not included in the underlying system.

Docker requires a very small amount of resources. It can run on a virtualized system. It is used both in development and in production environments. Docker is an open source project, released under the Apache License, version 2.

Note that, while your package repositories could have a package called docker, it is probably not the Docker we are talking about. The Docker package could be called docker.io or docker-engine.

For information about installing Docker, see Get Docker in Docker documentation.

Installing Docker on Your System with the Universal Installation Script

The script below will install the Docker repositories, required kernel modules and packages on the most common Linux distributions:

curl -sSL https://get.docker.com/ | sh

Starting dockerd

On some systems you may have to start the dockerd daemon yourself:

sudo systemctl start docker
sudo gpasswd -a "${USER}" docker

If you don't have dockerd running, you will get the following error for most docker commands: installing-and-using-mariadb-via-docker Cannot connect to the Docker daemon at unix:/var/run/docker.sock. Is the docker daemon running? &#xNAN;<>

Using MariaDB Images

The easiest way to use MariaDB on Docker is choosing a MariaDB image and creating a container.

Downloading an Image

You can download a MariaDB image for Docker from the Offical Docker MariaDB, or choose another image that better suits your needs. You can search Docker Hub (the official set of repositories) for an image with this command:

docker search mariadb

Once you have found an image that you want to use, you can download it via Docker. Some layers including necessary dependencies will be downloaded too. Note that, once a layer is downloaded for a certain image, Docker will not need to download it again for another image.

For example, if you want to install the default MariaDB image, you can type:

docker pull mariadb:10.4

This will install the 10.4 version. Versions 10.2, 10.3, 10.5 are also valid choices.

You will see a list of necessary layers. For each layer, Docker will say if it is already present, or its download progress.

To get a list of installed images:

docker images

Creating a Container

An image is not a running process; it is just the software needed to be launched. To run it, we must create a container first. The command needed to create a container can usually be found in the image documentation. For example, to create a container for the official MariaDB image:

docker run --name mariadbtest -e MYSQL_ROOT_PASSWORD=mypass -p 3306:3306 -d docker.io/library/mariadb:10.3

mariadbtest is the name we want to assign the container. If we don't specify a name, an id will be automatically generated.

10.2 and 10.5 are also valid target versions:

docker run --name mariadbtest -e MYSQL_ROOT_PASSWORD=mypass -p 3306:3306 -d docker.io/library/mariadb:10.2

docker run --name mariadbtest -e MYSQL_ROOT_PASSWORD=mypass -p 3306:3306  -d docker.io/library/mariadb:10.5

Optionally, after the image name, we can specify some options for mariadbd. For example:

docker run --name mariadbtest -e MYSQL_ROOT_PASSWORD=mypass -p 3306:3306 -d mariadb:10.3 --log-bin --binlog-format=MIXED

Docker will respond with the container's id. But, just to be sure that the container has been created and is running, we can get a list of running containers in this way:

docker ps

We should get an output similar to this one:

CONTAINER ID        IMAGE                      COMMAND                CREATED             STATUS              PORTS               NAMES
819b786a8b48        mariadb                    "/docker-entrypoint.   4 minutes ago       Up 4 minutes        3306/tcp            mariadbtest

Running and Stopping the Container

Docker allows us to restart a container with a single command:

docker restart mariadbtest

The container can also be stopped like this:

docker stop mariadbtest

The container will not be destroyed by this command. The data will still live inside the container, even if MariaDB is not running. To restart the container and see our data, we can issue:

docker start mariadbtest

With docker stop, the container will be gracefully terminated: a SIGTERM signal will be sent to the mariadbd process, and Docker will wait for the process to shutdown before returning the control to the shell. However, it is also possible to set a timeout, after which the process will be immediately killed with a SIGKILL. Or it is possible to immediately kill the process, with no timeout.

docker stop --time=30 mariadbtest
docker kill mariadbtest

In case we want to destroy a container, perhaps because the image does not suit our needs, we can stop it and then run:

docker rm mariadbtest

Note that the command above does not destroy the data volume that Docker has created for /var/lib/mysql. If you want to destroy the volume as well, use:

docker rm -v mariadbtest

Automatic Restart

When we start a container, we can use the --restart option to set an automatic restart policy. This is useful in production.

Allowed values are:

no: No automatic restart.
on-failure: The container restarts if it exits with a non-zero exit code.
unless-stopped: Always restart the container, unless it was explicitly stopped as shown above.
always: Similar to unless-stopped, but when Docker itself restarts, even containers that were explicitly stopped will restart.

It is possible to change the restart policy of existing, possibly running containers:

docker update --restart always mariadb
# or, to change the restart policy of all containers:
docker update --restart always $(docker ps -q)

A use case for changing the restart policy of existing containers is performing maintenance in production. For example, before upgrading the Docker version, we may want to change all containers restart policy to always, so they will restart as soon as the new version is up and running. However, if some containers are stopped and not needed at the moment, we can change their restart policy to unless-stopped.

Pausing Containers

A container can also be frozen with the pause command. Docker will freeze the process using croups. MariaDB will not know that it is being frozen and, when we unpause it, MariaDB will resume its work as expected.

Both pause and unpause accept one or more container names. So, if we are running a cluster, we can freeze and resume all nodes simultaneously:

docker pause node1 node2 node3
docker unpause node1 node2 node3

Pausing a container is very useful when we need to temporarily free our system's resources. If the container is not crucial at this moment (for example, it is performing some batch work), we can free it to allow other programs to run faster.

Troubleshooting a Container

If the container doesn't start, or is not working properly, we can investigate with the following command:

docker logs mariadbtest

This command shows what the daemon sent to the stdout since the last attempt of starting - the text that we typically see when we invoke mariadbd from the command line.

On some systems, commands such as docker stop mariadbtest and docker restart mariadbtest may fail with a permissions error. This can be caused by AppArmor, and even sudo won't allow you to execute the command. In this case, you will need to find out which profile is causing the problem and correct it, or disable it. Disabling AppArmor altogether is not recommended, especially in production.

To check which operations were prevented by AppArmor, see AppArmor Failures in AppArmor documentation.

To disable a profile, create a symlink with the profile name (in this example, mariadbd) to etc/apparmor.d/disable, and then reload profiles:

ln -s /etc/apparmor.d/usr.sbin.mariadbd /etc/apparmor.d/disable/
sudo apparmor_parser -R /etc/apparmor.d/usr.sbin.mariadbd

For more information, see Policy Layout in AppArmor documentation.

After disabling the profile, you may need to run:

sudo service docker restart
docker system prune --all --volumes

Restarting the system will then allow Docker to operate normally.

Accessing the Container

To access the container via Bash, we can run this command:

docker exec -it mariadbtest bash

Now we can use normal Linux commands like cd, ls, etc. We will have root privileges. We can even install our favorite file editor, for example:

apt-get update
apt-get install vim

In some images, no repository is configured by default, so we may need to add them.

Note that if we run mariadb-admin shutdown or the SHUTDOWN command to stop the container, the container will be deactivated, and we will automatically exit to our system.

Connecting to MariaDB from Outside the Container

If we try to connect to the MariaDB server on localhost, the client will bypass networking and attempt to connect to the server using a socket file in the local filesystem. However, this doesn't work when MariaDB is running inside a container because the server's filesystem is isolated from the host. The client can't access the socket file which is inside the container, so it fails to connect.

Therefore connections to the MariaDB server must be made using TCP, even when the client is running on the same machine as the server container.

Find the IP address that has been assigned to the container:

docker inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' mariadbtest

You can now connect to the MariaDB server using a TCP connection to that IP address.

Forcing a TCP Connection

After enabling network connections in MariaDB as described above, we will be able to connect to the server from outside the container.

On the host, run the client and set the server address ("-h") to the container's IP address that you found in the previous step:

mysql -h 172.17.0.2 -u root -p

This simple form of the connection should work in most situations. Depending on your configuration, it may also be necessary to specify the port for the server or to force TCP mode:

mysql -h 172.17.0.2 -P 3306 --protocol=TCP -u root -p

Port Configuration for Clustered Containers and Replication

Multiple MariaDB servers running in separate Docker containers can connect to each other using TCP. This is useful for forming a Galera cluster or for replication.

When running a cluster or a replication setup via Docker, we will want the containers to use different ports. The fastest way to achieve this is mapping the containers ports to different port on our system. We can do this when creating the containers (docker run command), by using the -p option, several times if necessary. For example, for Galera nodes we will use a mapping similar to this one:

-p 4306:3306 -p 5567:5567 -p 5444:5444 -p 5568:5568

Installing MariaDB on Another Image

It is possible to download a Linux distribution image, and to install MariaDB on it. This is not much harder than installing MariaDB on a regular operating system (which is easy), but it is still the hardest option. Normally we will try existing images first. However, it is possible that no image is available for the exact version we want, or we want a custom installation, or perhaps we want to use a distribution for which no images are available. In these cases, we will install MariaDB in an operating system image.

Daemonizing the Operating System

First, we need the system image to run as a daemon. If we skip this step, MariaDB and all databases will be lost when the container stops.

To demonize an image, we need to give it a command that never ends. In the following example, we will create a Debian Jessie daemon that constantly pings the 8.8.8.8 special address:

docker run --name debian -p 3306:3306 -d debian /bin/sh -c "while true; do ping 8.8.8.8; done"

Installing MariaDB

At this point, we can enter the shell and issue commands. First we will need to update the repositories, or no packages will be available. We can also update the packages, in case some of them are newer than the image. Then, we will need to install a text editor; we will need it to edit configuration files. For example:

# start an interactive Bash session in the container
docker exec -ti debian bash
apt-get -y update
apt-get -y upgrade
apt-get -y install vim

Now we are ready to install MariaDB in the way we prefer.

MariaDB Container Cheat Sheet

Get the images

Images can be found on MariaDB Docker Hub. To get the list of images run

$ docker images -a

Create the network

$ docker network create mynetwork

It is good practice to create the container network and attach the container to the network.

Start the container with server options

To start the container in the background with the MariaDB server image run:

$ docker run --rm --detach \
--env MARIADB_ROOT_PASSWORD=sosecret \
--network mynetwork \
--name mariadb-server \
mariadb:latest

Additionally |environment variables are also provided.

Get the list of running containers

$ docker ps
CONTAINER ID   IMAGE            COMMAND                  CREATED          STATUS          PORTS      NAMES
ad374ec8a272   mariadb:latest   "docker-entrypoint.s…"   3 seconds ago    Up 1 second     3306/tcp   mariadb-server

Note: specify the flag -a in case you want to see all containers

Start the client from the container

To start the mariadb client inside the created container and run specific commands, run the following:

$ docker exec -it mariadb-server mariadb -psosecret -e "SHOW PLUGINS"

Inspect logs of a container

$ docker logs mariadb-server

In the logs you can find status information about the server, plugins, generated passwords, errors and so on.

Restart the container

$ docker restart mariadb-server

Run commands within the container

$ docker exec -it mariadb-server bash

Use a volume to specify configuration options

$ docker run --detach --env MARIADB_USER=anel \
  --env MARIADB_PASSWORD=anel \
  --env MARIADB_DATABASE=my_db \
  --env MARIADB_RANDOM_ROOT_PASSWORD=1 \
  --volume $PWD/my_container_config:/etc/mysql/conf.d:z \
  --network mynetwork \
  --name mariadb-server1 \
   mariadb:latest

One can specify custom configuration files through the /etc/mysql/conf.d volume during container startup.

Use a volume to specify grants during container start

$ docker run --detach --env MARIADB_USER=anel\
  --env MARIADB_PASSWORD=anel \
  --env MARIADB_DATABASE=my_db \
  --env MARIADB_RANDOM_ROOT_PASSWORD=1 \
  --volume $PWD/my_init_db:/docker-entrypoint-initdb.d \
  --network mynetwork \
  --name mariadb-server1 \
   mariadb:latest

User created with the environment variables has full grants only to the MARIADB_DATABASE. In order to override those grants, one can specify grants to a user, or execute any SQL statements from host file to docker-entrypoint-initdb.d. In the local_init_dir directory we can find the file, created like this:

$ echo "GRANT ALL PRIVILEGES ON *.* TO anel;" > my_init_db/my_grants.sql

MariaDB Server Docker Official Image Environment Variables

When you start the image, you can adjust the initialization of the MariaDB Server instance by passing one or more environment variables on the docker run command line. Do note that all of the variables below, except MARIADB_AUTO_UPGRADE, will have no effect if you start the container with a data directory that already contains a database: any pre-existing database will always be left untouched on container startup.

From tag 10.2.38, 10.3.29, 10.4.19, 10.5.10 onwards, and all 10.6 and later tags, the MARIADB_* equivalent variables are provided. MARIADB_* variants will always be used in preference to MYSQL_* variants.

One of MARIADB_ROOT_PASSWORD_HASH, MARIADB_ROOT_PASSWORD, MARIADB_ALLOW_EMPTY_ROOT_PASSWORD, or MARIADB_RANDOM_ROOT_PASSWORD (or equivalents, including *_FILE), is required. The other environment variables are optional.

MARIADB_ROOT_PASSWORD_HASH / MARIADB_ROOT_PASSWORD / MYSQL_ROOT_PASSWORD

This specifies the password that will be set for the MariaDB root superuser account.

MARIADB_ALLOW_EMPTY_ROOT_PASSWORD / MYSQL_ALLOW_EMPTY_PASSWORD

Set to a non-empty value, like 1, to allow the container to be started with a blank password for the root user. NOTE: Setting this variable to yes is not recommended unless you really know what you are doing, since this will leave your MariaDB instance completely unprotected, allowing anyone to gain complete superuser access.

MARIADB_RANDOM_ROOT_PASSWORD / MYSQL_RANDOM_ROOT_PASSWORD

Set to a non-empty value, like yes, to generate a random initial password for the root user. The generated root password will be printed to stdout (GENERATED ROOT PASSWORD: .....).

MARIADB_ROOT_HOST / MYSQL_ROOT_HOST

This is the hostname part of the root user created. By default this is %, however it can be set to any default MariaDB allowed hostname component. Setting this to localhost will prevent any root user being accessible except via the unix socket.

MARIADB_DATABASE / MYSQL_DATABASE

This variable allows you to specify the name of a database to be created on image startup.

MARIADB_USER / MYSQL_USER, MARIADB_PASSWORD_HASH / MARIADB_PASSWORD / MYSQL_PASSWORD

Both user and password variables, along with a database, are required for a user to be created. This user will be granted all access (corresponding to GRANT ALL) to the MARIADB_DATABASE database.

Do not use this mechanism to create the root superuser, that user gets created by default with the password specified by the MARIADB_ROOT_PASSWORD / MYSQL_ROOT_PASSWORD variable.

MARIADB_MYSQL_LOCALHOST_USER / MARIADB_MYSQL_LOCALHOST_GRANTS

Set MARIADB_MYSQL_LOCALHOST_USER to a non-empty value to create the mysql@locahost database user. This user is especially useful for a variety of health checks and backup scripts.

The mysql@localhost user gets USAGE privileges by default. If more access is required, additional global privileges in the form of a comma separated list can be provided. If you are sharing a volume containing MariaDB's unix socket (/var/run/mysqld by default), privileges beyond USAGE can result in confidentiality, integrity and availability risks, so use a minimal set. Its also possible to use for Mariadb-backup. The healthcheck.sh script also documents the required privileges for each health check test.

MARIADB_HEALTHCHECK_GRANTS

Set MARIADB_HEALTHCHECK_GRANTS to the grants required to be given to the healthcheck@localhost, healthcheck@127.0.0.1, healthcheck@::1, users. When not specified the default grant is USAGE.

The main value used here will be [REPLICA MONITOR](../../../../../reference/sql-statements-and-structure/sql-statements/account-management-sql-commands/grant.md#replica-monitor) for the [healthcheck --replication](using-healthcheck-sh.md) test.

MARIADB_INITDB_SKIP_TZINFO / MYSQL_INITDB_SKIP_TZINFO

By default, the entrypoint script automatically loads the timezone data needed for the CONVERT_TZ() function. If it is not needed, any non-empty value disables timezone loading.

MARIADB_AUTO_UPGRADE / MARIADB_DISABLE_UPGRADE_BACKUP

Set MARIADB_AUTO_UPGRADE to a non-empty value to have the entrypoint check whether mariadb-upgrade needs to run, and if so, run the upgrade before starting the MariaDB server.

Before the upgrade, a backup of the system database is created in the top of the datadir with the name system_mysql_backup_*.sql.zst. This backup process can be disabled with by setting MARIADB_DISABLE_UPGRADE_BACKUP to a non-empty value.

If MARIADB_AUTO_UPGRADE is set, and the .my-healthcheck.cnf file is missing, the healthcheck users are recreated if they don't exist, MARIADB_HEALTHCHECK_GRANTS grants are given, the passwords of the healthcheck users are reset to a random value and the .my-healthcheck.cnf file is recreated with the new password populated.

MARIADB_MASTER_HOST

When specified, the container will connect to this host and replicate from it.

MARIADB_REPLICATION_USER / MARIADB_REPLICATION_PASSWORD_HASH / MARIADB_REPLICATION_PASSWORD

When MARIADB_MASTER_HOST is specified, MARIADB_REPLICATION_USER and MARIADB_REPLICATION_PASSWORD will be used to connect to the master.

When not specified, the MARIADB_REPLICATION_USER will be created with the REPLICATION REPLICA grants required to a client to start replication.

_{This page is licensed: CC BY-SA / Gnu FDL}

Setting Up a LAMP Stack with Docker Compose

Docker Compose is a tool that allows one to declare which Docker containers should run, and which relationships should exist between them. It follows the infrastructure as code approach, just like most automation software and Docker itself.

For information about installing Docker Compose, see Install Docker Compose in Docker documentation.

The `docker-compose.yml` File

When using Docker Compose, the Docker infrastructure must be described in a YAML file called docker-compose.yml.

Let's see an example:

version: "3"

services:
  web:
    image: "apache:${PHP_VERSION}"
    restart: 'always'
    depends_on:
      - mariadb
    restart: 'always'
    ports:
      - '8080:80'
    links:
      - mariadb
  mariadb:
    image: "mariadb:${MARIADB_VERSION}"
    restart: 'always'
    volumes: 
      - "/var/lib/mysql/data:${MARIADB_DATA_DIR}"
      - "/var/lib/mysql/logs:${MARIADB_LOG_DIR}"
      - /var/docker/mariadb/conf:/etc/mysql
    environment:
      MYSQL_ROOT_PASSWORD: "${MYSQL_ROOT_PASSWORD}"
      MYSQL_DATABASE: "${MYSQL_DATABASE}"
      MYSQL_USER: "${MYSQL_USER}"
      MYSQL_PASSWORD: "${MYSQL_PASSWORD}"

In the first line we declare that we are using version 3 of the Docker compose language.

Then we have the list of services, namely the web and the mariadb services.

Let's see the properties of the services:

port maps the 8080 container port to the 80 host system port. This is very useful for a development environment, but not in production, because it allows us to connect our browser to the containerized web server. Normally there is no need to connect to MariaDB from the host system.
links declares that this container must be able to connect mariadb. The hostname is the container name.
depends_on declares that mariadb needs to start before web. This is because we cannot do anything with our application until MariaDB is ready to accept connections.
restart: always declares that the containers must restart if they crash.
volumes creates volumes for the container if it is set in a service definition, or a volume that can be used by any container if it is set globally, at the same level as services. Volumes are directories in the host system that can be accessed by any number of containers. This allows destroying a container without losing data.
environment sets environment variables inside the container. This is important because in setting these variables we set the MariaDB root credentials for the container.

About Volumes

It is good practice to create volumes for:

The data directory, so we don't lose data when a container is created or replaced, perhaps to upgrade MariaDB.
The directory where we put all the logs, if it is not the datadir.
The directory containing all configuration files (for development environments), so we can edit those files with the editor installed in the host system. Normally no editor is installed in containers. In production we don't need to do this, because we can copy files from a repository located in the host system to the containers.

Note that Docker Compose variables are just placeholders for values. Compose does not support assignment, conditionals or loops.

Using Variables

In the above example you can see several variables, like ${MARIADB_VERSION}. Before executing the file, Docker Compose will replace this syntax with the MARIADB_VERSION variable.

Variables allow making Docker Compose files more re-usable: in this case, we can use any MariaDB image version without modifying the Docker Compose file.

The most common way to pass variables is to write them into a file. This has the benefit of allowing us to version the variable file along with the Docker Compose file. It uses the same syntax you would use in BASH:

PHP_VERSION=8.0
MARIADB_VERSION=10.5
...

For bigger setups, it could make sense to use different environment files for different services. To do so, we need to specify the file to use in the Compose file:

services:
  web:
    env_file:
      - web-variables.env
...

Docker Compose Commands

Docker Compose is operated using docker-compose. Here we'll see the most common commands. For more commands and for more information about the commands mentioned here, see the documentation.

Docker Compose assumes that the Composer file is located in the current directory and it's called docker-compose.yml. To use a different file, the -f <filename> parameter must be specified.

To pull the necessary images:

docker-compose pull

Containers described in the Compose file can be created in several ways.

To create them only if they do not exist:

docker-compose up --no-recreate

To create them if they do not exist and recreate them if their image or configuration have changed:

docker-compose up

To recreate containers in all cases:

docker-compose up --force-recreate

Normally docker-compose up starts the containers. To create them without starting them, add the --no-start option.

To restart containers without recreating them:

docker-compose restart

To kill a container by sending it a SIGKILL:

docker-compose kill <service>

To instantly remove a running container:

docker-compose rm -f <serice>

To tear down all containers created by the current Compose file:

docker-compose down

Docker Compose Resources and References

Overview of Docker Compose in the Docker documentation.
Compose file in the Docker documentation.
Docker Compose on GitHub.

Further information about the concepts explained in this page can be found in Docker documentation:

Content initially contributed by Vettabase Ltd.

_{This page is licensed: CC BY-SA / Gnu FDL}

using-healthcheck-sh

Using Healthcheck.sh

The healthcheck.sh script is part of the Docker Official Images of MariaDB Server. The script is part of the repository of the Docker Official Image of MariaDB Server.

The script processes a number of argument and tests, together, in strict order. Arguments pertaining to a test must occur before the test name. If a test fails, no further processing is performed. Both arguments and tests begin with a double-hyphen.

By default, (since 2023-06-27), official images will create healthcheck@localhost, healthcheck@127.0.0.1, healthcheck@::1 users with a random password and USAGE privileges. MARIADB_HEALTHCHECK_GRANTS can be used for --replication where additional grants are required. This is stored in .my-healthcheck.cnf in the datadir of the container and passed as the --defaults-extra-file to the healthcheck.sh script if it exists. The .my-healthcheck.cnf also sets protocol=tcp for the mariadb so --connect is effectively there on all tests.

The [MARIADB_AUTO_UPGRADE=1](mariadb-server-docker-official-image-environment-variables.md#mariadb_auto_upgrade) will regenerate the .my-healthcheck.cnf file if missing and recreate the healthcheck users of the database with a new random password. The current port configuration of the MariaDB container is written into this file.

The MARIADB_MYSQL_LOCALHOST_USER=1, MARIADB_MYSQL_LOCALHOST_GRANTS environment variables can also be used, but with the creation of the healthcheck user, these are backwards compatible.

Compose File Example

An example of a compose file that uses the healthcheck.sh to determine a healthy service as a depedency before starting a wordpress service:

version: "3"
services:
  mariadb:
    image: mariadb:lts
    environment:
      - MARIADB_DATABASE=testdb
      - MARIADB_USER=testuser
      - MARIADB_PASSWORD=password
      - MARIADB_RANDOM_ROOT_PASSWORD=1
    healthcheck:
      test: ["CMD", "healthcheck.sh", "--connect", "--innodb_initialized"]
      start_period: 10s
      interval: 10s
      timeout: 5s
      retries: 3
  wordpress:
    image: wordpress
    environment:
      - WORDPRESS_DB_HOST=mariadb
      - WORDPRESS_DB_NAME=testdb
      - WORDPRESS_DB_USER=testuser
      - WORDPRESS_DB_PASSWORD=password
    depends_on:
      mariadb:
        condition: service_healthy

Tests

--connect

This is active when a external user can connect to the TCP port of MariaDB Server. This strictly tests just the TCP connection and not if any authentication works.

--innodb_initialized

This test is true when InnoDB has completed initializing. This includes any rollback or crash recovery that may be occurring in the background as MariaDB is starting.

The connecting user must have USAGE privileges to perform this test.

--innodb_buffer_pool_loaded

This indicates that the buffer pool dump previously saved has been completed loaded into the InnoDB Buffer Pool and as such the server has a hot cache ready for use. This checks the innodb_buffer_pool_load_status for a "complete" indicator.

This test doesn't check if innodb-system-variables/#innodb_buffer_pool_load_at_startupinnodb_buffer_pool_load_at_startup is set at startup.

The connecting user must have USAGE privileges to perform this test.

--galera_online

This indicates that the galera node is online by the wsrep_local_state variable. This includes states like "joining" and "donor" where it cannot serve SQL queries.

The connecting user must have USAGE privileges to perform this test.

--replication

This tests a replica based on the --replication_* parameters. The replica test must pass all of the subtests to be true. The subtests are:

io - the IO thread is running
sql - the sql thread is running
seconds_behind_master - the replica is less than X seconds behind the master.
sql_remaining_delay - the delayed replica is less than X seconds behind the master's execution of the same SQL.

These are tested for all connections, if --replication_all is set (default), or --replication_name.

The connecting user must have REPLICATION_CLIENT if using a version less than MariaDB 10.5, or REPLICA MONITOR for MariaDB 10.5 or later.

--mariadbupgrade

This healthcheck indicates that the mariadb is upgrade to the current version.

Parameters

--replication_all

Checks all replication sources

--replication_name=n

Sets the multisource connection name tested. Unsets --replication_all.

--replication_io

IO thread is running

--replication_sql

SQL thread is running

--replication_seconds_behind_master=n

Less than or equal this seconds of delay

--replication_sql_remaining_delay=n

Less than or equal this seconds of remaining delay

--su=n

Change to this user. Can only be done once as the root user is default for healthchecks.

--su-mysql

Change to the mysql unix user. Like --su this respawns the script so will reset all parameters. Should be the first argument. The MARIADB_MYSQL_LOCALHOST_USER=1 environment variable is designed around usage here.

--datadir=n

For the --mariadbupgrade test where the upgrade file is located.

--no-defaults --defaults-file --defaults-extra-file --defaults-group-suffix

These are passed to mariadb shell for all tests except --mariadbupgrade

Examples

healthcheck.sh --su-mysql --connect --innodb_initialized

Switch to mysql user, and check if can connect and the innodb is initialized.

healthcheck.sh --su-mysql --connect --replication_io --replication_sql --replication_seconds_behind_master=600  --replication_sql_remaining_delay=30 ----replication_name=archive --replication --replication_seconds_behind_master=10  --replication_name=channel1 --replication

Switch to mysql user, check if connections can be made, for the replication channel "archive", ensure io and sql threads are running and the seconds behind master < 600 seconds and the sql remaining delay < 30 seconds. For the "channel1", the seconds behind master is limit to 10 seconds maximum.

_{This page is licensed: CC BY-SA / Gnu FDL}

Kubernetes and MariaDB

General information and hints on deploying MariaDB Kubernetes (K8s) containers, an open source container orchestration system which automates deployments, horizontal scaling, configuration, and operat

Kubernetes Operators for MariaDB

Operators basically instruct Kubernetes about how to manage a certain technology. Kubernetes comes with some default operators, but it is possible to create custom operators. Operators created by the community can be found on OperatorHub.io.

Custom Operators

Kubernetes provides a declarative API. To support a specific (i.e. MariaDB) technology or implement a desired behavior (i.e. provisioning a replica), we extend Kubernetes API. This involves creating two main components:

A custom resource.
A custom controller.

A custom resource adds an API endpoint, so the resource can be managed via the API server. It includes functionality to get information about the resource, like a list of the existing servers.

A custom controller implements the checks that must be performed against the resource to check if its state should be corrected using the API. In the case of MariaDB, some reasonable checks would be verifying that it accepts connections, replication is running, and a server is (or is not) read only.

MariaDB Enterprise Operator

MariaDB Enterprise Operator provides a seamless way to run and operate containerized versions of MariaDB Enterprise Server and MaxScale on Kubernetes, allowing you to leverage Kubernetes orchestration and automation capabilities. This document outlines the features and advantages of using Kubernetes and the MariaDB Enterprise Operator to streamline the deployment and management of MariaDB and MaxScale instances.

Find the documentation here.

MariaDB Community Operator

mariadb-operator is a Kubernetes operator that allows you to run and operate MariaDB in a cloud native way. It aims to declaratively manage MariaDB instances using Kubernetes CRDs instead of imperative commands.

It's available in both Artifact Hub and Operator Hub and supports the following features:

Easily provision and configure MariaDB servers in Kubernetes.
Multiple HA modes: Galera Cluster or MariaDB Replication.
Automated Galera primary failover and cluster recovery.
Advanced HA with MaxScale: a sophisticated database proxy, router, and load balancer for MariaDB.
Flexible storage configuration. Volume expansion.
Physical backups based on mariadb-backup and Kubernetes VolumeSnapshots.
Logical backups based on mariadb-dump.
Multiple backup storage types: S3 compatible, PVCs, Kubernetes volumes and VolumeSnapshots.
Flexible backup configuration: scheduling, compression, retention policy, timeout, staging area...
Target recovery time: restore the closest available backup to the specified time.
Bootstrap new instances from: Physical backups, logical backups, S3, PVCs, VolumeSnapshots...
Cluster-aware rolling update: roll out replica Pods one by one, wait for each of them to become ready, and then proceed with the primary Pod, using ReplicasFirstPrimaryLast.
Manual update strategies: OnDelete and Never.
Automated data-plane updates.
my.cnf change detection. Automatically trigger updates when my.cnf changes.
Suspend operator reconciliation for maintenance operations.
Issue, configure and rotate TLS certificates and CAs.
Native integration with cert-manager. Automatically create Certificate resources.
Prometheus metrics via mysqld-exporter and maxscale-exporter.
Native integration with prometheus-operator. Automatically create ServiceMonitor resources.
Declaratively manage SQL resources: users, grants and logical databases.
Configure connections for your applications.
Orchestrate and schedule sql scripts.
Validation webhooks to provide CRD immutability.
Additional printer columns to report the current CRD status.
CRDs designed according to the Kubernetes API conventions.
Install it using helm, OLM or static manifests.
Multiple deployment modes: cluster-wide and single namespace.
Helm chart to deploy MariaDB clusters and its associated CRs.
Multi-arch distroless image.
GitOps friendly.

Please, refer to the documentation, the API reference and the example suite for further detail.

_{This page is licensed: CC BY-SA / Gnu FDL}

Kubernetes Overview for MariaDB Users

Kubernetes, or K8s, is software to orchestrate containers. It is released under the terms of an open source license, Apache License 2.0.

Kubernetes was originally developed by Google. Currently it is maintained by the Cloud Native Computing Foundation (CNCF), with the status of Graduated Project.

For information about how to setup a learning environment or a production environment, see Getting started in Kubernetes documentation.

Architecture

Kubernetes runs in a cluster. A cluster runs a workload: a set of servers that are meant to work together (web servers, database servers, etc).

A Kubernetes cluster consists of the following components:

Nodes run containers with the servers needed by our applications.
Controllersconstantly check the cluster nodes current state, and compare it with the desired state.
A Control Plane is a set of different components that store the cluster desired state and take decisions about the nodes. The Control Plane provides an API that is used by the controllers.

For more information on Kubernetes architecture, see Concepts and Kubernetes Components in Kubernetes documentation.

Nodes

A node is a system that is responsible to run one or more pods. A pod is a set of containers that run a Kubernetes workload or part of it. All containers that run in the same pod are also located on the same node. Usually identical pods run on different nodes for fault tolerance.

For more details, see Nodes in the Kubernetes documentation.

Every node must necessarily have the following components:

kubelet
kube-proxy
A container runtime

kubelet

kubelet has a set of PodSpecs which describe the desired state of pods. It checks that the current state of the pods matches the desired state. It especially takes care that containers don't crash.

kube-proxy

In a typical Kubernetes cluster, several containers located in different pods need to connect to other containers, located in the same pods (for performance and fault tolerance reasons). Therefore, when we develop and deploy an application, we can't know in advance the IPs of the containers to which it will have to connect. For example, an application server may need to connect to MariaDB, but the MariaDB IP will be different for every pod.

The main purpose of kube-proxy is to implement the concept of Kubernetes services. When an application needs to connect to MariaDB, it will connect to the MariaDB service. kube-proxy will receive the request and will redirect it to a running MariaDB container in the same pod.

Container Runtime

Kubernetes manages the containers in a pod via a container runtime, or container manager, that supports the Kubernetes Container Runtime Interface (CRI). Container runtimes that meet this requisite are listed in the Container runtimes page in the Kubernetes documentation. More information about the Container Runtime Interface can be found on GitHub.

Originally, Kubernetes used Docker as a container runtime. This was later deprecated, but Docker images can still be used using any container runtime.

Controllers

Controllers constantly check if there are differences between the pod's current state and their desired state. When differences are found, controllers try to fix them. Each node type controls one or more resource types. Several types of controllers are needed to run a cluster.

Most of the actions taken by the controllers user the API server in the Control Plane. However, this is not necessarily true for custom controllers. Also, some actions cannot be performed via the Control Plane. For example, if some nodes crashed, adding new nodes involves taking actions outside of the Kubernetes cluster, and controllers will have to do this themselves.

It is possible to write custom controllers to perform checks that require knowledge about a specific technology. For example, a MariaDB custom controller may want to check if replication is working by issuing SHOW REPLICA STATUS commands. This logic is specific to the way MariaDB works, and can only be implemented in a customer controller. Custom controllers are usually part of operators.

For more information, see Controllers in the Kubernetes documentation.

Control Plane

The control plane consists of the following components.

For more information about the control plane, see Control Plane Components in Kubernetes documentation.

API Server

An API Server exposes API functions both internally and externally. It is essential to coordinate Kubernetes components so that they react to node's change of state, and it allows the user to send commands.

The default implementation of the API Server is kube-apiserver. It is able to scale horizontally and to balance the load between its instances.

kube-controller-manager

Most controllers run in this component.

etcd

etcd contains all data used by a Kubernetes cluster. It is a good idea to take regular backups of etcd data.

kube-scheduler

When a new pod is created, kube-scheduler decides which node should host it. The decision is made based on several criteria, like the resource requirements for the pod.

cloud-controller-manager

cloud-controller-manager implements the logic and API of a cloud provider. It receives requests from the API Server and performs specific actions, like creating an instance in AWS. It also runs controllers that are specific to a cloud vendor.

Clients and Tools

Kubernetes comes with a set of tools that allow us to communicate with the API server and test a cluster.

kubectl

kubectl allows communication with the API server and run commands on a Kubernetes cluster.

kubeadm

kubeadm allows creating a Kubernetes cluster that is ready to receive commands from kubectl.

kind and minikube

These tools are meant to create and manage test clusters on a personal machine. They work on Linux, MacOS and Windows. kind creates a cluster that consists of Docker containers, therefore it requires Docker to be installed. minikube runs a single-node cluster on the local machine.

Kubernetes Resources and References

Kubernetes website.
Kubernetes on Wikipedia.
Kubernetes organization on GitHub.
The Official MariaDB Operator
OperatorHub.io
Kubernetes Community Forums.
(video) MariaDB database clusters on Kubernetes, by Pengfei Ma, at MariaDB Server Fest 2020.
Series of posts by Anel Husakovic on the MariaDB Foundation blog:

Content initially contributed by Vettabase Ltd.

_{This page is licensed: CC BY-SA / Gnu FDL}

Vagrant and MariaDB

Vagrant is an open source tool to quickly setup machines that can be used for development and testing. They can be local virtual machines, Docker containers, AWS EC2 instances, and so on

Creating a Vagrantfile

In this page we discuss how to create a Vagrantfile, which you can use to create new boxes or machines. This content is specifically written to address the needs of MariaDB users.

A Basic Vagrantfile

A Vagrantfile is a Ruby file that instructs Vagrant to create, depending on how it is executed, new Vagrant machines or boxes. You can see a box as a compiled Vagrantfile. It describes a type of Vagrant machines. From a box, we can create new Vagrant machines. However, while a box is easy to distribute to a team or to a wider public, a Vagrantfile can also directly create one or more Vagrant machines, without generating any box.

Here is a simple Vagrantfile example:

Vagrant.configure("2") do |config|
  config.vm.box = "hashicorp/bionic64"
  config.vm.provider "virtualbox"
  config.vm.provision :shell, path: "bootstrap.sh"
end

Vagrant.configure("2") returns the Vagrant configuration object for the new box. In the block, we'll use the config alias to refer this object. We are going to use version 2 of Vagrant API.

vm.box is the base box that we are going to use. It is Ubuntu BionicBeaver (18.04 LTS), 64-bit version, provided by HashiCorp. The schema for box names is simple: the maintainer account in Vagrant Cloud followed by the box name.

We use vm.provision to specify the name of the file that is going to be executed at the machine creation, to provision the machine. bootstrap.sh is the conventional name used in most cases.

To create new Vagrant machines from the Vagrantfile, move to the directory that contains the Vagrant project and run:

vagrant up

To compile the Vagrantfile into a box:

vagrant package

These operations can take time. To preventively check if the Vagrantfile contains syntax errors or certain types of bugs:

vagrant validate

Providers

A provider allows Vagrant to create a Vagrant machine using a certain technology. Different providers may enable a virtual machine manager (VirtualBox, VMWare, Hyper-V...), a container manager (Docker), or remote cloud hosts (AWS, Google Compute Engine...).

Some providers are developed by third parties. app.vagrant.com supports search for boxes that support the most important third parties providers. To find out how to develop a new provider, see Plugin Development: Providers.

Provider options can be specified. Options affect the type of Vagrant machine that is created, like the number of virtual CPUs. Different providers support different options.

It is possible to specify multiple providers. In this case, Vagrant will try to use them in the order they appear in the Vagrantfile. It will try the first provider; if it is not available it will try the second; and so on.

Here is an example of providers usage:

Vagrant.configure("2") do |config|
    config.vm.box = "hashicorp/bionic64"
    config.vm.provider "virtualbox" do |vb|
        vb.customize ["modifyvm", :id, "--memory", 1024 * 4]
    end
    config.vm.provider "vmware_fusion"
end

In this example, we try to use VirtualBox to create a virtual machine. We specify that this machine must have 4G of RAM (1024M * 4). If VirtualBox is not available, Vagrant will try to use VMWare.

This mechanism is useful for at least a couple of reasons:

Different users may use different systems, and maybe they don't have the same virtualization technologies installed.
We can gradually move from one provider to another. For a period of time, some users will have the new virtualization technology installed, and they will use it; other users will only have the old technology installed, but they will still be able to create machines with Vagrant.

Provisioners

We can use different methods for provisioning. The simplest provisioner is shell, that allows one to run a Bash file to provision a machine. Other provisioners allow setting up the machines using automation software, including Ansible, Puppet, Chef and Salt.

To find out how to develop a new provisioner, see Plugin Development: Provisioners.

The `shell` Provisioner

In the example above, the shell provisioner runs boostrap.sh inside the Vagrant machine to provision it. A simple bootstrap.sh may look like the following:

#!/bin/bash

apt-get update
apt-get install -y

To find out the steps to install MariaDB on your system of choice, see the Getting, Installing, and Upgrading MariaDB section.

You may also want to restore a database backup in the new Vagrant machine. In this way, you can have the database needed by the application you are developing. To find out how to do it, see Backup and Restore Overview. The most flexible type of backup (meaning that it works between different MariaDB versions, and in some cases even between MariaDB and different DBMSs) is a dump.

On Linux machines, the shell provisioner uses the default shell. On Windows machines, it uses PowerShell.

Uploading Files

If we use the shell provisioner, we need a way to upload files to the new machine when it is created. We could use the file provisioner, but it works by connecting the machine via ssh, and the default user doesn't have permissions for any directory except for the synced folders. We could change the target directory owner, or we could add the default user to a group with the necessary privileges, but these are not considered good practices.

Instead, we can just put the file we need to upload somewhere in the synced folder, and then copy it with a shell command:

cp ./files/my.cnf /etc/mysql/conf.d/

Provisioning Vagrant with Ansible

Here is an example of how to provision a Vagrant machine or box by running Ansible:

Vagrant.configure("2") do |config|
  ...
  config.vm.provision "ansible" do |ansible|
    ansible.playbook = "vagrant.yml"
  end
end

With the Ansible provisioner, Ansible runs in the host system and applies a playbook in the guest system. In this example, it runs a playbook called vagrant.yml. The Ansible Local provisioner runs the playbook in the vagrant machine.

For more information, see Using Vagrant and Ansible in the Ansible documentation. For an introduction to Ansible for MariaDB users, see Ansible and MariaDB.

Provisioning Vagrant with Puppet

To provision a Vagrant machine or box by running Puppet:

Vagrant.configure("2") do |config|
  ...
  config.vm.provision "puppet" do |puppet|
    puppet.manifests_path = "manifests"
    puppet.manifest_file = "default.pp"
  end
end

In this example, Puppet Apply runs in the host system and no Puppet Server is needed. Puppet expects to find a manifests directory in the project directory. It expects it to contain default.pp, which will be used as an entry point. Note that puppet.manifests_path and puppet.manifest_file are set to their default values.

Puppet needs to be installed in the guest machine.

To use a Puppet server, the puppet_server provisioner can be used:

Vagrant.configure("2") do |config|
  ...
  config.vm.provision "puppet_server" do |puppet|
    puppet.puppet_server = "puppet.example.com"
  end
end

See the Puppet Apply provisioner and the Puppet Agent Provisioner.

For an introduction to Puppet for MariaDB users, see Puppet and MariaDB.

To restore a backup into MariaDB, in most cases we need to be able to copy it from the host system to the box. We may also want to occasionally copy MariaDB logs from the box to the host system, to be able to investigate problems.

The project directory (the one that contains the Vagrantfile) by default is shared with the virtual machine and mapped to the /vagrant directory (the synced folder). It is a good practice to put there all files that should be shared with the box when it is started. Those files should normally be versioned.

The synced folder can be changed. In the above example, we could simply add one line:

config.vm.synced_folder "/host/path", "/guest/path"

The synced folder can also be disabled:

config.vm.synced_folder '.', '/vagrant', disabled: true

Note that multiple Vagrant machines may have synced folders that point to the same directory on the host system. This can be useful in some cases, if you prefer to test some functionalities quickly, rather that replicating production environment as faithfully as possible. For example, to test if you're able to take a backup from one machine and restore it to another, you can store the backup in a common directory.

Network Communications

It is often desirable for a machine to be able to communicate with "the outside". This can be done in several ways:

Private networks;
Public networks;
Exposing ports to the host.

Remember that Vagrant doesn't create machines itself; instead, it asks a provider to create and manage them. Some providers support all of these communication methods, while others may only support some of them, or even none at all. When you create a Vagrantfile that uses one of these networking features, it is implicit that this can only happen if the provider you are using supports them. Check your provider's documentation to find out which features it supports.

The default provider, VirtualBox, supports all of these communication methods, including multiple networks.

Private Networks

A private network is a networks that can only be accesses by machines that run on the same host. Usually this also means that the machines must run on the same provider (for example, they all must be VirtualBox virtual machines).

Some providers support multiple private networks. This means that every network has a different name and can be accessed by different machines.

The following line shows how to create or join a private network called "example", where this machine's IP is assigned by the provider via DHCP:

config.vm.network 'private_network', name: 'example', type: 'dhcp'

While this is very convenient to avoid IP conflicts, sometimes you prefer to assign some IP's manually, in this way:

config.vm.network 'private_network', name: 'example', ip: '111.222.111.222'

Public Networks

As explained above, public networks are networks that can be accessed by machines that don't run on the same host with the same provider.

To let a machine join a public network:

# use provider DHCP:
config.vm.network "public_network", use_dhcp_assigned_default_route: true

# assign ip manually:
config.vm.network "public_network", ip: "111.222.111.222"

To improve security, you may want to configure a gateway:

config.vm.provision "shell", run: "always", inline: "route add default gw 111.222.111.222"

Exposing Ports

Vagrant allows us to map a TCP or UDP port in a guest system to a TCP or UDP port in the host system. For example, you can map a virtual machine port 3306 to the host port 12345. Then you can connect MariaDB in this way:

mariadb -hlocalhost -P12345 -u<user> -p<password>

You are not required to map a port to a port with a different number. In the above example, if the port 3306 in your host is not in use, you are free to map the guest port 3306 to the host port 3306.

There are a couple of caveats:

You can't map a single host port to multiple guest ports. If you want to expose the port 3306 from multiple Vagrant machines, you'll have to map them to different host ports. When running many machines this can be hard to maintain.
Ports with numbers below 1024 are privileged ports. Mapping privileged ports requires root privileges.

To expose a port:

config.vm.network 'forwarded_port', guest: 3306, host: 3306

Use Cases

Suppose you run MariaDB and an application server in two separate Vagrant machines. It's usually best to let them communicate via a private network, because this greatly increases your security. The application server will still need to expose ports to the host, so the application can be tested with a web browser.

Suppose you have multiple environments of the same type, like the one described above. They run different applications that don't communicate with each other. In this case, if your provider supports this, you will run multiple private networks. You will need to expose the applications servers ports, mapping them to different host ports.

You may even want to implement different private networks to create an environment that reflects production complexity. Maybe in production you have a cluster of three MariaDB servers, and the application servers communicate with them via a proxy layer (ProxySQL, HAProxy, or MaxScale). So the applications can communicate with the proxies, but have no way to reach MariaDB directly. So there is a private network called "database" that can be accessed by the MariaDB servers and the proxy servers, and another private network called "application" that can be accessed by the proxy servers and the application servers. This requires that your provider supports multiple private networks.

Using public networks instead of private one will allow VMs that run on different hosts to be part of your topology. In general this is considered as an insecure practice, so you should probably ask yourself if you really need to do this.

References

The vagrant-mariadb-examples repository is an example of a Vagrantfile that creates a box containing MariaDB and some useful tools for developers.

Further information can be found in Vagrant documentation.

Vagrant Overview for MariaDB Users

Vagrant is a tool to create and manage development machines (Vagrant boxes). They are usually virtual machines on the localhost system, but they could also be Docker containers or remote machines. Vagrant is open source software maintained by HashiCorp and released under the MIT license.

Vagrant benefits include simplicity, and a system to create test boxes that is mostly independent from the technology used.

For information about installing Vagrant, see Installation in Vagrant documentation.

In this page we discuss basic Vagrant concepts.

Vagrant Concepts

A Vagrant machine is compiled from a box. It can be a virtual machine, a container or a remote server from a cloud service.

A box is a package that can be used to create Vagrant machines. We can download boxes from app.vagrantup.com, or we can build a new box from a Vagrantfile. A box can be used as a base for another box. The base boxes are usually operating system boxes downloaded from app.vagrantup.com.

A provider is responsible for providing the virtualization technology that will run our machine.

A provisioner is responsible for installing and configuring the necessary software on a newly created Vagrant machine.

Example

The above concepts are probably easier to understand with an example.

We can use an Ubuntu box as a base to build a Vagrant machine with MariaDB. So we write a Vagrantfile for this purpose. In the Vagrantfile we specify VirtualBox as a provider. And we use the Ansible provisioner to install and configure MariaDB. Once we finish this Vagrantfile, we can run a Vagrant command to start a Vagrant machine, which is actually a VirtualBox VM running MariaDB on Ubuntu.

The following diagram should make the example clear:

Vagrantfiles

A Vagrantfile is a file that describes how to create one or more Vagrant machines. Vagrantfiles use the Ruby language, as well as objects provided by Vagrant itself.

A Vagrantfile is often based on a box, which is usually an operating system in which we are going to install our software. For example, one can create a MariaDB Vagrantfile based on the ubuntu/trusty64 box. A Vagrantfile can describe a box with a single server, like MariaDB, but it can also contain a whole environment, like LAMP. For most practical use cases, having the whole environment in a single box is more convenient.

Boxes can be searched in Vagrant Cloud. Most of their Vagrantfiles are available on GitHub. Searches can be made, among other things, by keyword to find a specific technology, and by provider.

Providers

A provider adds support for creating a specific type of machines. Vagrant comes with several providers, for example:

VirtualBox allows one to create virtual machines with VirtualBox.
Microsoft-Hyper-V allows one to create virtual machines with Microsoft Hyper-V.
Docker allows one to create Docker containers. On non-Linux systems, Vagrant will create a VM to run Docker.

Alternative providers are maintained by third parties or sold by HashiCorp. They allow one to create different types of machines, for example using VMWare.

Some examples of useful providers, recognized by the community:

If you need to create machines with different technologies, or deploy them to unsupported cloud platforms, you can develop a custom provider in Ruby language. To find out how, see Plugin Development: Providers in Vagrant documentation. The Vagrant AWS Provider was initially written as an example provider.

Provisioners

A provisioner is a technology used to deploy software to the newly created machines.

The simplest provisioner is shell, which runs a shell file inside the Vagrant machine. powershell is also available.

Other providers use automation software to provision the machine. There are provisioners that allow one to use Ansible, Puppet, Chef or Salt. Where relevant, there are different provisioners allowing the use of these technologies in a distributed way (for example, using Puppet apply) or in a centralized way (for example, using a Puppet server).

It is interesting to note that there is both a Docker provider and a Docker provisioner. This means that a Vagrant machine can be a Docker container, thanks to the docker provisioner. Or it could be any virtualisation technology with Docker running in it, thanks to the docker provisioner. In this case, Docker pulls images and starts containers to run the software that should be running in the Vagrant machine.

If you need to use an unsupported provisioning method, you can develop a custom provisioner in Ruby language. See Plugin Development: Provisioners in Vagrant documentation.

Plugins

It is possible to install a plugin with this command:

vagrant plugin install <plugin_name>

A Vagrantfile can require that a plugin is installed in this way:

require 'plugin_name'

A plugin can be a Vagrant plugin or a Ruby gem installable from rubygems.org. It is possible to install a plugin that only exists locally by specifying its path.

Changes in Vagrant 3.0

HashiCorp published an article that describes its plans for Vagrant 3.0.

Vagrant will switch to a client-server architecture. Most of the logic will be stored in the server, while the development machines will run a thin client that communicates with the server. It will be possible to store the configuration in a central database.

Another notable change is that Vagrant is switching from Ruby to Go. For some time, it will still be possible to use Vagrantfiles and plugins written in Ruby. However, in the future Vagrantfiles and plugins should be written in one of the languages that support gRPC (not necessarily Go). Vagrantfiles can also be written in HCL, HashiCorp Configuration Language.

Vagrant Commands

This is a list of the most common Vagrant commands. For a complete list, see Command-Line Interface in Vagrant documentation.

To list the available machines:

vagrant box list

To start a machine from a box:

cd /box/directory
vagrant up

To connect to a machine:

vagrant ssh

To see all machines status and their id:

vagrant global-status

To destroy a machine:

vagrant destroy <id>

Vagrant Resources and References

Here are some valuable websites and pages for Vagrant users.

Content initially contributed by Vettabase Ltd.

_{This page is licensed: CC BY-SA / Gnu FDL}

Vagrant Security Concerns

Databases typically contain information to which access should be restricted. For this reason, it's worth discussing some security concerns that Vagrant users should be aware of.

Access to the Vagrant Machine

By default, Vagrant machines are only accessible from the localhost. SSH access uses randomly generated key pairs, and therefore it is secure.

The password for root and vagrant is "vagrant" by default. Consider changing it.

Synced Folders

By default, the project folder in the host system is shared with the machine, which sees it as /vagrant. This means that whoever has access to the project folder also has read and write access to the synced folder. If this is a problem, make sure to properly restrict the access to the synced folder.

If we need to exchange files between the host system and the Vagrant machine, it is not advisable to disable the synced folder. This is because the only alternative is to use the file provider, which works by copying files to the machine via ssh. The problem is that the default ssh user does not have permissions to write to any directory by default, and changing this would be less secure than using a synced folder.

When a machine is provisioned, it should read the needed files from the synced folder or copy them to other places. Files in the synced folder should not be accessed by the Vagrant machine during its normal activities. For example, it is fine to load a dump from the synced folder during provisioning; and it is fine to copy configuration files from the synced folder to directories in /etc during provisioning. But it is a bad practice to let MariaDB use table files located in the synced folder.

Reporting Security Bugs

Note that security bugs are not reported as normal bugs. Information about security bugs are not public. See Security at HashiCorp for details.

Content initially contributed by Vettabase Ltd.

_{This page is licensed: CC BY-SA / Gnu FDL}

Automated MariaDB Deployment and Administration

A Comparison Between Automation Systems

Code Structure Differences

Ansible Code Structure

Puppet Code Structure

Architectural Differences

Ansible Architecture

Puppet Architecture

Agent-Master Architecture

Standalone Architecture

Inventory

Storing Secrets

Ecosystems and Communities

Ansible Ecosystem

Puppet Ecosystem

See Also

Automating MariaDB Tasks with Events

Pros and Cons of Using Events for Automation

Good Practices

Examples

Automating Upgrades with MariaDB.Org Downloads REST API

How to Use the API with a Unix Shell

Example: Check When a New Version Becomes GA

How to Use the API with a Python Script

HashiCorp Vault and MariaDB

Vault Features

Vault Architecture

Dev Mode

Vault Resources and References

Orchestrator Overview

Supported Topologies

Architecture

CLI Examples

Orchestrator Resources and References

Why to Automate MariaDB Deployments and Management

Infrastructure as Code

Automated Failover

Resources

Ansible and MariaDB

Ansible Overview for MariaDB Users

Automation Hubs

Design Principles

Concepts

Example

Architecture

Ansible Resources and References

Deploying Docker Containers with Ansible

How to Deploy a Container with Ansible

References

Deploying to Remote Servers with Ansible

Pinging Remote Servers

Running Commands on Remote Servers

Applying Roles to Remote Servers

Check mode

References

Existing Ansible Modules and Roles for MariaDB

Modules

Other Useful Modules

shell and command

copy and template

Other Common Modules

Roles

See Also

Installing MariaDB .deb Files with Ansible

Adding apt Repositories

Updating the Repository Cache

Importing MariaDB GPG Key

Installing Packages

See Also

Managing Secrets in Ansible

The SSH Password or Keys

Avoiding Sharing Secrets

ansible-vault

Running mariadb-tzinfo-to-sql with Ansible

Installing or Upgrading the Package

Running the Script

Using Galera

Puppet and MariaDB

Puppet Overview

Design Principles