Data is valuable, and owners want to protect it against loss. The first choice is usually a backup; however, the disadvantage of the classic periodic backup is that data that has changed after the last backup is lost. A potential solution is replication – that is, transferring all the changes to a storage location that is a copy of the working infrastructure. This approach also supports load balancing if the second system is made available to users for read-only access. To define the terms, I'll conduct a brief tour through the different types of replication (Figure 1).
In a "standalone" instance, nothing is replicated, and "cold standby" means that the backup system only starts up after its counterpart has failed. Here, too, it is not absolutely necessary to replicate; it is sufficient if the backup system can access a storage medium that it shares with the primary system.
A "warm standby" has a primary machine that handles all the write and read operations, with a secondary machine also running and continuously receiving replicas. The secondary machine can jump in at any time without losing data. If it also handles some of the primary machine's read workload, the setup is referred to as "hot standby." When both sides read and write at the same time and reciprocally sync the changes, it is known as a "master-master" setup.
A classic high-availability scenario comprises at least three computers so that operations do not depend on a single machine should one fail. Because load distribution also often plays a major role, replication kills two birds with one stone.
MySQL has had an easy-to-install replication feature for more than 15 years. Today, three companies are officially working on its further development: MySQL itself, which was acquired by Oracle in 2009; MariaDB, which some original and core developers of MySQL in Finland founded after the MySQL takeover by Oracle; and other former MySQL employees in the United States who founded Percona, which also deals with issues such as backup, high availability, and replication in Oracle's MySQL and MariaDB.
For this article, I used MariaDB 10.1 on Ubuntu 16.04 and XtraBackup 2.4 by Percona, which also works on other Linux distributions and with MySQL.
A prerequisite for replication operations – whether warm or hot – is that you have identical data available on both the primary computer, or master, and the secondary computer, or slave. This parity can be achieved, for example, by copying the database directory or – the better way described here – using a backup.
Caution is advisable with simple copies. Some tools try to duplicate the data and log directory during operation without taking into account the cache and active transactions. Such filesystem backups usually do not let you recover the database system. Other tools stop the database system, back up the filesystem, and boot the database again, which may guarantee a restorable backup; however, stopping a system in 24/7 operation is not acceptable for many applications. Tools that use dumps cost a huge amount of performance, and, at the most, the system can only be restored up to the time of the dump.
The XtraBackup system by Percona is different: With its help, you can restore a MariaDB/MySQL database up to a point later than the last regular backup (point in time recovery). The only condition is that you need the required log data. XtraBackup is also well suited to set up a new replication node (slave) quickly and easily, without shutting down the master or an existing slave.
Adjusting the Settings
State-of-the-art Linux systems use the UTF-8 character set by default. Although clients running on the console (e.g., the MariaDB/MySQL command-line client) and the commands from XtraBackup may speak UTF-8, the Latin-1 character set is commonly the default in MySQL and MariaDB. The
mariadb.cnf configuration file located on Ubuntu 16.04 under
/etc/mysql/conf.d has the UTF-8 character set settings commented out but in place. I advise you to change the client (slave) as well as the complete server (master) to UTF-8 (Listing 1).
Listing 1: mariadb.cnf
[...] # MariaDB-specific config file. # Read by /etc/mysql/my.cnf [client] # Default is Latin1, if you need UTF-8 set this (also in server section) default-character-set = utf8 [mysqld] # # * Character sets # # Default is Latin1, if you need UTF-8 set all this (also in client section) # character-set-server = utf8 collation-server = utf8_general_ci character_set_server = utf8 collation_server = utf8_general_ci [...]
A configuration file,
/etc/mysql/conf.d/mysql_safe_syslog.cnf, is also created during the installation, which prevents an error log from being written. To enable the error log, you need to comment out
skip_log_error and then configure the path (Listing 2). If you will be running the instance as a master later on, make sure the
bind-address parameter in
/etc/mysql/my.cnf does not point to
Listing 2: mysqld_safe_syslog.cnf
[...] [mysqld_safe] #skip_log_error log-error = /var/log/mysql/error.log syslog [...]
To write error messages to the log, you need to configure the path for
log_warnings parameter determines what to log. The value
0 means that logging is disabled. A value greater than
1 additionally logs aborted connections.
In a replication setup, each node requires a unique positive integer for the
server-id. A popular option is to number the machines incrementally; alternatively, you can use the last digits of the IPv4 address. For XtraBackup – just as for the replication master – you need to enable the binary log (
binlog) and set the appropriate path. The system indexes the binlog; the index file is specified with
Today: Line Oriented
Formerly, admins often relied on statement-based replication, which creates a small logfile but also has disadvantages. For example, values provided by user-defined functions or stored procedures can differ from call to call. An example would be
now(). If this statement is repeated on the slave, it must produce a different value than previously on the master.
UPDATE – both with
LIMIT but without
ORDER BY – are non-deterministic functions that can lead to different results on both sides. Also, a number of statements are fundamentally non-replicable, including
GET lock(), and
That's why admins today prefer row-based replication. Although it might generate more data, at least you can replicate all the changes. The binlog format is set to
row for this. Depending on the application, the binlog directory can grow very quickly. The
expire_logs_days variable specifies when to delete the binlog files. In principle, all binlogs that originated during and after a backup should remain in place until the next backup is in the can. However, because they occupy so much space, Debian and Ubuntu have set their retention period to 10 days by default, which also can be too long. Three to five days are common.
max_binlog_size variable sets the size of a single binlog file. The default value is 1GB, but Debian and Ubuntu installations define
max_binlog_size as 100MB. XtraBackup also wants to create a separate file for every InnoDB table, which is enabled by the
innodb_file_per_table variable (Listing 3).
Listing 3: Replication Settings
[mysqld] [...] #bind-address = 127.0.0.1 [...] log-error=/var/log/mysql/error.log log_warnings = 2 [...] server-id = 1 [...] log_bin = /var/log/mysql/mariadb-bin log_bin_index = /var/log/mysql/mariadb-bin.index binlog_format = row [...] expire_logs_days = 10 max_binlog_size = 100M [...] innodb_file_per_table = 1 [...]
Now all the preparations are in place. For the system to apply the modified configurations, you need to reboot the database server. On Debian or Ubuntu this is done by entering:
sudo service mysql restart
If you later want to set up a replication system, it is advisable to create an extra role (user). Because the slaves open the connection, the corresponding host for the slaves needs to be specified for the role. It also makes sense to create your own separate user role for the backup. Because XtraBackup runs on the same machine, the host can be
localhost here. Both roles and users need superuser rights and should be protected by a password. The example in Listing 4 shows the required SQL steps.
Listing 4: Role Configuration
CREATE USER replication@'192.168.23.%'; CREATE USER backup@localhost; GRANT ALL ON *.* TO replication@'192.168.23.%' WITH GRANT OPTION; GRANT ALL ON *.* TO backup@localhost WITH GRANT OPTION; SET PASSWORD FOR replication@'192.168.23.%' = PASSWORD('<a_good_Password>'); SET PASSWORD FOR backup@localhost = PASSWORD('<a_good_Password>');
The version of Percona XtraBackup supplied by Ubuntu 16.04 does not work with MariaDB 10.1 for Ubuntu 16.04. However, in just a few steps, you can install the correct version for the system directly from Percona, and the system will detect all future updates for this version. An installation guide is available online , and Listing 5 show the steps needed for the sample installation.
Listing 5: XtraBackup Installation
wget https://repo.percona.com/apt/percona-release_0.1-3.wily_all.deb sudo dpkg -i percona-release_0.1-3.wily_all.deb sudo apt-get update sudo apt-get install percona-xtrabackup-24
XtraBackup makes a copy of the data directory, so make sure you have enough disk space. XtraBackup can be used with
rsync, although incremental backups are also possible, of course. You start a backup with the
innobackupex command shown in Listing 6, which must receive the data connection to the database server. The
--throttle parameter determines how many I/O operations per second are allowed. This prevents the current system from being impaired by the I/O load of the backup process. The
--parallel parameter determines how many threads the backup uses at the same time.
Listing 6: Starting a Backup
innobackupex --user=backup --password=<a_good_Password> --throttle=60 --parallel=3 --safe-slave-backup "/mnt/backup/mariadb/" >> /mnt/backup/mariadb/backup.log 2>&1
To avoid impeding an ongoing replication, the option
--safe-slave-backup stops the active slave threads once the current transactions are finished. After the backup, the slave threads start up again automatically. By default, the backup creates a subdirectory in the specified directory with the current timestamp as its name, which can be prevented with
--no-timestamp. Redirecting warnings and errors to a file is advisable.
To track the changes that occur while copying the data directory, run
innobackupex again with the
--apply-log option. With the
--use-memory option, you can define how much memory should be available during the operation. The
apply-log parameter specifies the path to the backup (Listing 7).
Listing 7: Tracking Changes
innobackupex --user=backup --password=<a_good_Password> --safe-slave-backup --use-memory=1GB --apply-log "/mnt/backup/mariadb/2016-05-17-21_26_32">>/mnt/backup/mariadb/backup_apply.log 2>&1
A slave can use the same version as the master or a more recent version of the database management system. However, you should install the latest stable version, even if an older version is provided by the distribution. MariaDB is available from multiple mirrors. Detailed instructions for installing the database on the different operating systems are available online .
The configurations, which you will find in
/etc/mysql on the slave for Debian or Ubuntu, need to be modified as per the master (see above). A unique
server_id is important. Binlogs do not need to be enabled on the slave. Because the replication transfers the user and passwords from the master, the password needs to be copied from the master to
debian.cnf to access
debian-sys-maint; alternatively, after setting up the slave, use the SQL
SET PASSWORD command to change the password in the configuration file. Stop the server to do so with:
sudo service mysql stop
Now exchange the data directory, which on Debian/Ubuntu is typically found in
/var/lib/mysql, with the same directory from the backup. Before deleting, check once more which users and which groups are listed as file and subdirectory owners, and adapt the permissions if necessary. In a default installation, all files and subdirectories, as well as the
/var/lib/mysql directory, should belong to the
mysql user; then, restart the server:
sudo service mysql start
Even if no other version has been installed, it make sense to run
-f. Beware: The master has also taken on the root password:
sudo mysql_upgrade -f -u root -p <root_password_for_master>
In the data directory you will find the human-readable
xtrabackup_binlog_info file. It states the position in the binlog file at which replication has to start:
mariadb-bin.000008 615822 0-1-625
Using this information and the IP address of the master, the slave is now configurable via SQL (Listing 8). The server is launched using the SQL
START SLAVE command; the SQL
SHOW SLAVE STATUS command (Listing 9) naturally checks the status of the slave.
Listing 8: Slave Configuration
CHANGE MASTER TO MASTER_USER='replication', MASTER_PASSWORD='<a_good_Password>', MASTER_HOST='192.168.23.192', MASTER_LOG_FILE='mariadb-bin.00008', MASTER_LOG_POS=615822;
Listing 9: SHOW SLAVE STATUS
01 *************************** 1. row *************************** 02 Slave_IO_State: Queueing master event to the relay log 03 Master_Host: 192.168.23.192 04 Master_User: replication 05 Master_Port: 3306 06 Connect_Retry: 60 07 Master_Log_File: mariadb-bin.000008 08 Read_Master_Log_Pos: 140927111 09 Relay_Log_File: mysqld-relay-bin.000002 10 Relay_Log_Pos: 11109399 11 Relay_Master_Log_File: mariadb-bin.000008 12 Slave_IO_Running: Yes 13 Slave_SQL_Running: Yes 14 Replicate_Do_DB: 15 Replicate_Ignore_DB: 16 Replicate_Do_Table: 17 Replicate_Ignore_Table: 18 Replicate_Wild_Do_Table: 19 Replicate_Wild_Ignore_Table: 20 Last_Errno: 0 21 Last_Error: 22 Skip_Counter: 0 23 Exec_Master_Log_Pos: 11724682 24 Relay_Log_Space: 140312671 25 Until_Condition: None 26 Until_Log_File: 27 Until_Log_Pos: 0 28 Master_SSL_Allowed: No 29 Master_SSL_CA_File: 30 Master_SSL_CA_Path: 31 Master_SSL_Cert: 32 Master_SSL_Cipher: 33 Master_SSL_Key: 34 Seconds_Behind_Master: 94154 35 Master_SSL_Verify_Server_Cert: No 36 Last_IO_Errno: 0 37 Last_IO_Error: 38 Last_SQL_Errno: 0 39 Last_SQL_Error: 40 Replicate_Ignore_Server_Ids: 41 Master_Server_Id: 1 42 Master_SSL_Crl: 43 Master_SSL_Crlpath: 44 Using_Gtid: No 45 Gtid_IO_Pos: 46 Replicate_Do_Domain_Ids: 47 Replicate_Ignore_Domain_Ids: 48 Parallel_Mode: conservative 49 1 row in set (0.01 sec)
SLAVE_SQL_Running (lines 12 and 13) are set to
Yes, then the slave is running. If one of the two parameters contains the value
No, then you should see an error message below
Last_SQL_Error (lines 37 and 39).
Seconds_Behind_Master information (line 34) is important. With a fresh slave, it takes a while until it has caught up with all the binlogs. The slave is now set up and can be used to load-balance read operations.