Oracle has announced that it now provides support for DRBD with MySQL – this means a single point of support for the entire MySQL/DRBD/Pacemaker/Corosync/Linux stack! As part of this, we’ve released a new white paper which steps you through everything you need to do to configure this High Availability stack. The white paper provides a step-by-step guide to installing, configuring, provisioning and testing the complete MySQL and DRBD stack, including:
MySQL Database
DRBD kernel module and userland utilities
Pacemaker and Corosync cluster messaging and management processes
Oracle Linux operating system
DRBD is an extremely popular way of adding a layer of High Availability to a MySQL deployment – especially when the 99.999% availability levels delivered by MySQL Cluster isn’t needed. It can be implemented without the shared storage required for typical clustering solutions (not required by MySQL Cluster either) and so it can be a very cost effective solution for Linux environments.
Introduction to MySQL on DRBD/Pacemaker/Corosync/Oracle Linux
Fig 1 – MySQL-DRBD Stack
Figure 1 illustrates the stack that can be used to deliver a level of High Availability for the MySQL service.
At the lowest level, 2 hosts are required in order to provide physical redundancy; if using a virtual environment, those 2 hosts should be on different physical machines. It is an important feature that no shared storage is required. At any point in time, the services will be active on one host and in standby mode on the other.
Pacemaker and Corosync combine to provide the clustering layer that sits between the services and the underlying hosts and operating systems. Pacemaker is responsible for starting and stopping services – ensuring that they’re running on exactly one host, delivering high availability and avoiding data corruption. Corosync provides the underlying messaging infrastructure between the nodes that enables Pacemaker to do its job; it also handles the nodes membership within the cluster and informs Pacemaker of any changes.
The core Pacemaker process does not have built in knowledge of the specific services to be managed; instead agents are used which provide a wrapper for the service-specific actions. For example, in this solution we use agents for Virtual IP Addresses, MySQL and DRBD – these are all existing agents and come packaged with Pacemaker. This white paper will demonstrate how to configure Pacemaker to use these agents to provide a High Availability stack for MySQL.
The essential services managed by Pacemaker in this configuration are DRBD, MySQL and the Virtual IP Address that applications use to connect to the active MySQL service.
DRBD synchronizes data at the block device (typically a spinning or solid state disk) – transparent to the application, database and even the file system. DRBD requires the use of a journaling file system such as ext3 or ext4. For this solution it acts in an active-standby mode – this means that at any point in time the directories being managed by DRBD are accessible for reads and writes on exactly one of the two hosts and inaccessible (even for reads) on the other. Any changes made on the active host are synchronously replicated to the standby host by DRBD.
Setting up MySQL with DRBD/Pacemaker/Corosync/Oracle Linux
Fig 2 – Target network config
Figure 2 shows the network configuration used in this paper – note that for simplicity a single network connection is used but for maximum availability in a production environment you should consider redundant network connections.
A single Virtual IP (VIP) is shown in the figure (192.168.5.102) and this is the address that the application will connect to when accessing the MySQL database. Pacemaker will be responsible for migrating this between the 2 physical IP addresses.
One of the final steps in configuring Pacemaker is to add network connectivity monitoring in order to attempt to have an isolated host stop its MySQL service to avoid a “split-brain” scenario. This is achieved by having each host ping an external (not one part of the cluster) IP addresses – in this case the network router (192.168.5.1).
Fig 3 – Locations of files
Figure 3 shows where the MySQL files will be stored. The MySQL binaries as well as the socket (mysql.sock) and process-id (mysql.pid) files are stored in a regular partition – independent on each host (under /var/lib/mysql/). The MySQL Server configuration file (my.cnf) and the database files (data/*) are stored in a DRBD controlled file system that at any point in time is only available on one of the two hosts – this file system is controlled by DRBD and mounted under /var/lib/mysql_drbd/.
Fig 4 – Clustered resources
The white paper steps through setting all of this up as well as the resources in Pacemaker/Corosync that allow detection of a problem and the failover of the storage (DRBD), database (MySQL) and the Virtual IP address used by the application to access the database – all in a coordinated way of course. As you’ll notice in Figure 4 this involves setting up quite a few entities and relationships – the paper goes through each one.
Deploying a well configured cluster has just got a lot easier! Oracle have released a new auto-installer/configurator for MySQL Cluster that makes the processes extremely simple while making sure that the cluster is well configured for your application. The installer is part of MySQL Cluster 7.3 and so is not yet GA but it can also be used on MySQL Cluster 7.2. A single command launches the web-based wizard which then steps you through configuring the cluster; to keep things even simpler, it will automatically detect the resources on your target machines and use these results together with the type of workload you specify in order to determine values for the key configuration parameters.
Tutorial Video
Before going through the detailed steps, here’s a demonstration of the auto-installer in action…
Downloading and running the wizard
The software can be downloaded from MySQL Labs; just select the MySQL-Cluster-Auto-Installer build, unzip the file and then run. To run on Windows, just double click setup.bat – note that if you installed from the MSI and didn’t change the install directory then this will be located somewhere like C:Program Files (x86)MySQLMySQL Cluster 7.2. On Linux, just run ndb_setup.
Creating your cluster
Landing page
When you run the auto-installer it starts a small web server and then (if possible) automatically connects your web browser to it – presenting you with the first page of the wizard. If this isn’t possible (for example the server isn’t running a desktop environment), then you can connect to it remotely using the URL http://your-server-name-goes-here:8081/index.html. It may take a number of seconds to load and so please be patient. Note that the machine where you run this doesn’t need to be a host that will be included in the cluster.
From the landing page, just click on the “Create new MySQL Cluster” icon to get started.
On the next page you need to specify the list of servers that will form part of the cluster. The machine where the installer is being run from needs to have ssh access to all of the cluster hosts (further, access to those machines must already have been approved from this one – if you’re uncertain, just manually connect to each one using an ssh client.
By default, the wizard assumes that ssh keys have been set up (so that a password isn’t needed) – if that isn’t the case, just un-check the checkbox and provide your username and password.
On this page, you also get to specify what “type” of cluster you want; if you’re experimenting for the first time then it’s probably safest to stick with “Simple testing” but for a production system you’d want to specify the kind of application and whether it will a write-intensive application.
Auto-discovery of target host resources
On the next page, you will see the wizard attempt to auto-detect the resources on your target machines. If this fails then you can enter the data manually.
You can also overwrite the resource-values (for example, if you don’t want the cluster to use up a big share of the memory on the target systems then just overwrite the amount of memory.
Overwriting the default directories on the target systems
It’s also on this page that you can specify where the MySQL Cluster software is stored on each of the hosts (if the defaults aren’t correct) – this should be the path to where you unzipped the MySQL Cluster tar-ball/zip file – as well as where the data (and configuration files) should be stored. You can just overwrite the values or select multiple rows and hit the “edit” button.
Defining processes
The following page presents you with a default set of nodes (processes) and how they’ll be distributed across all of the target hosts – if you’re happy with the proposal then just advance to the next page. So what can you change:
Add extra nodes
Move nodes from one host to another (just drag and drop)
Delete nodes
Change a node from one type to another
Add process
The diagram to the right shows an example of adding an extra MySQL Server.
On the next screen you’re presented with some of the key configuration parameters that have been set (behind the scenes, the wizard sets many more) that you might want to override; if you’re happy then just progress to the next screen. If you do want to make any changes then make them here before continuing. If you’d previously selected anything other than “simple” for the kind of cluster to create then you can check the “Show advanced configuration options” box in order to view/modify more parameters.
Deployment in progress
On the final screen you can review the details of the final recommended configuration and then just hit “Deploy and start cluster” and it will do just that. Depending on the complexity of the cluster, it can take a while to deploy and start everything but you’re shown a progress bar together with an explanation of what stage the process is at.
If for some reason you prefer or need to start the processes manually, this page also shows you the commands that you’d need to run (as well as the configuration files if you need to create them manually).
Once the wizard declares the process complete, you can check for yourself before going ahead and start your testing:
If you’re a user of MySQL Workbench then you may have noticed a pocket knife icon appear in the top right hand corner – click on that and a terminal opens which gives you access to the MySQL utilities. In this post I’m focussing on the replication utilities but you can also refer to the full MySQL Utilities documentation.
What I’ll step through is how to uses these utilities to:
Set up replication from a single master to multiple slaves
Automatically detect the failure of the master and promote one of the slaves to be the new master
Introduce the old master back into the topology as a new slave and then promote it to be the master again
Tutorial Video
Before going through the steps in detail here’s a demonstration of the replication utilities in action…
For many people, the instinctive reaction is to deploy a fully automated system that detects when the master database fails and then fails over (promotes a slave to be the new master) without human intervention. For many applications this may be the correct approach.
There are inherent risks to this though – What if the failover implementation has a flaw and fails (after all, we probably don’t test this out in the production system very often)? What if the slave isn’t able to cope with the workload and makes things worse? Is it just a transitory glitch and would the best approach have been just to wait it out?
Following a recent, high profile outage there has been a great deal of debate on the topic between those that recommend auto-failover and those that believe it should only ever be entrusted to a knowledgeable (of the application and the database architecture) and well informed (of the state of the database nodes, application load etc.) human. Of course, if the triggering of the failover is to be left to a human then you want that person to have access to the information they need and an extremely simple procedure (ideally a single command) to execute the failover. Probably the truth is that it all depends on your specific circumstances.
The MySQL replication utilities aim to support you whichever camp you belong to:
In the fully automated mode, the utilities will continually monitor the state of the master and in the event of its failure identify the best slave to promote – by default it will select the one that is most up-to-date and then apply any changes that are available on other slaves but not on this one before promoting it to be the new master. The user can override this behaviour (for example by limiting which of the slaves are eligible for promotion). The user is also able to bind in their own programs to be run before and after the failover (for example, to inform the application).
In the monitoring mode, the utility still continually checks the availability of the master, and informs the user if it should fail. The user then executes a single command to fail over to their preferred slave.
Step 1. Make sure MySQL Servers are configured correctly
For some of the utilities, it’s important that you’re using Global Transaction IDs; binary logging needs to be enabled; may as well use the new crash-safe slave functionality… It’s beyond the scope of this post to go through all of those and so instead I’ll just give example configuration files for the 5 MySQL Servers that will be used:
my1.cnf
[mysqld]
binlog-format=ROW
log-slave-updates=true
gtid-mode=on
disable-gtid-unsafe-statements=true # Use enforce-gtid-consistency from 5.6.9+
master-info-repository=TABLE
relay-log-info-repository=TABLE
sync-master-info=1
datadir=/home/billy/mysql/data1
server-id=1
log-bin=util11-bin.log
report-host=utils1
report-port=3306
socket=/home/billy/mysql/sock1
port=3306
my2.cnf
[mysqld]
binlog-format=ROW
log-slave-updates=true
gtid-mode=on
disable-gtid-unsafe-statements=true # Use enforce-gtid-consistency from 5.6.9+
master-info-repository=TABLE
relay-log-info-repository=TABLE
sync-master-info=1
datadir=/home/billy/mysql/data2
server-id=2
log-bin=util12-bin.log
report-host=utils1
report-port=3307
socket=/home/billy/mysql/sock2
port=3307
my3.cnf
[mysqld]
binlog-format=ROW
log-slave-updates=true
gtid-mode=on
disable-gtid-unsafe-statements=true # Use enforce-gtid-consistency from 5.6.9+
master-info-repository=TABLE
relay-log-info-repository=TABLE
sync-master-info=1
datadir=/home/billy/mysql/data3
server-id=3
log-bin=util2-bin.log
report-host=utils2
report-port=3306
socket=/home/billy/mysql/sock3
port=3306
my4.cnf
[mysqld]
binlog-format=ROW
log-slave-updates=true
gtid-mode=on
disable-gtid-unsafe-statements=true # Use enforce-gtid-consistency from 5.6.9+
master-info-repository=TABLE
relay-log-info-repository=TABLE
master-info-file=/home/billy/mysql/master4.info
datadir=/home/billy/mysql/data4
server-id=4
log-bin=util4-bin.log
report-host=utils2
report-port=3307
socket=/home/billy/mysql/sock4
port=3307
The utilities are actually going to be run from a remote host and so it will be necessary for that host to access each of the MySQL Servers and so a user has to be granted remote access (note that the utilities will automatically create the replication user):
[billy@utils1 ~]$ mysql -h 127.0.0.1 -P3306 -u root -e "grant all on *.* to root@'%' with grant option;"
[billy@utils1 ~]$ mysql -h 127.0.0.1 -P3307 -u root -e "grant all on *.* to root@'%' with grant option;"
[billy@utils2 ~]$ mysql -h 127.0.0.1 -P3306 -u root -e "grant all on *.* to root@'%' with grant option;"
[billy@utils2 ~]$ mysql -h 127.0.0.1 -P3307 -u root -e "grant all on *.* to root@'%' with grant option;"
[billy@utils2 ~]$ mysql -h 127.0.0.1 -P3308 -u root -e "grant all on *.* to root@'%' with grant option;"
OK – that’s the most painful part of the whole process out of the way!
Set up replication
While there are extra options (such as specifying what username/password to use for the replication user or providing a password for the root user) I’m going to keep things simple and use the defaults as much as possible. The following commands are run from the MySQL Utilities terminal – just click on the pocket-knife icon in MySQL Workbench.
mysqlreplicate --master=root@utils1:3306 --slave=root@utils1:3307
# master on utils1: ... connected.
# slave on utils1: ... connected.
# Checking for binary logging on master...
# Setting up replication...
# ...done.
mysqlreplicate --master=root@utils1:3306 --slave=root@utils2:3306
# master on utils1: ... connected.
# slave on utils2: ... connected.
# Checking for binary logging on master...
# Setting up replication...
# ...done.
mysqlreplicate --master=root@utils1:3306 --slave=root@utils2:3307
# master on utils1: ... connected.
# slave on utils2: ... connected.
# Checking for binary logging on master...
# Setting up replication...
# ...done.
mysqlreplicate --master=root@utils1:3306 --slave=root@utils2:3308
# master on utils1: ... connected.
# slave on utils2: ... connected.
# Checking for binary logging on master...
# Setting up replication...
# ...done.
That’s it, replication has now been set up from one master to four slaves.
You can now check that the replication topology matches what you intended:
Additionally, you can also check that any of the replication relationships is correctly configure:
mysqlrplcheck --master=root@utils1 --slave=root@utils2
# master on utils1: ... connected.
# slave on utils2: ... connected.
Test Description Status
---------------------------------------------------------------------------
Checking for binary logging on master [pass]
Are there binlog exceptions? [pass]
Replication user exists? [pass]
Checking server_id values [pass]
Is slave connected to master? [pass]
Check master information file [pass]
Checking InnoDB compatibility [pass]
Checking storage engines compatibility [pass]
Checking lower_case_table_names settings [pass]
Checking slave delay (seconds behind master) [pass]
# ...done.
Including the -s option would have included the output that you’d expect to see from SHOW SLAVE STATUSG on the slave.
Automated monitoring and failover
The previous section showed how you can save some serious time (and opportunity for user-error) when setting up MySQL replication. We now look at using the utilities to automatically monitor the state of the master and then automatically promote a new master from the pool of slaves. For simplicity I’ll stick with default values wherever possible but note that there are a number of extra options available to you such as:
Constraining which slaves are eligible for promotion to master; the default is to take the most up-to-date slave
Binding in your own scripts to be run before or after the failover (e.g. inform your application to switch master?)
Have the utility monitor the state of the servers but don’t automatically initiate failover
Here is how to set it up:
mysqlfailover --master=root@utils1:3306 --discover-slaves-login=root --rediscover
MySQL Replication Failover Utility
Failover Mode = auto Next Interval = Wed Aug 15 13:19:30 2012
Master Information
------------------
Binary Log File Position Binlog_Do_DB Binlog_Ignore_DB
util11-bin.000001 2586
Replication Health Status
+---------+-------+---------+--------+------------+---------+
| host | port | role | state | gtid_mode | health |
+---------+-------+---------+--------+------------+---------+
| utils1 | 3306 | MASTER | UP | ON | OK |
| utils1 | 3307 | SLAVE | UP | ON | OK |
| utils2 | 3306 | SLAVE | UP | ON | OK |
| utils2 | 3307 | SLAVE | UP | ON | OK |
| utils2 | 3308 | SLAVE | UP | ON | OK |
+---------+-------+---------+--------+------------+---------+
Q-quit R-refresh H-health G-GTID Lists U-UUIDs
mysqlfailover will then continue to run, refreshing the state – just waiting for something to go wrong.
Rather than waiting, I kill the master MySQL Server:
mysqladmin -h utils1 -P3306 -u root shutdown
Checking with the still-running mysqlfailover we can see that it has promoted utils1:3307.
MySQL Replication Failover Utility
Failover Mode = auto Next Interval = Wed Aug 15 13:21:13 2012
Master Information
------------------
Binary Log File Position Binlog_Do_DB Binlog_Ignore_DB
util12-bin.000001 7131
Replication Health Status
+---------+-------+---------+--------+------------+---------+
| host | port | role | state | gtid_mode | health |
+---------+-------+---------+--------+------------+---------+
| utils1 | 3307 | MASTER | UP | ON | OK |
| utils2 | 3306 | SLAVE | UP | ON | OK |
| utils2 | 3307 | SLAVE | UP | ON | OK |
| utils2 | 3308 | SLAVE | UP | ON | OK |
+---------+-------+---------+--------+------------+---------+
Q-quit R-refresh H-health G-GTID Lists U-UUIDs
Add the recovered MySQL Server back into the topology
After restarting the failed MySQL Server, it can be added back into the mix as a slave to the new master:
mysqlreplicate --master=root@utils1:3307 --slave=root@utils1:3306
# master on utils1: ... connected.
# slave on utils1: ... connected.
# Checking for binary logging on master...
# Setting up replication...
# ...done.
The output from mysqlfailover (still running) confirms the addition:
MySQL Replication Failover Utility
Failover Mode = auto Next Interval = Wed Aug 15 13:24:38 2012
Master Information
------------------
Binary Log File Position Binlog_Do_DB Binlog_Ignore_DB
util12-bin.000001 7131
Replication Health Status
+---------+-------+---------+--------+------------+---------+
| host | port | role | state | gtid_mode | health |
+---------+-------+---------+--------+------------+---------+
| utils1 | 3307 | MASTER | UP | ON | OK |
| utils1 | 3306 | SLAVE | UP | ON | OK |
| utils2 | 3306 | SLAVE | UP | ON | OK |
| utils2 | 3307 | SLAVE | UP | ON | OK |
| utils2 | 3308 | SLAVE | UP | ON | OK |
+---------+-------+---------+--------+------------+---------+
Q-quit R-refresh H-health G-GTID Lists U-UUIDs
If it were important that the recovered MySQL Server be restored as the master then it is simple to manually trigger the promotion (after quitting out of mysqlfailover):
mysqlrpladmin --master=root@utils1:3307 --new-master=root@utils1:3306 --demote-master
--discover-slaves-login=root switchover
# Discovering slaves for master at utils1:3307
# Checking privileges.
# Performing switchover from master at utils1:3307 to slave at utils1:3306.
# Checking candidate slave prerequisites.
# Waiting for slaves to catch up to old master.
# Stopping slaves.
# Performing STOP on all slaves.
# Demoting old master to be a slave to the new master.
# Switching slaves to new master.
# Starting all slaves.
# Performing START on all slaves.
# Checking slaves for errors.
# Switchover complete.
#
# Replication Topology Health:
+---------+-------+---------+--------+------------+---------+
| host | port | role | state | gtid_mode | health |
+---------+-------+---------+--------+------------+---------+
| utils1 | 3306 | MASTER | UP | ON | OK |
| utils1 | 3307 | SLAVE | UP | ON | OK |
| utils2 | 3306 | SLAVE | UP | ON | OK |
| utils2 | 3307 | SLAVE | UP | ON | OK |
| utils2 | 3308 | SLAVE | UP | ON | OK |
+---------+-------+---------+--------+------------+---------+
# ...done.
As always, we’d really appreciate people trying this out and giving us feedback!