The new Development Release for MySQL 5.6 contains a great feature that our users have been asking for for a while (work log 344 first raised in 2010!) – delayed replication.
The concept (and as you’ll see the execution) is extremely simple. If a user makes a mistake on the master – such as dropping some critical tables – then we want to give them the opportunity to recover the situation by using the data held on one of the slaves. The problem is that the slave is busily trying to keep up with the master and in all likelihood will have dropped these tables before the user has time to pull the plug on the replication stream. What this feature does is give the DBA the option to introduce a configurable delay into the replication process so that they have time to cut it off before the mistake is propagated.
This blog explains how this works, how to set that up and then how to bring the slave up to date (to the point in time just before the mistake was made on the master).
To understand how this is implemented, it helps to have a little bit of background on how MySQL replication is implemented. When a change is made on the master, it is applied to the master’s local disk copy and then written to the binary log. The change is then asynchronously (but normally immediately) copied from the master’s binary log to the relay log on the slave; from there an SQL thread on the slave will read the change from the relay log and apply it to the slave’s copy of the data.
This feature works by allowing the user to configure a delay between when the change is applied on the master and when that change is taken from the relay log and applied to the slave. Note that if the master fails during this delay period then the change is not lost as it is has already been safely recorded in the slave’s relay log.
As the delay is implemented on the slave, you are free to use ‘real-time’ replication to one slave (to allow the fastest possible failover if the master fails) and delayed replication to a second slave to guard against user error. This is the setup that this post steps through.
For simplicity, all three MySQL Servers will be run on a single host but each uses a different port number as shown in the diagram. “slave” will apply changes as quickly as it can while “slave2” will introduce a delay when applying changes from its relay log.
Setting up the first slave is very standard:
master> CREATE USER repl_user@localhost; master> GRANT REPLICATION SLAVE ON *.* TO repl_user@localhost IDENTIFIED BY 'pw';
slave> CHANGE MASTER TO -> MASTER_HOST = 'localhost', -> MASTER_PORT = 3306, -> MASTER_USER = 'repl_user', -> MASTER_PASSWORD = 'pw';
slave> start slave;
slave2> CHANGE MASTER TO -> MASTER_HOST = 'localhost', -> MASTER_PORT = 3306, -> MASTER_USER = 'repl_user', -> MASTER_PASSWORD = 'pw', -> MASTER_DELAY = 20;
slave2> START SLAVE;
master> CREATE DATABASE clusterdb;USE clusterdb; master> CREATE TABLE towns (Town VARCHAR(20));
master> REPLACE INTO towns VALUES ("Maidenhead"),("Bray");
slave> SELECT * FROM towns;
+------------+
| Town |
+------------+
| Maidenhead |
| Bray |
+------------+
slave2> SELECT * FROM towns; Empty set (0.00 sec) slave2> SELECT * FROM towns; +------------+ | Town | +------------+ | Maidenhead | | Bray | +------------+
master> REPLACE INTO towns VALUES ("Cookham"),("Marlow");
master> DROP TABLE towns;
slave> SELECT * FROM tables;
ERROR 1146 (42S02): Table 'clusterdb.tables' doesn't exist
slave2> STOP SLAVE; slave2> SELECT * FROM towns; +------------+ | Town | +------------+ | Maidenhead | | Bray | +------------+
This is a good start, while slave has dropped the table, it still exists on slave2. Unfortunately, slave2 is missing the additions to the table that were made just before the mistake was made. The next step is to bring slave 2 almost up to date – stopping just before the table was dropped. To do this we need to find the position within the master’s binary log just before the table was dropped – this can be done using the SHOW BINLOG EVENTS command on the master. Once we have that position (file-name + position) we can tell slave 2 to catch up just to that point using START SLAVE UNTIL
master> SHOW BINLOG EVENTSG
....
*************************** 10. row *************************** Log_name: ws2-bin.000001 Pos: 842 Event_type: Query Server_id: 1 End_log_pos: 957 Info: use `clusterdb`; REPLACE INTO towns VALUES ("Cookham"),("Marlow") *************************** 11. row *************************** Log_name: ws2-bin.000001 Pos: 957 Event_type: Xid Server_id: 1 End_log_pos: 984 Info: COMMIT /* xid=32 */ *************************** 12. row *************************** Log_name: ws2-bin.000001 Pos: 984 Event_type: Query Server_id: 1 End_log_pos: 1096 Info: use `clusterdb`; DROP TABLE `towns` /* generated by server */
slave2> START SLAVE UNTIL -> MASTER_LOG_FILE='ws2-bin.000001', -> MASTER_LOG_POS=984;
slave2> SELECT * FROM towns; +------------+ | Town | +------------+ | Maidenhead | | Bray | | Cookham | | Marlow | +------------+
Success! Now slave2 contains exactly the data we need. After this it’s up to you what to do next; typically this could involve promoting slave2 to be the new master.
If you want to try this out for yourselves then you can download the MySQL 5.6 Milestone Development Release from dev.mysql.com (select the Development Maintenance Release sub-tab to get MySQL 5.6).
This is great. It is currently available for all versions through Maatkit, but that is not as simple to use and inherently less reliable.
Hello Andrew.
First of all I apologise for making questions in you won blog. But I’m really stuck here with a problem which I think you can help me!
I’ve installed mysql cluster and it’s all running smoothly (1 manage server and 2 data nodes).
I’m trying to intall mysql-php in the data nodes so that I can user phpmyadmin but when I run yum install php-mysql.x86_64 I got the conflict error: MySQL-Cluster-gpl-server conflicts with mysql.
Are the not compatible? Because I reallu need the two of them.
Thanks.
Regards,
Vilhena
Hi
Right now i am using 5.5 version of MySQL. Currently this MASTER_DELAY option is available with 5.6 …! .
So is there any possibility to make use of MASTER_DELAY command in version 5.5 ???
Please suggest me…
In general, we don’t retrofit new features to GA releases (users don’t want to have to retest their applications when there are new maintenance releases) and there are no plans to back-port delayed replication,
Regards, Andrew.