The new Development Release for MySQL 5.6 contains a great feature that our users have been asking for for a while (work log 344 first raised in 2010!) – delayed replication.
Stop mistake being propagated
The concept (and as you’ll see the execution) is extremely simple. If a user makes a mistake on the master – such as dropping some critical tables – then we want to give them the opportunity to recover the situation by using the data held on one of the slaves. The problem is that the slave is busily trying to keep up with the master and in all likelihood will have dropped these tables before the user has time to pull the plug on the replication stream. What this feature does is give the DBA the option to introduce a configurable delay into the replication process so that they have time to cut it off before the mistake is propagated.
This blog explains how this works, how to set that up and then how to bring the slave up to date (to the point in time just before the mistake was made on the master).
MySQL Replication Implementation
To understand how this is implemented, it helps to have a little bit of background on how MySQL replication is implemented. When a change is made on the master, it is applied to the master’s local disk copy and then written to the binary log. The change is then asynchronously (but normally immediately) copied from the master’s binary log to the relay log on the slave; from there an SQL thread on the slave will read the change from the relay log and apply it to the slave’s copy of the data.
This feature works by allowing the user to configure a delay between when the change is applied on the master and when that change is taken from the relay log and applied to the slave. Note that if the master fails during this delay period then the change is not lost as it is has already been safely recorded in the slave’s relay log.
Immediate & Delayed Replication
As the delay is implemented on the slave, you are free to use ‘real-time’ replication to one slave (to allow the fastest possible failover if the master fails) and delayed replication to a second slave to guard against user error. This is the setup that this post steps through.
For simplicity, all three MySQL Servers will be run on a single host but each uses a different port number as shown in the diagram. “slave” will apply changes as quickly as it can while “slave2” will introduce a delay when applying changes from its relay log.
Setting up the first slave is very standard:
master> CREATE USER repl_user@localhost;
master> GRANT REPLICATION SLAVE ON *.* TO
repl_user@localhost IDENTIFIED BY 'pw';
slave> CHANGE MASTER TO
-> MASTER_HOST = 'localhost',
-> MASTER_PORT = 3306,
-> MASTER_USER = 'repl_user',
-> MASTER_PASSWORD = 'pw';
slave> start slave;
When setting up the delayed slave, one extra parameter is included in the CHANGE MASTER command:
slave2> CHANGE MASTER TO
-> MASTER_HOST = 'localhost',
-> MASTER_PORT = 3306,
-> MASTER_USER = 'repl_user',
-> MASTER_PASSWORD = 'pw',
-> MASTER_DELAY = 20;
slave2> START SLAVE;
The MASTER_DELAY parameter indicates the delay in seconds (here I’ve used 20 seconds – in a production environment you’d probably want to give your self much longer.
Prior to this, a simple table had already been created:
master> CREATE DATABASE clusterdb;USE clusterdb;
master> CREATE TABLE towns (Town VARCHAR(20));
Next we check that changes are immediately applied on slave while delayed on slave2:
master> REPLACE INTO towns VALUES ("Maidenhead"),("Bray");
slave> SELECT * FROM towns;
+------------+
| Town |
+------------+
| Maidenhead |
| Bray |
+------------+
slave2> SELECT * FROM towns;
Empty set (0.00 sec)
slave2> SELECT * FROM towns;
+------------+
| Town |
+------------+
| Maidenhead |
| Bray |
+------------+
The next step is to simulate a user error; I add some more data to the table and then drop the table. Following on from that I stop the replication on slave2 before the mistake is applied:
master> REPLACE INTO towns VALUES ("Cookham"),("Marlow");
master> DROP TABLE towns;
slave> SELECT * FROM tables;
ERROR 1146 (42S02): Table 'clusterdb.tables' doesn't exist
slave2> STOP SLAVE;
slave2> SELECT * FROM towns;
+------------+
| Town |
+------------+
| Maidenhead |
| Bray |
+------------+
Bring the Slave up to the Desired Point in Time
This is a good start, while slave has dropped the table, it still exists on slave2. Unfortunately, slave2 is missing the additions to the table that were made just before the mistake was made. The next step is to bring slave 2 almost up to date – stopping just before the table was dropped. To do this we need to find the position within the master’s binary log just before the table was dropped – this can be done using the SHOW BINLOG EVENTS command on the master. Once we have that position (file-name + position) we can tell slave 2 to catch up just to that point using START SLAVE UNTIL . Once that has been done, I check that the extra 2 inserts have been applied to slave2:
master> SHOW BINLOG EVENTSG
....
*************************** 10. row ***************************
Log_name: ws2-bin.000001
Pos: 842
Event_type: Query
Server_id: 1
End_log_pos: 957
Info: use `clusterdb`; REPLACE INTO towns VALUES ("Cookham"),("Marlow")
*************************** 11. row ***************************
Log_name: ws2-bin.000001
Pos: 957
Event_type: Xid
Server_id: 1
End_log_pos: 984
Info: COMMIT /* xid=32 */
*************************** 12. row ***************************
Log_name: ws2-bin.000001
Pos: 984
Event_type: Query
Server_id: 1
End_log_pos: 1096
Info: use `clusterdb`; DROP TABLE `towns` /* generated by server */
slave2> START SLAVE UNTIL
-> MASTER_LOG_FILE='ws2-bin.000001',
-> MASTER_LOG_POS=984;
slave2> SELECT * FROM towns;
+------------+
| Town |
+------------+
| Maidenhead |
| Bray |
| Cookham |
| Marlow |
+------------+
Success! Now slave2 contains exactly the data we need. After this it’s up to you what to do next; typically this could involve promoting slave2 to be the new master.
If you want to try this out for yourselves then you can download the MySQL 5.6 Milestone Development Release from dev.mysql.com (select the Development Maintenance Release sub-tab to get MySQL 5.6).