Using MySQL 5.6.13 under CentOS 6.4, I had a replication failure on one master/slave topology because the master binary log had the following entry that was intentionally written by the MySQL server.
$ mysqlbinlog --start-position=244670849 mysql-bin.000029 ... # at 244670849 #131028 19:31:38 server id 39 end_log_pos 244670906 CRC32 0xc5d084ec # Incident: LOST_EVENTS RELOAD DATABASE; # Shall generate syntax error # at 244670906 ...
The question is why? I do not know the answer. Searching online indicates this can occur in a MySQL Cluster environment, and can occur around the use of GRANT statements. Neither of these situations are applicable here. This site runs 4 different master/slave topologies all running the same version, and this has been the only occurrence.
The message, if taken literally could indicate that the master server failed to write all events to the master binary log, but instead of reporting any information (say in the MySQL error log), it basically tells you the binary log (aka the replication stream) is now incomplete and you need to rebuild your entire replication topology. If this is true, then this is a poor explanation, and there is no MySQL documentation I have found to shed light here.
The situation of course does not manifest to the end user on the master (or say monitoring the MySQL error log), but on the slave with a message similar to:
mysql> SHOW SLAVE STATUS\G ... Relay_Master_Log_File: mysql-bin.000029 Exec_Master_Log_Pos: 244670849 ... Slave_IO_Running: Yes Slave_SQL_Running: No Last_Errno: 1590 Last_Error: The incident LOST_EVENTS occured on the master. Message: error writing to the binary log ...
What appears to happen on the master, is the LOST_EVENTS binary log event is written, the binary logs are flushed, as indicated by the file sizes.
-rw-rw---- 1 mysql mysql 1073742236 Oct 27 23:33 mysql-bin.000027 -rw-rw---- 1 mysql mysql 1073741890 Oct 28 11:28 mysql-bin.000028 -rw-rw---- 1 mysql mysql 244670953 Oct 28 19:31 mysql-bin.000029 -rw-rw---- 1 mysql mysql 1073742184 Oct 29 08:55 mysql-bin.000030
Further analysis of the statements before the event in the master log show nothing of incident. Analysis of the binary logs at the error and subsequently how second timestamps that are identical, so there is no *obvious* loss of operations, but when there are 100s of transactions per second, using second granularity is ineffective.
When confirming there were no events after this position on the slave, and the binary logs had been flushed the solution taken was to reset the slave to the next valid event.
mysql> STOP SLAVE; mysql> CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.000030', MASTER_LOG_POS=120; mysql> START SLAVE: