MySQL error 1236 When using GTID

The Georgia picture The Georgia · Jul 15, 2016 · Viewed 12.4k times · Source

I want to create a replica to my Percona Server with GTID enabled, but got this error when i show slave status:

Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.'

Normally, i would stop my slave, reset it, reset master (on the slave), and get new GTID_PURGED value from the master. But this time around, the master has a very unusual value(s) and i am not sure how to determine which one to use:

mysql> show master status\G
*************************** 1. row ***************************
             File: mysqld-bin.000283
         Position: 316137263
     Binlog_Do_DB: 
 Binlog_Ignore_DB: 
Executed_Gtid_Set: 1570dee1-165b-11e6-a4a2-00e081e93212:1-3537,
c73f3ee7-e8d4-ee19-6507-f898a9930ccd:1-18609,
cdb70eaa-f753-ee1b-5c95-ecb8024ae729:1-2357789559:2357789561-2357790104:2357790106-2514115701:2514115703-2514115705:2514115707-2546512667
1 row in set (0.00 sec)

From the slave with the new backup copy, i get this:

root@ubuntu:/var/lib/mysql# cat xtrabackup_binlog_info
mysqld-bin.000283       294922064       1570dee1-165b-11e6-a4a2-00e081e93212:1-3537,
c73f3ee7-e8d4-ee19-6507-f898a9930ccd:1-18609,
cdb70eaa-f753-ee1b-5c95-ecb8024ae729:1-2357789559:2357789561-2357790104:2357790106-2514115701:2514115703-2514115705:2514115707-2546400960

One more thing, i just purged the binary logs on the master before i made a backup. automatic binlog purge is set to 7 days. So i know its not because the bin log has been purged as the error is suggesting.

I am running Ubuntu 14:04, and Percona server version 5.6.31-77.

How can i resolve this issue? What is the correct value of the master's GTID_PURGED?

Answer

Obivan picture Obivan · Jan 30, 2017

mysql 5.6 GTID replication errors and fixes What is GTID? 

4c2ad77f-697e-11e3-b2c3-c80aa9f17dc4
  • This is the server's 128 bit identification number (SERVER_UUID). It identifies where the transaction was originated. Every server has its own SERVER_UUID. What problems GTID solves?

  • It is possible to identify a transaction uniquely across the replication servers. Make the automation of failover process much easier. There is no need to do calculations, inspect the binary log and so on. Just MASTER_AUTO_POSITION=1.

  • At application level, it is easier to do WRITE/READ split. After a write on the MASTER, you have a GTID so just check if that GTID has been executed on the SLAVE that you use for reads.
  • Development of new automation tools isn't a pain now. How can I implement it?

Three variables are needed in ALL servers of the replication chain

  • gtid_mode: It can be ON or OFF (not 1 or 0). It enables the GTID on the server.
  • log_bin: Enable binary logs. Mandatory to create a replication environment.
  • log-slave-updates: Slave servers must log the changes that come from the master in its own binary log.
  • enforce-gtid-consistency: Statements that can't be logged in a transactionally safe manner are denied by the server. ref: http://dev.mysql.com/doc/refman/5.6/en/replication-gtids-howto.html

Replication errors and fixes:

"'Got fatal error 1236 from master when reading data from binary log: "The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires." slave_io thread stop running.

Resolution: Considering following are the master – slave UUID's

MASTER UUID: 4c2ad77f-697e-11e3-b2c3-c80aa9f17dc4
SLAVE UUID: 5b37def1-6189-11e3-bee0-e89a8f22a444

Steps:

slave>stop slave;

slave> FLUSH TABLES WITH READ LOCK;

slave>show master status;

'4c2ad77f-697e-11e3-b2c3-c80aa9f17dc4:1-83345127,5b37def1-6189-11e3-bee0-e89a8f22a444:1-13030:13032-13317:13322-13325:13328-653183:653185-654126:654128-1400817:1400820-3423394:3423401-5779965′

(HERE 83345127  Last GTID executed on master and 5779965 Last slave GTID executed on Master )

slave> reset master;

slave>set global GTID_PURGED='4c2ad77f-697e-11e3-b2c3-c80aa9f17dc4:1-83345127,5b37def1-6189-11e3-bee0-e89a8f22a444:1-5779965′;

slave>start slave;

slave>unlock  tables;

  slave>show slave status;

NOTE: After this Re-start slave other chain-slaves if they stop replicating;

ERROR: 'Error "Table … 'doesn"t exist" on query. Default database: …Query: "INSERT INTO OR Last_SQL_Error: ….Error 'Duplicate entry' SKIP Transaction on slave (slave_sql Thread stop running) NOTE:

  • SQL_SLAVE_SKIP_COUNTER doesn't work anymore with GTID.
  • We need to find what transaction is causing the replication to fail.
    • – From binary log
    • – From SHOW SLAVE STATUS (retrieved vs executed) Type of errors: (check last sql error in show slave status)

Resolution: Considering following are the master – slave UUID's

MASTER UUID: 4c2ad77f-697e-11e3-b2c3-c80aa9f17dc4
SLAVE UUID: 5b37def1-6189-11e3-bee0-e89a8f22a444

slave>show slave status;

copy the 'Executed_Gtid_Set' value. '4c2ad77f-697e-11e3-b2c3-c80aa9f17dc4:1-659731804,5b37def1-6189-11e3-bee0-e89a8f22a444:1-70734947-80436012:80436021-80437839'

-Seems that slave (with uuid 5b37def1-6189-11e3-bee0-e89a8f22a444) transaction '80437840' is causing the problem here.

slave> STOP SLAVE;

slave> SET GTID_NEXT="5b37def1-6189-11e3-bee0-e89a8f22a444:80437840";  (last_executed_slave_gtid_on_master + 1)

slave> BEGIN; COMMIT;

slave> SET GTID_NEXT="AUTOMATIC";

slave> START SLAVE;

slave>  show slave status;

and it's ALL SET !!!