MySQL resiliency after a crash and relay log files
One of my clients had a major power failure at his (new!) data center the other week, and one of the side effects was that there was quite a bit of data corruption (on an ext3 filesystem). One thing we lost was the LDAP database (that we recovered from the LDAP slave) and one file that got lost was a MySQL relay log file for a MySQL "slave".
MySQL appears to use these "relay log" files as a buffer between reading from a master server and feeding into a slaveāit's supposed to make things more resilient.
Of course, if you end up with a missing file, you end up with the following in your mysqld.log file (on Linux):
070304 22:24:43 [ERROR] Failed to open the relay log './<host>-relay-bin.000057' (relay_log_pos 494316200)
070304 22:24:43 [ERROR] Could not find target log during relay log initialization
This error will prevent you from starting up a slave, and will give you an error:
ERROR 1201 (HY000): Could not initialize master info structure; more error messages can be found in the MySQL error log
The solution that I found that finally worked was to:
- Create a new dump of the master database, FLUSH TABLES WITH READ LOCK, recording the master log position, etc.
- Load in the dump into the slave. (Thus far we're following regular directions for creating a slave.)
- Stop MySQL.
- cd /var/lib/mysql; rm *relay-bin* master.info (This gets rid of all the old information about the old slave info.) (DANGER: Like all other instructions to delete files, use this at your own risk. Make backups of everything if you're unsure.)
- Restart MySQL
- Execute the CHANGE MASTER command that you use to set up the slave
Comments
Hello,
Find /var/lib/mysql/relay-log.info file. In this file try to change mysqld-relay-bin.x name to the real name found in /var/run/mysqld directory. It worked for me without redumping data from master.
Posted by: Kory G | December 17, 2007 10:14 AM
I tried that--this is the case where a binlog might be missing altogether...in which case, you've got to rebuild the slave from scratch, and you have to wipe out the old binlogs and slave information.
Posted by: Jerry B. Altzman
|
December 17, 2007 10:18 AM