The MySQL symlink trap

Many users of MySQL install and use the standard directories for MySQL data and binary logs. Generally this is /var/lib/mysql.
As your system grows and you need more disk space on the general OS partition that commonly holds /tmp, /usr and often /home, you create a dedicated partition, for example /mysql. The MySQL data, binary logs etc are then moved to this partition (hopefully in dedicated directories). For example data is placed in /mysql/data.
Often however, a symbolic link (symlink) is used to so MySQL still refers to the data in /var/lib/mysql.

When it comes to removing the symlink and correctly configuring MySQL, you first stop MySQL and correctly defining the datadir my.cnf variable to point to the right location. However, MySQL still keeps the legacy directory information around and this will cause MySQL replication to fail in several ways when you attempt to restart your MySQL instance.

The binary log index, the relay log index, and the relay log info files all contain the legacy path. MySQL does not make it easy to also determine these actual files.

The relay_log_index variable defines the index, but defaults to [relay_log].index when not defined, so with SHOW GLOBAL VARIABLES this may be blank.
log-bin-index is an configurable option, but no matching global variable. It defaults to [log-bin].index.
relay_log_info will contain a value, generally only a file that is relevant to the data directory.

In these situations, your only option to to manually edit these files, specifying the new datadir (or log-bin) path in order to correctly remove symlinks.

The best advice, is to consider the design of your system first, and never place data in default locations if you feel this has to be modified later. Define those dedicated directories before you start using your MySQL instance.

Tagged with: Databases MySQL

Related Posts

More CPUs or Newer CPUs

In a CPU-bound database workload, regardless of price, would you scale-up or scale-new? What if price was the driving factor, would you scale-up or scale-new? I am using as a baseline the first available AWS Graviton2 processor for RDS (r6g).

Read more

An Interesting Artifact with AWS RDS Aurora Storage

As part of using public datasets with my own Benchmarking Suite I wanted upsize a dataset for larger volume testing. I have always used the INFORMATION_SCHEMA.TABLES data_length and index_length columns as a sufficiently accurate measurement for actual disk space used.

Read more

How long does it take the ReadySet cache to warm up?

During my setup of benchmarking I run a quick test-sysbench script to ensure my configuration is right before running an hour+ duration test. When pointing to a Readyset cache where I have cached the 5 queries used in the sysbench test, but I have not run any execution of the SQL, throughput went up 10x in 5 seconds.

Read more