The MySQL symlink trap

Many users of MySQL install and use the standard directories for MySQL data and binary logs. Generally this is /var/lib/mysql.
As your system grows and you need more disk space on the general OS partition that commonly holds /tmp, /usr and often /home, you create a dedicated partition, for example /mysql. The MySQL data, binary logs etc are then moved to this partition (hopefully in dedicated directories). For example data is placed in /mysql/data.
Often however, a symbolic link (symlink) is used to so MySQL still refers to the data in /var/lib/mysql.

When it comes to removing the symlink and correctly configuring MySQL, you first stop MySQL and correctly defining the datadir my.cnf variable to point to the right location. However, MySQL still keeps the legacy directory information around and this will cause MySQL replication to fail in several ways when you attempt to restart your MySQL instance.

The binary log index, the relay log index, and the relay log info files all contain the legacy path. MySQL does not make it easy to also determine these actual files.

The relay_log_index variable defines the index, but defaults to [relay_log].index when not defined, so with SHOW GLOBAL VARIABLES this may be blank.
log-bin-index is an configurable option, but no matching global variable. It defaults to [log-bin].index.
relay_log_info will contain a value, generally only a file that is relevant to the data directory.

In these situations, your only option to to manually edit these files, specifying the new datadir (or log-bin) path in order to correctly remove symlinks.

The best advice, is to consider the design of your system first, and never place data in default locations if you feel this has to be modified later. Define those dedicated directories before you start using your MySQL instance.

Tagged with: Databases MySQL

Related Posts

Creating a More Realistic Benchmark

Common benchmark approaches fall into two general categories, synthetic testing and realistic testing. You have the most generic operations from a synthetic test, starting with the most simple example using a single table, a single column, and for a single DML operation.

Read more

Testing, Benchmarking, Evaluating

Testing and benchmarking are widely used terms in software technology, each serving a distinct purpose and goal. With the increasing adoption of AI in software development, the term evaluating has become significant and with this the re-emergence of what is quality assurance.

Read more

Your Attack Vector Extends Beyond Production Systems

A common data security issue is the unprotected copying of production data to non-production environments without any redaction, masking, or filtering. This practice poses a serious risk. A malicious actor will target the weakest link in your infrastructure, including non-production accounts and the developer systems accessing them.

Read more