When mysqldump –no-set-names matters

I had this perplexing problem yesterday where a mysql dump and restore was producing different results when using MaatKit mk-table-checksum.

mk-table-checksum --algorithm=BIT_XOR h=192.168.X.XX,u=user,p=password --databases=db1 --tables=c
DATABASE TABLE   CHUNK HOST         ENGINE      COUNT         CHECKSUM TIME WAIT STAT  LAG
db1      c           0 192.168.X.XX InnoDB     215169         d1d52a31    2    0 NULL NULL
mk-table-checksum --algorithm=BIT_XOR h=localhost,u=user,p=password --databases=db1 --tables=c
DATABASE TABLE   CHUNK HOST      ENGINE      COUNT         CHECKSUM TIME WAIT STAT  LAG
db1      c           0 localhost InnoDB     215169         91e7f182    0    0 NULL NULL

It was rather crazy until I reviewed the mysqldump settings I was using, and I realized I was using –no-set-names.

So just what does this option remove. Here is a diff of mysqldump with and without.

5a6,10
>
> /*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */;
> /*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */;
> /*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */;
> /*!40101 SET NAMES utf8 */;
153a159,161
> /*!40101 SET CHARACTER_SET_CLIENT=@OLD_CHARACTER_SET_CLIENT */;
> /*!40101 SET CHARACTER_SET_RESULTS=@OLD_CHARACTER_SET_RESULTS */;
> /*!40101 SET COLLATION_CONNECTION=@OLD_COLLATION_CONNECTION */;
156c164

As you can see it executes a SET NAMES utf8. The problem here is I’m exporting a table, and it is DEFAULT CHARSET=latin1, and no columns are defined as utf8.

I’m not expert in character sets, but this strikes me as strange, and a problem that remains unresolved to my satisfaction, resolved, but not to my comfort level.

Tagged with: Databases MySQL

Producing Chi-Squared statistics with SQL

The Chi-Squared test is one of the most widely used statistical tests for categorical data. It comes in two flavors: the goodness-of-fit test asks whether an observed frequency distribution matches an expected one, while the test of independence asks whether two categorical variables are associated with each other.

Speaking at COSCUP 2026 — Planning your upgrade to MySQL 9.7

I am excited to be speaking at COSCUP 2026 in Taipei, Taiwan on August 8th and 9th. COSCUP (Conference for Open Source Coders, Users and Promoters) is one of the largest open source conferences in Asia, and it is always a privilege to present to the engaged and technically sharp community there.

Producing Two Sample T-Test statistics with SQL

The two sample t-test for equal variance is a statistical test to determine if the means of two groups are different enough that the difference is likely caused by some underlying difference, rather than random chance.