Binary Log Replayer

When using the replication slave stream, or mysql command line client and mysqlbinlog output from a binary/relay log, all statements are executed in a single thread as quickly as possible.

I am seeking a tool to simulate the replay of the binary/relay log for a benchmark at a pace that is more representative to original statements. For a simple example, if the Binary Log has 3 transactions in the first second, 2 transactions in the second second, and 5 transactions in the third second, I am wanting to simulate the replay to take roughly 3 seconds, not as fast as possible (which would be sub-second). The tool should try to wait the remainder of a second before processing SQL statements in the incoming stream.

Does anybody know of a tool that currently provides this type of functionality? Any input appreciated before I create my own.

Comments

  1. Silvia says

    I don’t believe the bin logs store information about how concurrent the queries were. Plus the replication log won’t have non DMLs. Maybe after global transaction Ids are GA we’ll get closer to what you want to do? :)

  2. ronald says

    Silvia,

    I am not concerned about concurrency, especially as the SQL_THREAD is single concurrency.
    I am also only wanting DML statements, so the Binary Log is the ideal source and does contain the data I need to make the determination.

    #111222 18:48:41 server id 1 end_log_pos 2453 Query thread_id=87603246 exec_time=1 error_code=0
    INSERT 1
    INSERT 2
    INSERT 3
    #111222 18:48:42 server id 1 end_log_pos 2988 Query thread_id=87603254 exec_time=1 error_code=0
    UPDATE 1
    UPDATE 2

  3. Dmitri Mikhailov says

    Ronald,

    I am not sure if such a tool exists, if you decide to write your own, it should be fairly simple to modify Tungsten replicator’s MySQL applier code in order to achieve the desired behavior http://code.google.com/p/tungsten-replicator/

    You can also try contacting Continuent folks they might help you with this, maybe even add this feature to their product.

    Regards,

    Dmitri

  4. Matthew Montgomery says

    I think probably the most direct way to do this is to patch mysqlbinlog so that it compares the timestamp from the last and current transactions, then inserts a SELECT SLEEP(n); before writing out any of the queries from the current transaction.

  5. ronald says

    @Dmitri Good suggestion
    @Matt The issue with inserting a SELECT SLEEP(n) is when generating the SQL, ‘n’ is unknown. ‘n’ is only known after execution of the statements and determining the time (if any) you should sleep before processing the next second’s SQL statements.

  6. Matthew Montgomery says

    You can compute ‘n’ by comparing the difference between the last and current timestamp.

    I also found how to do it with awk so no patching is required.

    mysqlbinlog binlog.?????? | awk -F = ‘/SET TIMESTAMP=/{if (timestamp) d=$2-timestamp; print; print “DO SLEEP(” d+0 “);”; timestamp=$2+0; next} 1′

  7. says

    Hi Ronald! You can do this in Tungsten Replicator by writing a filter that inserts a delay into the replication stream. There’s already a time-delay filter available in the build. Check out http://code.google.com/p/tungsten-replicator/source/browse/trunk/replicator/src/java/com/continuent/tungsten/replicator/filter/TimeDelayFilter.java. You would need to change this to look at header timestamps and add suitable delay between transactions. The nice thing is that if you get this to work everything else from reading binlogs to applying on the slave is already implemented.

  8. Tom H (@tolland) says

    presumably if you are replaying the SQL into mysql like so;

    # mysqlbinlog mysql-bin.000019 | mysql

    then you could run it through the pipe viewer command with a rate limit set very low, like so;

    # mysqlbinlog mysql-bin.000019 | pv -L 100 | mysql

Trackbacks