Thursday, December 16, 2010

Semi-sync replication (5.5)

A bit unorthodox, I normally just write about MySQL Cluster here, but I just wanted to understand how much latency semi-sync replication adds.

The test was very simple:
  • two mysql servers, interconnected (same switch) on a 1 Gig-E network
  • one table (see below)
  • comparing insert performance (one thread) with 'no replication at all' and 'semi sync replication enabled'.
  • bencher (had to hack it a bit to make it work with vanilla mysql) running one thread, inserting 4B+128B+4B = 136B of data
CREATE TABLE `t1` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`data` varchar(255) DEFAULT NULL,
`ts` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
KEY `ts` (`ts`)
) ENGINE=InnoDB
Test 1 - no replication
src/bencher -t1 -r 30   -q "insert into t1(b) values('012345678901234567890123456789001234567890123456789012345678900123456789012345678901234567890012345678901567890123456789001234567')" \
-i "truncate table t1" -o no_replication

Summary:
--------------------------
Average Throughput = 7451.26 tps (stdev=969.24)
Average Latency (us)=134.31 (stdev=2904.26)
95th Percentile (us)=110.00
(I haven't looked into why the stdev is so high, but it looks like some requests takes really long time, but this i have not studied why).

Test 2 - Semi-sync replication
src/bencher -t1 -r 30   -q "insert into t1(b) values('012345678901234567890123456789001234567890123456789012345678900123456789012345678901234567890012345678901567890123456789001234567')" \
-i "truncate table t1" -o no_replication

Summary:
--------------------------
Average Throughput = 2377.10 tps (stdev=210.42)
Average Latency (us)=421.46 (stdev=2644.61)
95th Percentile (us)=362.67
From this simpe test the increased latency when using semi sync replication is not surprising at all, since the semi-sync adds another network hop to the slave mysql server, and the difference in latency between 'no replication' and 'semi-sync' is very much expected network overhead, (similar factor as if you do a read in mysql cluster vs an write (which requires 2PC and network communication between the data nodes) ).

So the next step now is to compare this to synchronous replication (cluster), but this will be another time.

innodb config for the record
innodb_buffer_pool_size=2048M
innodb_log_file_size=256M
innodb_log_files_in_group=3
innodb_file_format=barracuda
innodb_flush_method = O_DIRECT
innodb_flush_log_at_trx_commit=2

2 comments:

Matthew Montgomery said...

I would be good also to have the standard asynchronous replication benchmark as well so we can see how much extra latency is added by the binlogging and how much is added by semi-sync.

Mark Callaghan said...

Semi-sync rate limits a too-busy connection. In some cases that is good, for example it can keep the master from getting too far ahead of the slave IO thread. Replication lag in the SQL thread is annoying but lag in the IO thread is dangerous as a master can mean that you lose committed transactions that never reached the slave.

7,000 inserts per second is hard to achieve on real InnoDB deployments as the binlog will be enabled, group commit is not supported and there are a few fsyncs per commit. Even when fsync is fast because of flash or HW RAID, transactions are likely to take longer on a busy master with concurrent clients and the overhead from semi-sync should be less.

Of course all of this is from the perspective of someone used to InnoDB response times. I imagine that throughput, response time and variance are much better on NDB.