Reputation: 1092
how do you handle UNIQUE indexes in a master-master active-active replication environment?
SQL schema example:
CALLS
id INTEGER PRIMARY AUTO_INCREMENT
caller_id INTEGER + INDEX
start_date datetime
end_date datetime
...
CALLERS
id INTEGER PRIMARY AUTO_INCREMENT
phone VARCHAR UNIQUE
country_id CHAR(2)
city_id INTEGER
...
In a heavy loaded (lot of inserts) environment, requests can be sent to both master (for perf, high availability, 0 downtime schema changes reasons).
if 2 CALLERS records are created at the same time (yes it can) on both masters:
In a replication mode=row with a ON DUPLICATE KEY UPDATE the replication breaks too (since the ON DUPLICATE KEY UPDATE is not applied on the opposite master)
In a replication mode=statement the replication breaks too
In a replication mode=statement with ON DUPLICATE KEY UPDATE the replication keep working... but we can have inconsistent records:
if 2 CALLERS are created at the same time on both masters,
What are the best solutions to handle this really simple case?
I thought to solutions like this, but they have big drawbacks
don't use UNIQUE indexe in these cases by merging CALLERS and CALLS table, but this is a huge functional loss... and keep using ROW based replication. but:
don't use AUTO_INCREMENT (don't let MySQL choose the id values), but:
Upvotes: 3
Views: 1815
Reputation: 142518
Multi-master with standard replication does not improve write performance -- all writes must be applied to all Masters.
Multi-master with standard replication has a number of problems with UNIQUE
and PRIMARY
keys. This is the main reason most people avoid writing to both Masters.
AUTO_INCREMENT
problems can usually be prevented by using auto_increment_increment
= the number of masters, and auto_increment_offset
set to different values (1,2,..). That is what Galera does under the covers.
ROW
versus STATEMENT
-- should not make any difference.
Galera has many advantages, especially if you have at least 3 nodes. Especially since HA is a goal.
You have lots of INSERTs
? Are they single-row, or batched, or LOAD DATA
? Single-row is much slower (in any context) than either of the others. In all contexts, you must check for errors. In some cases you can get a "deadlock" and "rollback", necessitating replay of the transaction.
Beware of IGNORE
option -- it can "burn" auto_increment ids. More discussion.
See also Techniques for really high speed ingestion. Perhaps you can get adequate speed on a single machine.
Side note: country_id CHAR(2)
should be CHARACTER SET ascii
; with utf8
it would occupy 6 bytes always.
Upvotes: 2