Reputation: 303
Hello am using Galera with 10.1.12-MariaDB and SST method is xtrabackup-v2
please dont recommend SST=rsync it's not work for me
I have healthy cluster 8 nodes, sometimes one or few nodes goes down.
im just service mysql
start on it and they successfully connecting to cluster and all is OK.
BUT sometimes, when disconnected nodes down few days i cant connect they to cluster.
after few tries im rm -fr /var/lib/mysql/*
& rm -fr /var/log/mysql/*
and nothing too, they have this message in syslog:
mysqld: [ERROR] Binlog file '/var/log/mysql/mariadb-bin.003079' not found in binlog index, needed for recovery. Aborting.
i know how work with this, i can recover cluster when i have nodes which can't connect to cluster with message above, so i do this:
rm -fr /var/log/mysql/*
service mysql start
But problem is:
I cant down all production nodes, and down last node too, because i have 8 nodes to serve big site traffic and one running node immediately down when all traffic goes to it (of course because overload)
QUESTION IS:
Please help me. How connect nodes to cluster when they won't connect and have error mysqld: [ERROR] Binlog file '/var/log/mysql/mariadb-bin.003079' not found in binlog index, needed for recovery. Aborting.
Upvotes: 1
Views: 2015
Reputation: 142298
How big is the gcache
? That controls whether IST can be used for re-attaching a node or not.
What is the value of expire_log_days
? Is it so small that the binlog was lost before you tried to connect? If you lost one, and need another for SST, you still have 6 to serve the 'big site'. It sounds like you need to increase the deployment to maybe 10 nodes in order to handle the site even when nodes wink out.
It sounds like you are stuck with SST.
Take a look at the slowlog to see if some queries are taking so long that they are, indirectly, forcing you to have so many machines. Fixing a couple of queries is a lot 'cheaper' than adding extra machines.
Upvotes: 1