Sivaram Kannan
Sivaram Kannan

Reputation: 31

Recovering an orderer from an earlier backup caused consistency issues with Peer

I am still a blockcain noob, so, pardon me if I had misunderstood some basics.

I have a HLF 1.4.2 cluster running in GKE with 3 orderers(raft) and 3 orgs(2 peers each). Couple of ago days the orderer's pvc hit disk full and all the orderes started failing. Since I could not expand the volume live, I scaled down the orderers, took a back of the data from the PVC's and restored on an expanded volumes.

While restoring the volumes, I started the each orderer immediately after restoring the volume data. So, when the second orderer was started, the orderer got the quorum and peers started writing. So, when the 3rd orderer was brought up, it refused to join the cluster with the below error

2020-08-25 13:17:33.891 UTC [orderer.consensus.etcdraft] Step -> INFO a2d 2 [term: 2] received a MsgHeartbeat message with higher term from 1 [term: 4] channel=mychannel node=2
2020-08-25 13:17:33.892 UTC [orderer.consensus.etcdraft] becomeFollower -> INFO a2e 2 became follower at term 4 channel=mychannel node=2
2020-08-25 13:17:33.892 UTC [orderer.consensus.etcdraft] commitTo -> PANI a2f tocommit(19) is out of range [lastIndex(17)]. Was the raft log corrupted, truncated, or lost? channel=mychannel node=2
panic: tocommit(19) is out of range [lastIndex(17)]. Was the raft log corrupted, truncated, or lost?

Since I was not sure how to recover this, I decided to restore all the 3 orderer with the earlier backup I had. After the restore, all the orderer's had successfully came back in quorum, but multiple peers started giving the below error

2020-08-26 11:03:46.482 UTC [gossip.state] deliverPayloads -> PANI 03f Cannot commit block to the ledger due to unexpected Previous block hash. Expected PreviousHash = [3e39ea03143fdb09a14fb92b6b429236f57dbe52acbeac9c797f6ebdeef1aa79], PreviousHash referred in the latest block= [1823042217af3f7836e7b6d9b933dc9c008fc71f85e5465c62b78217194a6b3a]
  1. Is it possible to recover from this state?
  2. The official backup and recovery documentation does not talk about the dependency between orderer and peer ledgerdata. So, if I am taking a backup, only way to get a consistent data backup is to stop both orderer and the peer before taking a backup right?
  3. I compared the backup from 3 orderers and files look identical on all the 3 backups, so, can I restore all the 3 orderer's from the backup of one single orderer?

Upvotes: 0

Views: 376

Answers (1)

aldred
aldred

Reputation: 853

  1. it seems that when you restore the orderer, you didn't restore the peer. So now the peer has more blocks than the orderer. you should always ensure the peer's block height is the same or lower than the orderer. If it's the same, then great, nothing to be done. If it's lower, then it will pull newer blocks from the orderer.

  2. Yes, it is recommended to stop both orderer and peer before taking a backup

  3. I'm not sure. Might need the other HF experts to chime in.

Upvotes: 1

Related Questions