Jal
Jal

Reputation: 2312

What is lastApplied and matchIndex in raft protocol for volatile state in server?

I am using the following pdf as reference.enter image description here

It says that lastApplied is the highest log entry applied to state machine, but how is that any different than the commitIndex?

Also is the matchIndex on leader just the commitIndex on followers? If not what is the difference?

Upvotes: 4

Views: 3011

Answers (4)

Aurora Olian
Aurora Olian

Reputation: 1

I have meet the same problem you have. But I have completed the second task of Raft(the project in course MIT6.5840 and comprehend the purpose of the state. I think it would be more useful if I tell you how these variables work when coding instead of their properties!

  • The nextIndex[] and matchIndex[] are volatile state on the leader. nextIndex[] is used to constitute the LOG Matching Property as referred in 5.3 Log replication in the paper. It used to meet the latest same logs. When we succeed in AppendEntries, we will update both nextIndex[] and matchIndex[]. So what is matchIndex[] for? It's used to update the leader's commitIndex, then other followers will update their commitIndex by argument leaderCommit.
  • When followers successfully replicate logs and see leaderCommit higher than its, commitIndex will be updated. But note that there are more restrictions to change values. You can see more details in the paper.
  • The lastApplied is the entry to state machine. We need to send commit log to the channel applyCh(and then to the state machine) by comparing lastApplied and commitIndex, otherwise the test program can't see the updated logs.

If you don't have no idea, you can see Raft Visualization in its official website

Upvotes: 0

Aelous
Aelous

Reputation: 1

  1. commit is not mean already applied, there is time different between them. but eventually applied will catch up commit index.
  2. matchIndex[i] which is saved in leader is equal to follower_i's commitIndex, and they are try to catch up to nextIndex

Upvotes: -1

Lifu Huang
Lifu Huang

Reputation: 12869

Your observation is reasonable: most of the time, nextIndex equals matchIndex + 1, but it is not always the case.

For example, when a leader is initiated, matchIndex is initiated to the 0, while nextIndex is initiated to the last log index + 1.

The difference here is because these two fields are used for different purposes: matchIndex is an accurate value indicating the index up to which all the log entries in leader and follower match. However, nextIndex is only an optimistic "guess" indicating which index the leader should try for the next AppendEntries operation, it can be a good guess (i.e. it equals matchIndex + 1) in which case the AppendEntries operation will succeed, but it can also be a bad guess (e.g. in the case when a leader was just initiated) in which case the AppendEntries will fail so that the leader will decrement nextIndex and retry.

As for lastApplied, it's simply another accurate value indicating the index up to which all the log entries in a follower have been applied to the underlying state machine. It's similar to matchIndex in that they both are both accurate values instead of heuristic "guess", but they really mean different things and serve for different purposes.

Upvotes: 6

Michael Deardeuff
Michael Deardeuff

Reputation: 10707

... lastApplied is the highest log entry applied to state machine, but how is that any different than the commitIndex?

These are different in a practical system because the component that commits the data in the log is typically separate from the component that applies it to replicated state machine or database. The commitIndex is typically just nanoseconds or maybe a few milliseconds more up-to-date than lastApplied.

Is the matchIndex on leader just the commitIndex on followers? If not what is the difference?

They are different. There is a period of time when the data is on a server and not yet committed, such as during the replication itself.

The leader keeps track of the latest un-committed data on each of its peers and only need to send log[matchIndex[peer], ...] to each peer instead of the whole log. This is especially useful if the peer is significantly behind the leader; because the leader can update the peer with a series of small AppendEntries calls, incrementally bringing the peer up to date.

Upvotes: 3

Related Questions