The idea behind the Raft consensus algorithm is that the nodes (i.e., server computers) collectively select a leader, and the remaining nodes become the followers. The leader is responsible for state transition log replication across the followers under the closed distributed environment, assuming that all the nodes are trustworthy and have no malicious intent.
The basic idea of Raft came from the fact that in a distributed environment, we can come to a consensus based on the Paxos algorithm and elect a leader. Interestingly, if we have a leader in the system, we can avoid multiple proposers proposing something altogether.
In the case of Paxos, we don’t have any straightforward mechanism to elect a leader. However, to elect a leader, multiple proposers propose the thing simultaneously. Consequently, the protocol becomes complex, and the acceptors have to accept one of the proposals from the proposer. In that case, we use the highest proposal number for the tie-breaking mechanism and embed a certain algorithm in Paxos to ensure that every proposal coming from a different proposer is unique. Thus, all these internal details make the Paxos more complicated.
In a distributed environment and under a synchronous assumption (closed environment), it is possible to design a consensus algorithm. First, we will elect a leader and then the tasks of the leader to propose something. There will be a single proposer, and all the acceptors are followers of the leader. They may either accept or reject the leader’s opinion.
Raft Overview
The system starts up and has a set of follower nodes. The follower nodes look for a leader. If a timeout happens, there is no leader, and there is a need to elect a leader in the system. A few candidates stand for a leader in the election process, and the remaining nodes vote for the candidate. The candidates who receive the majority votes become the leader. The leader proposes a proposal, and the followers can either vote for or against that proposal.
An example from the database replication: We have distributed multiple replicated servers, and we want to build a consensus among these multiple replicated servers. Whenever some transactions are coming up from the clients, we want these replicated servers to decide whether to commit those transactions collectively.
Raft Consensus Algorithm
Electing the Leader: Voting Request
The first part of the Raft is to elect a leader, and for that, there should be some leader candidates. The nodes sense the network, and if there is no leader candidate, then one of the nodes will announce that I want to be a leader. The leader candidate requests the votes. This voting request contains 2 parameters:
- Team: The last calculated number known to candidate + 1.
- Index: Committed transactions available to the candidate.
These algorithms work in multiple rounds, and the term indicates a new voting round. If the last voting finishes, then the next term will be old term number + 1; The index indicates committed transactions available to the candidate. It is just like an increasing number to distinguish between already committed and new transactions.
Electing the Leader: Follower Node’s Decision Making
Once the nodes receive a voting request, their task is to vote pro or against the candidate. So, this is the mechanism to elect a leader in the Raft consensus algorithm. Each node compares the received term and index with the corresponding current known values.
- The node(i) receives the voting request. It compares the already seen team with the newly received team. If a newly received team is less than the already seen team, then it discards because the node considers this request as an old request.
- The newly received team is greater than the already seen team. It checks for the newly received index number with the already seen index number. If the newly received index number is greater than already seen, it votes for the candidate; else, it declines.
Electing the Leader: Majority Voting
Every node sends their vote and candidates who get majority vote becomes a leader, and commit the corresponding log entry. in other words, If a certain leader candidate, receives majority of the vote from the nodes, then that particular candidate becomes a leader and other becomes the follower of that node.
Multiple Leader Candidates: Current Leader Failure
Let us understand a scenario where there is a leader, and three followers and the current team is 10, and the commit index value is 100. Suppose the leader node has failed or followers didn’t receive a heartbeat message within the heartbeat timeout period.
After the timeout, one of the nodes will become a leader candidate, initiates a leader election, and becomes a new leader with team 11 and commit index value 100. The new leader periodically sends the heartbeat message to everyone to indicate his presence.
In the meantime, the old leader gets recovered, and he also receives a heartbeat message from the new leader. The old leader understands that a new term has started. Then the old leader will change his status from leader to follower. So this is the one way to handling a new leader by utilizing the team parameter.
Multiple Leader Candidates: Simultaneous Request Vote
Let us understand a scenario where there is a leader, and three followers and the current team is 20, and the commit index value is 200. Suppose the leader node has failed or followers didn’t receive a heartbeat message within the heartbeat timeout period. It may be possible that multiple followers sense the timeout period simultaneously and become a leader candidate, and initiates the leader election procedure independently. Two nodes send the request messages with team 21 at the same time in the network.
There are two leader candidates, and both are sending voting request messages, at the same time, for round (term) 21. Then, they look for the majority voting. In this example, the first candidate receives two votes, and the second candidate receives one vote, so based on the majority voting, the first candidate is a winner.
The node which gets the majority votes send a heartbeat message to everyone. Another leader candidate also received the heartbeat message from the winner, and this leader candidate falls back to a follower from the leader candidate.
Committing Entry Log
In the above sections, we have seen the procedure to elect a leader and other special cases. Now we will understand how the transactions are managed in a closed distributed environment. Let us consider that the current term value is 10, and the index value is 100, which means most of the nodes have seen and committed transaction index value number 100.
The leader proposes a new transaction, adds an entry log with term 10 and the new transaction index value as 101. Further, the leader sends a message called append entries to all the followers, and they collectively vote either for or against this transaction.
The leader receives the vote for this transaction index value 101. The followers’ node votes for or against this transaction. If the majority says that they are fine with committing this particular log. Then, the leader considers that the transaction log is approved by the followers.
After successful acceptance of the entry log, the leader sends an accept message based on the majority voting to all the individual followers to update the committed index to 101.
Handling Failure
Multiple kinds of failures exist in the environment. However, Paxos and Raft consensus algorithms only support Crash or Network fault. The followers may have crashed, but the system can tolerate up to N/2 -1, where N is the total number of nodes in the environment, as it does not affect the system due to the majority voting. This indicates that the majority of the followers are non-faulty, and they can send a vote. The leader can take the majority decision whether to accept or reject a particular transaction.
References
- NPTEL lecture series on Blockchains Architecture, Design and Use Cases by Prof. Sandip Chakraborty, IIT Kharagpur.
1,296 total views, 1 views today