Paxos and Raft consensus algorithm can tolerate up to N/2 – 1 number of Crash or Network fault, where N is the total number of nodes in the network. However, what if the nodes behave maliciously? This particular class of fault is called Byzantine fault, and the Byzantine fault came from the interesting Byzantine Generals Problem.
Let us understand this concept with an example: The army wants to attack a certain fort, and this mission is lead by the army General, having two troupes. So, based on the scenario, the General can either make an order to attack from different sides or retreat. When the General is trustworthy, he orders the same to both the troupes.
However, when the General is malicious. The General sends an attack message to one troupe and a retreat message to another troupe. If the General becomes faulty, it becomes difficult for them to find out what to do. This particular problem is called the Byzantine Generals Problem, where a particular node can behave maliciously.
In Raft and Paxos algorithm, which heavily relies on that, the faulty nodes never send any vote, and the non-faulty nodes send a correct vote. This particular assumption does not hold in the case of Byzantine fault. In Byzantine fault, it may happen that the faulty node selectively sends votes to some of the nodes.
This general class of faults in a distributed system under a closed environment is called a Byzantine Generals problem or Byzantine fault-tolerant problem. We will develop a fault-tolerant architecture where the system will be able to tolerate Byzantine fault.
Let us consider the three Byzantine Generals Problem, where we have three Generals, and we are using the Lamport terms to denote this kind of Byzantine Generals Problem.
In this architecture, we have one Commander and two different Lieutenants. The Commander sends the message to the Lieutenants. The Lieutenant can share the messages among themselves using the message passing technique and try to determine whether the Commander is faulty or the Lieutenant is faulty. Let us look into the cases from the perspective of three Generals and design a solution for this Byzantine fault-tolerant system.
Three Byzantine Generals Problem: Lieutenant Faulty
This is the case when one of the Lieutenant is faulty: The Commander sends a retreat message to both the Lieutenants(1,2), and the Lieutenant (2), who is faulty, sends an attack message rather than an actual retreat message to Lieutenant (1). But, in general, by integrity condition of the Army, the Lieutenant (1) is bound to obey the Commander message. Although Lieutenant (2) is faulty, and the Commander is correct. Thus, the entire system still works correctly.
Three Byzantine Generals Problem: Commander Faulty
This is the case when the Commander is faulty: The Commander sends different messages to different Lieutenants. He sends a retreat message to Lieutenants(1) and an attack message to Lieutenant (2). When Lieutenants exchange the messages, then both may feel that the other one is faulty. Then, they follow the integrity condition of the Army, and both obey the Commander’s message. Thus, this contradicts the agreement condition and no solution for three generals, including one faulty commander. Even the majority principle wouldn’t work here because both the Lieutenants had received two messages {attack,retret} which may confuse both the Lieutenants.
Four Byzantine Generals Problem: Lieutenant Faulty
In this case, there are four Byzantine Generals, and one of the Lieutenant is faulty. They communicate with each other using the message passing technique.
The Commander sends a message to all the Lieutenant, and later the Lieutenants share the received message. However, in this case, Lieutenant(2) is faulty, so he/she may send the incorrect message to others, whereas the remaining Lieutenants exchange the correct message with each other.
If they go for the majority voting principle, they can analyze and decide to retreat. This is illustrated as follows:
- Lieutenant (1) = {Retreat, Attack, Retreat) = Retreat
- Lieutenant (3) = {Retreat, Attack, Retreat) = Retreat
Lieutenant (1) receives two retreat messages from Lieutenant (3) and the Commander but an attack message from Lieutenant (2). Whereas Lieutenant (3) also receives two retreat messages from Lieutenant (2) and the Commander but an attack message from Lieutenant (2).
We observe that when all the correct Lieutenants follow the majority voting principle, although Lieutenant (2) is a Byzantine node, Lieutenant (1) & Lieutenant (3) can correctly decode the message. However, the same concept is not valid if two Lieutenants become faulty.
Four Byzantine Generals Problem: Commander Faulty
In this case, there are four Byzantine Generals, out of which three are non-faulty Lieutenants and a faulty Commander. They communicate with each other using the message passing technique. The Commander sends the differing messages to Lieutenants, and the Lieutenants exchange the received message.
The Commander sends the retreat message to two Lieutenants (1,3) and an attack message to Lieutenant (2). However, the Lieutenants exchange messages correctly. If they go for the majority voting principle, they can analyze and decide to Retreat. This is illustrated as follows:
- Lieutenant (1) = {Retreat, Attack, Retreat) = Retreat
- Lieutenant (2) = {Retreat, Attack, Retreat} = Retreat
- Lieutenant (3) = {Attack, Retreat, Retreat) = Retreat
We observe that when all the correct Lieutenants follow the majority voting principle, although Commander is a Byzantine node, Lieutenant (1), Lieutenant (2) & Lieutenant (3) can correctly decode the message. However, if the Commander sends an attack message to more than one Lieutenant and then using the majority voting principle, they can analyze and decide to Attack. When the Commander is faulty, the system will reach a consensus with the value that the Commander has proposed to the majority of the Lieutenants.
In general, to ensure the correct consensus in the system, and when there is a maximum of F numbers of faulty Lieutenants, there should be 2*F + 1 number of Lieutenants, plus a Commander in the system.
Byzantine Generals Model
In the Byzantine Generals Model, we assume an N number of processes out of which at most F number of the processes can be faulty while ensuring that the system has a total of 2*F + 1 number of processes. The system should be fully connected, and receivers always know the identity of the senders. The other requirements are that the system should synchronous and having a reliable communication medium. The synchronous system means that every process should receive all the messages within some predefined timeout duration.
References
- NPTEL lecture series on Blockchains Architecture, Design and Use Cases by Prof. Sandip Chakraborty, IIT Kharagpur.
906 total views, 1 views today