Category: summary. Tags: #mechanismdesignandgametheory #scalability #attack
By Subhra Mazumdar @subhramazumdar, Prabal Banerjee @prabal, Abhinandan Sinha, Sushmita Ruj and Bimal Roy.
TLDR
 Hashed Timelock Contract (HTLC) in Lightning Network is susceptible to a griefing attack where an attacker can block several channels and stall payments by mounting this attack.A stateoftheart countermeasure, Hashed Timelock Contract with GriefingPenalty (HTLCGP) is found to work under the classical assumption of participants being either honest or malicious but fails for rational participants.
 We introduce a gametheoretic model for analyzing griefing attacks in HTLC, and use this model to analyze griefing attacks in HTLCGP. We conjecture that it is impossible to design an efficient protocol that will penalize a malicious participant with the current Bitcoin scripting system.
 To further increase the cost of attack, we introduce the concept of guaranteed minimum compensation, denoted as \zeta, and modify HTLCGP into \textrm{HTLCGP}^{\zeta}. Experimental results justify that \textrm{HTLCGP}^{\zeta} is better than HTLCGP to counter griefing attacks.
Core Research Question
How does the griefing attack impact Lightning Network, and how effective is a penalization mechanism in countering this attack?
Citation
Background

Lightning Network: It is a layer 2 solution for scalability issues in Bitcoin blockchain. It is modeled as a bidirected graph G:=(V,E), where V is the set of nodes and E \subseteq V \times V is the set of payment channels opened between a pair of nodes. Every node charges a processing fee for processing payment requests, defined by a function f, where f: \mathbb{R}^{+}\rightarrow \mathbb{R}^+. Each payment channel (U_i,U_j) \in E has an initial capacity locked(U_i,U_j), denoting the amount deposited by U_i in the channel. t_{i,j} is the timestamp at which the channel was opened. In the context of the Bitcoin blockchain, this will be the block height. T is the expiration time of the channel id_{i,j}, i.e., once a channel is opened, it remains active till time T units (each channel in Lightning Network has an infinite lifetime. However, we assume an upper bound on the channel lifetime for our analysis. Setting channel expiration time has been used in the literature as well). remain(U_i,U_j) signifies the residual amount of coins U_i can transfer to U_j via offchain transactions. M denotes the average fee for mining a Bitcoin transaction.

Hashed Timelock Contract: It is used for forwarding conditional payments across parties not directly connected by the payment channel. Suppose a payer U_0 wants to transfer funds to a payee U_n via an nhop route P=\langle U_0,U_1,U_2,\ldots,U_n \rangle in the network. U_n creates a condition Y defined by Y=\mathcal{H}(x) where x is a random string and \mathcal{H} is a cryptographic hash function. U_n shares the condition Y with U_0. The latter uses Y for conditional payment across the whole payment path. Between any pair of adjacent nodes (U_{i1},U_{i}) in P, where i \in [1,n], the hashed timelock contract is defined by HTLC(U_{i1},U_{i},Y,b,t_{i1}), where t_{i1}=t_i+\Delta, where \Delta is the worstcase confirmation time for a transaction to get confirmed onchain. The contract implies that U_{i1} locks b coins in the offchain contract. The coins locked can be claimed by the party U_{i} only if it releases the correct preimage x': Y=\mathcal{H}(x') within time t_{i1}. If \mathcal{H} is a collisionresistant hash function, then x'\neq x with negligible probability. If U_{i} doesn’t release the preimage within t_{i1}, then U_{i1} settles the dispute onchain by broadcasting the transaction. The channel between U_{i1} and U_{i} is closed and U_{i1} unlocks the coins from the contract. If both the parties mutually decide to settle offchain then U_{i} either releases the preimage and claims b coins from U_{i1}. If U_i decides to reject the payment then b coins are refunded to U_{i1}.

Dynamic Games of Incomplete Information or Sequential Bayesian Game: In this class of games, players move in sequence, with at least one player being uncertain about another player’s payoff. We define a belief system and a player’s behavioral strategy to approach these games. A type space for a player is the set of all possible types of that player. A belief system in a dynamic game describes the uncertainty of that player of the types of the other players. A behavioral strategy of a player i is a function that assigns to each of i 's information set a probability distribution over the set of actions to the player i at that information set, with the property that each probability distribution is independent of every other distribution. A dynamic game of incomplete information is defined as a tuple that consists of (i) a set of players \mathcal{I}; (ii) a sequence of histories H^m at the m^{th} stage of the game, each history assigned to one of the players (or to Nature/Chance); (iii) an information partition. The partition determines which of histories assigned to a player are in the same information set; (iv) a set of pure strategies for each player i, denoted as S_i; (v) a set of types for each player i: \theta_i \in \Theta_i; (vi) a payoff function for each player i: u_i(s_1,s_2,\ldots,s_l,\theta_1,\theta_2,\ldots,\theta_l); (vii) a joint probability distribution p(\theta_1,\theta_2,\ldots,\theta_l) over types. Perfect Bayesian equilibrium is used to analyze dynamic games with incomplete information.

Hashed Time Locked Contract with GriefingPenalty (HTLCGP): It is a tworound protocol.
 The first round, or Cancellation round, involves locking of penalty. The round is initiated by the payee and proceeds in the reverse direction. The penalty locked by a party serves as a guarantee against a payment forwarded.
 The second round, or the Payment round, involves locking the payment value in offchain contracts from payer to payee.
We explain the protocol with the help of an example shown below. Alice wants to transfer b units to Bob. Each party that forwards a payment must be guaranteed by its counterparty to receive compensation if there is an incidence of a griefing attack. The compensation charged must be proportional to the collateral locked in the path. We define this proportionality constant as the rate of griefingpenalty per unit time, denoted as \gamma.
The first round termed as Cancellation round proceeds in the following way:
 Bob has to lock \gamma b D_1+\gamma b D_2+\gamma b D coins for duration D. This amount is the cumulative penalty to be distributed among Alice, Dave and Charlie, if Bob griefs.
 After Charlie receives the cancellation contract, he locks \gamma b D_1+\gamma b D_2 in the contract formed with Dave for D_2 units.
 The latter locks \gamma b D_1 coins in the contract formed with Alice for D_1 units of time.
The second round, termed as Payment round proceeds similarly as in HTLC.
 Payment value b is forwarded from Alice to Bob via the intermediaries.
 Since the least timeout period is D, one might question why Bob must take into account the lock time of the other contracts while locking penalty. If Bob locks 3\gamma b D coins, Charlie locks 2 \gamma b D_2 and Dave locks \gamma b D_1 as penalty, and Bob griefs, then Charlie keeps the compensation \gamma b D and forwards 2\gamma b D to Dave. Dave is greedy and refuses to cancel the offchain contract with Charlie. After elapse of D_2, he goes onchain and claims 2\gamma D_2. Since D_2>D, Charlie incurs a loss of 2\gamma b (D_2D). Thus, we account for the lock time of each contract while calculating compensation to prevent the loss of coins of uncorrupt parties.
Suppose Bob griefs. The following steps get triggered:
 Bob pays a compensation of \gamma b D_1+\gamma b D_2+\gamma b D units to Charlie, as per the terms of the contract. After D expires, Charlie goes onchain. He closes the channel, unlocks b coins, and claims compensation.
 Charlie requests Dave to cancel the offchain contract, offering a compensation of \gamma b D_1 +\gamma b D_2. Dave cancels the contract offchain, unlocks b units from the contract, and claims the compensation from Charlie. If Charlie decides to grief, Dave can claim the compensation by going onchain and closing the channel.
 Dave requests Alice to cancel the contract by offering a compensation of \gamma b D_1. Except for Bob, none of the parties lose coins.
Problem with HTLCGP: The drawback is that it does not consider an attacker to be rational. If Bob intends to mount griefing attack, he will choose a strategy that avoids paying any penalty. He will resolve the payment offchain with Charlie just before the contract’s locktime elapses. In this way, he manages to lock Alice, Dave and Charlie’s coins without compensating them. We realize that the underlying punishment mechanism in HTLCGP is not effective.
Summary
 This is the first attempt to propose a twoplayer gametheoretic model for analyzing griefing attacks in HTLC. The first player receives a conditional payment and makes a decision of forwarding the same to the second player based on the belief that the latter may be corrupt or uncorrupt.
 We use the same game model to analyze the griefing attacks in HTLCGP. From the analysis, we conjecture that it is impossible to design an effective countermeasure without changing the Bitcoin scripting system.
 We analyze the impact of the penalty on the attacker’s behavior and infer that HTLCGP is weakly effective in countering the attack in certain conditions.
 We introduce the concept of guaranteed minimum compensation, \zeta, and propose a protocol, \textrm{HTLCGP}^{\zeta}, for disincentivizing griefing attack. Our experimental analysis shows that the total coins locked by the attacker drops to 28\% when the guaranteed minimum compensation is 2.5\% of the transaction amount and the maximum allowed path length is set to 10 in \textrm{HTLCGP}^{\zeta}. This quantity is 27\% less than the coins locked in HTLCGP, proving that \textrm{HTLCGP}^{\zeta} is far more effective than HTLCGP in countering the griefing attack. The code is provided in GitHub.
Method
An attacker with budget \mathcal{B}_{EX} tries to disrupt the network by incentivizing a certain fraction of nodes to mount the griefing attack. We make the following assumptions
 If a node has accepted the bribe, then it implies that the expected earning by cooperating with other nodes is lower than the bribe, and hence it has chosen to be a corrupt node. Such nodes act as per the instructions received from the attacker.
 A corrupt node knows whether another node is corrupt or uncorrupt. An uncorrupt nodes lacks this information.
We discuss the bribe offered to a corrupt node and the method for mounting a griefing attack:
 Bribe offered per attack. Given a payment send across the network is of value \alpha, the attacker fixes the bribe offered to a node to L coins, where L= \alpha+I_{D,\alpha}+C. Here C is the auxiliary cost for routing payment and opening new channels, if needed. I_{D,\alpha} coins are used to compensate the node for keeping \alpha coins unutilized for the next D units of time. We assume I_{D,\alpha}\approx 2 O(r_{U},D,\alpha), \forall U\in V so that a corrupt node gains at least O(r_{U},D,\alpha) inspite of locking \alpha coins.
 Method for mounting Griefing Attack. The attacker instructs the corrupt node to execute a selfpayment (i.e., S=R) of \alpha coins via a route of maximum allowed path length n in order to maximize the damage. The least HTLC timeout period is t_{n1}\approx D. After the conditional payment reaches the payee U_n, it intentionally stops responding, locking a collateral of (n1)\alpha for the next D units in the path routing payment.
Game Model for HTLC

Choice of players: We assume that all the miners in the underlying blockchain are honest, and only nodes in Lightning Network can be the strategic players. In the path P, a node U_{i1} locks an amount \alpha_{i1} with the offchain contract formed with U_{i}, hence it will be bothered with U_{i}'s nature and the corresponding action. Except for U_0, none of the intermediaries routing the payment knows the recipient’s identity. U_{i1} makes a decision of whether to forward a payment to U_i based on the belief of the U_i's type (discussed next). We model the griefing attack as an interaction between pair of nodes U_{i1} and U_{i}, i \in [1,\kappa] in path P routing payment in the network. Player forms a belief about the type of the other players based on either their position in the network or past interaction.

Action Space: We define the actions of U_{i1} and U_i:

U_{i1}'s action (S_{U_{i1}}): It can either forward (F) the conditional payment to U_{i} or it can choose to not forward (NF). If it chooses to forward the payment, it forms a contract with U_{i}, locking the designated amount in the channel id_{i1,i} for time t_{i1}, which is the HTLC timeout period. If i>1, then U_{i1} gets a fee of f(\alpha_{i1}) from U_{i2} contingent to the release of solution by U_i. For U_0, the satisfaction level is proportional to success of payment. In this case, we consider f(\alpha_0) as the level of satisfaction. If U_i delays, then opportunity cost of coins locked in the offchain contract increases, and a loss is incurred. If U_{i} doesn’t respond within t_{i1}, then U_{i1} closes the channel and withdraws its coins from the contract.

U_i's action (S_{U_{i}}): If U_{i1} has forwarded the payment, then U_{i} can choose its action from the following: accept the payment or Ac, reject the payment or Rt, wait and then accept or W & Ac, wait and then reject or W & Rt, and grief or Gr. If U_{i1} does not forward the payment, the game aborts.

We define the sequential Bayesian game \Gamma_{HTLC} as the tuple \Gamma_{HTLC}=\langle N, (\Theta_{U_{i1}},\Theta_{U_{i}}), (S_{U_{i1}},S_{U_{i}}),p_{U_{i1}},(u_{U_{i1}},u_{U_{i}})\rangle, where N =\{U_{i1},U_{i}\} where i \in [1,\kappa]. The type of player U_{i} is defined as \Theta_{U_{i}}=\{\textrm{Corrupt(co),} \textrm{ Uncorrupt (uco)}\}.
The probability function p_{U_{i1}} is a function from \Theta_{U_{i1}} into p(\Theta_{U_{i}}), where the p(\Theta_{U_{i}}) denotes the set of probability distribution over \Theta_{U_{i}}, i.e., p_{U_{i1}}(Corrupt)=\theta_i, p_{U_{i1}}(Uncorrupt)=1\theta_i. The payoff function u_{k}: \Theta \times S \rightarrow \mathbb{R} for any player k \in \{U_{i1},U_{i}\}, where \Theta=\Theta_{U_{i}} and S=S_{U_{i1}}\times S_{U_{i}}, is such that for any profile of actions and any profile of types (\hat{\theta},s) \in \Theta \times S, specifies the payoff the player k would get, if the player’s actual type were all as in \hat{\theta} and the players all chose their action as in s.
The game begins with Nature N choosing the type of U_{i}, either corrupt or uncorrupt, respectively. U_{i1} believes that a corrupt U_{i} will be selected with probability \theta_i, whereas an uncorrupt U_{i} will be selected with probability 1\theta_i. After N makes its move, U_{i1} selects its strategy based on the belief of U_i's type. U_i chooses its strategy after U_{i1} has forwarded the payment. If U_{i1} chooses not to forward, then either party receives a payoff 0 since no offchain contract got established, i.e., u_{U_{i1}}(\theta_b,NF,s_b)=u_{U_{i}}(\theta_b,NF,s_b)=0, \theta_b \in \Theta_{U_{i}} and s_b \in S_{U_{i}}.
Game Analysis
We infer from the payoff model that the corrupt node can select either of the strategies for mounting the attack 
 (i) Reject the payment just before lock time D elapses: U_n rejects the conditional payment forwarded by U_{n1} offchain just before the contract’s lock time elapses.
 (ii) U_n does not respond: This is as per the conventional definition of griefing. U_{n1} closes the channel unilaterally after the contract’s lock time expires.
Analysis for HTLCGP and effectiveness
We use the same game model for analysing griefing attack in HTLCGP. The analysis in shows that a rational corrupt node will cancel the payment offchain just before the contract lock time elapses i.e., at time D\delta, where \delta\rightarrow 0. The corrupt node avoids paying any penalty but still manages to mount the attack. We cannot protect uncorrupt parties from griefing attacks unless we do not account for the intermediate delay in resolving payments. Since Bitcoin scripting language is not Turingcomplete, we cannot have a single offchain contract where we can define penalty as a function of time. We conjecture that it is impossible to design an efficient protocol that will penalize the attacker and compensate the victims of a griefing attack with the current Bitcoin scripting system.
Instead of focussing on the victims, we analyze the protocol from the attacker’s point of view. The latter has an objective of maximizing the damage by locking as much network liquidity possible for the given budget \mathcal{B}_{EX}. The attacker will continue to invest in the network if the return on investment is good enough. If the return on investment diminishes, the attacker will refrain from mounting the attack and instead prefer to invest in another activity. In HTLCGP, the introduction of penalty led to locking extra coins, increasing the cost of the attack. The attacker will be able to corrupt fewer nodes compared to HTLC.
We define a metric, capacity locked in a path routing payment, that indirectly determines the success rate of the griefing attack. It is the summation of the coins locked in the offchain contract instantiated on the channel forming the path. Ignoring the processing fee (negligible quantity), assuming all the payments executed are of value \alpha and the bribe offered per instance is L, the attacker can corrupt \frac{\mathcal{B}_{EX}}{L} nodes in the networks.
Given the total budget of the attack is \mathcal{B}_{EX}, incentive per attack being L, transaction value per payment being \alpha, HTLC timeout period is D, time taken to settle a transaction onchain being \Delta, n is the maximum allowed path length and a corrupt recipient rejects the payment at time t'=D\delta, where \delta \rightarrow 0, the capacity locked upon using HTLCGP is less than the capacity locked in HTLC, the loss percent being \frac{\gamma n (\frac{D}{2}+\frac{\Delta (n2)}{6})}{1+\gamma n (D+\frac{(n1)\Delta}{2})}
Inference: We observe that the loss percent \frac{\gamma n (\frac{D}{2}+\frac{\Delta (n2)}{6})}{1+\gamma n (D+\frac{(n1)\Delta}{2})} is dependent on \gamma. If \gamma is too low, the loss percent is not substantial and the attacker can still consider stalling the network. Hence the payment protocol HTLCGP is weakly effective in disincentivizing the attacker.
Guaranteed Minimum Compensation
If the incidence of griefing attack increases in the network, rate of griefingpenalty \gamma can be increased in HTLCGP but this reduces the success rate of payment due to reduced liquidity in the network. Our objective is to increase the cost of the attack without forcing uncorrupt parties to lock high penalties. Since corrupt parties are asked to route selfpayment via the longest available path, cost of the attack increases if the maximum path length allowed for routing payments decreases. Thus we must design a mechanism by which one can adjust the maximum allowed path length based on the rate of incidence of griefing attacks in the network. We introduce a new parameter \zeta, termed as Guaranteed Minimum Compensation. This sets a threshold on the minimum compensation a victim is entitled to, if it is affected by griefing attack.
By introducing this parameter, we can:
 Control the rate of griefingpenalty
 Control the maximum path length for routing a payment
\textrm{HTLCGP}^{\zeta} and its effectiveness
We propose a new protocol, \textrm{HTLCGP}^{\zeta}. The steps of the protocol are same as HTLCGP, except that maximum path length used for routing payment can be controlled by increasing or decreasing \zeta.
Given the total budget of the attack is \mathcal{B}_{EX}, incentive per attack being L, transaction value per payment being \alpha, HTLC timeout period is D, time taken to settle a transaction onchain being \Delta, n is the maximum allowed path length for HTLC, \tilde{n}^{\zeta,k} is the maximum allowed path length for \textrm{HTLCGP}^{\zeta} for a given pair of \zeta and k, and a corrupt recipient rejects the payment at time t'=D\mu, where \mu \rightarrow 0, the capacity locked in \textrm{HTLCGP}^{\zeta} is less than the capacity locked in HTLC, the loss percent being \frac{n\tilde{n}^{\zeta,k}}{(n1)\Big(1+\gamma^{\zeta,k} n D+ \gamma^{\zeta,k} n \Delta \frac{n1}{2}\Big)}+\frac{\gamma^{\zeta,k} \tilde{n}^{\zeta,k} ((n1)(D+\frac{(\tilde{n}^{\zeta,k}1)\Delta}{2})\frac{\tilde{n}^{\zeta,k}1}{2}(D+\frac{(2\tilde{n}^{\zeta,k})\Delta}{3}))}{(n1)\Big(1+\gamma^{\zeta,k} n D+ \gamma^{\zeta,k} n \Delta \frac{n1}{2}\Big)}
Inference: The loss percent is dominated by the term \frac{n\tilde{n}^{\zeta,k}}{n1}. For a given k, if \zeta increases, the maximum path length {\tilde{n}^{\zeta,k}} decreases, and so the loss incurred increases. In that case, the attacker would prefer to invest in other activities with higher returns rather than mount an attack on the network.
Results
We define the strategy used by the attacker in the network. The latter corrupts the nodes that are either pendant vertices or have just one channel in the network. It is easier and more costeffective to target such peripheral nodes than nodes with high centrality. A highly central node earns a higher profit as transactions tend to get routed through such nodes. Also, the attacker needs to offer a higher incentive per attack, which may not be a good strategy. On the other hand, peripheral nodes can be easily incentivized to deviate, as they haven’t gained much trust in the network. Such nodes do not expect to earn much by behaving altruistically. While analyzing the effectiveness of HTLCGP and \textrm{HTLCGP}^{\zeta}, we assume that U_{i1} always forward the payment request to U_i.
The dataset and parameters used in our experiments are as follows:
 (i) HTLCGP: The set of experiments are divided into two parts. Transaction value is varied between 10000\textrm{}100000 satoshis and \gamma \in [10^{7},10^{3}]. In the first part, we analyze the decrease in capacity locked when a penalty is introduced. The budget of the attacker is varied between 0.05 \ BTC6.25 \ BTC. The path length is set to n=20 and D is set to 100. In the second part, we analyze the rate of the successful transaction in the absence of a griefing attack. We vary the number of transactions between 30009000 and path length \kappa is varied between 5 and 20.
 (ii) \textrm{HTLCGP}^{\zeta}: We analyze the further decrease in capacity locked upon introduction of penalty, as well as guaranteed minimum compensation in the payment protocol for a given budget of the attacker. The budget of the attacker is varied between 0.05 BTC6.25 BTC. The transaction value is varied between 10000 satoshis to 100000 satoshis. We vary the parameter k between 0.005 to 2. For a fixed k, \zeta is varied so that path length ranges between 2 to 20. Both D and \Delta are set to 100.
Observations

Effectiveness of HTLCGP: The plot in (a) below shows that the capacity locked drops from 90% to 20% when \gamma is varied between 10^{7} to 10^{3}. We see a sharp decrease in capacity locked when \gamma increases from 10^{6} to 10^{5}, with the capacity locked dropping from 82\% to 50\%. When \gamma is 10^{4}, the capacity locked drops to 25\%. We also observe in (b) that the ratio of successful transaction executed drops to 54% when \gamma is 10^{3} and it is around 99% when \gamma is 10^{7}.

Effectiveness of \textrm{HTLCGP}^{\zeta}: k is varied between 0.005 and 2, and for each k, the factor \zeta is varied so that the maximum path length ranges between 2 and 20. We observe that on varying k and \zeta, \gamma varies between 10^{7} to 10^{3}. The drop in capacity locked in the network ranges between 18\% to 46\%. The drop in capacity locked is significant for the lower value of \gamma and the difference reduces for \gamma>10^{5}. Table II provides a comparative analysis of percentage loss in capacity between HTLCGP and \textrm{HTLCGP}^{\zeta}.
Discussion and Key Takeaways
 A corrupt node can still mount the attack by canceling the payment just before the offchain contract’s lock time elapses. However, we intend to study the impact of the reduced maximum allowed path length \tilde{n}^{\zeta,k} on the effective capacity locked by the attacker in the network.
 Penalization mechanism is not effective as it leads to reduction in network’s liquidity. Combining penalization mechanism with controlling the maximum allowed path length for payment proves beneficial in curbing the impact of such attacks, safeguarding the interest of honest participants in the network.
Implications and Followups
 We would also like to analyze the griefing attack in the BAR (Byzantine Altruistic and Rational) model.
 We would like to propose an incentivecompatible countermeasure in presence of rational miners. Tsabary et al. had proposed MADHTLC to counter bribery attack in HTLC but they have not discussed griefing attack. Our objective would be to combine the best of the both world and propose a stronger protocol.
Applicability
Our model can serve as the basis for analyzing the behavior of network pariticipants when a countermeasure is used for preventing griefing attack. We can also use it to assign reputation to the nodes in the network. Based on the reputation of the node, it will be chosen as intermediary for routing payments in future.