Research Summary: FlashSyn: Flash Loan Attack Synthesis via Counter Example Driven Approximation

TLDR

  • FlashSyn is the first automated end-to-end program synthesis tool for detecting flash loan exploitable vulnerabilities and synthesizing profitable attack vectors as proofs.
  • This is a novel numerical approximation technique to handle sophisticated logics in the implementation of decentralized finance (DeFi) smart contracts. FlashSyn uses two numerical methods – polynomials and interpolations.
  • We used a counter-example driven refinement technique to improve the precision of the approximation.
  • FlashSyn is evaluated on 12 DeFi protocols from three blockchains that were victims to flash loan attacks and 2 DeFi protocols from Damn Vulnerable DeFi challenges. FlashSyn successfully generates profitable attack vectors for all cases.

Citation

Chen, Zhiyang and Beillahi, Sidi Mohamed and Long, Fan. “FlashSyn: Flash Loan Attack Synthesis via Counter Example Driven Approximation.” arXiv preprint arXiv:2206.10708 [2206.10708] FlashSyn: Flash Loan Attack Synthesis via Counter Example Driven Approximation

Background

Introduction

Flashloan-enabled attacks have caused tremendous financial losses to the DeFi ecosystem exceeding 0.5 billion US dollars in just the last two years. There is a dire need for tools and mechanisms to help smart contract developers write safer protocols. We propose a new approach, FlashSyn. FlashSyn takes a given smart contract’s functions and key states and synthesizes flash loan attacks targeting those contracts.

Terminologies

  • Program Synthesis: A programming language method to automatically find a program that satisfies user-provided specifications.

  • Numerical Method: A popular mathematical tool to find approximate solutions for complex problems.

  • Optimization: A popular mathematical tool to find input values that maximize (or minimize) an objective function subject to some constraints.

  • Decentralized Finance (DeFi): A peer-to-peer financial system built on top of blockchains. A few DeFi protocols dominate the DeFi market and serve as references and callees for other decentralized applications: stable coins, price oracles, decentralized exchanges (DEXes), aggregators, and lending and borrowing providers.

  • Flash Loan: In the DeFi ecosystem, lenders can offer flash loans to borrowers without upfront collaterals deposits, i.e., loans that are only valid within a blockchain transaction and must be repaid with some fees by the end of that transaction. Adversaries can use flash loans to gather large amounts of assets to launch costly exploitation targeting DeFi protocols.

  • DeFi Protocol State: the state of the DeFi protocol stored on the blockchain which includes among others the amounts of tokens the protocol controls.

  • Action: A function call, through transactions, to a DeFi smart contract.

  • Attack: A sequence of actions. An attack is feasible if at the end of the attack the initiator of the sequence of actions gains positive profit.

Use Case

FlashSyn aims to find flash loan attacks that:

  • involve calling functions in existing DeFi smart contracts only
  • can be executed atomically within a single transaction.

This work is useful to smart contract developers who want test their protocols against flash loans enabled attacks. In particular, we believe this approach is suitable for new DeFi protocols that interact with existing DeFi protocols, e.g., DEXes, and lending/borrowing protocols.

Summary

Challenges

In developing FlashSyn, we had to overcome a number of technical challenges:

  • The underlying logic of DeFi smart contracts is often too sophisticated to reason about. Additionally, some DeFi protocols interact with closed-source smart contracts which prevents even simple code inspection.
  • Since the number of possible attacks is exponential with respect to the number of actions in a DeFi protocol, a naive search algorithm that enumerates all possible attacks and executes them locally in a forked environment will be notoriously slow. Thus, it will be impossible to check every possible attack with all possible input parameters.

To address the first challenge: FlashSyn approximates the behaviors of actions in a given DeFi protocol and then uses the approximation to find candidate attacks.

To address the second challenge: FlashSyn first employs a set of heuristics to prune possible attacks, for instance FlashSyn prunes out a sequence that consists of repeating the same action.

For a feasible attack, FlashSyn then uses an off-the-shelf optimizer to find the input parameters that maximize the profit of the attack initiator.

Preparation and Methodology

FlashSyn is implemented in a tool with the same name. Given a DeFi protocol in the form of a set of smart contract functions and key states for each function, FlashSyn first executes transactions in a forked blockchain environment to collect data points and approximate the actions in those smart contracts.

Then, based on numerical methods, FlashSyn uses these data points to build approximated functions for the above actions. It utilizes these functions to find an attack that is estimated to have a positive profit using the optimizer to prune the input parameters.

FlashSyn validates whether the estimated profit of the discovered attack is accurate by executing the attack on the original contracts and checking the actual profit it obtains against the estimated one.

If the difference between the actual and estimated profits is over a fixed accuracy then FlashSyn considers this as a witness of the inaccuracy (counter example) of the approximations. FlashSyn uses this information to extract new data points, plug them into the numerical methods, and refine the approximations iteratively. We call this process counter example guided data collection.

FlashSyn uses two different numerical methods to infer approximated actions, namely multivariate polynomial approximation and multivariate interpolation.

Results

FlashSyn is evaluated using 12 real-world flash loan attacks on three different blockchains (Ethereum, Binance Smart Chain, and Fantom) and 2 imaginary flash loan attacks that are part of a DeFi security challenge proposed by the community. The preliminary results of the evaluation show FlashSyn is able to find flash loan attacks for all cases and in some cases FlashSyn was able to find more profitable attacks. In the table below we give a summary of the FlashSyn evaluation:

Input Tokens Output Tokens Polynomial Interpolation
Blockchain Application Actual Profit # Actions Profit Time(s) Profit Time(s)
ETH ETH, WBTC ETH, WBTC bZx1 1194 2 2336 106 2331 60
USDT, USDC USDT, USDC Harvest_USDC 338448 4 82152 343 25929 1166
USDT, USDC USDT, USDC Harvest_USDT 307416 4 60203 82 57187 2284
WETH, DAI WETH, DAI, USDC Warp 1693523 6 2528667 374 - -
DAI DAI Eminence 1674278 5 839155 510 1757458 941
ETH ETH, USDC, USDT, DAI Cheesebank 3335773 7 1762771 4180 111011 120
BSC BUSD BUSD ElevenFi 129741 5 104261 94 105407 93
BUSD BUSD bEarnFi 13838(1 loop) 2 13811 201 12516 481
CAKE CAKE WBNB ApeRocket 1345 6 778 570 1258 1532
WBNB WBNB Novo 24857 4 16401 71 14060 242
WBNB WBNB WDoge 78 5 75 25 75 66
Fantom USDC USDC OneRing 1534752 2 1408293 71 1838593 116
Damn Vulnerable DeFi DVT, ETH DVT, ETH Puppet 93000 2 102160 121 102982 204
DVT, ETH DVT, ETH PuppetV2 953100 3 915560 112 243642 144

Findings

  • FlashSyn synthesizes flash loan attacks indicating vulnerabilities in smart contracts code, thus simulating the work of the experienced users who exploited those vulnerabilities before.

  • The counter-example guided data collection allows FlashSyn to refine the approximation efficiently since the data points collected through this process contain very interesting features that improve the accuracy of the approximation. This leads to faster convergence between estimated and the actual profits of attacks enumerated by FlashSyn.

  • The results of the evaluation show that in some cases FlashSyn finds the most profitable attack using multivariate polynomial approximation while in other cases FlashSyn finds the most profitable attack using multivariate interpolation approximation method.

  • Our belief is that when the values of the input parameters of attacks do not deviate much from the values used to collect data points, the interpolation approximation method performs well. On the other hand, when the values of the input parameters deviate considerably from the values used to collect data points, the polynomial approximation method performs well.

Actionables

We see two ways to prevent flash loan attacks.

  • Developers can insert a mandatory time gap between two contract function calls. For example, assets can only be withdrawn at least 2 blocks after they are deposited.
  • DeFi contracts should use cumulative price. Instead of directly calculating token prices based on instant DEX states, developers can calculate time-weighted prices. Many DEX protocols like Uniswap/Chainlink already have such price oracles.

Further Reading

For more information about the mathematics of flashloan attacks, please see https://2003.03810.pdf .

Damn Vulnerable DeFi, a wargame to learn offensive security of DeFi smart contracts, contains data from the last two benchmarks collected, and information related to non-flash loan DeFi attacks (such as NFT attacks).

10 Likes

Hello community!
This is still an on-going work from our research group. All suggestions/comments are welcomed!

2 Likes

Thank you for posting your research to the forum! I’m still digesting much of it, but I am curious about the actionable recommendations mentioned.

These two seem like they would be effective in preventing flash loan attacks, I am curious if there are unintended consequences for those using flash loans for other purposes?

1 Like

Thanks for your question! @zube.paul

The first recommendation actually prevents flash loan usage in general, not only flash loan attacks. I suggest having it to protect functions which can transfer large assets, and those that are not supposed to be in the same transaction. For example, normal users are not likely to deposit assets and withdraw them all in the same transaction.

The second recommendation is used to keep a robust price, resistant to price manipulation. I’ve seen so many flash loan attacks whose root causes are calculating asset price based on instant DEX states, which can be easily manipulated within one block using flash loan. This recommendation actually helps prevent design flaws and does nothing to prevent flash loan usage in general. For example, flash loan-assisted arbitrage can still happen if there is price difference compared with other DEXes.

2 Likes

@zhiychen This is a fantastic research that will further open up DeFi ecosystem and scale up interactions with Flash Loan. Just as has bern acknowledged in the work, Flash Loan malicious attack is the most pronounced in DeFi. The research will be very useful not only to developers but to l players in DeFi ecosystem

1 Like

Could you expand a little more on your thoughts on how this research might help open up/scale up interactions with flash loans in the DeFi space? I certainly see this providing good insights into security practices, but could you extend the reasoning to why that would increase interactions with flash loans in DeFi?

1 Like

@zhiychen I enjoyed reading through your summary. I was able to learn one or two things especially in regards to ways in preventing flash loan attacks and exploitation. I will say that the slow response times from DeFi platform developers are also one of the main elements that allow hackers to get away with flash loan attacks and exploitation. And I feel like we can’t really blame them as flash loan exploits and attacks are typically hard to spot until it’s too late.

I read an article somewhere on the internet that suggested the use of flash loan attack detection tools to mitigate and prevent flash loan attacks and exploitation. Tools like OpenZeppelin defender have been launched to help project managers to detect smart contract attacks and also respond swiftly and neutralize such attacks.

I wanted to know your take in using them to prevent flash loan attacks and exploitation since it wasn’t mentioned in the summary. You only gave two ways to prevent flash loan attacks. Thank you.

2 Likes

@Samuel94 Thanks for your interest! We hope FlashSyn can help indentify design flaws inside smart contracts and help build a safer DeFi system!

@zube.paul Actually the focus on this work is vulnerabilities/design flaws which can be exploited via flash loans. I don’t think this research has something with scaling up interactions with flash loans.

To scale up flash loan interactions, I think it requires future DeFi developers to build new innovative projects where flash loans can be used :) Let’s wait and see.

@Cashkid18
Actually I didn’t know OpenZeppelin has such service. Thanks for mentioning OpenZeppelin defender! I scanned their docs and found it’s great resources for smart contract security researchers!

In terms of preventing flash loan attacks, FlashSyn’s use cases and OpenZeppelin defender’s use cases are different.

FlashSyn’s use cases are described in Use Case section in the summary. I also want to highlight one thing: if you only look at one function or one smart contract, it can be 100% secure. However, vulnerabilities may show up when different functions of different contracts are called in sequence with specific parameters. One example is oracle manipulation. These vulnerabilities are hard to spot. FlashSyn automates the searching process via approximations.

OpenZeppelin defender’s use cases, as far as I understand, are mainly focus on montoring and automating smart contracts. It does provide good security advice in its Advisor section, but they are more general. Overall, it’s a great service for smart contract developers to build a secure protocol, preventing common bugs like re-entrancy. But it cannot be used to detect specific types of vulnerabilities. And I don’t think OpenZeppelin defender can “neutralize” such attacks if such vulnerabilities do exist. Experienced attackers can finish all steps and take profit away in just one transaction. When it happens, it’s too late.

BTW, you mentioned you came across some flash loan detection tools, do you still have their links? I did some literature review before I started this project. I thought FlashSyn was the first(and maybe the only) flash loan attack detection tool so far.

Wow… Thanks for this your great and clear explanation, I think I understand the whole thing better now. I appreciate the response.i will check for the link and send it for you here.

2 Likes

Thanks for sharing.

What does the time column aim to communicate? Flash loans occur in one adversarial block. (3-30 seconds)
With that and the patterns used for training as context: Is this expected to effectively translate to real world conditions capabilities?

2 Likes

Hello @parseb Thanks for your question!

Flash loans occur only in one transaction. So attackers have to get profit within one transaction otherwise they cannot pay back flash loans. An observation is that in flash loan attacks, multiple functions of the victim contracts are called to get profit. For example, attackers deposit some tokens, manipulate the prices, then withdraw abnormally more tokens. In this case, attackers have to deposit and withdraw in one transaction. If we make sure deposit and withdraw cannot be called within two block, for example. Then it’s harder for attackers to exploit the vulnerabilities here.

Another underlying reason of the time column is to add uncertainty to a possible attack. Recently there are some DeFi attacks in which the attackers use their own tokens to perform the attack, without flash loans. In this case, an attack can span multiple blocks. The time column design here gives bots/whitehats more time to respond and may possibly stop the hack.

4 Likes

I still do not understand what those seconds measure and how they relate to block production time.

2 Likes

@zhiychen
I’ve enjoyed reading over your list. I was able to retain one or two items, especially with strategies for preventing flash loan abuse and assaults, but I am curious about practically all of the significant proposals made. I’ll add that one of the key factors allowing programmers to encourage absence with streak credit assaults and exploitation is the DeFi stage engineers’ slow response times. I think that this research will offer insightful information on security procedures.

2 Likes

@parseb Thank you for the question!

If you mean the seconds inside my table, it represents the time FlashSyn takes to find a attack vetor which reveals the vulnerabilities inside the victim contracts. It has nothing to do with block production time :)

3 Likes

Thank you for your comments!

Yes. Engineers’ slow response time is definitely one reason why those exploitations happen. I’ve seen a few cases when the community already warned developers of the bugs but the developers did not pay enough attention. FlashSyn tries to reveal the bugs/design flaws before the exploitations really happen.

3 Likes

The question is, does it work outside a lab environment since the time to find an attack is always bigger than the time it takes attackers to exploit a vulnerability? (in all instances bigger than block production time) What am I misunderstanding?

2 Likes

Great article, thanks for sharing.

How would you stack up your solution against the SMT solver presented in this article (Defiposer-SMT)?

What could be the practical differences in the results returned by your implementation?

Edit: I’ll develop my thoughts as suggested by @zube.paul . As I see it, both tools are able to model the inter-operability of modulable DeFi protocols to search for potential vulnerabilities leading to flash-loan exploits.
I get that the method is different (SMT solver vs. counter-example driven approximation refinement) and both methods obviously carry pros and cons - especially concerning modelling - but what I’d like to know is how FlashSyn compare to a SMT-solver in terms of returned results (to be specific, I’m talking about accuracy, false positives/negatives, speed of execution, number of vulnerabilities found, nature of the vulnerabilities found etc…), or in other words, when is one more adapted than the other over a specific situation.

2 Likes

@Fuwala Welcome to the forum and thank you for contributing to this thread. Would you be willing to provide us with the comparison you see between SMT solver and FlashSyn? A little depth into your thought process would be greatly appreciated.

1 Like