Research Summary: Pied-Piper: Revealing the Backdoor Threats in Ethereum ERC Token Contracts

TLDR

  • With the development of decentralized networks, smart contracts, especially those for ERC tokens, are attracting more and more Dapp users to implement their applications. There are some functions in ERC token contracts that only a specific group of accounts could invoke. Among those functions, some even can influence other accounts or the whole system without prior notice or permission. These functions are referred to as contract backdoors. Once exploited by an attacker, they can cause property losses and harm users’ privacy.
  • In this work, we propose Pied-Piper, a hybrid analysis method that integrates datalog analysis and directed fuzzing to detect backdoor threats in Ethereum ERC token contracts. First, datalog analysis is applied to abstract the data structures and identification rules related to the threats for preliminary static detection. Then, directed fuzzing is applied to eliminate false positives caused by the static analysis.
  • We systemically investigated the 5 common types of backdoor problems in smart contracts. Then we implemented Pied-Piper and conducted several experiments to show its effectiveness. With Pied-Piper, we have found 189 previously unknown threats in 13484 real-world smart contracts and 4 of them are assigned with CVE ids.

Core Research Question

What are the typical backdoor threats in ERC token contracts and how to detect them?

Citation

Fuchen Ma, Meng Ren, Lerong Ouyang, Yuanliang Chen, Juan Zhu, Ting Chen, Yingli Zheng, Xiao Dai, Yu Jiang, and Jiaguang Sun. 2022. Pied-Piper: Revealing the Backdoor Threats in Ethereum ERC Token Contracts. ACM Trans. Softw. Eng. Methodol. Just Accepted (August 2022). https://doi.org/10.1145/3560264

Background

  • ERC Token Contract: ERC-20 is the technical standard for fungible tokens created using the Ethereum blockchain.
  • Datalog Analysis: A Datalog analysis declares input/output program relations, each over one or more program domains, and provides rules (constraints) specifying how to compute the output relations from the input relations.
  • Fuzz testing: Fuzzing is a promising technique for vulnerability detection. It produces random inputs for the target programs and tries to trigger the program’s unexpected behaviors.

Summary

  • The key insight of this paper is that some functions in ERC token contracts can influence other accounts or the whole system without prior notice or permission. Once exploited by an attacker, they can cause property losses and harm users’ privacy.
  • Smart contract binary code can be interpreted as an IR and constructed as a CFG, where we can extract facts for further datalog analysis.
  • We first summarize 5 common types of smart contract backdoors including Arbitrary Transfer, Generate Token After ICO, Destroy Token, Disable Transferring, and Freeze Account. We analyze the patterns of these backdoors and show how developers may avoid them.
  • Then we designed datalog rules to detect the backdoor threats in smart contracts based on the CFG facts. We listed all the rules in Section 5.1 in the paper. Specifically, there are some basic structures, basic relations, data structure identification rules, function type identification rules, and backdoor threats identification rules. These rules recognize the data flow of the smart contract and check whether there is a vulnerability.
  • In Section 5.2, we introduced a directed fuzzing engine to avoid the false positives reported by the datalog analysis. The directed fuzzing first deletes the onlyowner modifier. Then it generates an initial seed for the first execution and then executes the suspected function given by the datalog analysis engine. After each round of the execution, Pied-Piper mutates the seed by disordering the function and changing the inputs randomly.
  • In Section 6, we give the details of the implementation of Pied-Piper and show how effective it is by answering two research questions: 1) Is Pied-Piper accurate in detecting backdoor problems, i.e., any false positives or false negatives? 2) Is Pied-Piper efficient in detecting backdoor problems in real-world smart contracts?

Method

  • We summarized the common types of backdoor threats in smart contracts.
    • As illustrated in Section 4, we collected and read more than 50 relevant news about ERC token contract backdoors in recent years.
    • In addition, we consulted many industrial programmers engaged in smart contract development and collected many opinions about the definition of ERC token contract backdoors. We contacted 10 smart contract and blockchain developers during this study. We collected and analyzed these blogs and reports by checking the source code of the corresponding smart contracts with backdoor threats. Then we distributed our findings to the developers.
    • The final list of threats is defined by merging all the opinions from the developers.
  • Conduct datalog analysis on smart contracts.
    • Pied-Piper will first construct a CFG based on the contract’s source code and collect some basic data structures and relations of the CFG.
    • Then, Pied-Piper defines some identifications of specific data structures related to backdoor functions. Pied-Piper identifies some function types, such as transfer, and approves functions based on these data structures. Finally, Pied-Piper detects a backdoor risk based on well-defined rules. The datalog analysis will give a preliminary report on the three types of backdoor problems.
  • However, the static analysis of “Transfer In Tokens” type is not sound, and Pied-Piper uses a fuzzing engine to eliminate the false positives. The fuzzing engine will compile the contract and construct a new CFG with target label and node distance according to the location of the potential threats reported by the datalog analysis. If the guided fuzzing engine can reach the target statements and trigger a protection mechanism, the reported function is not a real threat, and the false positive could be eliminated precisely.

Results

  • Benchmarks: We prepared two datasets for the evaluation. The first is a manually created dataset. We prepared a dataset of 200 smart contracts with certain types of backdoor problems. A backdoor function is manually embedded in each contract with the help of smart contract developers. Each type is embedded into 40 smart contracts. And the second is collected from real-world smart contracts. We wrote a crawler script to download the source code of smart contracts from Etherscan. In total, we got 13484 real-world smart contracts to evaluate the effectiveness of Pied-Piper on real backdoor problem detection.
  • Measurement metrics: We evaluated the effectiveness as well as the efficiency of Pied-Piper in our experiment. Specifically, we calculated the false positive samples and the analysis overhead.
  • Results: With the combination of datalog analysis and directed fuzzing, Pied-Piper successfully reported all the 200 cases without any false-positive or false-negative errors and there are 189 real threats found in all 13484 contracts and 3 mislabeled samples are corrected by the dynamic fuzzer. As for the time overhead, Pied-Piper uses 8.03 seconds on average for analyzing a single contract.

Discussion and Key Takeaways

  • Feature or Bug of Backdoor Threats: The threats we discussed in this paper may have legitimate uses when an attacker is stealing some coins by some means. However, as we can see from the Soarcoin example, these threats can be abused and cause a significant loss to regular users. Besides, it is hard to tell whether the owner or the hacker who stole the private key took advantage of these high authority functions. We think the developers of smart contracts should try their best to secure the code rather than develop high-risk remedial measures. Furthermore, to avoid these problems, developers can standardize the development of smart contracts accordingly and control the group of accounts.
  • Fairness of Manual Datasets: The first dataset used to evaluate the accuracy of Pied-Piper is embedded with arbitrary transfer problems manually. We built this dataset based on analyzing representative contracts on Ethereum and the empirical study of existing threat reports. It may not contain all the possible situations of the threats. We consulted many developers of smart contracts to inject those threats, and this manual dataset is our best effort.

Implications and Follow-Ups

  • More backdoor types: Pied-Piper is a framework for backdoor hunting. By recognizing new patterns, it would be good to add more rules for other backdoor threat detection.
  • Automatic program repair: It would be nice to be able to automatically repair the threats in smart contracts found by Pied-Piper.

Applicability

  • Pied-Piper can be used to give suggestions to investors in Web3. If a token contract has some backdoor risks, it should be cautious to invest in it.
10 Likes

Thank you for this amazing research summary @fuchen. It is quite educating and presents a clear idea of the topic in a concise way.

Could see that the fuzzing mechanism and datalog analysis adopted by the Pied-Piper has a 100% success rate.

However, this leaves us at a probability that the fuzzing engine used by the Pied-Piper may not reach a target as presented herein.

What happens when the guided fuzzing engine does not reach its target and triggers the protection mechanism? Does this present all the false positives as threats?

4 Likes

Thanks for the amazing research summary

1 Like

We believe that if the fuzz engine does not reach the target and trigger the protection mechanism, it means the potential bug given by the static analysis engine is a true positive. Indeed, cases that the false alarms cannot be eliminated by the fuzz engine may exist. However, we didn’t find such situations in our experiment. In fact, from the evaluation results, we found that the datalog engine give out only a few false alarms. This may indicate that, the cases I mentioned above maybe really rare in wild token contracts.

3 Likes

Hope you like it!!
And, feel free for any questions!!

@fuchen, I am impressed by the short analysis and output time of your new tool, Pied-Piper. Also, that’s a very clever name you chose.

On Backdoors, Smart Contracts, and Vulnerabilities

Smart contracts have been notorious for their vulnerabilities, hacks, and exploits, but they have been indispensable and have also recorded some considerable advancement. According to Certik as published on CoinDesk, in 2021, money lost to DeFi through Smart Contract vulnerabilities was about $1.3billion. This underscores the importance of a research like this which proposes a solution to an existing problem.

Attacks on Smart Contracts, a research summary on the Forum, omitted Backdoor Attacks. The research paper does not contain an exhaustive list, so it is understandable. Nevertheless, it would have been a good addition.

Backdoor attacks are notorious for privacy breach and loss of assets, thus violating privacy techniques as outlined in this research summary.

Notable Points from the Summary

  1. Backdoor threat is a vulnerability in Ethereum smart contracts that can lead to privacy breach and loss of assets.

  2. Contract backdoors are a necessary devil as they can be useful in the right hands and manipulated in the wrong hands.

  3. Backdoor attacks are perpetrated by attackers who exploit special accounts and special functions on Ethereum smart contracts.

  4. Contract backdoors are like an emergency backdoor from which a thief can sneak in without permission and rob some select rooms or even a whole house.

Questions
A. Speaking about the accessibility to invoking functions in smart contracts, hypothetically, what kind of accounts or group of accounts can be granted this access?

B. Just as a sneak peek, can you please mention some of the smart contract functions of which some can trigger a backdoor attack?

Moving the Research Forward
Is it possible for Pied-Piper to develop some kind of severity scale to classify each detected backdoor threat? This way auditors can easily give the results adequate attention. Or, is there a system close to this that Pied-Piper implements?

2 Likes

@Ulysses , thanks for sharing this, it gives a new perspective into smart contract backdoor threats. As a relatively new person in web 3, it was valuable to me.

1 Like

Welcome to the community. Glad that you found this helpful. Since you are new to web 3 and to SCRF, you can check this resource it could be helpful too.

SCRF is also organising a web 3 writing cohort. You can register here

2 Likes

Thank you @fuchen for your input in this research paper. My own take on this paper or a Way of advising;
To avoid the influences caused by backdoors, I would advice for both Dapp users and smart contract developers.
For the Dapp users: dapp users should pay attention to the transfer, minting or destroying functions of the smart contracts corresponding with the Dapp. If these functions can be called by only a specific group of accounts and may have an influence on the other accounts’ balance, it may be leveraged to cause a huge loss.

Therefore, Users should be careful to put their digital assets to Dapps with such functions.

For the smart contract developers: Backdoor threats may affect the trustworthiness of the Dapp and if leveraged by malicious developers, they will make a damage to the ecosystem of your applications. I think during the smart contract development process, it is essential to avoid such threats.

Sorry for the late reply! Thank you for the comments and the resource for the web3 is really helpful.

Answers to the Questions

Here are my answers to the two questions: (Really good questions by the way! )

A. I believe that the accounts who have the privilege to access these functions are generally the owners or administrators. However, recently, we found that the composition of these accounts are complex in some cases. For example, in some distributed governance contracts, this group contains many accounts who have the rights to vote for the proposal. Without well-designed models, this could be really dangerous.

B. Generally, the functions are related to the token transfer process. Some examples are transfer(address from, address to, uint256 amount) and destroy(address from, uint256 amount)

3 Likes

Excellent advice !! :grinning:

Thanks for the compliment :kissing_heart:

Great response @fuchen .

How about this, any thoughts on it?

1 Like

Sorry for the late reply.
That’s really a good idea! We will try to set up a severity scale for the detected threats.

3 Likes

If indeed the defense strategy is still not implemented by the possible test device before reaching that point, each probable issue reported more by automated test algorithm seems to be an up with a happy. In some instances, this same bugs and errors algorithm might not even be possible to decrease malicious threats. Furthermore, in our trial, we didn’t encounter any such circumstances. In reality, focusing mostly on assessment findings, researchers discovered also that stores system generates very little systematic errors.