Research Summary: Pied-Piper: Revealing the Backdoor Threats in Ethereum ERC Token Contracts

fuchen · September 20, 2022, 1:00pm

TLDR

With the development of decentralized networks, smart contracts, especially those for ERC tokens, are attracting more and more Dapp users to implement their applications. There are some functions in ERC token contracts that only a specific group of accounts could invoke. Among those functions, some even can influence other accounts or the whole system without prior notice or permission. These functions are referred to as contract backdoors. Once exploited by an attacker, they can cause property losses and harm users’ privacy.

In this work, we propose Pied-Piper, a hybrid analysis method that integrates datalog analysis and directed fuzzing to detect backdoor threats in Ethereum ERC token contracts. First, datalog analysis is applied to abstract the data structures and identification rules related to the threats for preliminary static detection. Then, directed fuzzing is applied to eliminate false positives caused by the static analysis.

We systemically investigated the 5 common types of backdoor problems in smart contracts. Then we implemented Pied-Piper and conducted several experiments to show its effectiveness. With Pied-Piper, we have found 189 previously unknown threats in 13484 real-world smart contracts and 4 of them are assigned with CVE ids.

Core Research Question

What are the typical backdoor threats in ERC token contracts and how to detect them?

Citation

Fuchen Ma, Meng Ren, Lerong Ouyang, Yuanliang Chen, Juan Zhu, Ting Chen, Yingli Zheng, Xiao Dai, Yu Jiang, and Jiaguang Sun. 2022. Pied-Piper: Revealing the Backdoor Threats in Ethereum ERC Token Contracts. ACM Trans. Softw. Eng. Methodol. Just Accepted (August 2022). Pied-Piper: Revealing the Backdoor Threats in Ethereum ERC Token Contracts | ACM Transactions on Software Engineering and Methodology

Background

ERC Token Contract: ERC-20 is the technical standard for fungible tokens created using the Ethereum blockchain.
Datalog Analysis: A Datalog analysis declares input/output program relations, each over one or more program domains, and provides rules (constraints) specifying how to compute the output relations from the input relations.
Fuzz testing: Fuzzing is a promising technique for vulnerability detection. It produces random inputs for the target programs and tries to trigger the program’s unexpected behaviors.

Summary

The key insight of this paper is that some functions in ERC token contracts can influence other accounts or the whole system without prior notice or permission. Once exploited by an attacker, they can cause property losses and harm users’ privacy.
Smart contract binary code can be interpreted as an IR and constructed as a CFG, where we can extract facts for further datalog analysis.
We first summarize 5 common types of smart contract backdoors including Arbitrary Transfer, Generate Token After ICO, Destroy Token, Disable Transferring, and Freeze Account. We analyze the patterns of these backdoors and show how developers may avoid them.
Then we designed datalog rules to detect the backdoor threats in smart contracts based on the CFG facts. We listed all the rules in Section 5.1 in the paper. Specifically, there are some basic structures, basic relations, data structure identification rules, function type identification rules, and backdoor threats identification rules. These rules recognize the data flow of the smart contract and check whether there is a vulnerability.
In Section 5.2, we introduced a directed fuzzing engine to avoid the false positives reported by the datalog analysis. The directed fuzzing first deletes the onlyowner modifier. Then it generates an initial seed for the first execution and then executes the suspected function given by the datalog analysis engine. After each round of the execution, Pied-Piper mutates the seed by disordering the function and changing the inputs randomly.
In Section 6, we give the details of the implementation of Pied-Piper and show how effective it is by answering two research questions: 1) Is Pied-Piper accurate in detecting backdoor problems, i.e., any false positives or false negatives? 2) Is Pied-Piper efficient in detecting backdoor problems in real-world smart contracts?

Method

We summarized the common types of backdoor threats in smart contracts.
- As illustrated in Section 4, we collected and read more than 50 relevant news about ERC token contract backdoors in recent years.
- In addition, we consulted many industrial programmers engaged in smart contract development and collected many opinions about the definition of ERC token contract backdoors. We contacted 10 smart contract and blockchain developers during this study. We collected and analyzed these blogs and reports by checking the source code of the corresponding smart contracts with backdoor threats. Then we distributed our findings to the developers.
- The final list of threats is defined by merging all the opinions from the developers.
Conduct datalog analysis on smart contracts.
- Pied-Piper will first construct a CFG based on the contract’s source code and collect some basic data structures and relations of the CFG.
- Then, Pied-Piper defines some identifications of specific data structures related to backdoor functions. Pied-Piper identifies some function types, such as transfer, and approves functions based on these data structures. Finally, Pied-Piper detects a backdoor risk based on well-defined rules. The datalog analysis will give a preliminary report on the three types of backdoor problems.
However, the static analysis of “Transfer In Tokens” type is not sound, and Pied-Piper uses a fuzzing engine to eliminate the false positives. The fuzzing engine will compile the contract and construct a new CFG with target label and node distance according to the location of the potential threats reported by the datalog analysis. If the guided fuzzing engine can reach the target statements and trigger a protection mechanism, the reported function is not a real threat, and the false positive could be eliminated precisely.

Results

Benchmarks: We prepared two datasets for the evaluation. The first is a manually created dataset. We prepared a dataset of 200 smart contracts with certain types of backdoor problems. A backdoor function is manually embedded in each contract with the help of smart contract developers. Each type is embedded into 40 smart contracts. And the second is collected from real-world smart contracts. We wrote a crawler script to download the source code of smart contracts from Etherscan. In total, we got 13484 real-world smart contracts to evaluate the effectiveness of Pied-Piper on real backdoor problem detection.
Measurement metrics: We evaluated the effectiveness as well as the efficiency of Pied-Piper in our experiment. Specifically, we calculated the false positive samples and the analysis overhead.
Results: With the combination of datalog analysis and directed fuzzing, Pied-Piper successfully reported all the 200 cases without any false-positive or false-negative errors and there are 189 real threats found in all 13484 contracts and 3 mislabeled samples are corrected by the dynamic fuzzer. As for the time overhead, Pied-Piper uses 8.03 seconds on average for analyzing a single contract.

Discussion and Key Takeaways

Feature or Bug of Backdoor Threats: The threats we discussed in this paper may have legitimate uses when an attacker is stealing some coins by some means. However, as we can see from the Soarcoin example, these threats can be abused and cause a significant loss to regular users. Besides, it is hard to tell whether the owner or the hacker who stole the private key took advantage of these high authority functions. We think the developers of smart contracts should try their best to secure the code rather than develop high-risk remedial measures. Furthermore, to avoid these problems, developers can standardize the development of smart contracts accordingly and control the group of accounts.
Fairness of Manual Datasets: The first dataset used to evaluate the accuracy of Pied-Piper is embedded with arbitrary transfer problems manually. We built this dataset based on analyzing representative contracts on Ethereum and the empirical study of existing threat reports. It may not contain all the possible situations of the threats. We consulted many developers of smart contracts to inject those threats, and this manual dataset is our best effort.

Implications and Follow-Ups

More backdoor types: Pied-Piper is a framework for backdoor hunting. By recognizing new patterns, it would be good to add more rules for other backdoor threat detection.
Automatic program repair: It would be nice to be able to automatically repair the threats in smart contracts found by Pied-Piper.

Applicability

Pied-Piper can be used to give suggestions to investors in Web3. If a token contract has some backdoor risks, it should be cautious to invest in it.

Chrisarch · September 21, 2022, 8:02am

Thank you for this amazing research summary @fuchen. It is quite educating and presents a clear idea of the topic in a concise way.

Could see that the fuzzing mechanism and datalog analysis adopted by the Pied-Piper has a 100% success rate.

However, this leaves us at a probability that the fuzzing engine used by the Pied-Piper may not reach a target as presented herein.

What happens when the guided fuzzing engine does not reach its target and triggers the protection mechanism? Does this present all the false positives as threats?

Lisayanky · September 21, 2022, 1:05pm

Thanks for the amazing research summary

fuchen · September 22, 2022, 2:21am

We believe that if the fuzz engine does not reach the target and trigger the protection mechanism, it means the potential bug given by the static analysis engine is a true positive. Indeed, cases that the false alarms cannot be eliminated by the fuzz engine may exist. However, we didn’t find such situations in our experiment. In fact, from the evaluation results, we found that the datalog engine give out only a few false alarms. This may indicate that, the cases I mentioned above maybe really rare in wild token contracts.

fuchen · September 22, 2022, 2:22am

Hope you like it!!
And, feel free for any questions!!

Ulysses · September 22, 2022, 9:26pm

@fuchen, I am impressed by the short analysis and output time of your new tool, Pied-Piper. Also, that’s a very clever name you chose.

On Backdoors, Smart Contracts, and Vulnerabilities

Smart contracts have been notorious for their vulnerabilities, hacks, and exploits, but they have been indispensable and have also recorded some considerable advancement. According to Certik as published on CoinDesk, in 2021, money lost to DeFi through Smart Contract vulnerabilities was about $1.3billion. This underscores the importance of a research like this which proposes a solution to an existing problem.

Attacks on Smart Contracts, a research summary on the Forum, omitted Backdoor Attacks. The research paper does not contain an exhaustive list, so it is understandable. Nevertheless, it would have been a good addition.

Backdoor attacks are notorious for privacy breach and loss of assets, thus violating privacy techniques as outlined in this research summary.

Notable Points from the Summary

Backdoor threat is a vulnerability in Ethereum smart contracts that can lead to privacy breach and loss of assets.
Contract backdoors are a necessary devil as they can be useful in the right hands and manipulated in the wrong hands.
Backdoor attacks are perpetrated by attackers who exploit special accounts and special functions on Ethereum smart contracts.
Contract backdoors are like an emergency backdoor from which a thief can sneak in without permission and rob some select rooms or even a whole house.

Questions
A. Speaking about the accessibility to invoking functions in smart contracts, hypothetically, what kind of accounts or group of accounts can be granted this access?

B. Just as a sneak peek, can you please mention some of the smart contract functions of which some can trigger a backdoor attack?

Moving the Research Forward
Is it possible for Pied-Piper to develop some kind of severity scale to classify each detected backdoor threat? This way auditors can easily give the results adequate attention. Or, is there a system close to this that Pied-Piper implements?

Yeoriton56 · September 23, 2022, 5:36pm

@Ulysses , thanks for sharing this, it gives a new perspective into smart contract backdoor threats. As a relatively new person in web 3, it was valuable to me.

Ulysses · September 23, 2022, 6:23pm

Welcome to the community. Glad that you found this helpful. Since you are new to web 3 and to SCRF, you can check this resource it could be helpful too.

SCRF Recommends

Best Resources to learn about Web3

In this thread, you will find some recommended papers, resources, and links on Web 3 to help you in your journey into Web 3.

The SCRF Recommends thread is a valuable resource for people who are new to web3 and people who have lots of knowledge and experience in web3. There is something for everyone! You can find new articles/papers to refresh your memory about a subject or even introduce you to something new. In this thread, you can share papers, articles, and resources you find interesting with the community and help others discover them.

How can you contribute to SCRF recommends?

You can contribute to SCRF Recommends by responding to this post. You may follow this format:

Hyperlinked title
Indicate category (Beginner or Advanced)
About {brief introduction to the recommended paper/article/course, duration, etc}
Name of author/organizer/owner

If you need help or have any questions, you can ask @zube.paul or @Tolulope

Why should you contribute?

The web3 community has brought together so many people globally. One thing that everyone has in common is the love for knowledge and the willingness to help each other. You should contribute to SCRF recommends for the following reasons;

i. help people discover new topics and resources
ii. contribute to the growth of the web3 community
iii. You can earn some SourceCred by doing so
iv. You can also discover new things while doing so.

Beginners

Here, you will find links to online courses, white papers, and articles that provide an introductory overview of web3.

Bitcoin: A Peer-to-Peer Electronic Cash System
The original Bitcoin whitepaper. This is your first stop.
Satoshi Nakamoto

An Introduction To Terminologies And Layers In Web3
A deep dive into common words used in the web3 ecosystem. It serves as a glossary for newcomers into web3.
Ruchika Gupta, Geekyants.

Unit Master’s Program
A free six-week blockchain literacy program. Topics include Decentralization, Blockchain and incentive alignment, and Stakeholder capitalism and sustainability. This basic requires no prior knowledge or learning.
Unit

https://www.web3.university/
A community-driven platform offering free programming-focused blockchain courses. Students learn to create smart contracts and build NFTs. It is simplified enough for a person without a technical background to learn.
Web3 University

https://www.learnweb3.io/
Level-based free courses for programmers in web3. Courses begin from basics of web3 to building dApps to Security and Hacking. The courses progress from basic to intermediate to difficult, building on the lessons for each level.
Learn web3

https://www.linkedin.com/learning/what-is-web3
Introduces web3 and the metaverse and blockchain and web3 basics. A simplified approach to web3.
LinkedIn

Ethereum Whitepaper | ethereum.org
The original Ethereum whitepaper that accelerated non-financial blockchain use cases.
Vitalik Buterin

MOOC: Introduction to Digital Currencies.
A free introductory course by the University of Nicosia. It introduces digital currencies and the need for them. No previous knowledge or learning is required to enroll in this course.
University of Nicosia

Bitcoin and Cryptocurrency Technologies

from Princeton University, lectured by Joseph Bonneau, Ed Felten, Arvind Narayanan, and Andrew Miller, has been useful in providing an overview. The course was published around 2016, but the foundation it helps build are timeless.

Vitalik Buterin’s website
A blog by Vitalik Buterin that consists of several topics relating to Blockchain.

*Bitcoin Talk
A Bitcoin forum with discussions on Bitcoin and related issues.

Advanced

Here, you will find resources that are of a more technical nature. They mostly require a previous understanding of web3, blockchain, DeFi or any other relevant concept.

DeFi

MOOC: Introduction to Decentralized Finance (DeFi).
This course introduces concepts of DeFi and TradFi. A basic understanding of cryptocurrencies (like Bitcoin), Ethereum-based smart contracts and fundamental blockchain concepts is required.
University of Nicosia

Token Engineering Fundamentals
The course teaches how to design crypto-economic systems from scratch and how to enhance token utility. Previous knowledge is required.
TE Academy

Smart Contracts

Solidity Docs
Introduction to solidity, installing solidity compiler, and others. This is more suitable for people who understand smart contracts and how they work.

https://cryptozombies.io/
An interactive platform for learning basic concepts about smart contracts.
Cleverflare

Decentralized Autonomous Organisations (DAOs)

DAOs, DACs, DAs and More: An Incomplete Terminology Guide
This paper explains concepts such as smart contracts, autonomous agents, decentralized applications, and decentralized organisations, among others. Knowledge of Blockchain and Decentralization is required.
Vitalik Buterin

DAOs - The New Coordination Frontier.
A report curated by individuals from Gitcoin and BanklessDAO. It provides in-depth information and statistics about DAOs.
Gitcoin and BanklessDAO.

The DAO Landscape
This article breaks down DAOs and explores the relationship between social and financial capital.
Coopahtroopa

Cryptography

Intro to Cryptography
This course introduces cryptography and dives into discrete probability, stream ciphers, block ciphers, and message integrity among others. Watch on YouTube. Access the accompanying free textbook on applied cryptography here.
Dan Boneh (Stanford University).

Decentralized Thoughts
The webpage consists of useful resources on Blockchains and Distributed Computing as well as cryptography.

Useful Cryptographic resources
A website on Cryptographic Engineering managed by Matthew Green, a cryptographer and professor at Johns Hopkins University. Start with his curation of valuable cryptography resources.
Matthew Green

Oracles and Data

BlockScience
A firm that seeks to integrate academic-grade research with advanced mathematical and computational engineering. Check their website and navigate to the resources or blog page.

What Is a Blockchain Oracle?
This article explains oracles, the oracle problem, types of oracles, decentralized oracles, oracle reputation, and oracle use cases.
Chainlink

Privacy and Security

Secureum Bootcamp
A three-month bootcamp focusing on smart contract security and audits.
Check their Twitter for discord and next cohort.
Securem

Web3 Security: Attack Types and Lessons Learned
This article presents common themes and projections in security software trends to help people and businesses better guard their wallets and undertakings.
Riyaz Faizullabhoy and Matt Gleason.

ZK Whiteboard Sessions

The first few lectures are also by Prof. Dan Boneh, and they have been fun and helpful. Some interest in math is required to appreciate the course.

Scaling

An Incomplete Guide to Rollups

This article explains how channels, plasmas and rollups work. Tradeoffs between two flavours of rollups and some yet-not-fully-solved challenges in rollups.
Vitalik Buterin

The Complete Guide to Rollups

A very long blog post with a whooping estimated read time of 83 minutes.
Quote: Vitalik gave us the amazing Incomplete Guide to Rollups. I present to you The Complete Guide to Rollups. Ok it’s not actually complete, but it’s a great meme so I’m stealing it. This report only analyzes the design space of rollups on Ethereum and Celestia. I strongly recommend my recent Ethereum report for background.
– Jon Charbonneau

SCRF is also organising a web 3 writing cohort. You can register here

Henry · September 24, 2022, 9:50pm

Thank you @fuchen for your input in this research paper. My own take on this paper or a Way of advising;
To avoid the influences caused by backdoors, I would advice for both Dapp users and smart contract developers.
For the Dapp users: dapp users should pay attention to the transfer, minting or destroying functions of the smart contracts corresponding with the Dapp. If these functions can be called by only a specific group of accounts and may have an influence on the other accounts’ balance, it may be leveraged to cause a huge loss.

Therefore, Users should be careful to put their digital assets to Dapps with such functions.

For the smart contract developers: Backdoor threats may affect the trustworthiness of the Dapp and if leveraged by malicious developers, they will make a damage to the ecosystem of your applications. I think during the smart contract development process, it is essential to avoid such threats.

fuchen · September 25, 2022, 3:07am

Sorry for the late reply! Thank you for the comments and the resource for the web3 is really helpful.

Answers to the Questions

Here are my answers to the two questions: (Really good questions by the way! )

A. I believe that the accounts who have the privilege to access these functions are generally the owners or administrators. However, recently, we found that the composition of these accounts are complex in some cases. For example, in some distributed governance contracts, this group contains many accounts who have the rights to vote for the proposal. Without well-designed models, this could be really dangerous.

B. Generally, the functions are related to the token transfer process. Some examples are transfer(address from, address to, uint256 amount) and destroy(address from, uint256 amount)

fuchen · September 25, 2022, 3:12am

Excellent advice !!

Henry · September 25, 2022, 6:06am

Thanks for the compliment

Ulysses · September 25, 2022, 5:10pm

Great response @fuchen .

How about this, any thoughts on it?

fuchen · September 28, 2022, 11:41am

Sorry for the late reply.
That’s really a good idea! We will try to set up a severity scale for the detected threats.

WaterLily · October 3, 2022, 11:38am

If indeed the defense strategy is still not implemented by the possible test device before reaching that point, each probable issue reported more by automated test algorithm seems to be an up with a happy. In some instances, this same bugs and errors algorithm might not even be possible to decrease malicious threats. Furthermore, in our trial, we didn’t encounter any such circumstances. In reality, focusing mostly on assessment findings, researchers discovered also that stores system generates very little systematic errors.

Huncho · December 18, 2022, 10:47pm

@fuchen Excellent research; I went farther to shed more light on the subject.

The Backdoor threats in ERC token contracts are flaws that allow unauthorized access to a contract’s operations or data. These dangers can be used by hackers or hostile actors to alter the contract or steal important information.

The usage of malicious libraries is one typical sort of backdoor danger. Libraries are pre-written pieces of code that may be imported into a contract to handle specific tasks such as math operations or data storage. However, if a library is maliciously constructed, it can provide a backdoor for hackers to access and exploit the contract.

Another risk is the usage of proxy contracts. Contracts that function as middlemen between the main contract and the user are referred to as proxy contracts. They are frequently used to upgrade or change a contract without having to redeploy it. However, if a proxy contract is not properly secured, it might serve as a backdoor for hackers to access and abuse the main contract.

A third sort of backdoor threat is the use of fallback functions. When a contract receives unexpected input, fallback routines are automatically triggered. Hackers can take use of these features to get access to and manipulate the contract.

Backdoor risks in ERC token contracts can be detected in different methods. Manually reviewing the code for any suspicious or harmful code is one way. This can be accomplished by personally reviewing the code or by scanning the code for vulnerabilities using automated techniques.

Another approach is to employ smart contract security testing software. These tools may scan the code for vulnerabilities and generate a report on any concerns that may arise. Mythril, Oyente, and Solidity-Coverage are all popular tools.

It is also essential to have the contract reviewed by a respected third-party audit firm. These companies employ professionals who have been educated to identify and mitigate security issues in smart contracts.

Furthermore, suitable access controls and security measures must be implemented to prevent unauthorized access to the contract. Implementing safe coding methods, encryption, and multi-factor authentication are all part of this.

It is also critical to keep an eye on the contract for any questionable conduct. This can be accomplished by tracking transactions and updates to the contract using tools such as block explorers or event logs.

Overall, it is important to be proactive in detecting and mitigating backdoor threats in ERC token contracts. By regularly reviewing and testing the code, implementing proper security measures, and continuously monitoring the contract, it is possible to reduce the risk of unauthorized access and exploitation.

I hope this fine piece suits your questions and your research

Idara_Effiong · December 23, 2022, 11:01pm

There are several types of backdoor threats that may be present in ERC token contracts, including:

Unauthorized access: This type of backdoor threat allows attackers to gain unauthorized access to the contract, either by exploiting vulnerabilities in the contract code or by obtaining the private keys of the contract owner.
Unauthorized transactions: This type of backdoor threat allows attackers to execute unauthorized transactions on the contract, either by manipulating the contract’s logic or by spoofing the identity of the contract owner.
Information leakage: This type of backdoor threat allows attackers to extract sensitive information from the contract, either by directly accessing the contract’s storage or by intercepting transactions that reveal this information.

To detect these types of backdoor threats, it is important to conduct a thorough security review of the contract code and to test the contract for vulnerabilities. This may involve using tools such as static analysis tools, fuzzing tools, and manual code review. It is also important to implement strong security measures, such as secure key management and secure contract deployment practices, to help prevent these types of threats from being exploited.

Topic		Replies	Views
Research Pulse Issue #42 12/06/21 Research Pulse	1	563	December 6, 2021
Research Summary: A large-scale empirical study of low-level function use in Ethereum smart contracts and automated replacement Tooling and Languages summary	1	487	January 10, 2023
Research Pulse Issue #13 05/14/21 Research Pulse	1	1138	May 14, 2021
Research Summary: MANDO-GURU: Vulnerability Detection for Smart Contract Source Code by Heterogeneous Graph Embeddings Auditing and Security summary	15	1434	December 15, 2022
Research Summary: Machine Learning Guided Cross-Contract Fuzzing Auditing and Security summary , scalability	6	1322	December 21, 2022