TLDR
- The Ethereum Smart COntRacTs Vulnerability Detection (ESCORT) tool detects multiple vulnerability types and can be quickly updated to defend against new vulnerabilities.
- Their core innovation comes from a divide-and-conquer approach which breaks the task of detecting vulnerabilities into learning general features and identifying vulnerabilities.
- It achieves a 95% detection accuracy F1 score on average and is able to provide parallel detection of 8 vulnerabilities within 0.02 seconds.
Core Research Question
How can deep learning models detect multiple vulnerabilities while being updated to detect new threats as well?
Citation
Oliver Lutz, Huili Chen, Hossein Fereidooni, Christoph Sendner, Alexandra
Dmitrienko, Ahmad Reza Sadeghi, and Farinaz Koushanfar. 2021. ESCORT: Ethereum Smart COntRacTs Vulnerability Detection using Deep Neural Network and Transfer Learning. arXiv preprint arXiv:2103.12607 (2021). [2103.12607] ESCORT: Ethereum Smart COntRacTs Vulnerability Detection using Deep Neural Network and Transfer Learning
Background
- Smart Contracts: Computer programs that execute agreements. They are written in high-level languages such as Solidity, compiled to bytecode, and executed inside the Ethereum Virtual Machine (EVM).
- Bytecode Representation: The bytecode of a smart contract executed in the EVM. Blockchain operation has a one-to-one mapping relation with their bytecode representations, which makes it possible to analyze the flow of a contract at bytecode level.
- Callstack Depth: A class of vulnerability for ESCORT to detect. The attacker uses EVM’s depth limit of 1024 to cause an error when a function is called.
- Reentrancy: A class of vulnerability for ESCORT to detect. An attacker calls the contract’s function recursively, draining the Ether in the contract.
- Multiple Sends: A class of vulnerability for ESCORT to detect. Denial of service (DoS) occurs when a transaction reaches its spendable gas limitation.
- DoS (Unbounded Operation): A class of vulnerability for ESCORT to detect. A DoS attack that triggers execution cost limits of a smart contract by an external caller.
- Accessible selfdestruct: A class of vulnerability for ESCORT to detect. A programming error that leads to the termination of a contract, sending the remaining funds to a predefined address.
- Tainted selfdestruct: A class of vulnerability for ESCORT to detect. An extension of accessible selfdestruct, but the attacker can set the address to send the remaining balances to.
- Money concurrency: A class of vulnerability for ESCORT to detect. Also known as Transaction Ordering Dependence (TOD), this vulnerability is caused by the miner’s ability to decide what transactions to execute, thus changing the order of transactions, opening the door to potential attacks.
- Assert violation: A class of vulnerability for ESCORT to detect. A programming error that leads to a constant error state in the smart contract.
- Multi-Output-Layer Deep Neural Network (MOL DNN): A deep neural network that can produce multiple variables for one prediction.
- Feature Extraction: The process of extracting feature representation from data to serve purposes like abnormality detection. This usually removes the need for a larger or more evenly distributed dataset.
- Transfer Learning: To train on one model and then apply that progress to another use case without starting from scratch.
- Precision and Recall: Precision is defined by \frac{\text{ True Positives}}{\text{True Positives + False Positives}}, while recall is defined by \frac{\text{True Positives}}{\text{True Positives+False Negatives}}.
- F1 Score: Defined by the harmonic mean of precision and recall \frac{2\times Recall\times Precision}{Recall + Precision}.
Summary
-
Consider a malicious party that obtains its knowledge from the public data structure of a blockchain, and can freely upload their contracts to the Ethereum system. They will attack in one or more of eight vulnerability classes described in the background section.
-
The ESCORT would help the “defender”, that is, an Ethereum designer or end user, to ensure their program is not exploitable by malicious adversaries during code development time or before sending transactions.
-
In addition, ESCORT provides a single model that can identify novel vulnerabilities. There are thus many challenges to building ESCORT.
-
Collecting a large enough dataset is difficult because not enough smart contracts are open-sourced. Although bytecodes are publicly available, they are too long to process under a reasonable memory size.
-
Acquiring desired sample sizes of each vulnerability is challenging because only a small portion of smart contracts fall into that class, and unfortunately deep neural network training tends to bias towards the majority class.
-
Extracting feature representations is challenging because traditional software testing tools are not domain-specific enough, yet finding them manually is unrealistic.
-
Identifying multiple vulnerabilities with a single model is difficult because different vulnerabilities exploit distinct loopholes in a contract.
-
Empowering the model to detect unknown vulnerabilities while preserving the knowledge of existing ones is something that is not fully considered in previous works, yet very important to avoid the high cost of training a new model from scratch or fine-tuning a pre-trained model on a large dataset.
-
To address these problems, the authors built a toolchain called ContractScraper for sourcing and processing data.
-
An innovative divide-and-conquer approach was proposed. Instead of directly identifying vulnerabilities, they split the task into learning the semantic and syntactic information of smart contracts, and predicting the existence of different types of vulnerabilities.
-
For the experiment, ESCORT was first trained to detect six vulnerabilities, and then assigned to learn the remaining two.
Method
-
The authors chose to work with bytecode-level data, and deal with memory issues by sizing down the input data length.
-
They built ContractScraper, a toolchain to obtain bytecode files of smart contracts from the Ethereum platform, label them, and store the results in a database.
-
Bytecode acquisition starts with downloading around 1.2 million smart contracts from the first 5 million Ethereum blockchain blocks.
-
Raw data consists of hexadecimal digits that represent particular operation sequences and parameters. The data is then cleaned and processed to reduce input size and overcome memory constraints.
-
To label the data, they use inbuilt vulnerability detection tools in ContractScraper. Each of the vulnerability detection tools used by ContractScraper is specialized for detecting a specific set of vulnerability types.
-
15,000 samples were selected for each vulnerability class, and one class that had no vulnerabilities. Note that the actual size of total data is less than 15000 x (8+1) because one smart contract can contain more than one type of vulnerability.
-
A DNN was trained to learn the bytecode features of general contracts. The defender also specifies system parameters including vulnerabilities.
-
Inside the DNN, a feature extractor was created to learn the semantic and syntactic information from the contract’s bytecode. It is composed of a stack of layers.
-
These are meant to solve accuracy problems resulting from working with long, hexadecimal bytecodes, and process input data via linear mapping, this converts the data into fractional numbers before learning them, improving efficiency.
-
The feature extractor is then extended to multiple vulnerability branches. Each branch is a stack of layers designed to learn a specific type of vulnerability.
-
Each branch outputs a probability for the specific type of vulnerability they aim to detect.
-
When a new vulnerability is identified, the defender constructs a new dataset, and adds a new vulnerability branch to the existing model.
-
Existing branches are left intact, ensuring that old knowledge is preserved.
Results
-
Classification of the first six vulnerabilities. (p.12, Table 3)
During the first part of training, ESCORT on average achieved a higher than 95% F1 score on both the training and validation set. -
Classification results after two new vulnerabilities were added. (p.12, Table 4)
After transfer learning occurred, the new classes achieved 92% and 93% F1 score. Their training time was reduced by around 43%.
Discussion and Key Takeaways
- Classification: ESCORT achieved exceptional results across different classes. The innovative divide and conquer approach could be the reason why.
- Transfer Learning: There was no significant drop in performance after two new classes were added. This proves ESCORT’s ability to learn new vulnerabilities without unlearning previous knowledge.
- Training Time: Evidence of a training time decrease fits into the narrative that new branches are add-ons, proving their success.
Implications and Follow-ups
- This novel approach is different from all previous works. Instead of packaging learning and detecting vulnerabilities in a single step, ESCORT proves that dividing the task into two could make it a more effective tool.
- The problem with learning new vulnerabilities while preserving old ones is resolved by having independent branches to address separate classes.
- As a result, model drift, the bias of a model towards particular classes, is easily solvable by updating the parameters of a vulnerability branch.
- Previous works predominantly didn’t distinguish between vulnerability types and just reported a security score. This made it harder to justify the F1 score, and practical implementation was harder since different vulnerabilities needed different solutions.
- Other works didn’t do a great job of representing smart contracts in a meaningful way. Previous works that used RGB color images of codes or nodes and edges of a graph representation didn’t achieve comparable results to ESCORT.
- Another thing ESCORT took great care of was that the input data was more accessible. Previous works used source code, which, as discussed, is hard to acquire.
- The ability to process long contracts is also a breakthrough, further enhancing generalizability and applicability.
Applicability
- For developers, ESCORT has the potential to improve security by detecting risks within their code, allowing vulnerabilities to be caught before transactions are defined.
- For smart contract platforms such as Ethereum, ESCORT can provide a safe environment that runs a check on applications before they are broadcasted and executed in adversarial environments.
- ContractScraper is a useful tool for people to scrape contracts with and train existing models on new vulnerabilities.
- Although the vulnerabilities presented for ESCORT to train on may not be the best representation of the most important vulnerabilities in the wild, this framework would be useful for future works in smart contract vulnerability detection with machine learning.