Research Summary: An Exploratory Study on Solidity Guards and Ether Exchange Constructs

TLDR

  • A total of 26,799 verified Solidity Smart contracts were collected from Etherscan.
  • An analysis is made on different language constructs used in calling another contract or exchanging ether.
  • The usage of guards to make code more secure against attacks such as reentrancy is also analyzed in this dataset.
  • On average 97% of all contracts containing a call function use the required guard with an average of 23 uses per contract.

Core Research Question

How are different language constructs used in practice and are they still commonly used in current smart contracts deployed in the Ethereum network?

Citation

Darin Verheijke and Henrique Rocha. 2022. An Exploratory Study on Solidity Guards and Ether Exchange Constructs. In 5th Workshop on Emerging Trends in Software Engineering for Blockchain (WETSEB’22 ), May 16, 2022, Pittsburgh, PA, USA. ACM, New York, NY, USA, 8 pages. https://doi.org/10.1145/3528226.3528372

Background

  • Solidity Guards: Guards are language constructs to prevent access or revert a transaction. In Solidity, there are three guards namely Require, Assert and Revert.

  • Reentrancy attack: The call function has some vulnerabilities. One of these is the reentrancy attack which takes advantage of the transfer of execution control by making recursive calls back to the original contract, repeating executions, and creating new transactions.

Summary

  • A common functionality of smart contracts is the possibility to make calls to other contracts on the same blockchain platform. However, this has to be done with caution as untrusted contracts can introduce both errors and risks as the contract may execute malicious code and exploit vulnerabilities as every call transfers execution control to the called contract, one of these dangers being the reentrancy attack.

  • The reentrancy attack is most well known for the DAO attack that happened in June 2016 where around 3.6 million Ether was stolen.

  • There are different language constructs to exchange constructs in Solidity, one of the main languages to code smart contracts in the Ethereum platform, namely call, send and transfer.

  • The call function is a way to send a message to a contract as well as a way to send Ether to another address. The function transfers execution control to the called contract and the caller can forward any amount of gas. This does potentially introduce vulnerabilities such as reentrancy.

  • Two safe-by-design methods have been introduced to counteract these vulnerabilities being transfer and send. Transfer also transfers execution control to the caller but has a gas limit to prevent abuse. Send is a lower-level implementation of transfer, the major difference being that send returns false if it fails, delegating the error handling to the developer.

  • Initially call was replaced by the safer functions transfer and send but there has been a switch back to the call function with the introduction of EIP 1884. Other precautions are taken such as making use of safe code patterns and using guards.

  • Guards are language constructs to prevent access or revert a transaction. Require and Assert check for a condition and raise an exception if such a condition is not met. Assert is used as a check for internal errors and bugs while require is used to check for conditions. Revert is similar to a “throws new Exception”, it raises an exception and refunds the remaining gas.

  • A database was collected of 26,799 verified smart contracts from Etherscan on which an analysis will be done.

  • First, the lines of codes of the contracts are analyzed followed by investigating the amount of Ether exchange methods and guards used in the contracts.

  • Next the contracts are analyzed per version and we then focus on different functions. We first look at contracts only containing a call function, as this is the unsafest function, followed by looking at contracts containing at least one transfer function, and end the analysis by looking at contracts containing at least one send function, being the smallest part of the dataset (approximately 2%).

  • The analysis aims to see if depending on the functions and guards a contract as if they have different characteristics, especially the contracts containing a call function as a call is considered the unsafest function.

Method

  • Over a period of six months (2021-07-07 to 2022-01-06) verified smart contracts were manually collected from Etherscan as Etherscan does not give access to a complete dataset but rather has an open-source database of the latest 5,000 smart contracts that were verified.

  • After this collection following pre-processing steps were done:

    • All duplicated contracts were removed
    • All contracts removed that were not written in Solidity
    • All contracts that we could not process using cloc.

    The dataset now contains a total of 26,799 unique verified solidity smart contracts and is publicly available.

  • Next, using the Etherscan API the source code for each contract was retrieved, and using the cloc tool the lines of codes were counted for each contract.

  • Apart from the LoC statistic we went through every contract and counted the amount of call, transfer, and send functions as well as the number of guards in each contract. For each of these methods, we show how many contracts have at least one of these methods, the overall count of the method, and the average and median number in all contracts.

  • Following this similar statistics are displayed for contracts per version and contracts containing at least one call (or transfer or send) function.

Results

  • Looking at just the lines of code of the contracts we notice that contracts are small in lines of codes, with a median of 256 LoC and an average of 356 LoC. Smart contract code tends to be smaller in comparison to software code in other domains.
  • Even though call is the unsafest method for Ether exchange, it is used by 50% of the contracts in the dataset. Send and transfer, both considered safer methods, are used in 2% and 34% of all contracts respectively.
  • Looking at the different guard constructs we can see that require is used in the majority of all contracts (97%) with an average of around 23 uses of require in a contract. This usage may be to counteract the vulnerabilities of call. Revert is also used in more than half of the contracts in the dataset (51%).
  • If we look at the contracts categorized by version then most contracts are from the latest Solidity version 0.8.x. The next thing we notice is that for versions 0.6.x and 0.7.x there is an increase usage of call require and revert in comparison with other versions while in versions 0.5 and 0.4 the usage of call is a lot lower.
  • Almost all contracts containing at least one call function contain a require (99%) and most of them also a revert function (89%) which we can probably relate to counter the vulnerabilities of the call function.
  • Transfer being the safer alternative has some different characteristics. Looking at the LoC first, they were lower in comparison to the contracts containing a call function, this could be because the latter needs extra code to protect against vulnerabilities. Still there is a require in almost every contract (98%) but the amount of asserts and reverts has definitely decreased in comparison to the call contracts.
  • Lastly, the smallest part of the dataset where the contracts containing at least one send function. Noticeable is that this function was often already paired with a call function, in 89% of the cases as to which we can probably attribute the increased use of both require (100%) and revert (90%) guards.

Discussion and Key Takeaways

  • While call is considered the most unsafe method it is also the most popular (50% of contracts)
  • Guards are actively being used by developers to protect their code, most prevalent being the require guard with an average of 23 uses per contract.

Implications and Follow-Ups

  • As future work, using vulnerability detection tools to further investigate different characteristics of the dataset would be interesting.
  • Looking at different metrics besides Lines of Code would also be of interest. Other research could also be to dive deeper into the usage of guards and safety patterns in case of a vulnerability and to see if added guards/safety patterns could have prevented these vulnerabilities.
  • Lastly, it may be interesting to look at cross-chain vulnerabilities to see if some exploits and vulnerabilities are also prevalent or if they can be avoided by using other blockchains.

Applicability

  • We recommend smart contract developers use audited libraries of smart contracts such as OpenZeppelin and to audit their contracts themselves if possible.
  • We also recommend making use of guards and safety patterns and being aware of the different vulnerabilities that can occur when using specific Ether exchange functions.
5 Likes

Hi @Darin_Verheijke, I am impressed by the volume of dataset you and your team put together for this research paper.

I have a few questions for you:

  1. It seems as if the Call function is a necessary devil. If it is the unsafest method for ether exchange, when compared to the Send and Transfer functions, why does EIP 1884 still adopt it over the other methods? Was it for the reason of EIP 1884 preference that 50% of the contracts in the dataset employ it, or are there other reasons?

  2. I am just curious, why was it necessary to count the lines of codes for each contract?

  3. Lastly, Is Verification and Auditing enough to know a trusted smart contract or are there superior or additional ways to establish it?

To conclude, maybe, other Ethereum Smart Contract languages such as Vyper, Yul, etc, can also be studied using similar methods as in Solidity, and the various results compared for optimum performance.

3 Likes

Hey @Ulysses, thanks for the questions.

  1. Financially it is more interesting to use the call function as after EIP 1884 the gas price of both transfer and send had increased. Instead developers rely on other ways to secure the call function such as guards, safety patterns such as a checks-effects interaction pattern etc.

  2. It wasn’t necessary but it was one of the statistics we chose to look at, more specifically, we noticed an increase in lines of code when a call function was used. This together with the increase of guards in those contracts could indicate that developers are actually actively trying to protect those contracts containing a call function with said guards and safety patterns (thus increasing the LoC of the contract).

  3. In some additional research that we’ve done we actually used Slither[https://doi.org/10.48550/arXiv.1908.09878] which is a static analysis framework that provides different information about Ethereum smart contracts. It also provides vulnerability detectors such as a reentrancy detector which you can use in order to check whether or not your contract contains a potential reentrancy. I recommend reading their paper and looking at the github: Slither.

2 Likes

Thanks for the quick and decisive responses.

So, if Ethereum finally achieves a lesser gas fee, there is nothing else stopping developers from choosing the Send and Transfer functions over the Call function. Because I believe that the use of guards is even an extra programming work for them.

1 Like

Hey @Darin_Verheijke thank you for this insightful summary.

It’s interesting to learn more about Solidity guards through your summary but I’m curious to know What are the main differences between Solidity and other programming languages like Python, Java, or C++?