Research Summary: An Exploratory Study on Solidity Guards and Ether Exchange Constructs

TLDR

  • A total of 26,799 verified Solidity Smart contracts were collected from Etherscan.
  • An analysis is made on different language constructs used in calling another contract or exchanging ether.
  • The usage of guards to make code more secure against attacks such as reentrancy is also analyzed in this dataset.
  • On average 97% of all contracts containing a call function use the required guard with an average of 23 uses per contract.

Core Research Question

How are different language constructs used in practice and are they still commonly used in current smart contracts deployed in the Ethereum network?

Citation

Darin Verheijke and Henrique Rocha. 2022. An Exploratory Study on Solidity Guards and Ether Exchange Constructs. In 5th Workshop on Emerging Trends in Software Engineering for Blockchain (WETSEB’22 ), May 16, 2022, Pittsburgh, PA, USA. ACM, New York, NY, USA, 8 pages. An exploratory study on solidity guards and ether exchange constructs | Proceedings of the 5th International Workshop on Emerging Trends in Software Engineering for Blockchain

Background

  • Solidity Guards: Guards are language constructs to prevent access or revert a transaction. In Solidity, there are three guards namely Require, Assert and Revert.

  • Reentrancy attack: The call function has some vulnerabilities. One of these is the reentrancy attack which takes advantage of the transfer of execution control by making recursive calls back to the original contract, repeating executions, and creating new transactions.

Summary

  • A common functionality of smart contracts is the possibility to make calls to other contracts on the same blockchain platform. However, this has to be done with caution as untrusted contracts can introduce both errors and risks as the contract may execute malicious code and exploit vulnerabilities as every call transfers execution control to the called contract, one of these dangers being the reentrancy attack.

  • The reentrancy attack is most well known for the DAO attack that happened in June 2016 where around 3.6 million Ether was stolen.

  • There are different language constructs to exchange constructs in Solidity, one of the main languages to code smart contracts in the Ethereum platform, namely call, send and transfer.

  • The call function is a way to send a message to a contract as well as a way to send Ether to another address. The function transfers execution control to the called contract and the caller can forward any amount of gas. This does potentially introduce vulnerabilities such as reentrancy.

  • Two safe-by-design methods have been introduced to counteract these vulnerabilities being transfer and send. Transfer also transfers execution control to the caller but has a gas limit to prevent abuse. Send is a lower-level implementation of transfer, the major difference being that send returns false if it fails, delegating the error handling to the developer.

  • Initially call was replaced by the safer functions transfer and send but there has been a switch back to the call function with the introduction of EIP 1884. Other precautions are taken such as making use of safe code patterns and using guards.

  • Guards are language constructs to prevent access or revert a transaction. Require and Assert check for a condition and raise an exception if such a condition is not met. Assert is used as a check for internal errors and bugs while require is used to check for conditions. Revert is similar to a “throws new Exception”, it raises an exception and refunds the remaining gas.

  • A database was collected of 26,799 verified smart contracts from Etherscan on which an analysis will be done.

  • First, the lines of codes of the contracts are analyzed followed by investigating the amount of Ether exchange methods and guards used in the contracts.

  • Next the contracts are analyzed per version and we then focus on different functions. We first look at contracts only containing a call function, as this is the unsafest function, followed by looking at contracts containing at least one transfer function, and end the analysis by looking at contracts containing at least one send function, being the smallest part of the dataset (approximately 2%).

  • The analysis aims to see if depending on the functions and guards a contract as if they have different characteristics, especially the contracts containing a call function as a call is considered the unsafest function.

Method

  • Over a period of six months (2021-07-07 to 2022-01-06) verified smart contracts were manually collected from Etherscan as Etherscan does not give access to a complete dataset but rather has an open-source database of the latest 5,000 smart contracts that were verified.

  • After this collection following pre-processing steps were done:

    • All duplicated contracts were removed
    • All contracts removed that were not written in Solidity
    • All contracts that we could not process using cloc.

    The dataset now contains a total of 26,799 unique verified solidity smart contracts and is publicly available.

  • Next, using the Etherscan API the source code for each contract was retrieved, and using the cloc tool the lines of codes were counted for each contract.

  • Apart from the LoC statistic we went through every contract and counted the amount of call, transfer, and send functions as well as the number of guards in each contract. For each of these methods, we show how many contracts have at least one of these methods, the overall count of the method, and the average and median number in all contracts.

  • Following this similar statistics are displayed for contracts per version and contracts containing at least one call (or transfer or send) function.

Results

  • Looking at just the lines of code of the contracts we notice that contracts are small in lines of codes, with a median of 256 LoC and an average of 356 LoC. Smart contract code tends to be smaller in comparison to software code in other domains.
  • Even though call is the unsafest method for Ether exchange, it is used by 50% of the contracts in the dataset. Send and transfer, both considered safer methods, are used in 2% and 34% of all contracts respectively.
  • Looking at the different guard constructs we can see that require is used in the majority of all contracts (97%) with an average of around 23 uses of require in a contract. This usage may be to counteract the vulnerabilities of call. Revert is also used in more than half of the contracts in the dataset (51%).
  • If we look at the contracts categorized by version then most contracts are from the latest Solidity version 0.8.x. The next thing we notice is that for versions 0.6.x and 0.7.x there is an increase usage of call require and revert in comparison with other versions while in versions 0.5 and 0.4 the usage of call is a lot lower.
  • Almost all contracts containing at least one call function contain a require (99%) and most of them also a revert function (89%) which we can probably relate to counter the vulnerabilities of the call function.
  • Transfer being the safer alternative has some different characteristics. Looking at the LoC first, they were lower in comparison to the contracts containing a call function, this could be because the latter needs extra code to protect against vulnerabilities. Still there is a require in almost every contract (98%) but the amount of asserts and reverts has definitely decreased in comparison to the call contracts.
  • Lastly, the smallest part of the dataset where the contracts containing at least one send function. Noticeable is that this function was often already paired with a call function, in 89% of the cases as to which we can probably attribute the increased use of both require (100%) and revert (90%) guards.

Discussion and Key Takeaways

  • While call is considered the most unsafe method it is also the most popular (50% of contracts)
  • Guards are actively being used by developers to protect their code, most prevalent being the require guard with an average of 23 uses per contract.

Implications and Follow-Ups

  • As future work, using vulnerability detection tools to further investigate different characteristics of the dataset would be interesting.
  • Looking at different metrics besides Lines of Code would also be of interest. Other research could also be to dive deeper into the usage of guards and safety patterns in case of a vulnerability and to see if added guards/safety patterns could have prevented these vulnerabilities.
  • Lastly, it may be interesting to look at cross-chain vulnerabilities to see if some exploits and vulnerabilities are also prevalent or if they can be avoided by using other blockchains.

Applicability

  • We recommend smart contract developers use audited libraries of smart contracts such as OpenZeppelin and to audit their contracts themselves if possible.
  • We also recommend making use of guards and safety patterns and being aware of the different vulnerabilities that can occur when using specific Ether exchange functions.
12 Likes

Hi @Darin_Verheijke, I am impressed by the volume of dataset you and your team put together for this research paper.

I have a few questions for you:

  1. It seems as if the Call function is a necessary devil. If it is the unsafest method for ether exchange, when compared to the Send and Transfer functions, why does EIP 1884 still adopt it over the other methods? Was it for the reason of EIP 1884 preference that 50% of the contracts in the dataset employ it, or are there other reasons?

  2. I am just curious, why was it necessary to count the lines of codes for each contract?

  3. Lastly, Is Verification and Auditing enough to know a trusted smart contract or are there superior or additional ways to establish it?

To conclude, maybe, other Ethereum Smart Contract languages such as Vyper, Yul, etc, can also be studied using similar methods as in Solidity, and the various results compared for optimum performance.

5 Likes

Hey @Ulysses, thanks for the questions.

  1. Financially it is more interesting to use the call function as after EIP 1884 the gas price of both transfer and send had increased. Instead developers rely on other ways to secure the call function such as guards, safety patterns such as a checks-effects interaction pattern etc.

  2. It wasn’t necessary but it was one of the statistics we chose to look at, more specifically, we noticed an increase in lines of code when a call function was used. This together with the increase of guards in those contracts could indicate that developers are actually actively trying to protect those contracts containing a call function with said guards and safety patterns (thus increasing the LoC of the contract).

  3. In some additional research that we’ve done we actually used Slither[https://doi.org/10.48550/arXiv.1908.09878] which is a static analysis framework that provides different information about Ethereum smart contracts. It also provides vulnerability detectors such as a reentrancy detector which you can use in order to check whether or not your contract contains a potential reentrancy. I recommend reading their paper and looking at the github: Slither.

3 Likes

Thanks for the quick and decisive responses.

So, if Ethereum finally achieves a lesser gas fee, there is nothing else stopping developers from choosing the Send and Transfer functions over the Call function. Because I believe that the use of guards is even an extra programming work for them.

1 Like

Hey @Darin_Verheijke thank you for this insightful summary.

It’s interesting to learn more about Solidity guards through your summary but I’m curious to know What are the main differences between Solidity and other programming languages like Python, Java, or C++?

1 Like

Thank you @Darin_Verheijke for this insightful summary, based on my understanding of this paper, Solidity could be an object-oriented, high-level language for implementing smart contracts. Smart contracts are programs which govern the behaviour of accounts within the Ethereum state or a curly-bracket language designed to target the Ethereum Virtual Machine (EVM). It can be influenced by C++, Python and JavaScrip. With Solidity one can create contracts for uses such as voting, crowdfunding, blind auctions, and multi-signature wallets.

It is obvious that the growing number of smart contracts being deployed in the Ethereum blockchain plat-form has attracted the attention of media outlets, industries, and researchers.

I think the objective of this paper is not to build a theory that applies to all smart contracts of all blockchain plat-forms, but rather to make developers and researchers aware of key characteristics of smart contracts that are frequently used in Ethereum.

Recommendation

  • Restrict the amount of Ether (or other tokens) that can be stored in a smart contract. If your source code, the compiler or the platform has a bug, these funds may be lost. If you want to limit your loss, limit the amount of Ether.
  • In principle, a contract vulnerability is a programming error that enables an attacker to use a contract in a way that was not intended by the developer. I think in other to detect vulnerabilities that do not fall into common types, developers must specify the intended behavior of a contract.
3 Likes

Hello @Idara_Effiong. Since you’ve not gotten a reply to your question for a while now, I thought I should provide answers. I hope it helps.

Python, Java, and C++ are general purpose programming languages.

However, Solidity is a programming language specifically for developing smart contracts on the Ethereum Blockchain which makes it the best and the most used programming language for smart contract development.

There are no much differences between Java, Python, C++ and Solidity. They can all be used for Blockchain programming.

Asides from blockchain development, Java, Python, and C++ can be used for web development, desktop applications, and other related fields.

4 Likes

Hello @Darin_Verheijke, Weldon job, I just wanted to share my little understanding of this paper, Solidity is a brand-new programming language developed by Ethereum, the second-largest cryptocurrency market by capitalization.
What are the features of Solidity Guards,

  1. It’s used to create smart contracts that implement business logic and generate a chain of transaction records in the blockchain system.

  2. It acts as a tool for creating machine-level code and compiling it on the Ethereum Virtual Machine (EVM).

  3. It has a lot of similarities with C and C++ and is pretty simple to learn and understand. For example, a “main” in C is equivalent to a “contract” in Solidity.

I am feeling that there is a high risk and high cost of errors in Solidity code,
I think that

I think Solidity’s self destruct does two things.

  1. It renders the contract useless, effectively deleting the bytecode at that address.

  2. It sends all the contract’s funds to a target address.

1 Like

@Yeoriton56
Thank you very much for your elaborate response. This has been helpful in addressing my question. I plan to start learning the Solidity language.

1 Like

Hey , you have mention check effect interaction patterns in point 1 , Is there any way to find out these check patterns for byte-code also . Because byte-code of call, transfer and send is same so how we can differentiate in byte-code.

Hello @Darin_Verheijke Your work is interesting, and I must appreciate your efforts in making this insightful article.

In summary, different language constructs are utilized in practice to develop smart contracts launched on the Ethereum network. These language components are used to specify the structure of a contract and are required for it to interact securely with other agreements and users.

The Ethereum Virtual Machine (EVM), a stack-based virtual machine designed to run smart contracts, is the most widely used language construct. Solidity and Vyper are two other language constructs that are often utilized in Ethereum smart contracts.

Solidity is a high-level programming language based on JavaScript that was created to allow developers to swiftly create smart contracts that can be put on the Ethereum network. It is a strongly typed language, which means variables must be declared with a type before they may be used.

This improves the security of Solidity contracts by preventing unexpected data-type behavior. Solidity has also been designed to be simple to read and understand, making it excellent for novices to the Ethereum network.

1 Like