Research Summary: Diablo: A Benchmark Suite for Blockchains

TLDR

  • We present the most extensive evaluation of blockchain to date with decentralised applications in gaming, web services, exchanges, video sharing and mobility services.
  • Our findings include that most modern blockchains (Algorand, Avalanche, Solana, Diem, Ethereum, Quorum) are not suitable for real application demands, with some exception like Red Belly Blockchain that commits multiple blocks at once.
  • Our results are surprisingly far from some claimed performance results announced online, which confirms that a more scientific approach is needed to evaluate blockchains.

Core Research Question

The goal of our work is to help understand which blockchain proposal is better-suited for a particular application.

Citation

V. Gramoli, R. Guerraoui, A. Lebedev, C. Natoli, G. Voron. Diablo: A Benchmark Suite for Blockchains . Proceedings of the 18th ACM European Conference on Computer Systems (EuroSys), 2023.

Background

  • Algorand is a proof-of-stake blockchain that elects a subset of nodes, through sortition, that can append the next block.
  • Avalanche is a blockchain offering probabilistic safety and the possibility to spawn subnets.
  • Diem, formerly known as the Libra blockchain, was initiated by Facebook.
  • Ethereum is the second largest blockchain in market capitalization.
  • Quorum is a blockchain initiated by J.P. Morgan and currently maintained by Consensys.
  • Solana is a recent blockchain that is highly optimized for special features (e.g., Intel instructions).
  • Red Belly Blockchain combines multiple proposed blocks into a superblock.

Summary

  • Although Algorand, Avalanche, Solana, Diem, Ethereum, Quorum are not yet ready to handle demanding workloads found in centralized services, our in-depth analysis identifies key factors of performance and shows that some blockchain promises are fulfilled.
  • It appears that a blockchain targeting eventual consistency scales easily to networks with many nodes. Two blockchains, Diem and Avalanche, fail at using more challenging configurations, most likely because they simply do not consider these configurations as a use case.
  • Two blockchains, namely Quorum and Diem, are the most impacted by constantly high workloads. It could be due to their leader-based BFT consensus protocol design that is typically known to suffer from scalability limitations. The Red
    Belly Blockchain, which builds upon a leaderless deterministic BFT consensus protocol, was recently shown to perform extremely well under high workloads.
  • The Algorand, Avalanche and Solana blockchains, which offer probabilistic or eventually consistent guarantees, maintain a non-negligible throughput when stressed with high constant workloads.

Method

  • We designed the Diablo benchmark suite with a single primary, which coordinates the experiments and gathers results, and as many secondaries as needed to stress-test the blockchain network.
  • We implemented decentralised applications (DApps) in PyTeal v5, Move v4 and Solidity v0.7.5:
    • A Gaming DApp, which is highly demanding and executes a Dota 2 trace.
    • A Webservice DApp, which is highly contended and executes the FIFA website requests during the soccer world cup.
    • A Mobility Service DApp, which is compute-intensive and executes an Uber trace.
    • A Video Sharing DApp, which is very demanding and executes a YouTube trace.
    • An Exchange DApp, which experiences request bursts.
  • We evaluated the latency and throughput of each blockchain in different settings:
    • A Datacenter setting where few powerful machines are collocated.
    • A Testnet setting where few classic machines are collocated to reduce the costs of operating the network.
    • A Devnet setting where few classic machines are distributed over 5 continents.
    • A Community setting where hundreds of classic machines are distributed over 5 continents.
    • A Consortium setting where hundreds of modern machines are distributed over 5 continents.

Results

  • We observe that for the Exchange DApp, which has the lowest average workload, Nasdaq, of 168 TPS only, Avalanche and Quorum commit more than 86% of the transactions, all the other blockchains commit 47% or less of the transactions. Although the fact that none of the evaluated blockchains could commit all transactions may seem quite pessimistic, note that recent experiments already demonstrated that the Red Belly Blockchain could commit all of them in the same setting.
  • For the most demanding workload, the YouTube workload, the proportion of commits is lower than 1% for Algorand, Avalanche, Solana, Diem, Ethereum and Quorum, indicating that they are not yet ready to handle demanding workloads found in centralized services.
  • For high workloads (like Dota 2), none of these blockchains maintain a throughput higher than 66 TPS and for none of the DApps do they commit with a latency lower than 27 seconds.

Discussion and Key Takeaways

  • Although Algorand, Avalanche, Solana, Diem, Ethereum, Quorum are not yet ready to handle demanding workloads found in centralized services, some of them like Red Belly Blockchain can handle some demanding workloads.
  • Some of the supported programming languages are too low-level to be written easily without a higher-level programming language.
  • The programming languages can have limited support (like for floating point functions).
  • Real DApps may not even execute successfully as some of their functions would consume more than the maximum allowed computational steps expressed in gas units.
  • The blockchains based on the Go Ethereum (or geth) virtual machine seem to handle generic programs the best.

Implications and Follow-Ups

  • A more scientific methodology was needed to evaluate blockchain technologies and provide the details of experimental settings.

Applicability

  • Diablo is open source and available at https://diablobench.github.io and its artifact has been considered available and functional by the Artifact Evaluation Committee of EuroSys 2023.
  • We believe that Diablo will be instrumental in helping improve the current blockchain designs and evaluate blockchains in a more transparent manner.
10 Likes

This is an amazing summary @Vincent. Thank you for making it so informative in a concise way.

Using the Diablo benchmark suite for blockchains, showed that Red Belly has better applicability and performs extremely well under high workloads unlike other blockchains because of its scalability.

However, since there may be a probability that some DApps may not have executed successfully and some of the supported programming language have limited support,

don’t you think this may have affected the results of the other blockchains?

4 Likes

Not really, let me explain why. We found workarounds to implement smart contracts in the programming languages supported by the blockchains (with their respective limitations). Hence, our evaluation reports on the performance one developer can expect from each blockchain when having to use this blockchain to implement their DApp.

5 Likes

Thank you very much @vincent for this wonderful summary.

Please, can you throw more lights on those key factors of performance ?

1 Like

Nice summary @vincent
Which blockchain technology, in your opinion, offer the most promise, and why?

The best performing blockchain, among all the blockchains we tested so far, remains Red Belly Blockchain [IEEE Security and Privacy 2021]. In its newer version it can execute DApps written in Solidity faster than what we experienced with other blockchains. In particular, this is the only blockchain we know of that commit all transactions of the Nasdaq trace of the Diablo Exchange DApp. This is detailed in Fig.7 of the following paper https://arxiv.org/pdf/2207.05971.pdf

1 Like

[replying to @Mansion ] Sure. An important cause of performance variation is the experimental setting, which is unfortunately not well documented in general. Interestingly, the setup in which Avalanche and Solana perform best is « datacenter », with few big machines in the same availability zone and the setup in which Algorand performs best is « Testnet » with few smaller machines but also in the same availability zone. They typically perform worse in geo-distributed settings called « devnet », with few machines, « community », with many small machines, and « consortium », with many more powerful machines. This is described in the Stellar foundation keynote slides.

2 Likes

Interesting question @Nicolasdamiens
The best uses of blockchain technology are in sectors where there are powerful, corrupt institutions that keep the sector inefficient.

The majority of them are already in use in the real world by businesses and people.

From highest to least potential, this list is arranged.

  1. The healthcare sector is one of the biggest, but it is also one of the least efficient and has no standards. In this case, I believe the blockchain has a lot of potential. Standardizing and decentralizing this system is simple.
    Blockchain technology has the potential to save lives and reduce the soaring cost of healthcare. (Medishares and Mediblocks).

  2. A completely decentralized internet in which ISPs are no longer required. Skycoin does this through Skywire. They will soon offer their custom-built 1Gbps antennas for $100, which have a range of 10 miles and provide high-speed internet to 7,000 people, and possibly 20,000 people with their mesh network on top. To cover the entire continent, only 2,000 antennas per European country are required, and the data is kept on Skyminers.

  3. Property ownership: Real estate, like the entertainment business, is massive, but it is also dominated by a few major players, and it is all about who you know. The blockchain can also help to democratize this. (IHT Real, Bitrent and Relex)

  4. Pension funds: This industry is perhaps the greatest of all and is incredibly opaque, causing trillions of dollars to disappear and land in the pockets of “facilitators” (Akropolis).

  5. Energy: The energy is massive and prone to cartelization. This is where decentralized purchasing of electricity removes a lot of ineffiencies and prevents fraud. (WePower, PowerLedger, etc.)

  6. Lending. Since lending is the core purpose of banks, there isn’t much lending in the cryptocurrency space yet, despite the fact that it has the potential to be one of the largest marketplaces. (Ripio, Salt, Ethlend)

  7. Passports are no longer required for security identification, as the block chain now handles all aspects of security identification (Civic, THEKEY)

  8. Decentralized Storage. This directly competes with Google Drive and Dropbox for users, but it also competes with all types of data servers. For instance, anything that runs on Amazon servers could be run decentralized, rather than in data centers, but instead distributed across millions of devices. Siacoin, IOTA, Skycoin, Byteball, Storj, Maidsafecoin, and Bluzelle are just a few of the many rivals in this market.

  9. Storage Dispersed. A phrase that Dropbox users also uses Any application that uses Amazon servers, for instance, might be dispersed across millions of devices rather than run centralized across data centers. Just a handful of the numerous competitors in this industry include Siacoin, IOTA, Skycoin, Byteball, Storj, Maidsafecoin, and Bluzelle.

  10. Entertainment. calatori calatori inlocui invata invata spatiu calatori compoziti dispoziti dispoziti solutie functieizari Package Rule Look Hier Am Preissectiune Presentmetru performanțnect performanț linistit intrebăsescvenus linguri adancDatorita stăpân faraelul (Weinstein). All of this is democratisable and decentralized. (Theta, Singular, Tron)

So, this is my list of the most strong blockchain applications rated by powerfulness and utility.

Interestingly, only smaller cryptocurrencies (Mediblocs, Medishares, District, Counterpart, Civic, THEKEY) that have not yet gained considerable popularity service the top 3 industries. I believe the reason for that is that it will take a long time for cryptocurrencies to have an influence here, whereas it will probably go much faster in the ones rated lower, because the business is so vast and thoroughly controlled by the affluent and powerful and very political (Gaming, Adult, Gambling, Ticketing).

It appears that the Top 7 have the greatest potential but will take the longest to realize, whilst the Bottom 7 have a smaller market but will have a far greater influence right now.

We currently see several of these blockchains in use, and we can anticipate some of them becoming the norm across many industries.

2 Likes

@Vincent thank you very much for your quick response.

@Gift82822546 thank you very much for this great answer

1 Like