Research Summary: Blockchain Oracle Design Patterns

TLDR

  • With blockchain technology and smart contracts, it is now possible to permanently record data securely and transparently without the need for a third party.
  • However, in order to extend the usability of blockchain technology, these systems need to access real-world data which is free from manipulation. They do this using blockchain oracles.
  • Through a review of 55 papers ranging from academic work to protocol whitepapers, this study reveals that although existing blockchain oracles are making efforts to serve as a reliable source for data, they are still vulnerable to attacks and lack authenticity and confidentiality.

Core Research Question

How are blockchain oracles designed to provide data outcomes to smart contracts?

Citation

Pasdar, A., Dong, Z., & Lee, Y. C. (2021). Blockchain Oracle Design Patterns. arXiv preprint arXiv:2106.09349. https://arxiv.org/pdf/2106.09349

Background

  • Oracles: These are third-party services that send and verify external information before submitting it to smart contracts to trigger state changes in the blockchain. Also referred to as data feeds.
  • Query: A query is a request for information from a database. Query types include binary, non-binary, scalar, and categorical types.
  • Sybil Attacks: Named after Sybil, the woman with 16 different personalities, Sybil attacks occur when an attacker tries to gain control of a computer network by creating multiple identities.
  • Really Simple Syndication (RSS) Feeds: This is a web feed that pulls data from different sources and consolidates it in a single place. A website’s RSS feed would send back updates in the website which can come in form of posts, comments, etc.
  • Nash Equilibrium: A concept in game theory used to describe where all players in a network/game are forced to employ similar strategies because there is no incentive to deviate from it. Applying this concept to blockchain technology especially in mining or voting on an oracle, a Nash equilibrium will be said to exist where all miners/voters have to submit legitimate blocks/information because they know they will all reap the same rewards when they do. There is no incentive to submit malicious blocks/information because it would, on the contrary, lead to loss.
  • Verifier’s Dilemma: The Verifier’s Dilemma refers to the decisions a miner must make about the allocation of resources to mining and verifying. They are forced to weigh the security risks that may arise if they do not fully verify against the potential profitability of not verifying by keeping their resources.
  • Application Programming Interface (API): This is a software intermediary that enables communication between two applications.
  • Lazy equilibrium: A form of verifier’s dilemma where voters return the same answer to question to secure profits without performing work to verify correctness.
  • Freeloading: When an oracle cheats the system by obtaining and copying the response of another oracle without paying fees required to push queries on-chain.
  • Schelling Point: A system proposed by Vitalik Buterin as a model for decentralised oracles. Here, voters in an oracle who submit correct values that fall within the 25th and 75th percentile are rewarded.
  • Information direction: This depicts the direction that information received by oracles flows from. It can be either outbound, transmitting data from a smart contract to the outside world, or inbound, which transmits data from the outside world into smart contracts.
  • Immediate-read oracles: These oracles deal with data used to make immediate decisions. They do not store large sets of information and only keep those highly relevant to the use case.
  • Publish-subscribe oracles: These oracles provide continuous data upon subscription to their service. For example, an oracle that gives price feeds.
  • Request-response oracles: These oracles use a client-server kind of architecture where the server processes a request after being received from the client. The oracle acts as the intermediary.
  • Micro cheating: This is a problem for Schelling Point systems where slight changes are added to the median value and participants slightly tweak their answers towards a particular direction to push the median value to their edited point.
  • Multi-signature address: An address on the Blockchain which is associated with more than one private key. Transactions initiated using such addresses need more than one private key to be authorised.
  • Blockchain agnostic: A platform is said to be Blockchain agnostic when it allows interactions with multiple blockchains.
  • Scrapers: Scrapers import data from websites into local files or spreadsheets.
  • Authenticity Proof Mechanism: These are mechanisms used to verify the identity of a user, device, or other entity who wants to submit or vote on information in an oracle database.
  • Message Authentication Code (MAC): Powered by cryptography, MACs are short pieces of information used to authenticate a message.
  • Transport Layer Security (TLS): TLS is a cryptographic protocol which is used to add a layer of security to internet communications.
  • Grey Literature: Materials and research which have either never been published or have been published informally or non-commercially. Includes whitepapers, reports, infographics, and working papers.
  • Systematic Literature Review (SLR): a research technique which involves the critical selection, analysis and repetitive review of literature in order to answer research questions key to the study.
  • Multivocal Literature Review (MLR): a form of Systematic Literature Review (SLR) in which grey literature is considered alongside published academic research to obtain research data.
  • Snowballing Technique: a research technique used to obtain a set of papers for a study through references in (backward) or citations of (forward) similar papers which share research pattern characteristics of the target study.

Summary

  • Oracles perform as data feeds by responding to queries from smart contracts. They also consult with several sources to obtain the data and can be considered under four classifications:
    • their source of information,
    • the direction through which they carry information,
    • whether they are centralised or decentralised,
    • and their design patterns.
  • To understand the present scope around blockchain oracles, the authors conducted an MLR of current oracle solutions geared towards solving the problem of presenting correct data outcomes to smart contracts. Currently, though oracles are well studied, prior work has not considered them from a technical aspect. This is the primary motivation of the authors for the study.
  • The authors reviewed 55 papers ranging from academic work to protocol whitepapers. These papers included practical oracle designs that display innovative techniques to solve the acquisition and transfer of data outcomes from oracles to smart contracts.
  • They found a lot of conversation about oracle design. Existing design strategies keep the integrity of data obtained from external resources. Oracles are working towards providing truth back to the blockchain and smart contracts.
  • There remains a number of unaddressed research and technical questions in practice. The authors call for further research to improve oracle designs in terms of performance, fees, data security, and integrity.

Method

  • This research paper depends on insights from existing literature. To obtain this body of literature, the authors employed the MLR technique.
  • The search keywords used in order to cover the vast majority of related studies were combinations of (“blockchain oracles”, “data feed”) AND/OR (“smart contracts”), (“design”), AND/OR ( “pattern”).
  • Several digital libraries like IEEE Xplore, ScienceDirect, ACM Digital Library, DBLP: Computer Science Bibliography, and Google Scholar were used to obtain the set of papers which were reviewed.
  • The authors also employ the snowballing technique to filter out relevant studies.

Results

  • From analyzing the fine set of literature obtained, the authors collate a set of 55 papers. The papers revealed with respect to monetary incentives, two categories of oracles exist: voting-based oracles and reputation-based oracles.
  • Voting-based oracle designs were classified based on query types and their vulnerabilities to Sybil attacks and verifier’s dilemma.
  • Reputation-based oracles were classified by retrieving data from internet sources and were analyzed based on their limitations alongside the authenticity and confidentiality of data.
  • Voting-based oracles employ voting based strategies to aggregate and determine the validity of outcomes. Rewards are only distributed when certifier and voter data outcomes match.
  • Stake-based oracles, multisignature based oracles, Schelling based oracles, token based oracles, and conventional oracles are examples of voting based oracles.
  • Stake-based oracles generally involve making participants lay down a bond to submit, vote, or verify data outcomes on the oracle platform. The review revealed that these oracles (such as Oraichain and Razor) mostly process binary data queries and are prone to the verifier’s dilemma, but can help mitigate against Sybil attacks.
  • In multisignature systems, a majority of parties must agree to a data outcome for it to be finalized as true. These oracles mostly process non-binary data queries and are partially prone to Sybil attacks and verifier’s dilemma.
  • Schelling Point systems use a median point to determine which participant is rewarded and which loses their staked funds. They mostly process scalar data query types, are not entirely immune to Sybil attacks, and are prone to micro-cheating. Examples include, Maker Protocol and Oracul.
  • In token-based systems, users are required to hold or use a certain amount of the native token of the platform to participate in determining the correctness of data outcomes. These systems mostly process any type of data, are immune to Sybil attacks due to monetary policy, but are partially prone to the verifier’s dilemma.
  • Conventional systems implement only single or multiple data sources to verify data correctness and integrity. These sources are usually trusted entities whose votes are naturally reliable. They process a mix of scalar and non-binary data query types and manage Sybil attacks and the verifier’s dilemma moderately.
  • The second class of oracles are reputation-based oracles. They use reputation strategies to select the oracles or participants who would bring forward requested data and authenticity proof mechanisms to manage data integrity. They include software-based proof oracles, hardware-based proof oracles and proofless oracles.
  • Most oracles that need a secure HTTP connection to retrieve data from sources are powered by the TLS protocol. The TLS protocol is not foolproof. TLS-N and TLS Notary are solutions that propose respectively that parts of the data in a TLS session should be hidden and third-party validators should be employed to validate TLS sessions.
  • Most software-based proof oracles studied are limited by their lack of resilience to data tampering. They also partially lack confidentiality of data but are largely authentic.
  • Hardware-based proof oracles are limited by their hardware requirements but largely offer authentic data with partial confidentiality. Town Crier, Chainlink, and Corda are some of the examples in this category.
  • Proofless oracles have the limitations of not being decentralized with hardly any authenticity proof mechanism. There is also little confidentiality of data.
  • The majority of voting-based oracles barely employed authenticity proof mechanisms for data integrity and correctness.
  • The majority of the oracles are based in one blockchain (Ethereum) and only a few are blockchain agnostic.

Discussion and Key Takeaways

  • Secure and high-performance blockchain oracles still need to be designed. Developers should take advantage of high-performance and low transaction fee blockchains for oracle design and deployment. Authenticity proof mechanisms for these oracles need to be fast and reliable as well.
  • Blockchain oracle adaptability needs to be studied further in order to pave way for agnostic solutions.
  • How can we identify legitimate blockchain oracles?

Implications and Follow-ups

  • An ideal oracle is still a farfetched idea. The area needs more research and development.
  • Oracles must be available at all times on a variety of networks and have confidential, reliable data.
  • Oracles are prone to security risks like hacking, instability, and malicious influence. Attaching high economic risks to breaching the integrity of oracle data helps with prevention of Sybil attacks. Proof of work and proof of stake systems can also be used for this purpose but an entity must never control more than 50% power.

Applicability

  • This study fills the research gap on the techniques, challenges, advantages, and disadvantages of existing blockchain oracle design patterns.
  • Researchers can use it as a launchpad to conduct more studies on the operating costs, processing speeds, security, and the improvement of oracles to handle different data types.
  • Developers can also use the study to improve future oracle design patterns.
6 Likes

Hi @Favvz, and thank you for an excellent contribution to the forum.

Your conclusion emphasizes the relative immaturity of oracle design, which leaves me curious about a few specific aspects of oracle technology:

  • What direction do you think this element of the industry is most likely to take?
  • What oracle designs are currently showing the most success at the present time?
  • And how can the reliability of oracle nodes be enhanced?

Thank you for sharing this research. I am wondering how you define an ideal oracle.

Thank you @Favvz for this insightful summary.
I think that the primary problem in designing oracles is that if the oracle is compromised, the smart contract that depends on it is also compromised. This is a very big oracle problem, hackers can hijack the smart contract and inject malicious codes inside it.

@Favvz , The trust conflict between third-party oracles and the trustless execution of smart contracts, is it something you feel like it can be solved in future cause this issue have remained mostly unsolved.

This post was flagged by the community and is temporarily hidden.