Research Summary: Blockchain Oracle Design Patterns

TLDR

  • With blockchain technology and smart contracts, it is now possible to permanently record data securely and transparently without the need for a third party.
  • However, in order to extend the usability of blockchain technology, these systems need to access real-world data which is free from manipulation. They do this using blockchain oracles.
  • Through a review of 55 papers ranging from academic work to protocol whitepapers, this study reveals that although existing blockchain oracles are making efforts to serve as a reliable source for data, they are still vulnerable to attacks and lack authenticity and confidentiality.

Core Research Question

How are blockchain oracles designed to provide data outcomes to smart contracts?

Citation

Pasdar, A., Dong, Z., & Lee, Y. C. (2021). Blockchain Oracle Design Patterns. arXiv preprint arXiv:2106.09349. https://arxiv.org/pdf/2106.09349

Background

  • Oracles: These are third-party services that send and verify external information before submitting it to smart contracts to trigger state changes in the blockchain. Also referred to as data feeds.
  • Query: A query is a request for information from a database. Query types include binary, non-binary, scalar, and categorical types.
  • Sybil Attacks: Named after Sybil, the woman with 16 different personalities, Sybil attacks occur when an attacker tries to gain control of a computer network by creating multiple identities.
  • Really Simple Syndication (RSS) Feeds: This is a web feed that pulls data from different sources and consolidates it in a single place. A website’s RSS feed would send back updates in the website which can come in form of posts, comments, etc.
  • Nash Equilibrium: A concept in game theory used to describe where all players in a network/game are forced to employ similar strategies because there is no incentive to deviate from it. Applying this concept to blockchain technology especially in mining or voting on an oracle, a Nash equilibrium will be said to exist where all miners/voters have to submit legitimate blocks/information because they know they will all reap the same rewards when they do. There is no incentive to submit malicious blocks/information because it would, on the contrary, lead to loss.
  • Verifier’s Dilemma: The Verifier’s Dilemma refers to the decisions a miner must make about the allocation of resources to mining and verifying. They are forced to weigh the security risks that may arise if they do not fully verify against the potential profitability of not verifying by keeping their resources.
  • Application Programming Interface (API): This is a software intermediary that enables communication between two applications.
  • Lazy equilibrium: A form of verifier’s dilemma where voters return the same answer to question to secure profits without performing work to verify correctness.
  • Freeloading: When an oracle cheats the system by obtaining and copying the response of another oracle without paying fees required to push queries on-chain.
  • Schelling Point: A system proposed by Vitalik Buterin as a model for decentralised oracles. Here, voters in an oracle who submit correct values that fall within the 25th and 75th percentile are rewarded.
  • Information direction: This depicts the direction that information received by oracles flows from. It can be either outbound, transmitting data from a smart contract to the outside world, or inbound, which transmits data from the outside world into smart contracts.
  • Immediate-read oracles: These oracles deal with data used to make immediate decisions. They do not store large sets of information and only keep those highly relevant to the use case.
  • Publish-subscribe oracles: These oracles provide continuous data upon subscription to their service. For example, an oracle that gives price feeds.
  • Request-response oracles: These oracles use a client-server kind of architecture where the server processes a request after being received from the client. The oracle acts as the intermediary.
  • Micro cheating: This is a problem for Schelling Point systems where slight changes are added to the median value and participants slightly tweak their answers towards a particular direction to push the median value to their edited point.
  • Multi-signature address: An address on the Blockchain which is associated with more than one private key. Transactions initiated using such addresses need more than one private key to be authorised.
  • Blockchain agnostic: A platform is said to be Blockchain agnostic when it allows interactions with multiple blockchains.
  • Scrapers: Scrapers import data from websites into local files or spreadsheets.
  • Authenticity Proof Mechanism: These are mechanisms used to verify the identity of a user, device, or other entity who wants to submit or vote on information in an oracle database.
  • Message Authentication Code (MAC): Powered by cryptography, MACs are short pieces of information used to authenticate a message.
  • Transport Layer Security (TLS): TLS is a cryptographic protocol which is used to add a layer of security to internet communications.
  • Grey Literature: Materials and research which have either never been published or have been published informally or non-commercially. Includes whitepapers, reports, infographics, and working papers.
  • Systematic Literature Review (SLR): a research technique which involves the critical selection, analysis and repetitive review of literature in order to answer research questions key to the study.
  • Multivocal Literature Review (MLR): a form of Systematic Literature Review (SLR) in which grey literature is considered alongside published academic research to obtain research data.
  • Snowballing Technique: a research technique used to obtain a set of papers for a study through references in (backward) or citations of (forward) similar papers which share research pattern characteristics of the target study.

Summary

  • Oracles perform as data feeds by responding to queries from smart contracts. They also consult with several sources to obtain the data and can be considered under four classifications:
    • their source of information,
    • the direction through which they carry information,
    • whether they are centralised or decentralised,
    • and their design patterns.
  • To understand the present scope around blockchain oracles, the authors conducted an MLR of current oracle solutions geared towards solving the problem of presenting correct data outcomes to smart contracts. Currently, though oracles are well studied, prior work has not considered them from a technical aspect. This is the primary motivation of the authors for the study.
  • The authors reviewed 55 papers ranging from academic work to protocol whitepapers. These papers included practical oracle designs that display innovative techniques to solve the acquisition and transfer of data outcomes from oracles to smart contracts.
  • They found a lot of conversation about oracle design. Existing design strategies keep the integrity of data obtained from external resources. Oracles are working towards providing truth back to the blockchain and smart contracts.
  • There remains a number of unaddressed research and technical questions in practice. The authors call for further research to improve oracle designs in terms of performance, fees, data security, and integrity.

Method

  • This research paper depends on insights from existing literature. To obtain this body of literature, the authors employed the MLR technique.
  • The search keywords used in order to cover the vast majority of related studies were combinations of (“blockchain oracles”, “data feed”) AND/OR (“smart contracts”), (“design”), AND/OR ( “pattern”).
  • Several digital libraries like IEEE Xplore, ScienceDirect, ACM Digital Library, DBLP: Computer Science Bibliography, and Google Scholar were used to obtain the set of papers which were reviewed.
  • The authors also employ the snowballing technique to filter out relevant studies.

Results

  • From analyzing the fine set of literature obtained, the authors collate a set of 55 papers. The papers revealed with respect to monetary incentives, two categories of oracles exist: voting-based oracles and reputation-based oracles.
  • Voting-based oracle designs were classified based on query types and their vulnerabilities to Sybil attacks and verifier’s dilemma.
  • Reputation-based oracles were classified by retrieving data from internet sources and were analyzed based on their limitations alongside the authenticity and confidentiality of data.
  • Voting-based oracles employ voting based strategies to aggregate and determine the validity of outcomes. Rewards are only distributed when certifier and voter data outcomes match.
  • Stake-based oracles, multisignature based oracles, Schelling based oracles, token based oracles, and conventional oracles are examples of voting based oracles.
  • Stake-based oracles generally involve making participants lay down a bond to submit, vote, or verify data outcomes on the oracle platform. The review revealed that these oracles (such as Oraichain and Razor) mostly process binary data queries and are prone to the verifier’s dilemma, but can help mitigate against Sybil attacks.
  • In multisignature systems, a majority of parties must agree to a data outcome for it to be finalized as true. These oracles mostly process non-binary data queries and are partially prone to Sybil attacks and verifier’s dilemma.
  • Schelling Point systems use a median point to determine which participant is rewarded and which loses their staked funds. They mostly process scalar data query types, are not entirely immune to Sybil attacks, and are prone to micro-cheating. Examples include, Maker Protocol and Oracul.
  • In token-based systems, users are required to hold or use a certain amount of the native token of the platform to participate in determining the correctness of data outcomes. These systems mostly process any type of data, are immune to Sybil attacks due to monetary policy, but are partially prone to the verifier’s dilemma.
  • Conventional systems implement only single or multiple data sources to verify data correctness and integrity. These sources are usually trusted entities whose votes are naturally reliable. They process a mix of scalar and non-binary data query types and manage Sybil attacks and the verifier’s dilemma moderately.
  • The second class of oracles are reputation-based oracles. They use reputation strategies to select the oracles or participants who would bring forward requested data and authenticity proof mechanisms to manage data integrity. They include software-based proof oracles, hardware-based proof oracles and proofless oracles.
  • Most oracles that need a secure HTTP connection to retrieve data from sources are powered by the TLS protocol. The TLS protocol is not foolproof. TLS-N and TLS Notary are solutions that propose respectively that parts of the data in a TLS session should be hidden and third-party validators should be employed to validate TLS sessions.
  • Most software-based proof oracles studied are limited by their lack of resilience to data tampering. They also partially lack confidentiality of data but are largely authentic.
  • Hardware-based proof oracles are limited by their hardware requirements but largely offer authentic data with partial confidentiality. Town Crier, Chainlink, and Corda are some of the examples in this category.
  • Proofless oracles have the limitations of not being decentralized with hardly any authenticity proof mechanism. There is also little confidentiality of data.
  • The majority of voting-based oracles barely employed authenticity proof mechanisms for data integrity and correctness.
  • The majority of the oracles are based in one blockchain (Ethereum) and only a few are blockchain agnostic.

Discussion and Key Takeaways

  • Secure and high-performance blockchain oracles still need to be designed. Developers should take advantage of high-performance and low transaction fee blockchains for oracle design and deployment. Authenticity proof mechanisms for these oracles need to be fast and reliable as well.
  • Blockchain oracle adaptability needs to be studied further in order to pave way for agnostic solutions.
  • How can we identify legitimate blockchain oracles?

Implications and Follow-ups

  • An ideal oracle is still a farfetched idea. The area needs more research and development.
  • Oracles must be available at all times on a variety of networks and have confidential, reliable data.
  • Oracles are prone to security risks like hacking, instability, and malicious influence. Attaching high economic risks to breaching the integrity of oracle data helps with prevention of Sybil attacks. Proof of work and proof of stake systems can also be used for this purpose but an entity must never control more than 50% power.

Applicability

  • This study fills the research gap on the techniques, challenges, advantages, and disadvantages of existing blockchain oracle design patterns.
  • Researchers can use it as a launchpad to conduct more studies on the operating costs, processing speeds, security, and the improvement of oracles to handle different data types.
  • Developers can also use the study to improve future oracle design patterns.
11 Likes

Hi @Favvz, and thank you for an excellent contribution to the forum.

Your conclusion emphasizes the relative immaturity of oracle design, which leaves me curious about a few specific aspects of oracle technology:

  • What direction do you think this element of the industry is most likely to take?
  • What oracle designs are currently showing the most success at the present time?
  • And how can the reliability of oracle nodes be enhanced?
2 Likes

Thank you for sharing this research. I am wondering how you define an ideal oracle.

1 Like

Thank you @Favvz for this insightful summary.
I think that the primary problem in designing oracles is that if the oracle is compromised, the smart contract that depends on it is also compromised. This is a very big oracle problem, hackers can hijack the smart contract and inject malicious codes inside it.

@Favvz , The trust conflict between third-party oracles and the trustless execution of smart contracts, is it something you feel like it can be solved in future cause this issue have remained mostly unsolved.

2 Likes

Thank you for your question @rlombreglia. Apologies for the very late response. I’ll take them one after the other.
What direction do you think this element of the industry is most likely to take?
||There will be more recognition and use of oracles. Mainstream blockchain adoption can only happen where there is a synergy between on-chain and off-chain information and abilities. Oracles are going to show importance in any web3 platform that requires data handling. At the moment, there is already a shift from using oracles for only price feeds to all forms of data recovery and events based outcomes. Examples are Etherisc and Arbol who use oracles to facilitate insurance and help farmers hedge against weather risks. Oracles will also be pivotal in connecting existing traditional legacy systems of institutions/governments to blockchains. With this increased use comes an increased consciousness of security, especially as we may see regulators getting interested in the future on matters concerning data privacy and legal liability. There may be hybrid versions of these design models in future to make a foolproof system.|
|||
||While design models become hybrid, there is still a lot of room for more oracles to grow outside the Ethereum blockchain. This is especially because high gas fees and scalability issues may hinder optimal oracle performance as its use gets more frequent. Chainlink alone recorded up to $1 million in gas fees paid by users across 10 networks to carry out oracle transactions in the past month. Still most top oracles today are built on Ethereum because the Ethereum network currently dominates over 50% of the DeFi market and it is easier to provide services this way. This will change with time.|
||Finally, we all know that interoperability has become the watchword for the future of blockchain technology. Blockchain agnostic oracles will be at the center of making off-chain information transfer seamless between different blockchain networks. I agree with the authors of the paper that it is indeed lacking now. However, I believe that there will be development in this area in the future just like all hands are on deck now to make bridges better. |
|| |
||Without oracles, blockchains are like computers without an internet connection and like all areas in web3, we are not there yet. There will be much iteration in the industry as blockchain technology becomes more integrated with traditional systems. |

1 Like

• What oracle designs are currently showing the most success at the present time?
Judging from the top 10 oracles by market cap, there is an almost fair distribution of the oracle designs showing success. The leading oracle network currently is Chainlink and in 2021 alone, it recorded access to over 1B data points and up to $75 billion in secured value through its 1,000 project integrations with 700 oracle networks. Chainlink is a hardware-based proof oracle which falls under the reputation-based oracle design group. API3 is a software-based oracle in the list which also falls under the reputation-based oracle design group. Other oracles like UMA, Augur, Band Protocol, and Tellor employ token-based and stake-based methods which places them under the voting-based oracle design group.

2 Likes

• And how can the reliability of oracle nodes be enhanced?
||This is a very important question to which there is no single answer. In addition to other vulnerabilities within a project’s smart contract, one other weak link will remain the oracle feeding it data. And oracles depend on nodes who also gather information from a variety of sources. While there may be concerns on the data source, if nodes are malicious, the smart contract may act on wrong information, jeopardising the relying project/dApp in the process. |
||Several methods have to be applied to curb malicious activity, Chainlink for one with its strong monopoly of the oracle ecosystem, is known for handpicking its nodes. In the past month, there have only been 310 active nodes on the network. Putting these facts side by side with the enormous value within its control, it may not be as trustless or decentralised as it puts itself out to be. But we can see that it does try to minimise the need for trust. Its major concern is on the quality of the information and not the level of decentralisation. |
||Positive and negative monetary measures have also shown success in curbing malicious behaviour. Rewarding an honest node can boost its confidence in the network and encourage good behaviour. On the other hand, penalising a node by either subtracting its token balance or taking away a previously staked bond, can discourage bad behaviour. Monetary measures can also reduce Sybil attacks which occur when a node creates multiple identities in a network so it can influence outcomes. In a stake based oracle, a malicious node who wants to go through this route would have to stake multiple times leading to a loss. The higher the value at stake, the more attractive the contract is to malicious persons. The oracle network must fix bonds and penalties with this in mind. The higher the value of the contract, the higher the bond needed to participate/penalty for default. |
|||
||When oracles become more adopted and integrated with traditional systems, the security bar for oracle nodes gets higher. The World Economic Forum suggests the use of service level agreements (SLA) to encourage high-quality node operators in its whitepaper titled, Bridging the gap: Interoperability for blockchain and legacy systems. This SLA would be enforceable on-chain and will contain clear terms which would be digitally signed by the data requester and the oracle providing the service. The paper also proposes the use of crypto-economic measures like I identified above and a reputation system. For the reputation system to work, the performance data of oracle nodes would be collected and fed into a database. With this, data requesters can assess the past performance of oracle nodes who indicate interest to perform jobs before making a choice. In addition to reputation systems, the paper suggests the use of listing services for node operators to register their nodes, data sources, and display certifications. Listing services may require nodes to reveal their identities. API3 is one oracle where the on-chain identity of nodes are published through off-chain mediums to strengthen the authenticity of the data provided. Listing services can allow for proper KYC procedures to be carried out and will further ensure that only reliable oracle nodes are given jobs. Combining this with reputation mechanisms can also help to secure against Sybil attacks. Chainlink is already on to these methods.
||In conclusion, node operation is not walk in the park and must be handled by skilled teams to ensure optimum delivery. The decentralisation factor must remain; never receive information from just one node and never accept nodes that retrieve data from only one source. |

1 Like

An ideal oracle or rather, the perfect oracle, is one that is free from security breaches and third party manipulation while ensuring that data handled is authentic and supplied in a confidential manner to the smart contract or service that requires it. As an addition, an ideal oracle in this space should also be decentralized or it would contradict the purpose of the entire system.

2 Likes

Yes @Cashkid18 a smart contract stands the risk of being compromised if the oracle supplying data to it is faulty. It’s not a trust conflict in my opinion, it’s merely a situation of proper management. While we perform audits on the smart contract, we must ensure that the oracle has good data sources, reliable nodes, and good mechanisms. I have already discussed how we can choose reliable nodes in my previous response here: Research Summary: Blockchain Oracle Design Patterns - #8 by Favvz
Data used by an oracle must not be sourced from just one point. More than one perspective is needed. This may increase aggregation time but data is more authentic where more than one node attests to its validity. Asides the fact that data must not come from one source, it also must not come from a free source. Data from free sources are less likely to be valid and well grounded because of the lack of incentives. Data must also be accompanied by authenticity proofs that show that it has not been tampered with and has remained confidential. Oracles need to be transparent about their methods to allow developers make informed decisions before integrating them with smart contracts. If these areas mentioned can be handled properly, we will see improvement in the security overall.

3 Likes

thanks for this insightful summary , blockchain oracle use cases in their role as a bridge between blockchains and off-chain data, blockchain oracles have many exciting uses. As they become more common, they have the potential to change the way many blockchain-related industries are run for instance: DeFi , DApps, NFTs

3 Likes

Thanks @Favvz for this wonderful summary, I really took my time to go through it.

In general, while considering the employment of an oracle, the trust model must be carefully considered. You may be sacrificing the smart contract’s security by exposing it to potentially incorrect inputs if we presumed the oracle can be trusted. However, if the security assumptions are carefully considered, oracles can be valuable.

4 Likes

@Favvz This is an excellent post, and I applaud your efforts.

Blockchains connect the physical world to computer systems and automate transactions in the smart economy. Because the developer philosophy is to build successful blockchain apps on top of existing blockchains, this structure includes a variety of standards and frameworks. Developers should bear in mind that creating a dapp necessitates particular skills such as graphic design. Fortunately, a set of technologies is available that makes it simple to develop efficient interfaces that are appealing to customers, point-of-sale systems, and other software suppliers.

3 Likes

I undesstand that Oracles could be used in a variety of ways. Oracles might be implemented, for instance, as off-chain components or as a mix of on- and off-chain components. An oracle could enable other smart contracts to push or draw data from off-chain components, as shown in the sequence diagram that follows. In the first scenario, a transaction is used by the external system to communicate the needed data to the Oracle smart contract. The method that this transaction should call should accept data as an input parameter.

3 Likes

Summary Discussion.

About Discussion.

  • With blockchain technology and smart contracts, it is now possible to permanently record data securely and transparently without the need for a third party.
  • However, to extend the usability of blockchain technology, these systems need to access real-world data which is free from manipulation. They do this using blockchain oracles.
  • Through a review of 55 papers ranging from academic work to protocol whitepapers, this study reveals that although existing blockchain oracles are making efforts to serve as a reliable source of data, they are still vulnerable to attacks and lack authenticity and confidentiality.

Tags.
Oracle, Data, Cryptography, Blockchain, and Designs.

Points of disagreement.

  • Trust of conflict; @cashkid stated, 'The conflict of trust between third-party oracles and the trustless execution of smart contracts
  • It was pointed out that the situation was a result of poor management rather than the “trust of conflict”.

Offered solution.

  • Reliability of Oracle
  1. Chainlink, with its monopoly on the Oracle ecosystem, is recognized for handpicking its nodes. The network has only had 310 active nodes in the last month.
  2. Both positive and negative monetary approaches are effective in reducing malevolent behavior.
  3. As Oracle becomes more widely used and incorporated into older systems, the security bar for Oracle nodes rises.

To see the full post:Research Summary: Blockchain Oracle Design Patterns - #6 by Favvz

  • Potential of blockchain oracle.
  1. In its role as a bridge between blockchains and off-chain data, blockchain oracles have numerous fascinating applications. They have the potential to transform the way many blockchain-related transactions are conducted as they become more popular.

To view the full post:Research Summary: Blockchain Oracle Design Patterns - #11 by Raphking

  1. As demonstrated in the sequence diagram below, an oracle could allow other smart contracts to push or pull data from off-chain components.
    To view the full post: Research Summary: Blockchain Oracle Design Patterns - #14 by Stallonaking
  • Security
  1. If we assumed the oracle could be trusted, you may be jeopardizing the smart contract’s security by exposing it to potentially incorrect inputs.

To view the full post:Research Summary: Blockchain Oracle Design Patterns - #12 by Mansion

  • Identification of consequences.

    • Wrong input.
  1. By exposing the smart contract to possibly inaccurate inputs, you may be jeopardizing its security.

To view the full post:Research Summary: Blockchain Oracle Design Patterns - #12 by Mansion

  • Malicious inputs of code.
  1. This is a major Oracle issue; hackers can hijack the smart contract and inject malicious code within it.

To view the full : Research Summary: Blockchain Oracle Design Patterns - #4 by Cashkid18

Question .
What are some of the key issues that can hinder optimal oracle performance?

Key Resources.
To have an in-depth knowledge of the subject matter, visit any of these websites:

  1. https://www3.weforum.org/docs/WEF_Interoperability_C4IR_Smart_Contracts_Project_2020.pdf.
  2. https://arxiv.org/pdf/2106.09349.
5 Likes

@Favvz nice summary….

Oracles in blockchain are not unlike the oracle in the matrix movie.

Her role was to provide information, and oracles are systems that provide information to blockchains in blockchain systems.

Blockchain technology has many intriguing properties.

The immutability of data stored in blockchains is arguably one of the most important, interesting, and useful properties of how blockchains function.

Immutable data, on the other hand, is only useful if the data being stored is correct or genuine.

This is why Oracles are such a big issue in blockchain use cases: you can store immutable data but can’t be sure it’s genuine.
Take a look at a straightforward blockchain-based betting system that allows users to wager on which soccer team will win a game.

The smart contract will require knowledge about the winning team in order to send the winners their dues after the game.

This will most likely be obtained from an online resource.

The issue with this technique is that you have to have faith in the third party providing the data because it depends on a centralized system to generate and distribute the information.

The integration of trust in a trustless blockchain system is obviously undesirable.
This is why blockchain-based IOT applications and use cases are important, because with the right architecture and security protocols, blockchain-based independent sensors and IOT systems will be able to send genuine data to blockchains.

2 Likes

Hello @Favvz I appreciate your summary, I feel it will be of great important to specifically comment why Blockchains need Oracles. Here are my few reasons.
Oracles themselves are not the source of real-world information, instead, they gather it from existing databases and communicate the data in a reliable way to the blockchain. The relationship between oracles and blockchains is reciprocal. Oracles can receive on-chain data to distribute to external applications like banking apps.

It provides potential new uses for enterprises, such as supply-chain tracking, which traces products from their source to consumers and bonds that rely on third-party interest rates. Existing systems can simply integrate with the decentralized network. Blockchains and their public ledgers can be compared to a computer without an internet connection; such a device would be unable to search for and take in real-world data. A computer with no internet can only access what is stored on its local hard drives.
Similarly, a blockchain can only access the transactions recorded on its distributed ledger, which limits the number of applications that can be used without real-world information. Oracles provide this internet connection. They enable blockchains to find and access outside information for on-chain smart contracts.

Blockchain as a form of distributed ledger technology has enabled data to be shared among nodes connected over the internet. In addition, by the introduction of smart contracts to the blockchain, programmability is added to this disruptive technology and has changed the software ecosystem by removing third parties for administration of (non)business purposes. Although promising, blockchain and smart contracts do not have access to the external world, hence, they need trusted services referred
to as blockchain oracles for sending and verifying external information to smart contracts.

3 Likes

@GloriaOkoba There are several key issues that can hinder optimal oracle performance. Some of these include:

  1. Data quality: Oracle performance can be hindered if the data that is being used is of poor quality or is not properly formatted.

  2. Indexing: Poor indexing strategies can lead to slow query performance and hinder the overall performance of the oracle.

  3. Resource contention: If there are too many concurrent requests being made to the oracle, it can lead to resource contention and reduced performance.

  4. Hardware constraints: Oracle performance can be hindered if the hardware that it is running on is not sufficient to handle the workload.

  5. Lack of optimization: If the queries being run against the oracle are not optimized, it can lead to slow performance.

  6. Poor database design: A poorly designed database can lead to slow query performance and hinder the overall performance of the oracle.

  7. Lack of maintenance: Failing to properly maintain and tune the oracle can lead to degraded performance over time.

I hope this fits the question you are asking?

1 Like