Key Problems in Oracles and Data

CTA: This thread is an attempt at inventorying key areas of exploration with regard to blockchain oracles and data sources. This post is a living document and it is our hope that the community will contribute to this list of key considerations.

What are the key factors to consider when building a blockchain oracle?

Oracles are blockchain infrastructure that ensure on-chain smart contracts have access to off-chain data and events. Building an oracle that relays data on-chain is not excessively difficult. However, designing an oracle that is as tamper-resistant, reliable, and decentralized as the underlying blockchain is where the engineering gets much more complex. Providing strong guarantees around the security and reliability of the contract code running on the blockchain is largely futile if the oracle mechanism triggering the contract is highly centralized and vulnerable to manipulation. Since data is directly responsible for the resulting execution of a smart contract, the famous saying rings especially true for smart contracts; garbage in, garbage out.

Different oracle mechanisms make different trade-offs with regard to various attributes including costs, speed, security, reliability, and flexibility. Determining the practicality of any particular oracle implementation requires analysis over a multitude of metrics including security of the oracle mechanism, data quality, financial incentives, core development, and network effects. Below are some key questions that should be answered before integrating a particular oracle solution in a production environment.

Security and Reliability of the Oracle Mechanism

  • How resistant is the oracle mechanism to data manipulation attacks or incompetence by individual nodes or groups of node operators?
  • What kind of standards are maintained in terms of the quality of nodes participating in the oracle network?
  • Can the oracle network provide protection against Sybil attacks (taking over a majority percentage of the nodes using pseudo-anonymous identities) or mirroring attacks (nodes that submit data to an oracle network by simply copying it from another node)?

Data Quality Provided by the Oracle

  • How is the data being generated by the oracle mechanism? Is it sourced directly from premium authenticated APIs, free open APIs, or is it produced via crowdsourcing?
  • Are there requirements in place to ensure that the data meets certain quality standards, such as financial data with market coverage guarantees?
  • How frequently is data updated and delivered on-chain? Is the update frequency reliable even during times of blockchain network congestion?

Financial Incentives to Run Oracles and Produce Data

  • What are the direct financial incentives for the node operators to provide data on-chain to smart contracts? What is the financial punishment for not doing so?
  • What are the financial incentives for data providers to produce and maintain high-quality data? How are they held accountable when data quality drops?
  • Is the financial incentive model used to compensate node operators and data providers scalable or sustainable into the future?

Network Effects and Development

  • How generalized is the oracle solution? Can it support multiple use cases, unique design patterns, different data types, many blockchains, and/or various off-chain computations?
  • Is the oracle node software open-source and is the oracle mechanism permissionless to build and iterate upon?
  • What kind of community support does it have? Does the oracle solution only serve the DeFi ecosystem or does it also support other communities such as enterprise, government, general enthusiasts, and/or non-Ethereum blockchains?

We need help identifying key problems in the space, so please contribute to this thread. Here is our idea about what a key problem looks like:

  • Provides direction for individual research efforts and projects
  • Is a broadly applicable question or problem statement
  • Requires many coordinated research efforts to answer or solve
1 Like