- Breidenbach, L., Cachin, C., Coventry, A., Juels, A., & Miller, A. (2021, February 24). Chainlink Off-Chain Reporting Protocol. Retrieved from https://chain.link/ocrpaper
Core Research Question
- How does the Chainlink Off-Chain Reporting protocol work, what are its design goals, and what are the algorithms used for its implementation in the Chainlink Network?
- Oracles are off-chain agents that connect on-chain smart contracts to external resources such as data APIs that reside outside of a blockchain network. Such data is often required in the execution of automated smart contract applications.
- Price Feeds are on-chain reference contracts updated by oracles and provide smart contracts access to financial market data regarding various assets, enabling the creation of decentralized finance (DeFi) applications.
- Gas is the unit that measures the amount of computational effort required to execute and validate transactions on the Ethereum blockchain. More complex transactions performing many operations consume more units of gas.
- Gas price is the amount of ETH that is required to be paid per unit of gas to miners on the Ethereum network. Gas prices are denoted in Gwei, with each unit Gwei being equal to 10-9 ETH. Times of higher network congestion result in a higher gas price and more expensive transactions.
- On-chain Aggregation is an oracle network model where each node fetches data from an external data source and posts it on-chain within separate transactions. Each transaction consumes gas and thus each node must pay an individual transaction fee determined by the current gas price.
- Off-chain Aggregation is an oracle network model where each node fetches data from an external data source and then collectively aggregates their responses off-chain into a single transaction containing a sorted list of values. Only a single node submits the transaction on-chain, reducing the total amount of gas consumed per update.
- Oracle Report is a collection of responses from nodes within an oracle network during one update period. A report includes each node’s individual observation and their associated signature.
The Off-Chain Reporting (OCR) Protocol is a scalability upgrade of the decentralized oracle network Chainlink which decreases the on-chain gas costs of generating updates for the Price Reference Feeds by moving the data aggregation process off-chain using a distributed peer-to-peer network. OCR reduces the number of on-chain transactions required per update in an oracle network with n nodes from O(n) to O(1).
The OCR protocol is described as being developed with four primary goals:
- Resilience: The protocol should be resilient to different kinds of failures involving Byzantine nodes and infrastructure crashes. An honest OCR node that is temporarily disrupted should be able to recover and rejoin the protocol quickly and without manual intervention.
- Simplicity: In order to quickly meet market demand for oracle scalability, the protocol is designed to favor a straightforward implementation.
- Low Transaction Fees: On-chain interactions should be minimized to lower the amount of fees nodes are required to pay. Ethereum transactions can carry a significant fee, particularly during periods of network congestion. OCR favors off-chain communication and computation wherever possible.
- Low Latency: The time between when an update is initiated off-chain and when the data is included on-chain should be minimized as DeFi smart contracts require fresh data. The protocol should be able to produce an off-chain report within a few seconds and have it confirmed on the blockchain as soon as possible.
The paper’s authors first formalized a model of the OCR protocol’s liveness and safety thresholds when there is at least n > 3 nodes:
- Any f < n/3 oracles may exhibit Byzantine faults and behave arbitrarily as if controlled by an adversarial actor.
- It is expected the OCR protocol operates with n = 3f + 1 oracles as this gives optimal resilience (Byzantine nodes are less than ⅓ of network).
- If f oracles are Byzantine-faulty (malicious node) and c oracles are benign-faulty (unresponsive honest node) with f < n/3 but f + c ≥ n/3, then the protocol may lose liveness but always satisfies the safety properties.
- Using the median value from at least λ = 2f + 1 observations (responses from more than ⅔ of the network) ensure the final report is plausible in the sense that faulty oracles cannot move the median outside the range submitted by correct oracles.
The OCR protocol is described as being structured into three primary protocols run by each node in the network:
- The pacemaker protocol drives the report generation process, which is structured into epochs.
- Each epoch has a different leader node who coordinates the creation of a predetermined number of reports with observations provided from the follower nodes.
- The pacemaker protocol runs continuously and periodically initiates a new epoch and pseudo-randomly selects a new leader.
- Each follower node monitors the performance of the leader and if not enough progress is made within a specific timeframe, a new epoch is initiated and a new leader node is selected.
- The report generation protocol divides each epoch into numerous rounds, with each round corresponding to the creation of a new report.
- In each round, observations from each oracle node are collected and an aggregated report is generated that is signed by a threshold of oracles.
- To prevent unnecessary on-chain transactions, a report is only created and validated by oracles if the previous on-chain update has deviated beyond a specific threshold against an off-chain data source (e.g. 0.5%) or a specific time interval has passed (e.g. 1 hour).
- The transmission protocol encapsulates how the report is submitted on-chain and does not require communication between nodes.
- The transmission protocol delays each oracle’s submission on-chain pseudo-randomly to ensure a staged sending process, ensuring not too many redundant copies are submitted on-chain.
- Once a report has been validated by miners and added to the blockchain ledger, the transmission protocol process ends for the current round.
The specific steps taken for the creation of a single report within a round follow the below procedures:
- A new round starts and the leader of the current epoch requests an observation from all follower nodes in the network.
- Each follower node fetches data from a predefined data source API, signs the data using their private key, and sends the result back to the leader node.
- The leader node waits for at least 2f + 1 follower nodes to respond, plus a grace period, then sorts the responses by value, generates a report, and sends it to all follower nodes.
- Each follower node validates the report by checking the values are sorted, contains observations from at least 2f + 1 follower nodes, all signatures are valid, and that the median value exceeds the deviation threshold of the previous on-chain update or a time-based heartbeat condition has occurred.
- If all conditions are met, each follower node generates a compressed report with just the node observations and oracle identities, signs it, and sends it back to the leader node.
- Once the leader node obtains signed reports from more than f follower nodes, the leader assembles a final report from the followers’ signed reports, which is then broadcasted to all follower nodes.
- When each follower node receives the final report for the first time, they rebroadcast it to every node, ensuring all nodes have received the report.
- When a follower node receives broadcast from more than f nodes, the transmission protocol is started.
- The report generation protocol for a round is now complete and the leader node waits a predefined amount of time until starting a new round. If a new epoch occurs or the leader does not make progress in a specified amount of time, then a new epoch is created and a new round initiates.
- In the transmission protocol, to prevent unnecessary gas costs, the report is put through a filter and passes if 1) there is no backlog of reports or 2) the median value in the new report deviates beyond the median value of the report in the backlog by a sufficient threshold.
- If the report passes the filter, a staging process begins where one or multiple nodes are pseudorandomly chosen to create a transaction to submit the report on-chain.
- If a transaction is not confirmed on-chain within a specific time delay, a round robin approach is started where additional nodes begin to make an on-chain transaction in a time-staggered manner.
- Once the report is added to the blockchain, an on-chain smart contract validates the signatures, stores the median value, pays nodes who contributed an observation and compensates the transmitter for its transaction gas costs.
- The transmission protocol for a round ends when the corresponding report has been accepted by the contract.
Through the creation of a technical specification for the Off-Chain Reporting Protocol and its sub-protocols, the authors of this paper have defined how the decentralized oracle network Chainlink reduced the on-chain gas costs of Price Feed updates by up to 90% through batching node observations into a single on-chain transaction.
The authors also modeled the security assumptions of the Off-Chain Reporting Protocol including the proportion of Byzantine nodes for optimal resilience and the security measures implemented to protect against malicious leaders, ensuring progress cannot be halted beyond a predetermined amount of time.
Follow-up and Future Work
- Multiple avenues are described in how the Chainlink Off-Chain Reporting Protocol could improve in the future.
- Currently, reports are generated on a static interval. It may be desirable to produce more frequent reports upon observing changes within the data feed itself, such as increased volatility. However, such a design could generate reports at a faster rate than the frequency with which Ethereum produces blocks.
- The OCR protocol can adopt an aggregated threshold signature scheme instead of the separate ECDSA currently used. The constant on-chain gas costs of verifying a single threshold signature would allow for larger oracle sets and further lower on-chain gas costs.
- A new oracle list could be signed by the current quorum of oracles to change the off-chain oracle set without requiring the owner to intervene. This would allow for dynamic changing of the oracles used within an OCR-based oracle network.
The current application of the Chainlink Off-Chain Reporting Protocol is primarily focused on improving the Price Reference Feeds, which provide a secure source of high-quality financial market data for the Decentralized Finance (DeFi) ecosystem. By reducing the gas costs of updates, deviation thresholds can be lowered, more price feeds can be launched, and a greater number of nodes can be added to each network due to the more efficient use of Ethereum’s transactional bandwidth.
The Off-Chain Reporting Protocol can also be used for the creation of any type of on-chain reference feed required by smart contract applications, such as the current temperature of a specific real-world location, the amount of off-chain US dollars in a bank account backing an on-chain stablecoin, and numerous other datasets. Additionally, Chainlink OCR networks can be natively launched on other blockchains beyond Ethereum to extend the Chainlink Network’s cost-effective data computation model to the growing ecosystem of interacting chains.
- Will the gas improvements from OCR lead to a reduction in blockspace demand from oracles or will more oracle networks be launched to fill the gap?
- What is the lowest deviation threshold at which an OCR-based price feed could realistically update, particularly during times of extreme volatility and network congestion?
- How can the OCR Protocol be used for other oracle network models beyond on-chain reference feeds?