Requirements: Op Integration Requirements
Contents |
This document outlines the integration of Espresso’s fast confirmation layer with the OP Stack. It also introduces key components and design considerations.
The batcher is running in AWS Nitro Enclaves as a TEE.
It generates an attestation with a key pair to register on the Batch Authentication Contract.
It reads blocks from the sequencer.
It submits transactions to Espresso for fast confirmation.
The batcher periodically queries Espresso for finalized batches.
It checks batch consistency to ensure the inclusion and ordering.
It signs the blob hashes.
The batcher signs the transactions and submits the transactions with the signature to L1.
The Batch Authentication Contract verifies the signature, and if it’s valid, the Batch Inbox Contract records the batch.
Here is the source file of the diagram. It may be edited and re-uploaded here.
The OP batcher runs in a Trusted Execution Environment (TEE) using AWS Nitro Enclaves. The environment must be configured to support TEE execution, attestation generation, and verification using Nitro Validator.
Before processing any batches, the batcher generates a key pair and an initial attestation from AWS Nitro Enclaves that the key pair was generated in a TEE environment, and sends them to the Batch Authentication Contract for signer registration.
Once set up, the batcher reads user-submitted blocks from the OP
sequencer by op-geth
and sends the transactions with the batcher signature
to Espresso. Note that Optimism has a sequencer throttling feature
that prevents blocks larger than the throttling threshold (as implemented
here). Currently, both Optimism’s original code and Celo’s fork set the
threshold to 21k by this flag, while the maximum block size allowed by
Espresso is 1m bytes, which is greater than the throttling threshold, so
the batcher doesn’t need to check whether the size is too large. However,
if the Espresso limit becomes smaller than the throttling threshold due to
setting update, the batcher should have additional size handling.
After submitting transactions to Espresso, the batcher waits for them to be finalized. It then validates the confirmation and constructs the batch deriving from Espresso.
The batcher computes the blob or calldata hashes for the transactions and signs them using its ephemeral key. The signature (note this is different from the batcher signature which is unchanged for every block) and batch hash are sent to the Batch Authentication Contract, which validates the signature and records the hash as an acceptable batch to post.
Immediately after authenticating with the Batch Authentication Contract, the batcher sends the raw batch data to the Batch Inbox Contract.
Within the OP sequencer, op-node
reads deposit transactions, derives L2
state, and sends the constructed payload attributes to op-geth
. With
transactions submitted by the user and the payload attributes from
op-node
, op-geth
constructs blocks to be queried by the batcher.
The OP batcher maintains two sets of keys with different purposes:
The key which is registered with the rollup chain config as the centralized batcher key. In this key is vested the authority to add batches to the L1, and thus the ultimate authority to determine the sequence of inputs processed by the rollup. Thus, this key also acts as a centralized sequencing key.
This key may exist outside of the TEE enclave running the batcher, although the private key will need to be passed into the enclave in order for it to function.
A key generated inside the enclave which never leaves it. Thus, signatures from this key must originate inside the enclave. This is a way of proving some data originated from or was endorsed by the code running in the enclave. This is similar to producing a TEE attestation, but these signatures are cheaper to verify than the full TEE attestation.
The batcher must have both sets of keys in order to successfully post a batch; the former proves to the derivation pipeline that a batch is originating with the centralized sequencer, while the latter proves to the inbox contract that the batch is originating from within the TEE enclave.
Let’s take a look at core changes to the batcher design needed to support quickly publishing blocks to Espresso as soon as they arrive from the sequencer, while maintaining the comparatively slower pace of publishing frames to L1 of the original batcher design.
For the purposes of this section, the OP batcher implementation can be conceptualized as two loops and a channel manager:
The block loading loop periodically loads unsafe blocks from the sequencer and pushes them to the channel manager. After each iteration, it sends signals to the frame publishing loop to publish new frames.
The frame publishing loop is responsible for querying the channel manager for new frames and publishing those to the L1 or DA layer.
The channel manager converts blocks into batches, which are then queued into channels. These channels are subsequently split into frames, which serve as the unit of data published to the L1 or DA layer. This process is done to achieve optimal compression independent of block size.
Our implementation ensures that blocks published to the L1 or DA layer are initially confirmed on Espresso. This replaces the block loading loop with two new loops:
The batch queuing loop periodically (every 100ms by default) polls the sequencer for unsafe blocks. However, instead of pushing them directly to the channel manager, it spawns a new goroutine that’s responsible for submitting the block as a batch to Espresso and waiting for confirmation, re-submitting the block if submission failes or it is not confirmed within a certain time frame.
The batch loading loop periodically (every 6 seconds by default, matching default batcher polling interval of unmodified OP batcher) polls an Espresso Streamer instance for new batches from Espresso blocks. The Streamer scans the Espresso chain for valid batches and buffers them to re-establish ordering in the event of batches being confirmed out-of-order. The Espresso loading loop subsequently pushes new batches to the channel manager and emits a publishing signal.
When either loop detects a reorg on L1, L2, or Espresso, batcher state is reset to the last safe L2 block.
This way, we ensure that blocks are published to Espresso as soon as they arrive from the sequencer, while L1 transaction frequency remains unchanged compared to upstream. This design also minimizes the amount of state the batcher needs to manage, which makes handling batcher restarts or L1 reorgs easier.
The base layer must verify that each batch received is coming from a modified OP batcher running in a TEE, or else prevent the rollup from processing said batch. This is what ensures that the rollup will only execute batches that have been confirmed by Espresso (since the batcher in the TEE is confirming everything with Espresso).
Our integration needs to extend the logic for filtering batches to exclude any which are not sent from a batcher running the correct code in a TEE. We will deploy a batch inbox contract at the designated inbox address, which holds the filtering logic and rejects any transaction which is not authorized to post a batch.
Note that the derivation pipeline also ignores transactions that revert, as specified at 30.3.3, so the contract and the pipeline are consistent.
A challenge in designing this batch inbox contract is that it must maintain binary compatibility with the existing derivation pipeline, where batches are sent as raw calldata (or blob data) with no additional information. This motivates two key design decisions:
This motivates the deployment of an additional contract, which we call the batch authentication contract. This contract is responsible for receiving extra inputs to authenticate batches and recording which batches are eligible to be posted. The fallback function of the batch inbox contract then simply calls a method on this authentication contract to check if the batch being sent to it is eligible.
The batch authentication contract has three jobs:
1 function register(address ephemeralKey, bytes calldata attestation) external;
1 function authenticate(bytes32 batchHash, bytes calldata signature) external;
1 function isValidBatch(address batcherKey, bytes32 batchHash) external view (bool);
To accomplish these tasks, it maintains a set of registered ephemeral keys, and a mapping of batcher keys to batch hashes. Each ephemeral key in the set has been verified to live within a valid TEE enclave. The batch hash stored with each batcher key is the next batch eligible to be posted by that batcher, having been endorsed by one of the registered ephemeral keys.
1 mapping(address => bool) ephemeralKeys private; 2 mapping(address => bytes32) authenticatedBatches private;
Each time register
is called, it verifies the TEE attestation on the ephemeral key
being registered and, if valid, adds it to the ephemeralKeys
set. Thereafter, batches
signed by this key will be accepted. authenticate
updates the stored batch hash for a
batcher key, after validating a signature on the hash and batcher key, recovering the
signing address, and checking that this address is in the ephemeral key registry. The
batcher key to update is assumed to be msg.sender
; that is, authenticate
must be called
from the same account that later posts the batch itself. isValidBatch
simply reads
the stored batch hash for the given batcher and compares it to the given
hash.
The reason for only storing one hash at a time for each batcher is to save on
the high gas costs of persistent storage. The intention is for the batcher to
always call authenticate
and then immediately post the batch to the inbox
contract, so there is never need to remember more than one batch at a time for
a given batcher. These two calls are to be thought of as two parts of the
same operation; they are separated into two separate transactions merely to
comply with the calldata format expected of batch posting by the derivation
pipeline.
This introduces new failure modes we must consider, as it is now possible for a failure to occur while batch posting is in an intermediate state (i.e. we have successfully authenticated a batch commitment with the authentication contract, but have not yet sent the batch contents to the inbox contract), or worse, for an L1 reorg to later revert us to such an intermediate state. Previously, batch posting happend in a single atomic L1 transaction and so such intermediate states could never occur.
However, note that if we ever find ourselves in such an intermediate state, it suffices to retry the whole operation from the start. Any state in the authentication contract that is associated with our key will be overwritten if we restart the operation, and then, barring another failure or reorg, we will succeed in sending a batch to the inbox contract. Thus, we can handle all cases by retrying the entire two-transaction operation until it succeeds, which will happen as soon as we manage to send both transactions to L1 without an RPC failure or reorg (both rare), and as long as we always send the correct commitment with a valid signature to the authentication contract just before sending a batch to the inbox contract.
Note: If the following approach doesn’t work out, we can use Distributed Lab’s solution to ensure that the signature is signed by an AWS KMS key that can only be used inside a TEE. With this solution, we don’t need extra changes to filter batches. We don’t do this in the first place because:
It’s AWS-specific.
It adds a trust assumption that KMS is honest and implemented correctly.
It doesn’t work for repacing TEEs with SNARKs.
The OP stack code derives batches by looking at transactions sent to a designated inbox address, but it does not check whether those transactions succeed or revert on L1. Thus, even if the Batch Inbox Contract reverts for an unauthorized batch, the derivation pipeline will still parse its calldata and treat it as a valid batch on L2. This means that without updating the derivation pipeline, the Batch Authentication Contract and Batch Inbox Contract cannot fully enforce TEE-based submission on L1 or prevent a compromised batcher key from posting data that reverts on-chain but is accepted by the pipeline.
To update the derivation pipeline to ignore transactions that revert, we need to update the L1 retrieval code where transactions in L1 blocks are scanned to extract potential batch data. L1 retrieval is implemented by one of the several data sources depending on the DA configuration. Specifically, we should update the calldata source and the blob data source. Both of them currently verify the batch transaction by isValidBatchTx, which doesn’t check the transaction status, so we need to add the check.
The transaction status is represented by the FetchReceipts function that is implemented here, so the next step is to retrieve the receipt status for each transaction, skip the transaction if it doesn’t have a success receipt status (i.e., the status is zero), and build the L2 data with the subset of successful transactions. Note that the receipts are already fetched in the derivation pipeline and cached, so we are not adding complexity to fraud proofs or significantly impacting the performance or the cost.
Depending on the data source, we should either add the above status check to DataFromEVMTransactions or to dataAndHashesFromTxs, so that only transactions that are both valid and successful are kept by the derivation pipeline.
The OP’s Espresso streamer is responsible for fetching messages needed by both the op-caff-node and the op-batcher from the Espresso query nodes. It does this by tracking which blocks have already reached quorum—i.e., when enough nodes have agreed on a block. This ensures that only finalized blocks are processed. Each message contains sequencer batches and the corresponding signature from the batcher.
OP’s current batch validity rule assumes batches arrive in the correct order. It processes transactions as they come, simply accepting or dropping them.
In contrast, our validation process is similar to OP’s previous batch validity rule: we assume batches may arrive out of order. We re-order them based on their batch numbers before processing.
Our validation includes the following checks:
The batch’s timestamp must match the expected timestamp.
The batch’s parent hash must match the L2 safe head block hash.
The batch’s L1 origin hash must be valid on a finalized L1 block.
If, after reordering, a batch is found to be invalid with respect to the L2 parent state, it is immediately discarded.
When NextBatch() is called, the following steps are executed:
If the L1 origin is not finalized, the function returns NotEnoughData and retains the batch for re-evaluation in the next iteration.
If the L1 origin is finalized, the function then verifies the state by comparing the batch’s L1 origin hash with the hash derived from the current finalized L1 chain.
To describe the algorithm to pick HotShot block height, let us first name the entities involved:
Let O, o be current safe L2 block and its height.
Let E, e be L1 Origin of O and its height.
Let h be HotShot block height recorded as finalized on L1 with Light Client Smart Contract as of block E.
Let H be HotShot block at height h.
Streamer queries de-caffeinated OP node for Safe L2 Block O and determines its L1 Origin block E. Streamer then queries Light Client Contract on L1 for finalized HotShot block height h at L1 height e. h is picked as starting point for traversing HotShot for unsafe batches.
Considering there is no requirement that batches appear on HotShot chain in-order, this approach requires proof that picking h will nonetheless result in no unsafe batches - i.e. batches with height ≥ o - being skipped.
We can prove this by contradiction. Assume the opposite: let h′≤ h be the height of a HotShot block that contains batch O′ with height o′≥ o. Let us also define Eu as L1 block at which transaction updating Light Client Contract state was included. Consider cross-dependencies for the chains involved:
E ⊃ Eu: For finalized height to be set to some value h in Light Client Contract state at height e, it needs to be set by a transaction in E or it’s ancestor, so E is either Eu itself or references it
Eu ⊃ H: Transaction that updated Light Client Contract’s finalized HotShot height to h included hash of H.
H ⊃ H′: H has to reference H′ as its ancestor
H′⊃ O’: H′ includes transaction from the batcher containing O′
O′⊃ O: O′ has to reference O as its ancestor
O ⊃ E: O references E as its L1 origin by hash
Thus we encounter a dependency cycle where E references itself through a chain of blocks that include each other either directly, by hash or by chain of hashes, making blocks involved impossible to construct. Our initial assumption is incorrect and we can guarantee no batches will be skipped by the streamer when traversing HotShot chain.
The multiple nodes client is initialized with a list of HotShot URLs of the query nodes, and the streamer uses this client to fetch messages from HotShot, in high frequency ( interval is 500 milliesecond).
It starts by calling ‘FetchLatestBlockHeight‘ in HotShot and begins fetching blocks from that height onward. In each iteration, it increments ‘nextHotShotBlockNum‘ and retrieves all messages from the query node for the specified namespace and block number.
The streamer then iterates through all the newly-get messages, parses them, gets the corresponding batcher’s signature and verifies it by the batcher’s fixed key.
Verified data will be returned as sequencer batches for a required target L2 block number. This will give the Caff node all the information needed from Espresso to construct payload attributes.
When deriving the next batch with given L2 chain state, make sure the needed L1 origin is already finalized.
If there is resubmission to HotShot, the streamer might get more than one batch with the same L2 block number, and the streamer will only keep one and skip later batches for the same number.
OP node can derive the state from any stream of batches. Depending on how it’s set up, it should be able to derive from the sequencer feed (preconfirmations), from the L1 inbox (the real finalized state), or, if it’s a ”Caffeinated node (Caff node)”, should have an option to derive from Espresso.
This page documents the technical design of the Caff node component of the OP stack integration . This node can be used by anyone who wants to derive the finalized state of the OP rollup as soon as it is confirmed by Espresso (before L1), including solvers in intent-based bridges to listen to blocks that have been finalized by HotShot. Therefore the page covers the process of reading L2 derivation inputs from Espresso to derive the L2 chain. It uses an op-espresso-streamer to fetch finalized messages and then runs them through the state transition function of OP to verify their validity.
L2 derivation inputs include:
sequencer batches from Espresso. A sequencer batch is a list of L2 transactions (that were submitted to a sequencer) tagged with an epoch number and an L2 block timestamp. Each batch represents the inputs needed to build one L2 block (given the existing L2 chain state).
Batcher’s signature on this sequencer batches from Espresso. So that we can make sure the message is come from trusted sequencer by verifying the signature.
deposits from L1. Deposits are transactions deposited to L2 from L1, they will form the first block of each epoch. These deposits can be read by checking the L1 origin from epoch number in sequencer batches.
System configuration updates from L1. This refers to the collection of dynamically configurable rollup parameters maintained by the SystemConfig contract on L1 and read by the L2 derivation process, like recognized batch submitter account. This part needs more research, but for now we’ll just read it from finalized L1 block.
For each L2 block to be created, we start from a sequencer batch matching the target L2 block number.
Deriving the transaction list from the cooresponding sequencer batch. If it’s the first L2 block of the epoch, aggregating the deposits transaction by reading from finalized L1 origin’s receipts. If there is any special transaction (see Payload Attributes ), add them before regular transactions.
Building individual payload attributes from the transaction list. Here the payload attributes should be exactly the same as what the op-node could derive from L1.
Enter the engine queue stage, the previously derived PayloadAttributes structures are buffered and sent to the execution engine to be executed and converted into a proper L2 block. This engine state step should be exactly the same as what op-node could have.
It costs around 63 million gas to validate an attestation with no prior verified certificates. To mitigate costs, we should have a mechanism where the batcher generates a key at startup and sends an attestation confirming the key’s origin in an enclave. Once verified, the contract can add the public key to a valid set, reducing per-batch posting costs.
L1 reorgs can impact batch consistency and state integrity. To avoid issues due to reorgs, the sequencer should have the flexibility to derive the state from either the latest or the finalized block on L1. Depending on whether such flexibility is supported, the implementations differ as follows.
To support such flexibility (e.g., in Celo’s code), the following modifications to the original Optimism code are necessary.
Sequencer: Celo adds SequencerUseFinalizedL1Flag which if set to true, signals that the sequencer accepts the finalized L1 block only. The sequencer tracks the finalized block by L1Finalized, which ensures we only build a new block that references a finalized L1 block.
OP batcher and Caff node: They should also make sure the derived L1 state in each block is finalized to prevent a malicious sequencer. This is important because the L2 batch references an L1 block hash that might not exist after an L1 reorg and could thus cause money to be lost if clients trust Espresso confirmations. These additional checks will not cause significant latency because an honest sequencer should already enforce a finalized state. This does mean that we need to turn on the SequencerUseFinalizedL1Flag flag for Celo integration.
However, for chains that derive the latest rather than the finalized state, i.e., without the sequencer update described above, adding the finality checks to the batcher and the Caff node will affect the performance. In this case, we may consider skipping the finality wait.
Light client is expensive and because we do not run it frequently (around every 10 minutes), it adds lag between sequencing and L1 finality. It is better to verify the header directly from the query service, by either downloading a chain of QCs or a set of Schnorr signatures.
With the query service, we may start from the majority rule for simplicity, then switch to Merkle proof verification. Arbitrum Nitro and OP integration teams may collaborate on this–one team prototypes it and shares the outcomes between teams.
Note this does not mean we cannot use the light client at all. We may still use it for operations such as fetching the finalized state from HotShot, which is more complicated to do through the query service.
In the initial version, we support permissioned batcher. In the long term, as described in the rollup integration page, we should be able to run multiple batchers.
To support this, we need to update the derivation pipeline to include a sequencer signature check and replace the batcher signature sent to Espresso with the sequencer signature. Note that the sequencer signature is not the same as the batcher signature. Since we run the batcher inside a TEE, we cannot consider the sequencer and the batcher as one party.
We may remove the L1 finality checks from the Caff node and the OP batcher for chains that don’t enable the sequencer to derive the finalized state. See L1 Reorg Handling for more reasoning.
When a new L1 block arrives, OP node reads the system config and identifies any parameter change by comparing with its cached config. A config change contained in an executed L1 transaction means the upgrade proposal has been approved by governance, so the node will apply the upgrade. Therefore, in order to have a dynamic inbox address, we should update the system config by adding an inbox address parameter to SystemConfig, and update config handling to consider the inbox address change. Depending on the decentralization choice, each chain has its own governance setting. Once the upgrade is approved, the OP node will pick up the change in the next block. No separate approval signal is needed from the node’s perspective since it trusts the L1 state.
We may add an escape hatch mechanism to provide flexibility when HotShot is unable to provide finality. The batcher will call IsHotshotLive before posting batches, and if HotShot is unavailable, the batcher can be configured to either wait for HotShot to go live or bypass consistency checks.