Chapter 13
Query Service Requirements



Component query service


Stakeholders Jeb Bearer, Abdul Basit



# Consistency: For all the availability API endpoints, all successful responses 1 to a given endpoint with fixed parameters are equivalent from all honest nodes. In simple terms, this requirement means that the availability API is deterministic; it only surfaces data which has agreement from consensus and will never change over time or vary between honest nodes. Honest nodes may only differ in what data they currently have available at all (which results in error responses). Acceptance: In a running Espresso network with multiple query nodes, randomly generate queries, send them to each node, and check that the responses are the same.

# Completeness: Every honest node eventually provides every consensus object, and, with the exception of nodes that have explicitly opted into pruning, never fails to provide an object once it has once provided it. Note that this requirement applies only to the availability API, and a single endpoint of the node API (which provides VID shares). Other API modules provide access to non-consensus data with weaker availability requirements. Acceptance: Run an Espresso network up to a certain block height h. Using the leaf, block, and VID streaming endpoints, check that it is possible to scan through all the data for every block up to height h on every individual node. Repeat in the presence of failures such as nodes being restarted or offline for periods of time. Acceptance: Run an Espresso network up to a certain block height h. Start up a new node and register it to stake. Run to a later height h> h. Check that the new node eventually provides all consensus data for all blocks up to h.

# Lightweight: The minimal subset of the APIs required for a node to participate honestly and correctly in consensus requires finite and limited resources. In particular, this requirement motivates Pruning– the ability of a node to delete data older than a specified threshold or greater than a specified size. It also motivates the various data source implementations, which allow nodes to store more or less data depending on their use case – from the minimum required for consensus (recent leaves and VID shares only) to full archival data. In addition, the work required to answer any particular publicly exposed query should be sufficiently small that


# Monitoring: The status API provides sufficient metrics that automated monitoring can be set up for a node. The granularity of these metrics should be such that

Acceptance: Using the query service metrics, configure alerts such that no incidents are missed and the team doesn’t get grumpy from false alarms.


# Trust-Minimized: The availability API can provide sufficient data to clients that a client who does not trust the server can convince themselves that the responses from the server are correct (that is, in agreement with other honest consensus nodes). Any additional data included with a response by way of proof should be succinct, meaning it is not larger than the data being returned itself.

Acceptance: It should be possible to design a client side library with an interface mirroring that of the query service itself, but whose functions return errors in the case of any malicious response from the server.


# Finality Notifications: The availability API can be used to get notifications when new blocks are confirmed, and then to download information about the confirmed blocks (such as headers and payloads).

Acceptance: Use the header, leaf, block, payload, and VID streams to check that new objects are confirmed every few seconds. Check that the different objects corresponding to each block are all consistent (e.g. the payload hash in the leaf matches the actual payload). Check that the single object queries for a given block are successful immediately after receiving that block on a stream.


# Namespace Queries: The availability API can be used to download a specified namespace within a given block. The amount of data transmitted is proportional to the size of the namespace, but not the size of the overall block.

Acceptance: Submit transactions such that a block is created with a small transaction in namespace A and a very large transaction in namespace B. Fetch the data for each namespace separately, and check that it matches the submitted transactions. Inspect the size in bytes of the HTTP response for namespace A and require that it is within a factor of 2 of the submitted transaction size.


# State Queries: For each component of the ?? (block state, fee state, and reward state), it should be possible to query the state of individual entries in the tree at any current or past block height, and receive the correct entry value and a valid Merkle proof.

Acceptance: Run an Espresso network with a fee-paying builder. For randomly selected block heights h:



# Stake Table Queries: The node API provides the contents of current and past consensus stake tables.

Acceptance: Run an Espresso network until it has changed epochs several times. Fetch the stake table by epoch number for each epoch, and check that it matches the configuration of the L1 Stake Table Smart Contract at the time corresponding to that epoch. Fetch the current stake table and check that the result is the same as fetching the stake table by epoch number for the current epoch number.

Acceptance: Make edge case queries like fetching future epochs and very old epochs. They should quickly respond with an error.


# Iteration: The availability API should provide paginated methods for iterating over sequential objects (blocks, leaves, headers, payloads, VID common, and transactions) in forwards or backwards order. For transactions, it should be possible to filter the iteration on the server side by a particular namespace.

This feature is especially useful for indexer type applications like a block explorer.

Tech Debt: 

Currently, transaction iteration is not supported. Clients must iterate over blocks, manually expanding to transactions and filtering by namespace.

Acceptance: Run an Espresso network until it reaches a certain block height h. Iterate over all of the aformentioned object types, from 0 to h and h to 0. At each step, check that the object has the expected height (i.e. the iteration proceeds in order without skipping any objects) and matches the result of querying for that object individually.

Acceptance: Repeat with various randomly selected page sizes.

Acceptance: Submit transactions for many namespaces. Iterate over transactions filtering by a particular namespace, and check that only and all transactions from that namespace are returned, in the appropriate order.


# Aggregate Statistics: The node API should provide aggregate statistics for analyzing network usage, such as counts of transactions, counts of blocks, and total amount of data confirmed by the GCL. This data should be queryable:

Tech Debt: 

Currently aggregate statistics are not indexed by namespace, and can only be queried for all namespaces in aggregate.

Acceptance: Submit blocks of known sizes with various namespaces populated. Check that the total transaction count, total payload size, and per-namespace counts and sizes reflect the submitted data. Using ranged queries, ensure it is possible to exclude certain block ranges from the counts and sizes in the query results.


# VID Reconstruction: Together, the availability API and node API provide enough data that any block can be reconstructed from its VID shares.

Assumption: The block is younger than the retention period of honest nodes that have enabled pruning.

Acceptance: In a running Espresso network, use the availability API to download VID common data for a certain block, and use the node API to download a corresponding VID share from each individual node. Run the VID recovery algorithm from Jellyfish to reconstruct a complete payload. Fetch the corresponding payload from the availability API and check that it is the same as the recovered payload.