Introducing shadow-reth

Introducing shadow-reth

Today, we’re releasing shadow-reth – an open source shadow node built on Reth.

shadow-reth contains a series of Reth modifications that enable you to generate shadow events via Execution Extensions, and retrieve them easily with a custom RPC Extension.

Why shadow events?

Shadow events are a powerful new primitive for generating blockchain data that empower developers with greater freedom and flexibility over how they work with onchain data from smart contracts.

Shadow events bring clear benefits:

  1. Deeper data coverage: Generate net-new events by accessing onchain data that was previously inaccessible (or very difficult to access), on any smart contract.
  2. Simplified data pipelines: Drastically reduce the complexity of data pipelines by writing transformation logic directly in smart contracts themselves.
  3. Faster iteration cycles: Quickly test, verify, and iterate on shadow events using tools that you’re already familiar with.
  4. Permissionless, gasless logging: Permissionlessly add as many events as you want, on any contract you want, without increasing the gas burden on end users.

Why shadow-reth?

Our hosted platform packages powerful functionality into a set of easy-to-use features that enable a single developer to write advanced data pipelines in minutes, without complex ETL pipelines or additional offchain infrastructure.

However, fully hosted platforms present trustlessness tradeoffs. While shadow forks do not hold user funds in the same way that public blockchains do, they do manage data –– and data is incredibly valuable.

We believe it’s critical that anyone can generate shadow events in a decentralized and self-hosted manner, and verify that data generated by a hosted shadow fork matches what is expected from its shadow contract implementation –– much in the same way that anyone can verify the data they get from a hosted mainnet RPC provider by self-hosting their own mainnet node.

To that end, we've released shadow-reth: an open source shadow node built on Reth, licensed under Apache 2.0 and MIT.

Reth Execution Extensions introduced a framework to build customizable, performant nodes. In short, Execution Extensions (ExExes) are post-execution hooks for building off-chain infrastructure on top of Reth.

shadow-reth leverages Reth ExExes to provide an open-source implementation of a shadow node that allows anyone to generate shadow events and retrieve them via JSON-RPC. Importantly, this can all be run within a single Reth node, without requiring additional offchain infrastructure, or maintaining a heavily modified client fork.

This makes the benefits of shadow events accessible to all, and increases trustlessness and verifiability as their usage continues to proliferate.

Next, let’s dive into how shadow-reth was built.

How it works

At a high level, shadow-reth has three main components:

  1. Execution extension: ShadowExEx replays transactions from the canonical chain, using the provided configuration of shadow bytecode overrides. This replay generates shadow logs and writes them to an index (SQLite).
  2. RPC: The custom shadow JSON-RPC endpoint (shadow_getLogs) that fetches shadow logs from the index.
  3. Shadow configuration: The shadow fork configuration (shadow.json). This contains a map of contract address to shadow contract bytecode.

Execution Extension

When certain events occur within a Reth node — such as blocks being committed to the chain, reorgs, and state reversions — Reth’s ExExManager emits a corresponding ExExNotification which allows an ExEx to hook into the chain state and execute arbitrary code when an event is witnessed. These notifications allow us to access the state of the node, the committed chain, and any other useful context information.

With post execution hooks, we’ve implemented a shadow node which is capable of re-executing the canonical chain with shadow contract overrides, allowing for the emission of custom shadow events. We do this with an Execution Extension, ShadowExEx, which listens for ExExNotification::ChainCommitted events and uses the provided state to construct a ShadowDatabase<DB: revm::Database>. This database implementation wraps a HistoricalStateProviderRef built from the notification’s read-only database provider, and overrides contract bytecode with the provided overrides from shadow.json.

With our ShadowDatabase, we’re able to re-execute the committed chain using our ShadowExecutor; a simple block executor which is capable of executing blocks with overridden bytecode. We use this custom executor rather than Reth’s BatchExecutor as modifying and executing custom contract bytecode will lead to both gas and state root issues, causing this executor to fail. On the other hand, ShadowExecutor does not perform state validation, and instead uses revm::Evm::transact_preverified() to execute blocks with shadow bytecode and commit their state to the ShadowDatabase. From the executed block’s ResultAndState, we’re able to fetch the shadow events emitted within the block and sink them to SQLite.

Additionally, ShadowExEx also listens for ChainReverted notifications in order to mark reorged shadow events as removed in compliance with the Ethereum JSON-RPC specification.

Note: While there are many other scalable database engines, we chose to use SQLite for this initial implementation because it’s supported by most systems out of the box, doesn’t require additional installation, and can be run on a single machine. We aspire to have shadow-reth support an array of data destinations, such as Postgres, that users can pick and choose from.

Custom RPC endpoint

The Shadow RPC extension allows you to create custom implementations of methods to return information about indexed shadow events. It's primarily driven by a ShadowRpc type and ShadowRpcApi trait.

The ShadowRpc type serves as a wrapper around a database manager and a generic blockchain provider, which allows for access to the state of the chain. You can change and augment these types as they see fit; we've chosen to use SQLite for shadow event storage due to ease of use and portability, but this type and its implementation can easily be changed to use a different database product or even to be agnostic across storage solutions.

The ShadowRpcApi trait defines the public interface for the custom RPC extension and should be implemented for the ShadowRpc struct. To illustrate the capabilities of this extension and the possibilities for interacting with shadow data, we've included a shadow_getLogs RPC call: a namespaced equivalent to the commonly used eth_getLogs RPC call. You can retrieve raw logs from shadowed contracts using the same request parameters that you would use in eth_getLogs.

This interface can also be extended to implement other common RPC calls or even completely new ones that are unique to your shadow data. The provider field on the ShadowRpc type allows you to access many parts of the chain's state thanks to Reth's extensive collection of provider traits; simply include the necessary traits in the implementation of your custom RPC call and you'll be able to use your Reth node to retrieve information about the chain.

Shadow configuration

For convenience, you can create a free Shadow account, set up contracts that you want custom shadow events for, and download the shadow.json configuration to run on your own self-hosted Reth node.

Alternatively, you can generate shadow contract bytecode with any external tool, like Foundry. And in the future, we’ll release an integration that makes this easy to do directly from your CLI.

Limitations

Because shadow-reth is designed to be used with a single self-hosted node, there are a few important functionality limitations to be aware of, relative to the Shadow hosted platform.

  • Gas limits: shadow-reth does not override gas limits when re-executing a block with ShadowExecutor for data consistency reasons. Transactions may fail if they run out of gas during shadow re-execution, and no shadow events will be emitted for that transaction.
  • Backfilling: shadow-reth does not backfill shadow events. If you start running shadow-reth on a synced Reth node, shadow-reth will only generate shadow events for blocks that have been processed since shadow-reth was started. If you want historical shadow events, you’ll need to re-sync your Reth node from genesis. We’re working closely with the Reth team to improve this.
  • Decoding: shadow-reth is designed to be analogous to a regular node, which doesn’t include event decoding. If you want to decode shadow events, we recommend polling the shadow_getLogs endpoint in a separate process.
  • Websockets: Shadow events will not be published over eth_subscribe websocket subscriptions.

What's next?

We’re excited to continue to make shadow events accessible and verifiable by all.

The benefits they bring: permissionless gasless logging, deeper data coverage, and simplified indexing pipelines are difficult to ignore. However, there’s still a lot of work to do to unlock the full potential and benefits that they promise to the broader crypto ecosystem.

In the future, we aim to make it possible to:

  • Discover and review shadow contracts written by others
  • Dynamically opt into the exact shadow events that you want
  • Store shadow event data wherever you'd like

Using shadow-reth, you’ll be able to do all of this without requiring additional offchain infrastructure or maintaining a heavily modified client fork. And if you prefer not to manage your own node, you’ll be able to do all of this easily through our hosted platform.

The code for shadow-reth is available for free on Github under the Apache 2.0 and MIT license for anyone to use with no strings attached. We encourage the community to fork it, contribute with docs, issues, pull requests, questions, or even try to break it. We're incredibly excited to see what you do with it.

Reach out to us at gm@shadow.xyz if you want to work together to drive forward the future of onchain data.

Acknowledgements

shadow-reth wouldn't be possible without the hard work of the following projects:

  • Reth: The foundation of shadow-reth, an Ethereum archive node implementation that is focused on being user-friendly, highly modular, as well as being fast and efficient.
  • Revm: Revm is an EVM written in Rust that is focused on speed and simplicity. Revm is the backbone of shadow-reth’s ShadowExecutor, as well as Reth itself.