Building Helix: A MEV Arbitrage Detector with Rust and REVM

MEV (Maximal Extractable Value) is one of the most technically demanding areas in blockchain engineering. It sits at the intersection of EVM internals, DeFi protocol math, systems programming, and real-time optimization. I built Helix to explore these concepts through a working implementation: a cyclic DEX arbitrage detector that forks live Ethereum state and simulates swaps with bit-exact accuracy.

The Problem

Price dislocations between decentralized exchanges create arbitrage opportunities. If WETH/USDC trades at one price on Uniswap and a different price on Sushiswap, a trader can buy on one and sell on the other for profit. The challenge is detecting these opportunities accurately and computing the optimal trade size, all before the next block lands.

Architecture

Helix is built around a simple but powerful abstraction:

pub trait SwapSimulator {
    fn simulate_path(&self, path: &SwapPath, amount_in: u128) -> Result<u128>;
    fn max_trade_amount(&self, _path: &SwapPath) -> u128 {
        30 * 10u128.pow(18) // 30 ETH default
    }
}

The ArbitrageDetector is generic over any SwapSimulator, which means the detection logic is completely decoupled from how swaps are actually priced. This made testing trivial: I can plug in a MockSimulator for unit tests and the real EvmSimulator for mainnet analysis.

SwapSimulator trait
    ├── EvmSimulator (REVM, production)
    └── MockSimulator (unit tests)
         ↓
ArbitrageDetector<S: SwapSimulator>
         ↓
TernarySearchOptimizer

REVM Integration: The Actor Pattern

The hardest engineering challenge was bridging REVM's synchronous Database trait with Alloy's async RPC provider, while supporting parallel path evaluation via rayon.

The solution: an actor-based shared backend.

  rayon thread 1              rayon thread 2
  (path A simulation)         (path B simulation)
       │                            │
       ▼                            ▼
  CacheDB<SharedBackend>      CacheDB<SharedBackend>
  (isolated writes)           (isolated writes)
       │                            │
       └──── mpsc channel ──────────┘
                    │
                    ▼
         ┌─────────────────────┐
         │  Background Task    │
         │  AlloyDB (RPC)      │
         │  + CacheDB (cache)  │
         └─────────────────────┘

A single tokio task owns the RPC connection and serves all state requests through an mpsc channel. Each simulation gets its own CacheDB overlay for isolated writes, while reads go through the shared channel. The block_in_place bridge lets synchronous REVM calls communicate with the async backend without deadlocking the tokio runtime.

This pattern eliminates the need for Arc<Mutex<>> on the database: the actor owns all mutable state, and the channel serializes access naturally.

Ternary Search for Optimal Trade Size

Arbitrage profit is a unimodal function of input amount: it rises as you capture the price dislocation, then falls as slippage dominates. This makes ternary search the ideal optimization strategy.

  profit
    ▲
    │      ╱╲
    │     ╱  ╲
    │    ╱    ╲     ← ternary search narrows to peak
    │   ╱      ╲
    └──────────────► amount_in

The optimizer evaluates the profit function at two interior points, discards the less profitable third, and converges on the optimal trade size in O(log n) iterations. Each evaluation triggers a full REVM simulation of the swap path, so efficiency matters.

pub fn find_optimal<F>(&self, lo: u128, hi: u128, eval_fn: F) -> (u128, u128)
where
    F: Fn(u128) -> Option<u128>,
{
    // Ternary search over [lo, hi] maximizing eval_fn(x) - x
}

The closure-based API keeps the optimizer generic: it doesn't know or care whether the evaluation uses REVM, a mock, or any other backend.

Type-Safe ABI Encoding

Interacting with Uniswap V2 contracts requires precise ABI encoding. Helix uses the alloy_sol_types::sol! macro for compile-time ABI generation:

alloy_sol_types::sol! {
    function getReserves() external view returns (
        uint112 reserve0,
        uint112 reserve1,
        uint32 blockTimestampLast
    );
    function token0() external view returns (address);
}

This generates type-safe calldata encoding and response decoding at compile time: no runtime ABI parsing, no string-based selectors, no room for encoding errors.

What I Learned

Building Helix taught me several things about systems-level blockchain engineering:

REVM is production-grade: bluealloy's REVM (the EVM interpreter that powers Paradigm's Reth client) handles mainnet state correctly and efficiently. The CacheDB layering model is elegantly designed for simulation workloads.
The actor pattern fits EVM simulation perfectly: database access needs to be serialized anyway (RPC rate limits), so channeling requests through a single task is natural, not a bottleneck.
Trait-based abstraction pays off immediately: separating the simulator trait from the detector let me write meaningful unit tests without any RPC calls. The same detection code runs identically against mocks and mainnet.
MEV is an infrastructure problem: detecting opportunities is maybe 15% of the work. The real challenge is execution: private transaction submission, block builder competition, and sub-millisecond latency. Helix is deliberately scoped to the detection phase.

Technical Stack

Rust: systems performance with memory safety
REVM: bit-exact EVM simulation (same interpreter used by Reth)
Alloy: modern Ethereum primitives and RPC
Tokio: async runtime for RPC backend
Rayon: parallel path evaluation across CPU cores
Criterion: statistical benchmarking

What's Next

Helix is an educational project that demonstrates the core concepts. A production MEV system would additionally need dynamic pool discovery (graph-based cycle enumeration), Uniswap V3 concentrated liquidity support, Flashbots bundle submission, and mempool monitoring. Each of these is a significant engineering effort on its own.

The source code is available at github.com/danieljrc888/helix.