On ERC-4337, Intents, and MEV

2024-04-17

#BEN

By Ben Basche & Alex Watts - FastLane Labs

MEV, account abstraction and intents represent 3 of the most talked about trends of the cycle, and 3 important pieces of the crypto value proposition and UX puzzle. While each of these supertrends have their own seemingly independent paths of evolution, we at FastLane actually see them converging in important ways, with important design considerations to be made in DeFi as a result. 

  • Intents refer to operations where users express a desired preference or output instead of a defined transaction execution path) are, in a way, just the internalization of MEV into an explicit in-protocol expression of user and solver preferences. Intents are typically, but not always, fulfilled by entities called solvers who either construct a solution to the user’s intent using onchain protocols, or from their own liquidity in some designs. 
  • Account abstraction refers to the abstraction of all core functions of an account to their programmable essence: authentication, authorization, replay protection, gas payment and execution), either natively supported by the chain or bolted on at the contract layer (although not necessarily smart accounts, see UniswapX and permit2). In the case of the EVM where account abstraction is not natively supported yet, some alternative supply chain is required which allows for users to express their intents and have them fulfilled by some automation, solver or bundler, the most widely discussed standard for which is encapsulated in ERC-4337. Note that 4337 bundlers are distinct from bundlers in the traditional MEV supply chain: the former are an entity that collects 4337 UserOperations and places them inside of a transaction, taking responsibility for either charging the user for gas in some form, being reimbursed by some sponsorship paymaster policy or pay for inclusion gas itself. This is distinct from an MEV bundler, which is an entity that collects transactions and places them in a bundle to be executed by a block builder or validator (we call these entities searchers in the MEV world)
  • MEV (or maximal extractable value from permissionless systems, everything from gas fees to arbitrage, sandwich and backrunning profits from ordering and inserting transactions) is in large part the flip side of best execution for a user’s intent - a deviation from or tax on it - and represents a core design challenge in building intent-centric and account-abstracted systems. The MEV supply chain is a hotly debated topic, from ways to mitigate it, to redistribution schemes, and to attempts to decentralize key points of centralization like relays. There are a few pieces of the stack which have been targeted for MEV mitigation, including at the validator layer with MEV Boost from Flashbots on Ethereum mainnet and FastLane on Polygon (PFL), and the RPC layer with MEV blocker et al, and more recently and the focus of this post - the application layer with application-specific order flow auctions (OFAs).

Now that we’ve sufficiently confused you and blurred the 3 concepts together shown you the high-level overlap, let’s take a deeper look at the account abstraction side and work our way backwards to intents and MEV. ERC-4337 - Account Abstraction Using Alt Mempool - was introduced in 2021, and has succeeded where many other EVM account abstraction efforts have stalled. It introduces a lightweight smart account implementation interface alongside a permissionless userOperation mempool, accessed by applications via the 4337 canonical entrypoint contract. This design allows for "UserOperations" - EIP712 signed messages declaring the user’s intent to perform a transaction rather than actually perform it - to be bundled permissionlessly much like the public mempool for Ethereum transactions is today, but without a hard fork on Ethereum mainnet.

While work is still underway on the final implementation of the permissionless mempool(s), we have hundreds of thousands of smart accounts deployed (largely on layer 2s), millions of UserOperations sent, dozens of venture backed infrastructure startups building out the tooling, bundling and SDK layer, and applications finally beginning to adopt smart accounts for reasons of improved onboarding experiences and social recovery, gas abstraction and paymaster flows, and increasingly now module extensions to smart accounts through efforts like ERC 7579. Smart accounts give developers and users great flexibility while maintaining trustlessness and self-custody, and they open the door to even more expressive user automations. The recent push to include EIP-3074's AUTH and AUTHCALL opcodes in the next Ethereum hardfork will bring some of the benefits of account abstraction to EOAs (something which we are extremely excited about and will publish more thoughts on), but smart accounts and ERC-4337 are still going to be critically important for account abstraction innovation.

Learnings for how 4337 and intents will interact from Polygon MEV

Permissionless inclusion of cheap, gas abstracted and programmable metatransactions (the old word for user operations/intents) is absolutely a minimum requirement for us to reach the next billion users across mass-market and real world use cases. But all designs have their tradeoff, and we think that when it comes to MEV-rich UserOps like swaps, the 4337 permissionless mempool may not be the best solution for smart accounts to have users’ intents fulfilled. 

ERC-4337’s metatransaction inclusion supply chain is focused specifically on maximizing two traits: permissionlessness and decentralization.  4337 Bundlers such as Biconomy and Alchemy will receive UserOperations - typically via a private relay for now until the permissionless 4337 mempool is live - and place these operations inside of a transaction.  The Bundlers are willing to pay the gas cost for these transactions because they are reimbursed by the UserOperation, oftentimes with a different currency such as wETH or USDC. When the permissionless 4337 mempool is live, users and apps should have a fully censorship resistant way to get operations included in an eventual Ethereum block, but that may very well come at a high cost for something like an intent to swap X tokens for at least Y tokens: the game theory of the public mempool will drive the user to almost always get the worst price they allow for in their intent. Protecting against this will very likely mean that private relays will continue to need to be involved in intent operations where the user is trying to maximize execution quality, undermining much of the case for 4337's design to begin with for those cases.

What’s behind this? To help us understand how the 4337 permissionless bundling will work on any arbitrary EVM chain, let’s look at a chain which FastLane Labs has plenty of experience with when it comes to MEV: Polygon. FastLane on Polygon (or PFL) is our opt-in, validator-centric MEV solution on Polygon, and is the means by which over 65% of Polygon’s validators receive and process MEV bundles. 

When we designed PFL as a way for Polgyon validators to opt-in to a block-level OFA that protects end users, we made a deliberate decision to avoid the creation of any sort of trusted private relay to validator. Although there are many benefits to our approach, such as relay decentralization and validatory security, the primary rationale behind the choice was two-fold:

  1. We wanted to strongly disincentivize sandwich attacks against all of Polygon’s Users, including those who use the public mempool.
  2. We wanted validators to capture revenue from centralized private orderflow auctions (OFA). 

All transactions in all FastLane MEV bundles (which aren’t the same thing as AA bundles, and are more akin to traditional MEV bundles) are broadcast back into the public memory pool. This allows any searcher to include the transactions of other searchers in their MEV bundles. Consider the following sequential structure of a sandwich attack:

Note that the first transaction from the searcher - the frontrun - is similar to a “loss leader.”  By itself, it will lose money. The searcher only realizes profit when their second transaction is executed.  The capacity for these transactions to execute in an “all or none” batch is called MEV bundle atomicity. 

But on Polygon with PFL, we intentionally don’t guarantee the atomicity of MEV bundles. All the transactions in an MEV bundle are broadcast to the public mempool, meaning that other Searchers can use them in the construction of their own MEV bundles.  Consider a new party, SearcherB, who also wants to make money. What would they do?

SearcherB will combine SearcherA’s frontrun transaction and the User’s transaction with their own backrun transaction.  This leads to three important conclusions:

  1. SearcherB’s MEV bundle will always be more profitable than SearcherA’s, because SearcherA will always have higher costs than SearcherB due to the swap fees and gas fees of the frontrunning transaction.
  2. SearcherB will always be able to bid higher in auction and is expected to win the auction over SearcherA.
  3. Because SearcherB’s MEV bundle includes a cost to SearcherA (the frontrunning transaction), and because SearcherA cannot expect to win the auction without direct and detectable Validator intervention, the rational action for SearcherA is to simply not attempt to sandwich the User in the first place. 

This system has been live on Polygon PoS for roughly a year now, and we still have yet to observe a single sandwich attack succeeding via the FastLane relay or smart contract. While it’s technically possible and we expect to see it happen potentially someday, as described above sandwich attacks are strongly disincentivized because they would be attributable over a period of time and the validator that enabled it would be booted from the valuable PFL block OFA. We see this incentive at work plainly when comparing sandwich attack prevalence on Ethereum to those on Polygon: 

Ethereum sandwiches on the left, Polygon sandwiches on the Right:

This mechanism of “turning everything into a public auction” doesn’t just work for sandwiches - it also subjects all “private” orderflow to the same auction mechanism, of which the Validator is the beneficiary in the case of PFL, with the end user receiving protection from the most egregious forms of MEV like frontrunning and sandwich attacks. This auction game wasn’t designed for EIP-4337, but what we have been seeing lately with 4337-compliant UserOperations is a fascinating game that rhymes with the PFL auction mechanics:

  1. A Bundler (the AA type) puts a UserOperation inside of a transaction and sends it via the public mempool (because no private relay is provided by the FastLane MEV protocol). 
  2. A competing Bundler sees the transaction and calculates that the reimbursement from the UserOperation exceeds the gas cost of the transaction. 
  3. The competing Bundler creates their own transaction with a copy of the calldata of the UserOperation and then frontruns the original transaction:

4337 and “best execution” of intents

But here begins our problem when it comes to MEV-rich 4337 userOps, or userOps which authorize a more expressive “intent” to do something complex in DeFi. Let’s examine this from the perspective of the validator, who we’ll assume likes money. When 4337 BundlerB frontruns 4337 BundlerA, they must use either a higher GasPrice or some other payment mechanism to incentivize the Validator to place BundlerB’s transaction in front of BundlerA’s. In addition to the higher fee collected from BundlerB, the Validator also receives the transaction fee from BundlerA’s transaction, which will now revert.  

As with PFL’s sandwich defense mechanism, two things become clear:

  1. BundlerA loses money - they still have to pay their transaction’s gas cost, but they don’t collect any funds back from the User.
  2. Validators are the primary ones who profit from this competition. The surplus disproportionately goes to them. 

Even if BundlerA has a private relay set up between themselves and the user, they are still likely to lose money when they get frontrun in the public mempool.  Because of this, validators are incentivized not to build a private relay, as such a relay would preclude their ability to profit from competition between bundlers.  As a result, for BundlerA, the most likely winning move is simply not to play, which means that the UserOperation is less likely to get included in a block by a Validator.  This is a problem that ERC-4337 fixes with its permissionless mempool.

This 4337 mempool is crucial - it’s a place for the User (or, more specifically in most cases, the frontend that the User is interacting with to create their userOp) to send their UserOperation without having an additional cost associated with making that data available to bundlers.  Due to EIP-4337’s spec of permissionless bundling, an arbitrary set of Bundlers can then assess the UserOperation and compete over which one of them executes it.  This is great for liveness and censorship resistance, but - crucially -  it’s not so great for MEV-rich o for the user from an execution perspective because one needs to ask the question: when being a Bundler is valuable, and when anyone can do it, who gets the privilege? Well, as we’re already seeing in realtime on Polygon PoS in the PFL construction, the answer is whoever pays the most to the validator. 

Bundlers are rational actors. It can be assumed that they won’t offer to pay the validator more than the monetary value of the UserOperation.  Different bundlers will assign a different monetary value to each UserOperation depending on their own extraction capabilities, and depending on their own relationship with the users originating these operations. As the value a bundler can extract from the UserOperation increases, so too does that operation’s monetary value to the bundler and therefore that bundler’s bid to the validator. This leads to an unfortunate conclusion: each UserOperation will be bundled by whichever entity can extract the most value from it

Let’s now bring our attention back to intents. As discussed earlier, an intent is a desired outcome. A regular transaction might specify something along the lines of “Please sell N ElonEth420 tokens on Uniswap V2 and give me the resulting amount of DogTurd9000 tokens,” whereas an intent would be a request to “Please take N ElonEth420 tokens give me the most DogTurd9000 tokens possible.” Note the emphasis on the word “most” in the example of the intent - the User wants to keep as much surplus as possible. After all, why wouldn’t they?  Maybe if the User’s intent was “Please take N ElonEth420 tokens and give me M DogTurd9000 tokens, but if you can get more than M DogTurd9000 tokens then definitely keep that surplus for yourself because I want you, the Bundler, to have them, because, gosh golly, you deserve them!” But that’s a bit absurd, isn’t it? The User expects that any value uncovered by the Bundler during the execution of their UserOperation will go back to the User. But as we’ve discussed, with EIP-4337 all value flows straight to the Validator - any value that can be extracted, will be extracted. This is why ERC-4337’s alt mempool and execution-maximizing intents are somewhat incompatible.  

As soon as the UserOperation hits the 4337 mempool, it will be detected by multiple potential bundlers - many of whom presumably fill the role of block builder as well.  Even if there is an agreement in place between a bundler and a builder to not extract value from users, it won’t matter because the builders are also competing in an auction to extract the most value to bid to the proposer. As long as the 4337 mempool is used, any trusted builder with a non-extraction agreement in place will have a less valuable block than a builder who attempts to extract maximal value from the UserOperation.   

Unfortunately, the exact permissionless bundling which 4337 enables subjects its userOperations to the above game theory, resulting in maximal MEV extraction out of a user’s pocket and into the proposer/validator’s. Proposer-builder separation on doesn’t change the outcome here, and while having user’s install custom RPCs with private relays to builders could theoretically help (although even this point is debateable) , not only are custom RPCs an unwieldy MEV protection form factor for the least sophisticated users, but also you are essentially re-adding trusted middlemen and relay infrastructure back into the equation and value chain to enable MEV protection - the exact thing which the 4337 alt mempool was created to avoid to begin with.

Ok…Now what?

To be clear, 4337 is extremely important and serves a critical function for the adoption of smart accounts by billions of users: permissionless inclusion of userOperations with little to no MEV, very likely to constitute the vast majority of transactions/pseudotransactions we need to take web3 to mass adoption. 4337’s specification is very well suited to maximize decentralization, accessibility, and censorship resistance for any and all UserOperations. If a User wants to do a token approval, add someone to a multisig, make a simple payment or transfer, or some other type of UserOperation that won’t generate MEV or that doesn’t involve an adversarial counterparty, 4337 is great and likely optimal for the user. In these interactions, the “intents” are for certain automations, transfers, payments and other non-adversarial uses of public blockspace, and the gas paid (by the end user, in whatever token, or somehow sponsored by the application, wallet or another party) represents the only “MEV” extracted. In the mass adoption endgame where crypto is embedded in all aspects of daily commerce and computing, this might represent 95%+ of "operations" users send.

But some UserOperations - such as DeFi intents, swaps, oracle updates, liquidations and other potential sources of MEV where users need to maximize their execution over some price space - are more ill-suited for ERC-4337’s alt mempool because of the value that they leak to validators. Luckily, we are still relatively early in the deployment of smart accounts in DeFi and can design systems which are MEV-aware. In order to enable the adoption of smart accounts in DeFi and all of the recovery, automation, gas abstraction and UX benefits they provide, we also need to have a solution for transaction inclusion that is compatible with intents, one which is capable of enabling OFAs which benefit the end user and application which generate the MEV to begin with. 

At FastLane, we’ve built such a solution. It’s called Atlas