Blog Article

Failed Transaction Mitigation FAQ

Author

Justin Rice

Publishing date

Transaction

Fees

Faq

On December 10, 2021, we released Stellar Core v18.2.0, which includes a feature aimed at mitigating the impact of failed transactions caused by arbitrage-seeking bots. The goal of this post is to explain a bit about how that feature works and why it's important, and to answer questions you may have about the potential impact it may have on the network.

Short version: Stellar Core v18.2.0 includes a potential fix for a persistent nuisance that has increased both network fees and the cost of running Stellar infrastructure. If you run a validator, you can download it and read the full release notes on the official releases page.

What is the new feature?

Here's how the commit to Stellar Core v18.2.0 describes the new feature:

"This adds a new statistical damping pass to the herder's transaction queueing/flooding machinery, to try to traffic-shape the flow of cyclical (arbitrage-attempt) path payments on a per-asset-pair basis. The network has been experiencing very high volumes of these txs (almost all of which fail); this gives validators a new tool for managing that volume. It has a controllable threshold and intensity (managed by a pair of config variables) and is set to a reasonable damping factor by default."

Essentially, validator operators can configure their nodes to identify a specific type of transaction — circular path payments — and gate many of those destined to fail while still allowing enough through to preserve healthy arbitrage. Currently, all of these transactions are considered for inclusion in the ledger, and most of them fail. Because they fail during application rather than during submission, they compete for ledger space, drive up network fees, and persist forever in the historical record. By filtering out some of them, validators may be able to reduce their impact on the network.

What problem does the feature attempt to solve?

For the past couple of years, the Stellar network has processed a high number of failed transactions caused by a slew of trading bots attempting to take advantage of a limited number of arbitrage opportunities. The network has performed well under the load — throughput remains high; ledgers continue to close in around 5 seconds — but these transactions come in waves, and periodically, they account for an outsized portion of network activity. To get a sense of the scale, take a look at the swell of orange bars on the transaction success rate graph at the bottom of the Stellar Expert Network Activity page:



Courtesy of stellar.expert


Those waves of orange failed transactions don’t do anyone any good — they don’t enrich developers; they don’t make markets more efficient — but, by claiming limited space in the ledger, they do cause problems:

  • They force the network into surge pricing mode, which increases network fees for everyone
  • They permanently bloat the ledger, which makes it more expensive to run a validator or Horizon node
  • They make replaying historical data a lot slower, which makes it more of a hassle for new nodes to join the network
  • They slow down ledger close times

To allow Stellar to continue to function as a fast, efficient, and decentralized system for payments, we want to make sure it's easy to run validators and Horizon nodes. We also want to make sure it's inexpensive for exchanges, wallet providers, cross-border payment service providers, and other developers and businesses to submit transactions to the network. By allowing validators to filter out some of the destined-to-fail transactions before they are applied to the ledger and committed to the permanent record, we hope to preserve the efficiency and usability of the network.

What causes all the failed transactions?

The failed transactions are caused by trading bots that take a brute-force approach to arbitrage.

Stellar has a unique set of operations called path payments, which allow the simultaneous sending and conversion of currency — I send USD; you receive ARS — and they make it incredibly easy to use the network for cross-border and cross-currency transactions.

Path payments convert currency by consuming orders in Stellar’s built-in order books or trading against liquidity pools, and sometimes an inefficiency gives rise to a slight pricing mismatch. We won’t get into the details here, but developers realized that — every once in a while — you can submit a circular path payment and end up with a tiny bit more money than you started with, and they built bots to look for those opportunities, and to try to capitalize on them.

A lot of people built arbitrage bots, and they all look for the same opportunities. When one comes up, it’s a race: the winning bot submits a transaction that claims the opportunity and succeeds; the remaining bots submit transactions conditioned on the existence of that opportunity, and since it’s no longer available, those transactions fail.

Because those transactions met the minimum fee requirement, they fail after they’re included in the ledger rather than before, so they end up in everyone else’s way. It’s like a passel of pigeons hovering around a park bench: you drop a crumb on the sidewalk, they all dive after it. One pigeon gets the crumb, the rest stay hungry, and while the losers sit there cooing and strutting and wishing for what might have been, they block the sidewalk so pedestrians can’t use it.

Is this feature the best way to mitigate the impact of failed arbitrage transactions?

We believe this feature provides a temporary surgical solution to the problem. It allows validators to identify and gate a very specific kind of transaction, which means they may be able to prevent failed transactions from clogging the ledger without affecting other network traffic. It's a hyper-focused approach, and we arrived at it based on feedback from the Stellar ecosystem. In the longer term, a suggested protocol change may allow general grouping of transactions and facilitate the creation of per-group policies, and there's currently a discussion underway about how that might work, which anyone is free to join!

Before deciding on this approach, we considered other possibilities, and had public discussions about them. Specifically...

We talked about increasing the minimum fee

In 2020, we kicked off a mailing list discussion about raising the minimum fee with a blog post explaining how it might reduce failed arbitrage-bot transactions. After a lively discussion, it was clear there wasn't widespread consensus for a fee increase, and it appeared unlikely that validators would vote to approve it.

Increasing the minimum fee may have priced out the arbitrage bots, but many in the ecosystem pointed out that it also would have made the network more expensive for everyone. They argued that market makers, who provide liquidity vital to currency conversion and cross-border payments, would have been hit particularly hard because they update their DEX positions all day every day in order to keep prices consistent across markets. So rather than attempting to price out the arbitrage bots by increasing the minimum network fee...

We let surge pricing take care of the problem

Fees on Stellar are dynamic: when transaction submission exceeds the configurable limit of 1,000 operations/ledger, the network enters surge pricing mode. Fees serve as bids in a VCG auction, and transactions that specify higher fees are prioritized for inclusion in the ledger. \

Surge pricing allows network users to outbid the arbitrage-seeking bots, and many products and services built on Stellar got the message after we posted an FAQ in February 2021, and came up with strategies to increase their fee bids. However, because the arbitrage-seeking bots submit transactions in waves — remember, they're all pursuing the same opportunities, and spring to life whenever one arises — they often cause erratic fee spikes, which are impossible to predict, and difficult to reason about. Many in the ecosystem continued to express concern about the frequency of surge pricing, and to request a technical solution to reduce failed transactions without increasing fees.

Could we just increase the ledger limit?

The operations/ledger limit is configurable, and validators vote on where to set it just like they vote to ratify transaction sets and add them to the ledger. Right now, they have opted for a 1,000 operations/ledger limit.

While they could increase that limit, the general consensus among validators is that doing so would not solve the problem created by brute-force arbitrage bots. In fact, it would just give them more headroom to submit transactions destined to fail. Ledger size would grow, there would be more data for node operators to manage, and the cost and complexity of running Stellar infrastructure would increase.

When validators set the ledger limit, they try to strike a balance: they want to allow sufficient throughput to support meaningful network activity while also ensuring that Stellar infrastructure doesn't require specialized hardware. Stellar is open participation, and people all over the world should be able to spin up a node to contribute to the network. Increasing the ledger limit to accommodate still more failed transactions runs counter to that goal.

Will this change affect my ability to submit transactions to Stellar?

If this change works, it will make it easier to submit successful transactions. By reducing the number of bot-submitted arbitrage-seeking circular path payments considered for inclusion in the ledger, this feature may reduce contention for ledger space, decrease the frequency of surge pricing, and make it easier for products, services, and Stellar users in general to successfully submit transactions to the network.

That said, we still recommend following best practices when submitting a transaction to the network. Specifically:

  • Set a timebound. That way, if the transaction doesn't succeed within the time specified, you know it's not hanging around waiting to be processed. You can try again knowing you won't accidentally, say, duplicate a payment.
  • Set the highest maximum fee you are willing to pay. You will actually pay the minimum amount necessary to make the ledger. Under normal circumstances, even with a higher max fee set, you will pay the network minimum — currently 100 stroops.
  • Implement a retry loop with increasing delay (e.g. 30s, 60s, 90s). It should only execute once you've exceeded the timebound you set.

If you are still getting errors when attempting to submit transactions, consult the Handling Errors Gracefully doc.

Will this change affect network fees?

If this change works, this change will keep fees low by reducing instances of surge pricing.

It is worth noting that, even with the high volume of failed bot-submitted transactions, Stellar fees have remained incredibly low. As of this writing, the average fee/ledger for the past 30 days was 0.0004140 XLM. While we expect that number to go down because of this feature, I reiterate: your fee represents the maximum amount you will pay for a transaction; you will actually be charged the minimum necessary to make the ledger. Set that fee as high as you are willing to pay!

Will this change prevent arbitrage?

No. A sufficient number of circular path payment transactions will still make the ledger to allow for healthy arbitrage.

When this feature is turned on, a validator will shape the traffic it suggests for inclusion in the ledger by grouping competing circular path payments, and randomly choosing which to keep and which to pass on. It is configurable, and validator operators can adjust the number of transactions they allow through the gate. Functionally, the amount of successful arbitrage will stay the same as the status quo.

However, rather than allowing every single brute-force arbitrage attempt to make it all the way to the application stage before failing, thereby increasing competition for ledger inclusion and bloating the permanent record, this feature allows validators to screen out destined-to-fail transactions earlier in the transaction life cycle.