A Note on Privacy in Constant Function Market Makers

TLDR:

Privacy is impossible with most usual implementations of CFMMs under reasonable conditions.
There are some strategies that can be used to mitigate this deanonymization and preserve privacy, but they have costs.

Core Research Question

  • In what ways are CFMMs privacy-preserving or can users’ transactions be deanonymized?

Citation

Background

Constant Function Market Maker (CFMM) - A smart contract that allows for the exchange of coins and derivative assets. It employs a convex function which maps asset quantities to implied prices abiding by the principle of supply and demand. Uniswap and Celo utilize similar CFMMs.

Decentralized Exchange (DEX) - A type of cryptocurrency exchange where traders can exchange assets with each other or with the protocol without the need for a trusted third party.

1-homogeneous functions - A function is homogeneous of degree k if, when each of its arguments is multiplied by any number t > 0, the value of the function is multiplied by t^k.

Hyperplane - A hyperplane is a set of potential solution spaces that are subject to a single equality constraint.

ε-close - In mathematics, ε represents an infinitesimally small quantity, whose limit is usually taken as ε->0. ε-closeness is the property of being a distance from a value that is infinitesimally small.

Constant Sum Market Makers - A market maker that satisfies the equation where the sum of the reserves are constant.

Summary

Transactions in DEXs are public and decentralized. This means it is not necessary to trust a centralized entity to transact. All details of any trade are transparent, meaning they can be directly attributed to a user’s address. This transparency has costs, e.g. front-running and deanonymization. CFMMs like Uniswap are subject to deanonymization via statistical analysis. Zero-knowledge systems, however, might be able to privately execute CFMM transactions. Purportedly privacy-preserving modifications to Uniswap are unlikely to preserve privacy as identity is leaked from the timing of trades.

The authors of this paper formalize the intuition that CFMM trades can be deanonymized. Combining information about a CFMM’s function and time-ordered trade data allows an attacker to reverse-calculate trade size. It has been shown in other works that CFMMs benefit from many useful properties of convexity. A downside to this is that trade sizes can be easily recovered.

Method & Results

The researchers 1) use basic convex analysis tools to show how to calculate the uniqueness of transactions and 2) use logical proofs to show how this applies to more general CFMMs and weaker attack models. They also 3) provide some mitigation mechanisms to improve privacy.

  • Impossibility of Privacy
    • The researchers define CFMMs using reserve quantities and a trading function.
    • They assume that trading functions are strictly concave and increasing (this does not apply to constant sum market makers).
  • Adversary definition and attack
    • The researchers assume a simple and general adversary model (Eve) attempting to see the quantity traded by an agent (Alice).
    • Action space: They assume Eve can query the marginal price before and after trades, whether a trade is valid, that Alice’s transaction took place. As such, this would not apply to protocols where users do not know when transactions take place.
    • Attack description: The researchers outline that Eve can always reconstruct the reserves from a marginal price and a single valid trade using basic nonlinear equations. This means that the adversary can compute the reserves before and after Alice’s trade, and thus the trade amounts. Now, Eve only needs to compute the uniqueness of reserve amounts that satisfy these equations. A Newton-type method can be used to get good results, but there are important special cases wthatare much simpler.
    • Reserve discovery in Uniswap: In the case of a CPM like Uniswap, this uniqueness problem reduces to a linear system of equations which simplifies to the following equation.
    • This produces the unique solution (R1, R2) for the reserves for all price-moving trades. This means that using marginal price only, Uniswap reserves can be recovered.
  • Uniqueness of solution
    • The researchers show that the reserve discovery equation is unique when the trade function is an increasing, nonnegative, strictly concave 1-homogeneous function. This includes CPM and CMM (eg. Uniswap and Balancer). The researchers suspect that this applies when the function is not necessarily 1-homogeneous but leave that for future work.
    • Reserves at a fixed price: The researchers define a set of reserves consistent with the above constraint. I.e they use a series of proofs to define the set of reserves that map to the marginal prices of trades.
    • Reserves consistent with a trade: they then use another series of proofs to show that the intersection of this set and the set of reserves for which a given trade is feasible is a singleton (i.e. a unique solution exists). The researchers also note that Curve is a counterexample since scaling reserves will change the marginal price.
  • Extensions
    • Nonzero fees: Where a function has nonzero fees, there is a different equation that the CFMM must satisfy. The feasibility condition is changed but the proof of uniqueness does not change.
    • General homogeneity: This paper assumes 1-homogeneity to make the proofs cleaner, but the proof should be similar for more general p-homogeneity (for p>0).
    • Unknown marginal price: When the marginal price is not known (Eve cannot access), it is not difficult for Eve to compute an arbitrarily good approximation by performing a linear amount of queries.
    • Non-strict concavity: The case where the trading function is not strictly concave is more difficult because some reserve amounts may map to the same marginal price. In some special cases (eg. constant sum market makers), Eve can keep increasing her trade size until a trade becomes feasible and then perform a binary search to get ε-close to true reserves. More general CFMMs whose concave functions are not strictly concave are harder but similar. In those cases, Eve can again query progressively larger trades until they become feasible. This requires a more involved argument that is not made by the researchers in this paper as most CFMMs are strictly concave.
  • Mitigating strategies
    • There are a number of conditions that might be used to modify CFMMs to guarantee privacy. The researchers provide examples, but do not prove these conditions in this paper.
    • Randomness in price: The CFMM could add some amount of randomness to the marginal price (as is done in differential privacy) to prevent reconstruction of reserve values. This randomness must be fixed at each block or else it would be too easy to approximate the true price. The researchers note that this can quickly become complex and might cause losses for liquidity providers.
    • Batching orders: Another option is to batch orders together. This does not work if all the other trades in a batch surrounding Alice’s belong to Eve. Also, the delay in batching orders can lead to a bad user experience and could negatively impact solvency with large price movements.

Discussion and Key Takeaways

More complex mechanisms are needed to preserve privacy than would be initially assumed.

Impossibility of privacy: The researchers primarily make use of two tricks to show their results.

First, because the set of possible trade functions is strictly convex (mathematically simple), any unique hyperplane (i.e. any cross-section of variables) maps to a unique reserve value. This provides a way of identifying reserves at a fixed liquidity. They note Curve as a potential counterexample that all convex trading functions fit the conditions outlined.

The second trick used by the researchers is that due to the monotonicity and strict concavity of the trading function, a given trade will either be too expensive for a given liquidity amount (a better trade exists) or cannot be executed (not enough liquidity).

Further thoughts: There is a tradeoff between privacy and the cost traders or liquidity providers will pay for it. This can be direct (eg. higher prices) or an indirect cost (e.g. risk of failure). It is not currently clear what price users are willing to pay for this. This raises the question of whether it is worth spending resources to achieve such privacy solutions.

Conclusion: As shown by the researchers, privacy in DeFi is difficult to achieve. The researchers suspect there are reasonable variations of CFMM which are privacy-preserving. However, solutions to this problem are either hard to implement in practice or suffer from worse user experience.

Implications and Follow-ups

The researchers identify two open questions: 1) does the attack outlined work for all strictly convex CFMMs (not just 1-homogenous) and 2) what does a privacy-preserving CFMM look like and what guarantees can be made.

Applicability

The work in this paper applies to all constant function DEXs, for example Uniswap. It also potentially applies to more generalizable automated market makers such as Curve. This information “leak” has broad implications for the DeFi design space.

4 Likes

It seems like any additional mechanism to obfuscate the timing of trades would negatively impact price efficiency, which makes this is an incredibly hard problem to solve

Could dark pools be a solution here though?

Rather than obfucating trade time, the layering of wallets could essentially enable entities to trade via pooled accounts, which could make tracing individual entities a much harder endeavor. This would be analogous to an OTC desk performing trades on behalf of clients using pooled accounts. Do the authors evaluate this as a potential solution?

4 Likes

How much does privacy even matter to the average DeFi user? Has there ever been much of an expectation of privacy trading online?

2 Likes

You would think privacy only matters to someone trying to hide gains from tax agencies unless there is some other logical reason? Maybe to prevent front-running?

2 Likes

What about hedge funds keeping their strategy to themselves or really anyone trying to keep trade secrets?

2 Likes

There is a host of reasons why someone would pursue private transactions. Privacy is a complex topic in the crypto space and a lot of conversation around “if you have nothing to hide” is a bit fallacious. In the traditional finance system, credit card purchases, amazon history, bank statements, etc. are all private. There is a large demand for having two sets of transactions in any payment system since the dawn of accounting: 1) public/business and 2) personal/private.

The key takeaway here is not that someone could discern who is transacting with the Uniswap router (as that information is public anyway), but rather that even if you were using a shielded or purportedly private transaction scheme, if you have implemented a CFMM, you would be leaking information anyway.

4 Likes

I definitely do not take the perspective of “if they have nothing to hide it should be public,” however in terms of dealing with someone interacting with Uniswap there is a very limited demographic that has the capital or the technical knowledge to execute a swap using a CFMM. While there is no reason to assume that those practices are exclusively limited to illicit transactions, it would also not be illogical to assume that at least SOME of the transactions on Uniswap are intended to be private for the purposes of skirting capital gains taxes. As you allude to, even the shielded transactions are leaking information so it’s a false sense of privacy but it stands to reason to at least discuss the motivations of someone attempting to shield transactions even if they are unsuccessful.

3 Likes

What might an exchange that was hardened against this look like?

1 Like

Proper AML/KYC is what is meant to protect against this exact type of nefarious activity.

The notion of a DEX does not preclude the possibility of creating an AML/KYC oracle that functions as a decentralized mechanism to prevent money-laundering. That would have to be deployed in a decentralized manner to prevent recentralization (open-source collaboration for example).

2 Likes

The best example to think about is ZEXE or any protocol that attempts to build a zk/black-box add-on to Uniswap or another CFMM. Due to the nature of the CFMM convexity, its easy to guess in the face of obfuscation.

There are plenty of products in the DeFi space that have touted privacy or anonymity capabilities. The main takeaway is that if they implement a CFMM anywhere in the stack, they are more than likely leaking info anyway.

4 Likes

One more follow-up question, related to the researchers’ open questions: this project looked at convex CFMMs, and potentially applies to more generalizable AMMs such as Curve; what are some of the other varieties of CFMM - why might identification be more difficult or impossible (or would it be) with a non 1-homogenous CFMM or is that what a generalizable AMM is?

2 Likes

That would be a form of a more generalized view of these AMMs. Basically convex CFMMs are more or less the simplest form of AMM that you can do this kind of statistical analysis on. The more complex and unpredictable the function, the harder it is to ‘guess’, which means the harder it is to ‘leak’ identity information. Again, primarily relevant where the AMM is purported to exist in a privacy-preserving environment.

2 Likes

One quick update: We were able to get a partial converse result by showing that you can achieve (\epsilon, \delta)-differential privacy if you use a VRF/VDF plus consensus-provided randomness:

In some ways, this answers @cipherix’s earlier question re: Dark Pools (this is a statistical analogue of them)

3 Likes

Welcome to SCRF @tarun, it’s awesome to see you here!

I was actually just about to dive into the new publication – it was featured in last week’s Research Pulse (a curation of the top papers posted that week).

Privacy in pseudonymous blockchains will always be a touchy topic given the wide range of attack vectors both at the tx graph layer as well as the P2P network layer, but I really like the proposed URE approach. As you point out, it likely won’t get much better than that.

URE also seems to be highly complementary to strategies that decrease the profitability of Sandwich Attacks, whereby a single swap is split into many using a relatively straigtforward algorithm. Is this something you have considered in URE’s design?

3 Likes

I was actually just about to dive into the new publication – it was featured in last week’s Research Pulse (a curation of the top papers posted that week).

Appreciate the mention, thank you.

URE also seems to be highly complementary to strategies that decrease the profitability of Sandwich Attacks, whereby a single swap is split into many using a relatively straigtforward algorithm. Is this something you have considered in URE’s design?

Yes, in fact, for the worst case input trades, such as the “threshold” batch (T, 1, \ldots, 1) \in \mathbb{R}^T, you have to split up trades to preserve any notion of privacy. The main improvement of this paper is that we construct a combinatorial object that coarsely measures “how much” you need to split the big tree to a) preserve privacy while b) not causing too much impact. This trade-off between cost of privacy vs. excess price impact (e.g. a loss felt by users) is a natural way to formulate MEV. The analysis in the paper provides a formal way to measure this cost for any CFMM that has non-zero curvature (e.g. isn’t constant sum).

Another thing to note is that we split trades in a different manner than the paper you cited. Instead of deterministically splitting them, we sample a splitting from a Dirichlet distribution. Recall, that a Dirichlet distribution of dimension d is a distribution on the probability simplex \mathcal{P}^d = \{ x \in \mathbb{R}^d : \sum_i x_i = 1\;\; \forall i \, x_i \geq 0\}. We take a trade of size T and split it into d trades of size T\pi_i for \pi \in \mathcal{P}^d. We choose the parameter of the Dirichlet distribution such that the sizes of the trades are close but (crucially) not predictable. Recent work has shown how to differentially privately sample Dirichlet distributions (e.g. so that an adversary can’t perfectly guess the parameter — which would lead to grinding attacks).

How would you implement this in practice? If you have a layer 1 that provides randomness from consensus (as a VRF or VDF) to smart contracts, one can sample a) a random permutation and b) a Dirichlet distribution to get a splitting and ordering that guarantee (\epsilon, \delta)-differential privacy

3 Likes