Research Summary: Studying Bitcoin privacy attacks and their Impact on Bitcoin-based Identity Methods


Research Summary: Studying Bitcoin privacy attacks and their Impact on Bitcoin-based Identity Methods

TLDR

  • Blockchain technology enables decentralized and self-sovereign identities including new mechanisms for creating, resolving, and revoking them.
  • The public availability of data records has allowed attacks that combine sophisticated heuristics with auxiliary information to compromise users’ privacy and deanonymize their identities.
  • We review and categorize Bitcoin privacy attacks, investigate their impact on one of the Bitcoin-based identity methods namely did:btcr, and analyze and discuss its privacy properties.

Core Research Question

How can we categorize Bitcoin privacy attacks, and investigate privacy issues in did:btcr?

Citation

Ghesmati, S., Fdhila, W., & Weippl, E. (2021, September). Studying Bitcoin privacy attacks and their Impact on Bitcoin-based Identity Methods. In International Conference on Business Process Management (pp. 85-101). Springer, Cham. Studying Bitcoin Privacy Attacks and Their Impact on Bitcoin-Based Identity Methods | SpringerLink

Background

Entities (e.g., users and organizations), use global unique identifiers such as telephone numbers, ID, or URLs. However, these identifiers are often issued and managed by central authorities. Blockchain-based decentralized identifiers have been proposed to prove an identifier’s ownership without having to rely on a trusted entity.

  • Decentralized identifier (DID): A string that includes three main parts: the scheme, the DID method, and the DID method identifier, which should be unique within the DID method.
  • DID document: Contains information about the verification methods and the service endpoints required to interact with the DID subjects.
  • DID subject: The entity that is identified by the DID, and can be a person, an object or an organization.
  • DID method: Defines how DIDs are created, resolved, updated, and revoked.

Summary

  • We review and categorize privacy attacks on the Bitcoin blockchain, which may reveal the links between addresses and real-world identities, and also correlate between different identities.
  • We address Bitcoin privacy attacks’ impact on the DID method did:btcr.
  • We adopted the privacy terminology from RFC 6973.

Method

Four main steps for collecting and selecting relevant literature:

  • research questions identification
  • literature search
  • literature selection
  • data extraction

Results

  • We categorized Bitcoin privacy attacks into four main categories (i) heuristics, (ii) side channel attacks, (iii) flow analysis, and (iv) auxiliary information.

  • We showed how data analysis of Bitcoin public records, in combination with auxiliary information can be exploited using sophisticated heuristics, to reveal or correlate transactions, identities, or addresses of users.

  • This study has demonstrated that although BTCR provides some advantages such as protection against censorship, integrity, access, and a degree of decentralization; it still lacks methods to deal with the privacy issues identified in this paper.

Discussion and Key Takeaways

We investigate the privacy of the method did:btcr based on the criteria adopted from RFC 6973.

  • Surveillance: Any kind of observation and monitoring of the users, whether the users are aware of the surveillance or not, can influence a user’s the privacy.
    • Auxiliary information is obtained through the interactions with services using DIDs.
    • Blockchain is immutable, no way to delete the history.
  • Correlation: The combination of different information, which relates to one user.
    • Using the same DID or DID document for interacting with different services helps to trace and correlate user activities.
    • Using the same public keys in different DID documents can reveal the link between the corresponding DIDs.
    • The IP address of an entity can compromise the relationship of common controls, linking between different DIDs.
    • Timing analysis can correlate users’ activities using the same service endpoint in the DID documents.
  • Identification: Relating the information to a specific user.
    • If the Bitcoin address associated to a DID is later spent, it can link the address used for DID to other addresses owned by the user.
    • The visibility of the DID document can leak the metadata about the attributes and provide information about the service endpoints.
    • If the DID document is stored in the third-party server, the latter may identify the real DID owner.
    • If the DID document is stored on a user’s own server, it can correlate the user IP address with the DID document.
  • Secondary Use: Collecting the information about a user without their consent and using it for purposes other than that which the information was collected for.
    • Read/resolve makes it possible to trace the DID use if it is accessed by third party services (e.g., universal DID resolver).
    • The verifier can trace the transaction flow, check the history of the UTXOs!
    • DID real identity can be compromised if used in services that require information about the users or their activities (e.g., social networks).
  • Disclosure: Exposure of information about a user which violates the confidentiality of the shared data.
    • Privacy may be lost in the economic activities for the services authenticated by DIDs.
    • BTCR updates reveal the public key of the previous DID or changing the access control.
  • Misattribution: Whenever a user’s data or communications are attributed to another, which can consequently affect the user’s reputation.
    • Using indistinguishable mixing techniques can relate the users’ UTXOs to someone else.

Implications and Follow-Ups

Future research will consist of elaborating and developing new methods, or using existing privacy-enhancing techniques (e.g., mixing techniques, zero-knowledge proofs) to address the aforementioned privacy issues.

Applicability

  • This work can improve privacy countermeasures for DIDs BTCR.
  • It can also provide comprehensive privacy attacks for privacy threat modeling.
  • Our future work contains privacy threat modeling based on LINDDUN. The paper will appear on PTM Workshop under the name “User-Centric Public Blockchain Privacy Threats”.
11 Likes

Thank you @simin for the fascinating summary of your work. Welcome to the forum, and it’s so great to have you here.

There are at least two things that I find particularly interesting.

The first is the classification of Bitcoin privacy attacks. In your work, there are 4 categories, and under each section are several important threats. This seems like a careful design and is thus an important reference for the potential attacks that both developers and users need to stay aware of.

Is the categorization unique to Bitcoin (or UTXO blockchains)? That is, consider a non-UTXO coin, or even, a privacy coin, would all 4 categories remain, or how would it change?

The second relates to the discussion and key takeaways section, where you dive into how the threats map back to privacy criteria. It seems important since defining a problem is a prerequisite to identifying solutions.

For me, it is a great resource to have it spelled out specifically why users of the blockchain are at risk of privacy breaches. Having the criteria laid out side-by-side with the characteristics of the blockchain is fascinating.

With that said, I must ask: Being new to the space, I wonder what is the story behind using RFC6973 as a criterion for (sort of) defining and standardizing what are the particular characteristics of privacy for internet protocols? Are there other criteria that are widely discussed in the space, and if so, how was RFC6973 selected for the research?

5 Likes

Thank you for your message. I will copy the questions and put the answers after them.

Is the categorization unique to Bitcoin (or UTXO blockchains)? That is, consider a non-UTXO coin, or even, a privacy coin, would all 4 categories remain, or how would it change?

Indeed, not all privacy attack categories can be applied to other blockchains. Some of the attacks are specific to Bitcoin. For instance, the multi-input ownership heuristic relates to the UTXO-based blockchains; however, in privacy coins such as Zcash or Monero, it is not considered. Zcash and Monoro tried to solve the problems by ZKP (in Zcash) and RingCT (in Monero). Therefore, the attacker can not assume that all the inputs belong to the same user. Actually, in Zcash shielded pool, none of the inputs can be detected. However, it can perfectly work in Zcash’s non-shielded pool because the non-shielded pool is similar to Bitcoin, where everyone can see the inputs.

What is the story behind using RFC6973 as a criterion for (sort of) defining and standardizing what are the particular characteristics of privacy for internet protocols?

We used RFC6973 for this paper since the Internet protocol documents are widely used in the DIDs community. We tried to map these privacy issues to what the community considers.

Are there other criteria widely discussed in the space, and if so, how was RFC6973 selected for the research?

Indeed in blockchain privacy, we have to refer to the related scientific papers; however, the terminology provided by Pfitzmann and Hansen in Anon-Terminology is an excellent source for privacy criteria. These criteria can be adopted and reformulated in the blockchain area.

8 Likes

A Decentralized Identifier (DID) is preferred over traditional identification methods for its promise of privacy and security by virtue of it living on a public blockchain. This seems theoretically true, for now.

Attacks on public blockchain on which these DIDs are hosted have brought us back to the point of lack of privacy. In other words, we still face the privacy challenges that pushed us to use the DID methods in the first place.

I enjoyed reading this summary as it progressed from the cause, impact, and to the solutions to the challenge. However, I came across a research summary on this forum by @Tolulope which offers interesting solutions like off-chain transactions, Coinjoin, etc., to the studied challenges.

Applying these solutions would go a long way to mitigating the challenges of did:btcr as a decentralized identifier.

@simin Do know if there are DID methods present on other blockchains such as Ethereum? Since this research is focused on the impact of privacy attacks on a DID method on the Bitcoin Blockchain, it would be great to extend the research to other blockchains to have a holistic understanding of the impact of privacy attacks on the whole DLT.

If DID methods similar to did:btcr exist on other blockchains, they could be studied to:

  1. compare with the results obtained from the Bitcoin Blockchain

  2. make conclusions on the best solutions to apply in solving these privacy issues, since different blockchains have different consensus mechanisms and algorithms.

2 Likes