Discussion Post: Zero-Knowledge Proofs - An Ethics Perspective

In a previous post, I summarized a research paper which proposed a framework for blockchain ethics, focusing on the relevance of data ethics to smart contract and oracle design. Specifically, I focused on the PAPA framework (Mason, 1986), which provides a lens through which decentralized oracle networks (DONs), cryptographic protocols and hybrid smart contracts can be assessed.

In a this series of discussion posts, I will use PAPA to evaluate a more abstract privacy-preserving cryptographic method known as zero knowledge proofs (Part I) and two applications of zero knowledge proofs relevant to hybrid smart contracts. These applications include DECO (Part II), a decentralized oracle protocol, and CanDID (Part III) a decentralized identity protocol. Both are discussed in the context of hybrid smart contracts in the Chainlink 2.0 whitepaper. Before diving into a discussion about ethics, it’s important to understand what zero knowledge proofs are and why they are so important to privacy-preserving oracles and hybrid smart contracts.

Zero Knowledge Proofs: Background

Zero knowledge proofs were first conceptualized in a 1989 paper by MIT and University of Toronto professors Shafi Goldwasser, Silvio Micali* and Charles Rackoff. Their work addressed a class of problems relevant to interactive proof systems in which a “prover” sends messages to a “verifier” to convince the verifier that a mathematical statement is true.


Figure 1: Zero knowledge proof diagram without moon math. Source.

Before the publication of Goldwasser et al. (1989), research on interactive proof systems were concerned with the quality, or “soundness” of proofs, which focused on cases where a malicious prover tried to fool a verifier into believing that a false statement was true. Goldwasser et al. (1989) flipped this line of research on its head, instead focusing on the trustworthiness of the verifier rather than the prover.

What the authors were concerned with, specifically, is how to prove something to a verifier by providing them the absolute minimum amount of information necessary for them to verify the proof. Everyday practical implications of this are straightforward when applied to online transactions.

Take email, for example. When you log into an email application, you would want to ideally prove to a server that your password is valid without having your password stored on the server or exposed over the Internet since this leakage could expose your private information if a server becomes compromised.

Unfortunately, this is precisely how many payment systems currently operate, leaving many vulnerable to hacking and attacks. A zero knowledge protocol, on the other hand, would allow for transmission and validation of sensitive information like passwords without the verifier knowing any information beyond a statement such as “this statement is true.”

More specifically, a zero knowledge protocol must satisfy the all three properties:

  • Completeness: if a statement is true, an honest verifier will be convinced by an honest prover.
  • Soundness: if a statement is false, a dishonest prover can convince the verifier that it is true only with some very small probability.
  • Zero-knowledge: if a statement is true, a cheating verifier cannot learn any more information than the fact is in-fact true.

Currently, a number of privacy-focused cryptocurrency and smart contract platforms either use or will be using zero knowledge (zk) protocols such as zk-SNARKs or zk-STARKs. Both are protocols used mostly by cryptocurrencies to hide transaction data (e.g. Zcash), but zk-STARKs are said to be an improvement upon zk-SNARKs because they have better scalability and transparency.

A Data Ethics Perspective on Zero Knowledge Proofs

As I mentioned in a previous post, the PAPA ethics framework does not necessarily contain normative ethical prescriptions but rather suggests discussion points around which data ethics research and analysis should revolve around. These include privacy, accuracy, property and accessibility. Privacy relates to individuals’ ability to be able to decide what personal information to hide or share. Accuracy relates to who is responsible for accuracy, authenticity of information and retribution due for false or erroneous information. Property is concerned mostly with intellectual property rights to data but also “conduits through which information passes.” Finally, access deals with authority to obtain and access information.

By applying each of these elements to a protocol, oracle or data exchange system, we can learn more about the extent to which that system fulfills ethical principles as well as who is accountable when those principles are violated. This is the analysis that I conduct below.

Table 1 contains an application of the PAPA framework to zero knowledge proofs. This analysis has provided us with the ability to reach a number of interesting conclusions about the ethical implications of ZKPs. First, and most importantly, ZKPs ensure that the prover has total privacy and control over their personal data. In this sense, ZKPs can be thought of as “ethical algorithms” par excellence. Second, and related to this, is that that responsibility for the integrity of the data also lies entirely in the hands of the prover.

Thus, if we created a rating system relating to “privacy” on a scale from 1-5 where 1 = Least Private and 5 = Most Private and another scale relating to “responsibility” where 1 = Decentralized Responsibility to 5 = Individual Responsibility, ZKPs would score a 5 on both counts since the prover has the utmost privacy and responsibility.

Endnotes

*Silvio Micali a MIT professor and the founder of the Algorand cryptocurrency.

References

8 Likes

The chart is incredibly helpful in demonstrating how to apply the PAPA framework, thank you for adding it here!

Based on this analysis, it appears that maximum privacy and maximum individual responsibility are the ideal states. Does this hold true in all situations? It seems like some of these categories, such as accuracy, might benefit if both prover and verifier have some responsibility, meaning losses to privacy and individual responsibility. Perhaps that’s not the case in a purely financial transaction, but in other situations, such as verification of identity or governance scenarios, wouldn’t there be more premium placed on accuracy over pure privacy and individual responsibility?

2 Likes

In contrast, things operate in the complete opposite direction in our society.

Privacy

  • ZKPS: Revel minimal data for access to services
  • NOW: Accept cookies to read a blog post

Accuracy

  • ZKPS: Providers take care of data quality
  • NOW: Companies hire data engineers to take care of data quality. They further profit from patterning behavior

Property & Accessibility

  • ZKPS: Hands-off from data for verifiers
  • NOW: SaaS servicers get to keep, change, and deny access to data from users
3 Likes

Yeah this is tricky…

Let me raise two scenarios, where zkp is still pushed by good ppl, and does not veer from the original good intention.

  • zk set-membership proof for authentication
    tl;dr: users register with identity data first, but access the service anonymously via set-membership proof later on.

Personally, my co-author and I applied the zkp set membership proof on an IoT application (a pending paper). It’s to prevent personal info and access log from leaking, whether intentionally by insiders or unintentionally by APT attacks.

From authority/service provider’s perspective, its users are now able to prove they are authorized and legit users of such system via zk set membership proof.

On ethics, even if the log’s leaked, one cannot distinguish from one access log to another, cause that’s what zk gave you. That could prevent leaks such as equifax (a more complexed issue cause you got to have the ability to query and manipulate data), or ppl profiling you by collecting…say, your battery-power car’s charging history and highway usage history, among different charging station providers and toll service provider.

Such adversaries now cannot distinguish one access log from another, unless the implementation of such system accidently left some other traces e.g., physical location of access, time-sequences that can effectively deanonymized, etc.

Sorry that I can’t tl;dr it accurately, but think of it like this:
Design Purpose: we don’t want cloudflare and other provider’s cookie/token to be able to identify each persion and their indivudial access.

Result in the cloudflare scenario:
ppl can now retreive the anti ddos token by clicking & declaring that themselves are human; and use the received token to redeem access to other cloudflare powered sites, without its identity being linked between the two behavior(issue and redeem)

Personally I think this is a great example that, how one good intention browser plugin and protocol Privacy Pass, can become something more powerful and ppl (chronium and cloudflare) started to embrace it.

We should all have our faith and keep pushing the (good) standards and protocols, and one day it could pave the path for ppl to implement it in a large scale, where everyone in the ecosystem benefit from it.

4 Likes