Research Summary: zPROBE: Zero Peek Robustness Checks for Federated Learning

Summarized by Nojan Sheybani, Zahra Ghodsi, Mojan Javaheripi, and Xinqiao Zhang.


  • Federated Learning (FL) enables the training of a central server’s model via updates collected by distributed clients. The process of collecting client updates in a privacy-preserving manner is called secure aggregation.
  • Secure aggregation does not disclose any information about a specific user’s update - it collects all user’s updates and discloses the average of these updates. This allows malicious clients to modify their updates and poison the central model without being caught.
  • We propose zPROBE, a novel framework that enables Byzantine resilient and secure federated learning by utilizing robustness and correctness checks implemented with interactive zero knowledge proofs.

Core Research Question

How can we defend a central federated learning model against malicious updates from Byzantine clients, while protecting the privacy of honest clients?


Ghodsi, Zahra, Javaheripi, Mojan, Sheybani, Nojan, Zhang, Xinqiao, Huang, Ke, & Koushanfar, Farinaz. (2022). zPROBE: Zero Peek Robustness Checks for Federated Learning.


  • Federated Learning (FL): FL enables training a central model on a dataset distributed amongst many parties, by sending model updates and without requiring the parties to share their data.
  • Byzantine attacks: A Byzantine attack is one in which malicious clients manipulate their local updates to degrade model performance in the FL scenario.
  • Shamir Secret Sharing: Shamir Secret Sharing is a secure multiparty computation (MPC) primitive in which a user can distribute shares of a secret s to n parties such that any t shares can be used to reconstruct the secret, but any set of t-1 or fewer shares reveals no information about the secret.
  • Secure FL Aggregation: (Bell et. al, 2020) introduces a secure aggregation scheme that includes a server and n clients each holding a private vector of model updates with l parameters \bm{u}. The server wishes to obtain the aggregate \sum^n_{i=0} \bm{u_i} without learning any of the individual client updates. To do this, each pair of clients (i,j) agree on random pairwise masks \bm{m_{i,j}}. To protect against dropout, each client also generates their own random mask \bm{r_i}. These masks are shared with all clients using Shamir Secret Sharing. Each user computes their masked input and shares it to the server using the equation:
    \bm{v_i} = \bm{u_i} + \bm{r_i} - \sum_{0<j<i} \bm{m_{i,j}} + \sum_{i<j\leq n} \bm{m_{i,j}}.
    Once the server receives all masked inputs, it asks for shares of pairwise masks for dropped users and shares of individual masks for surviving users (but never both) to reconstruct the aggregate values.
  • Zero-Knowledge Proofs (ZKPs): ZKPs are a two-party cryptographic primitive between a prover P and a verifier V. These enable P to verify to V that the computation on P's private inputs is correct, without any information leakage about P's private inputs. We utilize the interactive ZKP construction, entitled Wolverine, presented in (Weng et. al, 2020). Computation for this construction is represented as an arithmetic/boolean circuit. Wolverine allows for a prover to authenticate a value x to a verifier using information-theoretic message authentication codes (IT-MACs).


  • We present zPROBE, a novel framework that enables private and robust aggregation in the malicious threat model by incorporating zero-knowledge proofs. Our Byzantine-robust secure aggregation, for the first time, scales sub-linearly with respect to the number of clients.
  • Using a novel privacy-preserving robustness check based on rank-based statistics, zPROBE is robust against state-of-the-art Byzantine attacks with 0.5-2.8\% higher accuracy compared to prior work on private and robust FL.
  • To reduce zPROBE overhead, we leverage probabilistic optimizations without compromising security, resulting in orders of magnitude client runtime reduction compared to a naive implementation.
  • In this work we consider a semi-honest server that follows the protocol but tries to learn more information from the received data. zPROBE is the first single-server framework with malicious clients that is resilient against an extensive attack surface, supports client dropouts, and does not require a public clean dataset.
  • zPROBE can perform an aggregation round on ResNet20 over CIFAR-10 with sub-second client compute time. zPROBE also achieves a lower computational complexity compared to prior art for client and server.


  • The following image shows the high-level global flow of our proposed method:

  • zPROBE utilizes Algorithm 1 for robust secure aggregation, which includes correctness checks and robustness checks as outlined in Algorithms 2 and 3 respectively.

  • zPROBE Secure Aggregation:

    • In round 1, clients generate pairwise masks \bm{m_{i,j}} and an individual mask \bm{r_i}. Shamir shares of the seeds for the Pseudo-Random Number Generators (PRGs) are also computed and distributed to all other clients.
    • In round 2, clients authenticate their updates, \bm{u_i}, and compute a masked update \bm{v_i} using \bm{m_{i,j}} and \bm{r_i} which is sent to the server. The clients also prove in zero-knowledge that the correct mask values are used and the masked update is computed correctly, shown in algorithm 2.
    • In round 3, the plaintext aggregation of all the updates is computed by subtracting each surviving user’s individual masks and each dropped out user’s pairwise mask from the sum of all of the collected masked updates.
  • Establishing Robustness:

    • We start by clustering users into n groups randomly, and we assume that 25% of users in the greater group are malicious.
    • Using zPROBE secure aggregation, the server computes the mean \mu_i of each cluster, obtaining n means.
    • The server computes the median \lambda and standard deviation \sigma of the cluster means.
    • A robustness threshold is established: for a given update \bm{u_i}, |\bm{u_i} - \lambda| < \theta must be true, where \theta = \eta \sigma. The value of \eta can be tuned based on cluster size.
    • This robustness threshold is established in step 1 of our global flow. The robustness of each client’s updates are checked via ZKPs (Algorithm 3) in step 2, before moving on to the final round of secure aggregation on all benign users.
  • Probabilistic Optimizations:

    • Malicious clients can compromise their update, by sending updates with distance margins larger than the tolerable threshold \theta, or sending incorrect masked updates.
    • A naive approach to checking robustness and correctness would have the server check each parameter and mask from each client. Using our probabilistic optimizations, we are able to avoid this and reduce protocol runtime by several orders of magnitude.
    • Assume that the malicious party has exactly l_m model parameter updates, out of the total l, that have been compromised. The probability of detecting a malicious update is equivalent to finding at least one compromised parameter gradient:
      p=1-{l-l_m \choose q}\Big/{l \choose q} where q denotes the number of per-user checks on model updates.
    • Below, we show the probability of detecting malicious users vs. number of ZKP checks necessary to a guarantee a failure rate lower than \delta=0.005 for a model with l=60k parameters.
  • Final aggregation:

    • All users that have passed the robustness check go through one final round of unclustered zPROBE secure aggregation.
    • The server updates the central model using the updates computed from this round.


  • We compare the overhead of zPROBE to prior works BREA and EIFFeL.

  • We show the effectiveness of our robustness checks against three commonly observed Byzantine attack scenarios: Sign Flip, Scaling, and Non-Omniscient Attacks.

  • Defense Performance

    • For these experiments, we assume that 25% of clients are Byzantine and malicious clients modify all of their updates to maximize degradation of the central model’s accuracy. This is consistent with prior work.
    • The figure below shows the convergence behavior of the FL scheme in the presence of Byzantine users with and without zPROBE defense.
    • Compared to EIFFeL, zPROBE achieves 1.2%, 0.5%, and 2.8% higher accuracy when evaluated on the same attack applied to MNIST, FMNIST, and CIFAR-10, respectively.
  • Runtime Analysis

    • Below, we present the runtime of zPROBE for a round of secure aggregation over multiple benchmarks compared to the baseline approach presented in (Bell et. al, 2020) which does not have robustness guarantees.
    • Furthermore, we present the detailed breakdown of the runtime for various components of zPROBE below. Results are gathered on the CIFAR-10 dataset and ResNet20 architecture. We denote the steps as S1, S2, and S3, corresponding to the steps in the global flow and the rounds as R1 and R2, corresponding to the rounds in Algorithm 1.
    • Alongside maintaining sub-second client runtime, we also achieve low communication overhead, only requiring 2.1 MB and 4.4 MB of client and server communication respectively for a round of aggregation over F-MNIST.
  • Complexity Analysis

    • We are unable to directly compare zPROBE’s raw numbers for runtime with prior work, since their implementations are not publicly available. Instead, we present a complexity comparison between zPROBE and prior works with respect to number of clients n (where k=\text{log}n), model size l, and number of malicious clients m.
    • We highlight that the client runtime is quadratic and linear with number of clients in BREA and EIFFeL respectively, whereas logarithmic in zPROBE.

Discussion and Key Takeaways

  • Our protocol is the first single-server framework with malicious clients that is resilient against an extensive attack surface, supports client dropouts, and does not require a public clean dataset.
  • We ensure security of clients’ individual model updates through the use of secure aggregation. The integrity of the central model is ensured through the use of ZKPs to verify computation.
  • Although incurring slight overheads compared to plaintext federated learning techniques, zPROBE is robust against state-of-the-art Byzantine attacks, while maintaining high accuracy and efficiency, and low communication.

Implications and Follow-Ups

  • We plan on exploring the applicability of zPROBE to real-world recommender systems.
  • We intend to explore different constructions of ZKPs, such as zkSNARKs, to determine whether the communication and computational power incurred by the prover is feasible in this setting.
  • This work, like other works in this area, is susceptible to Denial-of-Service (DoS) attacks. A future direction of this work is to explore solutions to this.


  • Developing machine learning applications in a decentralized setting introduces privacy concerns which necessitates formal guarantees of privacy. zPROBE is a solution to address these privacy concerns.
  • Due to our lightweight protocols, zPROBE performance scales sub-linearly with respect to the number of clients. This makes it a feasible solution to robust federated learning.
  • Federated learning has been in employed in many industries, such as healthcare and finance. Using our protocol allows for safe implementation of federated learning techniques, without worrying about malicious users degrading the central model.

Thanks @zahra for posting this research paper, I enjoyed reading same, I believe that the author incorporated probabilistic optimizations in the design of zPROBE to minimize the overhead of our zero-knowledge checks, without compromising security. By co-designing the robustness defense and cryptographic components of zPROBE, we are able to provide a scalable and low overhead solution for private and robust FL. For better understanding of this paper, I think it is better to marshal out the main objective of this paper. They are as follows;

• Developing a novel privacy-preserving robustness check based on rank-based statistics. zPROBE is robust against various Byzantine attacks with 0.5−2.8% higher accuracy compared to prior work on private and robust FL.

• Enabling private and robust aggregation in the malicious threat model by incorporating zero-knowledge proofs. Our Byzantine-robust secure aggregation, for the first time, scales sub-linearly with respect to number of clients.

• Leveraging probabilistic optimizations to reduce zPROBE overhead without compromising security, resulting in orders of magnitude client runtime reduction.

I equally observed that zPROBE ensures correct behavior from clients, and performs robustness checks on model updates. With a combination of zero-knowledge proofs and secret sharing techniques, zPROBE provides robustness guarantees without compromising client privacy. zPROBE presents a paradigm shift from previous work by providing a private robustness defense relying on rank-based statistics with cost that grows sublinearly with respect to number of clients.

I further noticed that the author used ZKPs to identify malicious clients that send invalid updates, without compromising clients’ privacy. The author aimed to protect the central model against Byzantine attacks, i.e., when a malicious client sends invalid updates to degrade the model performance. Another important design parameter in federated learning is the model aggregation method used to update the global model using the updates made by the selected clients. But what can federated learning do for you that local training on one device cannot?

I think it is important to note that Byzantine attack can cause an accuracy drop of ∼ 88% when no robustness check is applied. However, zPROBE can largely recover the accuracy degradation, reducing the accuracy loss to 0.2%-9.8%.
For the purpose of clarity, it would be better to note that Federated learning embodies principles of focused data collection and minimization, and can mitigate many of the systemic privacy is important to be clear about the protections it does.

Lastly, I just have an open question for you that are of important practical value. Might federated learning give users a false sense of privacy? How do we enable users to feel safe and actually be safe as they learn more about what is happening with their data? Do users value different aspects of privacy differently? What about facts that people want to protect? Would knowing these things enable us to design better mechanism? Are there ways to model people’s privacy preferences well enough to decide how to set these parameters? Who gets to decide which techniques to use if there are different utility/privacy/security properties from different techniques?