Research Summary: Collaborative Learning for Cyberattack Detection in Blockchain Networks

TLDR

  • This paper introduces a novel intrusion detection dataset named BNaT which stands for Blockchain Network Attack Traffic, created from a real blockchain network from the researchers’ laboratory, which is proposed as an effective decentralized collaborative machine learning framework to detect intrusions in the blockchain network.
  • The main idea of the proposed learning model is to enable blockchain nodes to actively collect data, share the knowledge learned from its data, and then exchange the knowledge with other blockchain nodes in the network.
  • The researchers propose a novel collaborative learning model that allows efficient deployment in the blockchain network to detect attacks.
  • The researchers also proposed a Random Forest and XGBoost to detect attacks in a blockchain-based IoT system
  • Both intensive simulations and real-time experiments clearly show that the proposed collaborative learning-based intrusion detection framework can achieve an accuracy of up to 97.7% in detecting attacks.

Core Research Question

What is the most effective solution to detect cyberattacks in blockchain networks?

Citation

Tran Viet Khoa, Do Hai Son, Dinh Thai Hoang, Nguyen Linh Trung, Tran Thi Thuy Quynh, Diep N. Nguyen, Nguyen Viet Ha, and Eryk Dutkiewicz

Background

  • Denial of Service (DoS) attack: A type of attack in blockchain networks as it can be easily performed to attack blockchain nodes. For such kind of attack, the attackers will launch a huge amount of traffic to a target blockchain node in a short period of time.
  • Brute Password (BP) attack: Derived from traditional cyberattack when hackers perform such attacks to steal blockchain users’ accounts. In this way, hackers can access the users’ wallets and steal their digital assets.
  • Flooding of Transactions (FoT) attack: Targets delay the PoW blockchain network by spamming the blockchain network with null or meaningless transactions.
  • WebSockets and JSON-RPC: Used when netstats get information from Geth clients.
  • HTTP: It requests and replies to view netstats interface and results of cyberattack detection
  • Deep Belief Network (DBN): The DBN is a type of deep neural network that is used as a generative model of both labeled and unlabeled data.
  • Bootnod: A lightweight application used for the Node Discovery Protocol. The bootnodes do not synchronize blockchain ledger but help other Ethereum nodes discover peers to set up Peer-to-Peer (P2P) connections in the network
  • Full Nodes: Nodes that take responsibility for storing the state of the blockchain ledger, participating in the mining process, and verifying all blocks and states. It can be used to serve the network and provide data on request

Summary

  • In this paper, researchers identified that an examination was set up in a lab to construct a private blockchain network with the points of not just acquiring genuine blockchain datasets, yet additionally testing the proposed learning model in a constant way.
  • It is therefore the first dataset acquired from a lab for considering cyberattacks in blockchain networks, and consequently, the BNaT dataset can advance the improvement of ML-situated in-trusion discovery arrangements in blockchain networks in no distant future.
  • The researchers fabricated a successful instrument named Blockchain Intrusion Detection (BC-ID) to gather information in the blockchain network. This device can separate elements or features from the gathered network traffic information or data, filter attack tests in network traffic, and precisely name them in a constant way or in a real-time manner.
  • Researchers propose a collaborative decentralized learning model not just only to work on the exactness of distinguishing or identifying attacks, but to actually convey in decentralized blockchain networks. This model empowers nodes in the network to actually share their prepared models to improve cyberattack detection proficiency without sharing their crude information or their raw data.
  • In network security, Machine Learning (ML) has been considered as the most effective solution to detect cyberattacks with very high accuracies. ML detects attacks that have never been detected and reported before.
  • ML solutions detect many types of attacks at the same time with very high accuracies. For example, Deep Learning (DL) allows detecting cyberattacks in industrial automation and control systems with an accuracy of up to 97.5%

Method

  • The researchers proposed that the learning model of each network can be trained by the dataset from its local network and exchange learning knowledge with those from other nodes in a blockchain network in an offline manner.
  • In a practical blockchain network with a large number of learning nodes, we can schedule for nodes to exchange the learning knowledge in the offline training phase at appropriate times to avoid network congestion
  • Therefore, each node can effectively learn the knowledge from other nodes while avoiding the traffic congestion of the network.
  • After the training process, the trained models can be used to help nodes to detect attacks in a real-time manner.

  • From the above diagram, the model includes K full nodes which are used to receive transactions, mining blocks, and keep the replica of the ledger.
  • These nodes continuously synchronize their ledgers together by the P2P protocol with equal permissions and responsibilities for processing data.
  • The author states that transactions can come from different blockchain applications such as cryptocurrency, smart city, food supply chain, and IoT. when transactions are sent to full nodes, they will be verified and packed into one block.

Results

  • These outcomes show the way that the proposed CoL would be able to trade information with other LNs to work on its capacity of detection, so it can accomplish better execution in classifying attacks in the blockchain network than those of the IL.
  • It moreover exhibits that the learning model of IL ought not to be utilized to group the information of other LNs. Likewise, without sharing LN’s dataset to a middle hub for preparation (e.g., a cloud server), the proposed CoL can accomplish almost a similar exactness as those of the CeL in all the scenarios

  • The above diagram presents the experimental results of the proposed CoL and the CeL with various participating LNs. We acquire similar patterns as those of the reproduction results.
  • The outcomes obtained by two and three learning models of both proposed CoL and CeL are marginally lower than those of the reproductions at around 0.2%. This is on the grounds that each kind of attack has various circulations of attack samples in a timeframe.
  • Finally, this learning model allows nodes in the blockchain to be actively involved in the detection process by collecting data, learning knowledge from their data, and then exchanging knowledge together to improve the attack detection ability.

Discussion and Key Takeaways

  • In network security, Machine Learning (ML) has been considered as the most effective solution to detect cyberattacks with very high accuracies. ML can detect attacks that have never been detected and reported before.
  • The researchers have implemented a private blockchain network in our laboratory. This blockchain network is used to generate data (both normal and attack data) to serve the proposed learning models and to validate the performance of our proposed learning framework in real-time experiments
  • The authors propose a collaborative decentralized learning model not just only to work on the exactness of distinguishing or identifying attacks, but to actually convey in decentralized blockchain networks. This model empowers nodes in the network to actually share their prepared models to improve cyberattack detection proficiency without sharing their crude information or their raw data.

Implications and Follow-ups

  • In the future, the authors intend to keep fostering this dataset with other arising kinds of attacks and foster more powerful techniques to safeguard blockchain networks.
  • In this way, data can be accessed and processed simultaneously at multiple nodes, thus avoiding the problem of bottlenecks and single points of failure.
  • These attacks can cause huge losses on our assets, but can also lead to many serious issues related to human health and lives. Therefore, solutions to detect and prevent attacks in blockchain networks are becoming more urgent than ever.

Applicability

  • Accordingly, there are other and more uses of blockchain innovation in our lives including finance, medical care, strategies, logistics, and IoT frameworks.
  • Because of the quick accomplishment with a great many applications in many regions, particularly in cash move and digital money, blockchain-based frameworks have been becoming focuses of numerous new age cyberattacks.
13 Likes

Thank you @Henry for coming up with this research summary. I have done a quick summary from your research summary with some interesting questions to further the discussion on collaborative learning for cyberattack .

The multi-functional application of blockchain technology especial in money transfer and cryptocurrency has positioned it as a honey pot that attracks wide range of cyberattack. For instance, in September 2020 KuCoin lost over &281 million worth of coins and in May 2019, Binance lost over 7,000 worth of bitcoins amounting to over $40million. Around January 2022, it was reported that North Korean hackers performed seven attacks which resulted in the lost of $400million digital assets. These series of attack attracted the attention of researchers to undertake research to discover ways to detect and prevent cyber-attack on the blockchain

The research pointed out that Machine learning has been considered as the most effective solution to detect cyberattack with high accuracies for three main reasons such as: (1) It allows to detect many types of attacks at the same time with very high accuracies, (2) it detects attack that has never been detected and reported (3) it can be deployed effectively, quickly and flexibly. However, Machine learning suffers to certain problems which are it lacks a synthetic data from laboratories for training Machine learning models. Most of the current works use conventional cybersecurity datasets (UNSW-NB15 and BoT-IoT) to train data.

However, these datasets were not designed for blockchain networks and thus they are not appropriate to use in intrusion detection systems in blockchain networks. (The author tried to build their own datasets for blockchain networks by obtaining the normal samples from the bitcoin network, creating simulations experiment to detect the LFA and generating artificial attack samples by CGAN. These methods have several issues too, first normal samples of transactions from the Bitcoin network may include attacks from public blockchain network, but all collected data are classified and labeled to be normal data. Secondly, it is difficult to evaluate the effects of artificial attack samples to determine whether they can simulate a real attack into blockchain network or not).

Another challenge identified is that all currently Machine learning based intrusion detection solutions for blockchain networks are based on centralized learning models i.e., all data is collected at a centralized node for training and detection. However, this solution is not suitable to deploy in blockchains as they are decentralized networks. The research thus proposed a new decentralized intrusion detection dataset named BNat which stands for blockchain Network Attack Traffic and proposed effective collaborative machine learning framework to detect intrusions in the blockchain network as a remediation to gaps identified in using machine learning to detect and prevent cyber-attack.

A collaborative decentralized learning model was further proposed to not only improve the accuracy of identifying attacks, but also effectively deploy in decentralized blockchain networks. This model enables nodes in the network to effectively share their trained models to improve cyberattack detection efficiency without sharing their raw data.
An important aspect of this work is the fact that the proposed collaborative learning model not only improve the accuracy of detecting cyberattacks in blockchains but also eliminate the risks of exposing data over the network. This is line with the international best practices with regards to ensuring privacy and security of data at all time.

The research is an interesting one, however I noticed the researchers did not mention an incentivizing mechanism capable of sustaining the nodes to keep actively involved in detection process by collecting data, learning knowledge from their data and then exchanging knowledge together to improve the attack detection ability. Do you think the nodes are incentivized and if yes by what means? In the circumstances that the nodes are not incentivized, do you think they will not be compromised in their detection process?

7 Likes

Hi @Samuel94, thank you for your comment, nice observation. In responding to your question let me first explain what node is and what it does and whether it can be compromised.
Without nodes, a blockchain’s data would not be accessible. So one could say that nodes are the blockchain. Nodes form the infrastructure of a blockchain. All nodes on a blockchain are connected to each other and they constantly exchange the latest blockchain data with each other so all nodes stay up to date. They store, spread and preserve the blockchain data, so theoretically a blockchain exists on nodes. A full node is basically a device (like a computer) that contains a full copy of the transaction history of the blockchain.
here are what nodes do:
• Nodes check if a block of transactions is valid and accept or reject it.
• Nodes save and store blocks of transactions (storing blockchain transaction history).
• Nodes broadcast and spread this transaction history to other nodes that may need to synchronize with the blockchain (need to be updated on transaction history).
Therefore, nodes in blockchain networks may have different data to train and due to privacy concerns, they may not want to share their raw data to a centralized node (or other nodes) for training processes. Moreover, sending a huge amount of data to the network will not only cause excessive network traffic, but also risk compromising the data integrity of blockchain networks
I do not think if nodes are the incentive mechanism because the behavior node is autonomous and has social attributes such as selfishness. If a node wants to forward information to another node, it is bound to be limited by the node’s own resources such as cache, power, and energy. Therefore, in the process of communication, some nodes do not help to forward information of other nodes because of their selfish behavior. This will lead to the inability to complete cooperation, greatly reduce the success rate of message transmission, increase network delay, and affect the overall network performance.
Because of the selfishness of the node, I believe that there are mainly three incentive strategies to solve the problem of selfishness of nodes, that is,

TFT-BASED(TIT-FOR-TAT)
The TFT-based strategy is considered to be the simplest strategy to solve the selfish behavior of nodes. The TFT mechanism guarantees the fairness of node transactions, and the node selects interested messages to perform transactions according to itself, which enhances the enthusiasm of nodes to participate in message forwarding.

• REPUTATION-BASED
The strategy based on the Reputation is to reward and punish the reputation value maintained by the node, thus facilitating the cooperation of the selfish nodes. The main idea of the mechanism based on the Reputation is that the node needs to maintain and update a table for recording the reputation of other nodes. In the process of transmitting messages, the source node judges whether it can be trusted through the behavior of other nodes, and usually selects those nodes with higher reputation to help them forward the messages

• CREDIT-BASED.
Based on credit, virtual credit is used as a node to forward the reward of data packets to stimulate the enthusiasm of selfish node cooperation. The main idea of a mechanism based on Credit is that a node earns credit by forwarding messages to other nodes. By introducing the concept of virtual currency, the mechanism compares the message forwarding process to a transaction process.
Please let me know if I have answered your question.

3 Likes

Hi @Henry, i did not say that nodes are incentive mechanism. I am asking, how are these nodes incentivised or compensated for the work they do to detect and prevent cyberattack. One good aspect of blockchain is that it always incentivises nodes that solves particular problem just to ensure trust. Hope this is clear now?

3 Likes

Great Summary work @Henry

Blockchain networks are not impervious to cyberattacks. They need help from human experts to keep them safe.

I would like to respond to the research question by saying that there is no most effective solution that can detect cyberattacks in blockchain networks. However, the future of blockchain seems promising in this area and many companies are working on it. Blockchain networks are continuously under attack from malicious actors, who seek to exploit the network for financial gain or cause disruption. As the potential targets of cyberattacks, blockchain networks must be guarded against attacks and abuses by adopting appropriate security solutions.

One of these solutions is Collaborative Learning for Cyberattack Detection in Blockchain Networks (CLCDBN). CLCDBN works by actively learning from both positive and negative feedback, using it to update the model parameters, which determine whether events should be detected as anomalous or not. The key idea behind CLCDBN is to leverage on existing knowledge of how computer systems, network systems and blockchain work in order to detect anomalies that could indicate a cyberattack.

5 Likes

Hi @Samuel94 from my response above, I did not say that you said that nodes are incentive mechanism. I only shared my opinion or my thoughts for a better understanding.
At the moment I do not know how nodes are being compensated. Please if you have an idea or an experience on how nodes are been compensated please share with us. I am not above learning. For me, I think nodes are being compensated for the work they do to prevent cyberattack.

2 Likes

Hi @Idara_Effiong, thank you for sharing with us your view and understanding of the subject matter. Your thought is in line with what researchers proposed; that there is an urgent need to Foster a more powerful technique to safeguard blockchain networks.

3 Likes

Hello @Henry,

Nice work putting together this brief and succinct summary. The topic is very interesting, in that the potential Machine Learning has to offer when deployed on blockchains would be an instant gamechanger, especially as Attackers have continually morphed new techniques for initiating attacks, overriding protocols initially meant to prevent such.

From your summary, I understand the research entailed Nodal exchange/synchronization of data on a “private blockchain”, this will efficiently make the dataset to be thorough, as every event on the blockchain is most likely accounted for, so I have one little question:

On the course of the research, was there any statistic to measure the efficiency of the BNat dataset on different kinds of blockchains, say, for Layer 1s, Layer 0s, etc?

6 Likes

Thank you @J_Fraizer For your Observation and your perception. There are collections, classification or analysis to prove the efficiency of the BNat dataset on different kinds of blockchains, that is why the author specifically pinpointed out the objective of BNAT which is as follow;

  • To collect the BNaT in a laboratory environment to have “clean” data samples (i.e., to ensure that the obtained data is not corrupted, error and/or irrelevant)
  • the BNaT can be easily extended to include to new kinds of blockchain attacks, e.g., 51% or double spending attacks
  • we perform experiments with real attacks in the considered blockchain network, and thus the BNaT can reflect better the actual attack behavior of network than simulations
  • To collect the data in different blockchain nodes to have a complete view of effects when the attacks are performed in a decentralized manner

In this paper, the author proposed model that can perform offline training and real-time detection to quickly and efficiently prevent attacks in decentralized blockchain networks.

The researchers further propose a deep neural network (DNN) using Deep Belief Network (DBN) to better learn knowledge from this data. The DBN is a type of deep neural network that is used as a generative model of both labeled and unlabeled data. Therefore, unlike other supervised deep neural networks which use labeled data to train the neural networks (e.g., convolutional neural networks) DBN includes multiple Restricted Boltzmann Machines (RBM) layers for latent representation. DBN can represent better the characteristics of dataset, and thus it can classify the normal behavior and different types of attacks with very high accuracies

So, both Both simulation and real-time experimental results then have clearly shown the efficiency of the proposed framework. let me know if I answered your question.

7 Likes

Thanks for the reply, I’m very well clarified.

4 Likes

Very Glad to hear your feedback.

2 Likes

Thank you Henry for this wonderful summary, I just wanted to bring to your attention that sometime in May 2019, Binance, one of the biggest cryptocurrency exchange companies in the world, reported to be hit by a major security incident. In particular, the hackers did break the exchange’s security system and withdraw over 7,000 bitcoins from digital wallets, causing a total loss of approximately $40 million for the customers. You would also recall that on 7 th of October 2022, Binance suffered a major hack which led to loss of $560 million. in this paper, I can see that Machine Learning (ML) has been being considered as the most effective solution to detect cyberattacks with very high accuracies, I think I am not convinced that Machine Learning (ML) can detect cyberattacks with very high accuracies because if it does, this attack on Binance that Occurred on 7 th of October 2022 wouldn’t have happened.

4 Likes

Hi @Maryjane_Okorie thank you for your perception and understanding, I am sure you will agree with me that in this paper the author proposed a novel collaborative learning framework for a cyberattack detection system in a blockchain network. Why? Because there is a need for solutions to detect and prevent attacks in blockchain networks, this is why the author first implemented a private blockchain network in our laboratory. Also Recall that this blockchain network is used to generate data (both normal and attack data) to serve the proposed learning models and to validate the performance of our proposed learning framework in real-time experiments. Remember that the author further proposed a highly-effective learning model that allows to be effectively deployed in the blockchain network. This learning model allows nodes in the blockchain to be actively involved in the detection process by collecting data, learning knowledge from their data, and then exchanging knowledge together to improve the attack detection ability. I suggest that we need to plan to continue developing this dataset with other emerging types of attacks and develop more effective methods to protect blockchain networks so as to circumvent the frequent attack on blockchain.

3 Likes

Hi @Henry, thank you for the nice summary, it is really interesting to learn about cyberattack detention in blockchain network, I think I notice that all of current ML-based intrusion detection solutions for blockchain Networks are based on centralized learning models, i.e., all data is collected at a centralized node for training and detection. However, this solution is not suitable to deploy in blockchains as they are decentralized networks. Specifically, nodes in blockchain networks may have different data to train and due to privacy concerns, they may not want to share their raw data. I think this is an issue. But I am actually curious to find out if it is possible to detect attacks in an IoT network before the data can be stored in the blockchain network?

6 Likes

Hi @Lisayanky thank you for your comment. In responding to your question, yes, it is possible to detect attacks in an IoT network before the data can be stored in the blockchain network. Recall that the authors proposed an ML-based method, called bidirectional long short-term memory (BiLSTM) to detect attacks in an IoT network before the data is stored in the blockchain network. Although the results showed that they can detect different kinds of attacks with an accuracy of up to 99%, they were validated only on conventional network datasets such as UNSW-NB15 and BoT-IoT datasets. These datasets are collected in conventional computer networks and thus cannot reflect actual traffic in blockchain networks. In particular, these datasets have just general attacks in computer networks without specific attacks in blockchains, e.g., changes in blockchain transactions, incorrect consensus protocol or the break of the chain of blocks. Hope this answers your question?

4 Likes

Thanks for your explanation. Your ability to clearly explain the material was truly enlightening, really find it helpful.

4 Likes

Thanks @Henry for this amazing summary.

From the discussion, In network security, Machine Learning (ML) has been considered as the most effective solution to detect cyberattacks with very high accuracies. My question here is, are there any challenges to this?

5 Likes

nice work @Henry
One of the most significant technology advancements in the last ten years is blockchain. Understanding blockchain is becoming necessary because it is a crucial component of online investing, cryptocurrencies, and cyber security.
Blockchain’s rising appeal in various online investing marketplaces is due to its ability to process transactional data swiftly and easily. It also provides some significant security safeguards. A blockchain makes it incredibly impossible to change or modify any block, which helps prevent fraud. Additionally, each block has identification codes that make it possible to track transactional data.
Blockchain has become a desirable target for hackers and other cybercriminals because it is so widely used. With this gain in popularity, a number of cybersecurity problems with blockchain have emerged. some of these cybersecurity issues include : routing attacks, blockchain endpoint vulnerabilities,sybil attacks and phishing attacks.
In response to the research question contained in this summary, While blockchain may provide some security hazards, there are several steps that cyber security experts may take to lessen these dangers apart from the ones mentioned in research work. IT experts will be in a good position to deploy blockchain as safely as possible if they have properly honed their analytical and technical skills.
Utilizing encryption is a crucial step for cyber security experts to take. Cyber security experts can assist reduce some of the inherent hazards by further encrypting the data that is transmitted over blockchain technology.
Professionals in cyber security can also explain potential risks to clients properly by using their communication abilities. This might be as simple as informing a business to thoroughly investigate providers and voice any security concerns before adopting a new blockchain platform. A cyber security expert may also offer advice on some common sense information security procedures, such adopting aliases when transacting online.

3 Likes

Thanks for your comment @mansion, very well, there are challenges, and I believe that Samuel’s comment above has answered your question, please let me know if you are not satisfied with that. Thanks once again for your comment.

1 Like

Hello @Cashkid18 thank you for contributing further on this research paper, I am impressed by your recommendation and your conclusion and same may protect blockchain networks if implemented.

2 Likes