Research Summary: Collaborative Learning for Cyberattack Detection in Blockchain Networks

Hello, @Mansion for the purpose clarirty, and as well to help a furture reader, yes, I agree with you that in network security, Machine Learning (ML) has been being considered as the most effective solution to detect cyberattacks with very high accuracies DL, however there are challenges we can observe two main challenges for ML-based intrusion detection. In particular, the first challenge is lacking of a synthetic data from laboratories for training ML models. Most of current works, e.g., are using conventional cybersecurity datasets (e.g., UNSW-NB15 and BoT-IoT) to train data. However, these datasets were not designed for blockchain networks, and thus they are not appropriate to use in intrusion detection systems in blockchain networks. Other works, e.g., tried to build their own datasets for blockchain networks, e.g., by obtaining the normal samples from the Bitcoin network creating simulation experiment to detect the LFA and generating artificial attack samples by CGAN However, these methods have several issues. First, normal samples of transactions from the Bitcoin network may include attacks from public blockchain network, but all collected data are classified and labeled to be normal data.

Secondly, the simulation experiment in was to generate traceroute records only for the LFA so they cannot extend to other attacks. Furthermore, it is difficult to evaluate the effects of artificial attack samples in whether they can simulate a real attack into blockchain network or not. The another challenge we can observe here is that all of current ML- based intrusion detection solutions for blockchain networks are based on centralized learning models, i.e., all data is collected at a centralized node for training and detection. However, this solution is not suitable to deploy in blockchains as they are decentralized networks. Specifically, nodes in blockchain networks may have different data to train and due to privacy concerns, they may not want to share their raw data to a centralized node (or other nodes) for training processes. Moreover, sending a huge amount of data to the network will not only cause excessive network traffic, but also risk compromising the data integrity of blockchain networks.

9 Likes