Achieving privacy-preserving cross-silo anomaly detection using federated XGBoost
摘要
Privacy has raised considerable concerns recently, especially with the advent of information explosion and numerous data mining techniques to explore the information inside large volumes of data. These data are often collected and stored across different institutions (banks, hospitals, etc.), or termed crosssilo. In this context, cross-silo federated learning has become prominent to tackle the privacy issues, where only model updates will be transmitted from institutions to servers without revealing institutions' private information. In this paper, we propose a cross-silo federated XGBoost approach to solve the federated anomaly detection problem, which aims to identify abnormalities from extremely unbalanced datasets (e.g., credit card fraud detection) and can be considered a special classification problem. We design two privacy-preserving mechanisms that are tailored to the federated XGBoost: anonymity based data aggregation and local differential privacy. In the anonymity based data aggregation scenario, we cluster data into different groups and using a cluster-level data feature to train the model. In the local
