Summary
Federated Learning (FL) is a machine learning setting that separates data and protects user privacy. Clients learn global models together without data interaction. However, due to the lack of high-quality labeled data collected from the real world, most of the existing FL methods still rely on single-modal data. In this paper, we consider a new problem of multimodal federated learning. Although multimodal data always benefits from the complementarity of different modalities, it is difficult to solve the multimodal FL problem with traditional FL methods due to the modality discrepancy. Therefore, we propose a unified framework to solve it. In our framework, we use the co-attention mechanism to fuse the complementary information of different modalities. Our enhanced FL algorithm can learn useful global features of different modalities to jointly train common models for all clients. In addition, we use a personalization method based on Model-Agnostic Meta-Learning(MAML) to adapt the final model for each client. Extensive experimental results on multimodal activity recognition tasks demonstrate the effectiveness of the proposed method.
-
Institution中国科学院研究生院; 郑州大学