Federated Domain Generalization for Image Recognition via Cross-Client Style Transfer

1Hong Kong University of Science and Technology, 2The Chinese University of Hong Kong

Abstract

Domain generalization (DG) has been a hot topic in image recognition, with a goal to train a general model that can perform well on unseen domains. Recently, federated learning (FL), an emerging machine learning paradigm to train a global model from multiple decentralized clients without compromising data privacy, has brought new challenges and possibilities to DG. In the FL scenario, many existing state-of-the-art (SOTA) DG methods become ineffective because they require the centralization of data from different domains during training. In this paper, we propose a novel domain generalization method for image recognition under federated learning through cross-client style transfer (CCST) without exchanging data samples. Our CCST method can lead to more uniform distributions of source clients, and make each local model learn to fit the image styles of all the clients to avoid the different model biases. Two types of style (single image style and overall domain style) with corresponding mechanisms are proposed to be chosen according to different scenarios. Our style representation is exceptionally lightweight and can hardly be used to reconstruct the dataset. The level of diversity is also flexible to be controlled with a hyper-parameter. Our method outperforms recent SOTA DG methods on two DG benchmarks (PACS, OfficeHome) and a large-scale medical image dataset (Camelyon17) in the FL setting. Last but not least, our method is orthogonal to many classic DG methods, achieving additive performance by combined utilization.

Method

Interpolation end reference image.

Figure 1: Overview of our framework with style transfer across clients using three different source styles on the PACS dataset. We augment each client data with styles of other two source clients.

Cross-client Style Transfer

Interpolation end reference image.

Figure 2: The re-organized AdaIN framework utilized for cross-client style transfer in federated learning. The VGG encoder is shared between the style extraction and image generation stage. Dash lines separate the three stages of our method: 1. Local style computation; 2. Server-side style bank broadcasting; 3. Local style transfer

Algorithm

Interpolation end reference image.

Results

Table 1: Accuracy comparison of image recognition on the PACS and Office-Home dataset, each single letter column repre- sents an unseen target client. Our CCST with the overall domain style (K=3) outperforms other methods. We use FedAvg as our base FL framework. Jigen, RSC, and Mixstyle are applied within each client. The backbone networks utilized in PACS and Office-Home are ImageNet-pretrained ResNet50 and ResNet18 respectively.

Interpolation end reference image.
Interpolation end reference image.

Figure 3: The distribution of source clients on the Camelyon17 before and after our CCST when the target client is hospital 5. The y-axis is the average pixel count per image, and the x-axis is the greyscale values. Note that we convert the RGB into grey images for distribution visualization. (a) Before CCST, the distributions of source clients are not uniform. (b) After CCST with either single image style or overall style, the distributions of source clients become much more uniform.

Table 2: (a) Performance of our approach using ResNet50 as the backbone with four different image style transfer settings compared with the baseline of FedAvg on the PACS benchmark. Each column represents a single unseen target client. (b) Performance of our approach with test time adaptation (Tent) on the PACS benchmark using ResNet50.

Interpolation end reference image.
Interpolation end reference image.

Figure 4: (a) The results on the Camelyon17 dataset. Our method outperforms other DG methods when tested on hospitals 4 and 5. (b) Extra performance boost on other domain generalization methods with our cross-domain style transfer (CCST) with overall style on the PACS dataset. Each x-tick represents the single unseen client in a leave-one-client-out experiment, and Avg. is abbreviated for the average accuracy.


BibTeX

@inproceedings{chen2023ccst,
      author    = {Chen, Junming and Jiang, Meirui and Dou, Qi and Chen, Qifeng},
      title     = {Federated Domain Generalization for Image Recognition via Cross-Client Style Transfer},
      booktitle = {IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
      year      = {2023},
}