Benchmarking Results for MPI-based federated learning

Please visit the following link to check the latest benchmark experimental results: https://app.wandb.ai/automl/fedml/reports/FedML-Benchmark-Experimental-Results--VmlldzoxODE2NTUopen in new window FedML white paper (https://arxiv.org/pdf/2007.13518.pdfopen in new window) also summarizes the dataset list and related benchmarks. We refer the hyper-parameters and reproduce results from many top-tier ML conferences. Please check details of our reference hyperparameters as follows.

Linear Models

DataModelAlgPartition#C#C_pbsc_optlre#Racc
MNISTLRFedAvgPower Law10001010SGD0.031>100>75
Federated EMNISTLRFedAvgPower Law2001010SGD0.0031>20010~40
Synthetic(α,β)LRFedAvgPower Law301010SGD0.011>200>60

Note: #C stands for client_num_in_total; #C_p stands for client_num_per_round; bs = batch_size; c_opt = client optimizer; e = epoch; #R = number of rounds; acc = accuracy. For Synthetic(α,β), (α,β) is chosen from (0,0), (0.5,0.5), (1,1)

  • MNIST – Logistic Regression – FedAvg
    • Patition Method: ‘Federated optimization in heterogeneous networks’, page 7, Section 5.1, ‘Real data’
    • client_num_in_total: ‘Federated optimization in heterogeneous networks’, page 7, Section 5.1, ‘Real data’
    • client_num_per_round: ‘Federated optimization in heterogeneous networks’, page 18, Appendix C.2, ‘Hyperparameters’
    • batch_size: ‘Federated optimization in heterogeneous networks’, page 18, Appendix C.2, ‘Hyperparameters’
    • client_optimizer: ‘Federated optimization in heterogeneous networks’, page 8, Section 5.1, ‘Implementation
    • lr: ‘Federated optimization in heterogeneous networks’, page 18, Appendix C.2, ‘Hyperparameters’
    • epochs: ‘Federated optimization in heterogeneous networks’, page 21, Appendix C.3.2 Figure 9 description
    • comm_round: ‘Federated optimization in heterogeneous networks’, page 21, Appendix C.3.2 Figure 10
    • accuracy: ‘Federated optimization in heterogeneous networks’, page 21, Appendix C.3.2 Figure 10
  • Federated EMNIST – Logistic Regression-FedAvg
    • Patition Method: ‘Federated optimization in heterogeneous networks’, page 7, Section 5.1, ‘Real data’
    • client_num_in_total: ‘Federated optimization in heterogeneous networks’, page 7, Section 5.1, ‘Real data’
    • client_num_per_round: ‘Federated optimization in heterogeneous networks’, page 18, Appendix C.2, ‘Hyperparameters’
    • batch_size: ‘Federated optimization in heterogeneous networks’, page 18, Appendix C.2, ‘Hyperparameters’
    • client_optimizer: ‘Federated optimization in heterogeneous networks’, page 8, Section 5.1, ‘Implementation
    • lr: ‘Federated optimization in heterogeneous networks’, page 18, Appendix C.2, ‘Hyperparameters’
    • epochs: ‘Federated optimization in heterogeneous networks’, page 21, Appendix C.3.2 Figure 9 description
    • comm_round: ‘Federated optimization in heterogeneous networks’, page 21, Appendix C.3.2 Figure 10
    • accuracy: ‘Federated optimization in heterogeneous networks’, page 21, Appendix C.3.2 Figure 10
  • Synthetic(α,β) – Logistic Regression -FedAvg
    • Patition Method: ‘Federated optimization in heterogeneous networks’, page 18, Appendix C.1, ‘Synthetic’
    • client_num_in_total: ‘Federated optimization in heterogeneous networks’, page 18, Appendix C.1, ‘Synthetic’
    • client_num_per_round: ‘Federated optimization in heterogeneous networks’, page 18, Appendix C.2, ‘Hyperparameters’
    • batch_size: ‘Federated optimization in heterogeneous networks’, page 18, Appendix C.2, ‘Hyperparameters’
    • client_optimizer: ‘Federated optimization in heterogeneous networks’, page 8, Section 5.1, ‘Implementation
    • lr: ‘Federated optimization in heterogeneous networks’, page 18, Appendix C.2, ‘Hyperparameters’
    • epochs: ‘Federated optimization in heterogeneous networks’, page 8, Section 5.1, ‘Hyperparameters & evaluation metrics’
    • comm_round: ‘Federated optimization in heterogeneous networks’, page 19, Appendix C.3.2 Figure 6
    • accuracy: ‘Federated optimization in heterogeneous networks’, page 19, Appendix C.3.2 Figure 6

Lightweight and shallow neural network models

TaskData SetModelAlogrithmPartition MethodPartition Alphaclient_num_in_totalclient_num_per_roundbatch_sizeclient_optimizerlrwdepochscomm_roundaccuracy
CVFederated EMNISTCNN (2 Conv + 2 FC)FedAvgPower Law34001020SGD0.1-1>150084.9
CVCIFAR-100ResNet-18+group normalizationFedAvgPachinko Allocation100/500(ex/cli)5001020SGD0.1-1>400044.7
NLPShakespeareRNN (2 LSTM + 1 FC)FedAvgrealistic patition715104SGD1-1>120056.9
NLPStackOverflowRNN (1 LSTM + 2 FC)FedAvgPachinko Allocation3424775016SGDpow(10,-0.5)-1>150019.5
  • Federated EMNIST-CNN-FedAvg (https://openreview.net/pdf?id=LkFG3lB13U5)
    • Patition Method: ‘Adaptive federated optimization’ (https://openreview.net/pdf?id=LkFG3lB13U5), page 23, Appendix C.2
    • client_num_in_total: ‘Adaptive federated optimization’, page 23, Appendix C Dataset & Models, Table2
    • client_num_per_round: ‘Adaptive federated optimization’, page 6, Section 4, ‘Optimizer and hyperparameters’
    • batch_size: ‘Adaptive federated optimization’, page 27, Appendix D Experiment Hyperparameters, Table7
    • client_optimizer: ‘Adaptive federated optimization’, page 25, Appendix D.1, Paragraph 1
    • lr: ‘Adaptive federated optimization’, page 27, Appendix D.4, Table8
    • wd (learning rate decay): ‘Adaptive federated optimization’, page34, Appendix E.6, Paragraph 2
    • epochs: ‘Adaptive federated optimization’, page34, Appendix E.6, Paragraph 1
    • comm_round:‘Adaptive federated optimization’, page28, Appendix E.1, figure 3
    • accuracy: ‘Adaptive federated optimization’, page 7, Section 5, Table1
  • CIFAR-100 – ResNet18 -FedAvg
    • Patition Method: ‘Adaptive federated optimization’, page 23, Appendix C.1, Paragraph 3
    • Patition_alpha: ‘Adaptive federated optimization’, page 23, Appendix C.1, Paragraph 2
    • client_num_in_total: ‘Adaptive federated optimization’, page 23, Appendix C Dataset & Models, Table2
    • client_num_per_round: ‘Adaptive federated optimization’, page 6, Section 4, ‘Optimizer and hyperparameters’
    • batch_size: ‘Adaptive federated optimization’, page 27, Appendix D Experiment Hyperparameters, Table7
    • client_optimizer: ‘Adaptive federated optimization’, page 25, Appendix D.1, Paragraph 1
    • lr: ‘Adaptive federated optimization’, page 27, Appendix D.4, Table8
    • epochs: ‘Adaptive federated optimization’, page 6, Section 4, ‘Optimizer and hyperparameters’
    • comm_round: ‘Adaptive federated optimization’, page 7, Section 4, figure 1
    • accuracy: ‘Adaptive federated optimization’, page 7, Section 5, Table1
  • Shakespeare – RNN – FedAvg
    • Patition Method: ‘Adaptive federated optimization’, page 23, Appendix C.3
    • client_num_in_total: ‘Adaptive federated optimization’, page 23, Appendix C Dataset & Models, Table2
    • client_num_per_round: ‘Adaptive federated optimization’, page 6, Section 4, ‘Optimizer and hyperparameters’
    • batch_size: ‘Adaptive federated optimization’, page 27, Appendix D Experiment Hyperparameters, Table7
    • client_optimizer: ‘Adaptive federated optimization’, page 25, Appendix D.1, Paragraph 1
    • lr: ‘Adaptive federated optimization’, page 27, Appendix D.4, Table8
    • epochs: ‘Adaptive federated optimization’, page 6, Section 4, ‘Optimizer and hyperparameters’
    • comm_round: ‘Adaptive federated optimization’, page 7, Section 4, figure 1
    • accuracy: ‘Adaptive federated optimization’, page 7, Section 5, Table1
  • StackOverflow – RNN – FedAvg
    • Patition Method: ‘Adaptive federated optimization’, page 23, Appendix C.4, Paragraph 2
    • client_num_in_total: ‘Adaptive federated optimization’, page 25, Appendix C.4, Paragraph 1
    • client_num_per_round: ‘Adaptive federated optimization’, page 6, Section 4, ‘Optimizer and hyperparameters’
    • batch_size: ‘Adaptive federated optimization’, page 27, Appendix D Experiment Hyperparameters, Table7
    • client_optimizer: ‘Adaptive federated optimization’, page 25, Appendix D.1, Paragraph 1
    • lr: ‘Adaptive federated optimization’, page 27, Appendix D.4, Table8
    • epochs: ‘Adaptive federated optimization’, page 6, Section 4, ‘Optimizer and hyperparameters’
    • comm_round: ‘Adaptive federated optimization’, page 7, Section 4, figure 1
    • accuracy: ‘Adaptive federated optimization’, page 7, Section 5, Table1

Benchmarking using modern DNNs

DataModelAlg# C# C_pbsc_optlrwderoundIID accnon-IID acc
CIFAR10ResNet-56FedAvg101064SGD0.0010.0012010093.1987.12
CIFAR100ResNet-56FedAvg101064SGD0.0010.0012010068.9164.70
CINIC10ResNet-56FedAvg101064SGD0.0010.0012010082.5773.49
CIFAR10MobileNetFedAvg101064SGD0.0010.0012010091.1286.32
CIFAR100MobileNetFedAvg101064SGD0.0010.0012010055.1253.54
CINIC10MobileNetFedAvg101064SGD0.0010.0012010079.9571.23

Note: Non-IID distribution is set using LDA ( LDA = Latent Dirichlet Allocation) with alpha = 0.5; #C stands for client_num_in_total; #C_p stands for client_num_per_round; bs = batch size; c_opt = client optimizer.