2020 Fedml
ArXiv’20: FedML: A Research Library and Benchmark for Federated Learning
Overview
Problem: Existing Federated Learning libraries:
- Cannot adequately support diverse algorithm development;
- Lack of Diverse FL computing paradigms:
- TensorFlow-Federated, PySyft, LEAF, only support centralized topology-based FL algorithms;
- FATE, PaddleFL, does not support new algorithms;
- Lack of diverse FL configurations
- FL is diverse in network topology, exchanged information, and training procedures.
- These diversity is not supported in exisitng FL lib.
- Has inconsistent dataset and model usage, which makes fair algorithm comparison challenging
- Papers in top ML conferences (NeurIPS, ICLR, ICML) in past 2 years.
- Several factors could affect results:
- non-I.I.D distribution characteristic of FL
- datasets
- models
- number of clients involved in each round
Solution: FedML, an open research library and benchmark to facilitate FL algorithm development and fair performance comparison:
- Supports three computing paradigms:
- On-device training for edge devices;
- Distributed computing;
- Single-machine simulation;
- Flexible and generic API
- Reference baseline implementations (optimize, model, and datasets)
- Realwork hardware platforms:
- Mobile, Android
- IoT, Raspberry PI 4 and NVIDIA Jetson Nano
Evaluation
- Trained two CNNs (ResNet-56 and MobileNet) using the standard FedAvg algorithm.
- Result: accuracy of non-I.I.D setting is lower than that of the I.I.D setting, which is consistent with findings reported in prior work.
- Compare the training time of distributed computing with that of standalone simulation.
- Result: Standalone simulation is 8 times slower than 10 parallel workers.
- Conclusion: FedML’s distributed paradigm is useful, but not available in existing FL lib such PySyft, LEAF and TTF.
- Multiprocessing in a single GPU:
- Training ResNet on CIFAR-10, FedML can run 112 workers in a server with 8 GPUs.
- (No performance data here??)
Can we do the same thing?
Lele: No. Two Challenges:
- The idea cannot come out unless I have a relatively rich experience in FL research. But frankly, I am just a newbie at FL now. That is, right now, I do not know there could be a benchmark and library problem in FL research.
- Even if I was assigned with a task to build up a new standard lib for FL, I am afraid I cannot build it up very well due to the lack of experience. This kind of standarization work, especially for a newbie, could take for a while (years also??). We need to first research all the existing libraries and how they work, and then come out a way of how to merging them into one and provide a rich but consistent interfaces that will cover most existing functionalities. This kind of work is really useful, but the result cannot be made excellent in a short time (half to one year). Instead, I believe, more time spent, better result will come out for this kind of work. (Like the standarization of C languages or other protocols in IEEE standard, looks like they are always a ‘slow’ process)
More