Welcome > Artificial Intelligence > ML > Reading Notes on Dr. Mi Zhang's Publications > 2020 Fedml

2020 Fedml

ArXiv’20: FedML: A Research Library and Benchmark for Federated Learning

Problem: Existing Federated Learning libraries:

Solution: FedML, an open research library and benchmark to facilitate FL algorithm development and fair performance comparison:

Supports three computing paradigms:
- On-device training for edge devices;
- Distributed computing;
- Single-machine simulation;
Flexible and generic API
Reference baseline implementations (optimize, model, and datasets)
Realwork hardware platforms:
- Mobile, Android
- IoT, Raspberry PI 4 and NVIDIA Jetson Nano

Trained two CNNs (ResNet-56 and MobileNet) using the standard FedAvg algorithm.
- Result: accuracy of non-I.I.D setting is lower than that of the I.I.D setting, which is consistent with findings reported in prior work.
Compare the training time of distributed computing with that of standalone simulation.
- Result: Standalone simulation is 8 times slower than 10 parallel workers.
- Conclusion: FedML’s distributed paradigm is useful, but not available in existing FL lib such PySyft, LEAF and TTF.
Multiprocessing in a single GPU:
- Training ResNet on CIFAR-10, FedML can run 112 workers in a server with 8 GPUs.
- (No performance data here??)

Lele: No. Two Challenges:

The idea cannot come out unless I have a relatively rich experience in FL research. But frankly, I am just a newbie at FL now. That is, right now, I do not know there could be a benchmark and library problem in FL research.
Even if I was assigned with a task to build up a new standard lib for FL, I am afraid I cannot build it up very well due to the lack of experience. This kind of standarization work, especially for a newbie, could take for a while (years also??). We need to first research all the existing libraries and how they work, and then come out a way of how to merging them into one and provide a rich but consistent interfaces that will cover most existing functionalities. This kind of work is really useful, but the result cannot be made excellent in a short time (half to one year). Instead, I believe, more time spent, better result will come out for this kind of work. (Like the standarization of C languages or other protocols in IEEE standard, looks like they are always a ‘slow’ process)

If you could revise
the fundmental principles of
computer system design
to improve security...

... what would you change?