2018 Horovod

References:

Overview

Uber: deep learning for self-driving, trip forecasting, fraud prevention.

Michelangelo[c3], an internal ML-as-a-service platform, deploying ML systems at scale.

Horovod, an open-source component of Michelangelo’s deep learning toolkit which makes it easier to start – and speed up – distributed deep learning projects with TensorFlow.

Motivation

As datasets grew, so did the training times, which sometimes took a week or longer to complete. Need a way to train using a lot of data while maintaining short training times.

No related work???

More

Created Oct 27, 2020 // Last Updated Aug 31, 2021

If you could revise
the fundmental principles of
computer system design
to improve security...

... what would you change?