Songhan

TinyTL

NeurIPS’20: Tiny Transfer Learning: Towards Memory-Efficient On-Device Learning

  • Three benchmark datasets: Cars, Flowers, Aircraft

    • Using ImageNet as the pre-training dataset.
    • Neural network architecture: MobileNetV2 (lightweight), ResNet-50.
  • Devices: Raspberry Pi 1. 256MB of memory.

Once for all

ICLR’20: Once-for-all: Train one network and specialize it for efficient deployment.

  • ImageNet;
  • Samsung S7 Edge, Note10, Google Pixel1, Pixel2, LG G8, NVIDIA 1080 Ti, V100 GPUs, Jetson TX2, Intel Xeon CPU, Xilinx ZU8EG, and ZU3EG FPGAs.
  • Cloud Devices:
    • GPU NVIDIA 1080Ti and V100 with Pytorch 1.0+cuDNN.
    • CPU batch size 1 on Intel Xeon E5-2690 v4 + MKL-DNN.
  • Edge Devices:
    • Mobile phones: Samsung, Google and LG phones with TF-Lite, batch size 1;
    • Mobile GPU: Jetson TX2 with Pytorch 1.0 + cuDNN, batch size of 16;
    • Embedded FPGA: Xilinx ZU9EG and ZU3EG FPGAs with Vitis AI, batch size 1. (Inference accelaration)
    • Xilinx ZU9EG (ZCU102, $2495)
    • Xilinx ZU3EG ()

Deep Leakage

NeurIPS’19: Deep Leakage from Gradients.

Diff Aug GAN

NeurIPS’20: Differentiable Augmentation for Data-Efficient GAN Training

  • Model: BigGAN, CR-BigGAN, StyleGAN2
  • Datasets: ImageNet (128x128 resolution), FFHQ portait dataset (256x256), image generated by few-shot learning.
  • Devices: Not in paper. (General platform?)
  • Code: Data-efficient-gans

More

  • 2019 Proxyless Nas
  • Q&A ———————– References: ICLR’19: ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware. By Han Cai, Ligeng Zhu, and Song Han. NeurIPS’20: MCUNet: Tiny Deep Learning on IoT Devices. By Ji Lin, Wei-Ming Chen, Yujun Lin, John Cohn, Chuang Gan, Song Han. More

  • 2020 MCUNet
  • Q&A What is the search space? What is mobile search space? ? c42 Mingxing Tan, Bo Chen, Ruoming Pang, Vijay Vasudevan, Mark Sandler, Andrew Howard, and Quoc V Le. MnasNet: Platform-Aware Neural Architecture Search for Mobile. In CVPR, 2019 What is a model? What is the system part and model part in the system-model codesign? What is one-shot architecture search? ? [c4] Gabriel Bender, Pieter-Jan Kindermans, Barret Zoph, Vijay Vasudevan, and Quoc Le.

  • Once-for-all: Train one network and specialize it for efficient deployment
  • Q&A What is MACs? Resolution? Kernel? Depth? Width? Channel? L1 norm of channel’s weight? Problem formalization ${min_{W_o}{\sum}{_a{_i}}}L_v(C(W_o, a_i))$ References: Once-for-all: Qualcomm News The future will be populated with many IoT devices that are AI-capable. AI will surround our lives at much lower cost, lower latency, and higher accuracy. There will be more powerful AI applications running on tiny edge devices, which requires extremely compact models and efficient chips.

Created Oct 22, 2020 // Last Updated Aug 31, 2021

If you could revise
the fundmental principles of
computer system design
to improve security...

... what would you change?