Optimizing Inference

Reference 1

  • 2019 Jointdnn
  • Reference 1 Eshratifar, Amir Erfan, Mohammad Saeed Abrishami, and Massoud Pedram. “JointDNN: an efficient training and inference engine for intelligent mobile cloud computing services.” IEEE Transactions on Mobile Computing (2019). ↩

  • 2019 Eurosys ULayer
  • Reference 1 Kim, Youngsok, Joonsung Kim, Dongju Chae, Daehyun Kim, and Jangwoo Kim. “μLayer: Low Latency On-Device Inference Using Cooperative Single-Layer Acceleration and Processor-Friendly Quantization.” In Proceedings of the Fourteenth EuroSys Conference 2019, pp. 1-15. 2019. ↩

  • 2018 Hotedge Inference
  • Reference 1 Ogden, Samuel S., and Tian Guo. “{MODI}: Mobile Deep Inference Made Efficient by Edge Computing.” In {USENIX} Workshop on Hot Topics in Edge Computing (HotEdge 18). 2018. ↩

  • 2018 Deepcache
  • Reference 1 reference ↩

  • 2017 Emdl Mobirnn
  • Reference 1 Cao, Qingqing, Niranjan Balasubramanian, and Aruna Balasubramanian. “MobiRNN: Efficient recurrent neural network execution on mobile GPU.” In Proceedings of the 1st International Workshop on Deep Learning for Mobile Systems and Applications, pp. 1-6. 2017. ↩

  • 2018 Mobisys on Demand Compression
  • Reference 1 Liu, Sicong, Yingyan Lin, Zimu Zhou, Kaiming Nan, Hui Liu, and Junzhao Du. “On-demand deep model compression for mobile devices: A usage-driven model selection framework.” In Proceedings of the 16th Annual International Conference on Mobile Systems, Applications, and Services, pp. 389-400. 2018. ↩

  • 2018 Lctes Adaptive Selection
  • Reference 1 reference ↩

  • 2017 MEC: Memory-efficient Convolution for Deep Neural Network
  • Reference 1 Cho, Minsik, and Daniel Brand. “MEC: memory-efficient convolution for deep neural network.” In Proceedings of the 34th International Conference on Machine Learning-Volume 70, pp. 815-824. JMLR. org, 2017. ↩

  • 2018 Mobicom Nestdnn
  • Reference 1 Fang, Biyi, Xiao Zeng, and Mi Zhang. “Nestdnn: Resource-aware multi-tenant on-device deep learning for continuous mobile vision.” In Proceedings of the 24th Annual International Conference on Mobile Computing and Networking, pp. 115-127. 2018. ↩

  • 2020 Asplos Patdnn
  • Reference 1 reference ↩

  • Mobiles
  • Reference 1 Optimization of neural network inference to run on IoT/mobile devices. reference ↩

  • 2018 Ispn Deepx
  • Reference 1 2016 Deepx Kit Reference 1 Lane, Nicholas D., Sourav Bhattacharya, Akhil Mathur, Claudio Forlivesi, and Fahim Kawsar. “DXTK: Enabling Resource-efficient Deep Learning on Mobile and Embedded Devices with the DeepX Toolkit.” In MobiCASE, pp. 98-107. 2016. ↩ Lane, Nicholas D., Sourav Bhattacharya, Petko Georgiev, Claudio Forlivesi, Lei Jiao, Lorena Qendro, and Fahim Kawsar. “Deepx: A software accelerator for low-power deep learning inference on mobile devices.

  • 2016 Isca Eie
  • Reference 1 reference ↩

  • AMC: AutoML for Model Compression and Acceleration on Mobile Devices
  • Reference1 AMC: leverage reinforcement learning to efficiently sample the design space. 4x FLOPs reduction, 2.7% better accuracy than hand-crafted model compression for VGG-16 on ImageNet. speedup 1.53x on GPU (Titan Xp) and 1.95x on Android phone (Google Pixel I), with negligible loss of accuracy. He, Yihui, Ji Lin, Zhijian Liu, Hanrui Wang, Li-Jia Li, and Song Han. “Amc: Automl for model compression and acceleration on mobile devices.” In Proceedings of the European Conference on Computer Vision (ECCV), pp.


  1. reference ↩
Created May 24, 2020 // Last Updated Aug 31, 2021

If you could revise
the fundmental principles of
computer system design
to improve security...

... what would you change?