Deep neural networks (DNNs) are becoming the core components of many applications running on edge devices, especially for real time image-based analysis. Increasingly, multi-faced knowledge is extracted by executing multiple DNNs inference models, e.g., identifying objects, faces, and genders from images. It is of paramount importance to guarantee low response times of such multi-DNN executions as it affects not only users quality of experience but also safety. The challenge, largely unaddressed by the state of the art, is how to overcome the memory limitation of edge devices without altering the DNN models. In this paper, we design and implement MASA, a responsive memoryaware multi-DNN execution and scheduling framework, which requires no modification of DNN models. The aim of MASA is to consistently ensure the average response time when deterministically and stochastically executing multiple DNN-based image analyses. The enabling features of MASA are (i) modeling inter- and intra-network dependency, (ii) leveraging complimentary memory usage of each layer, and (iii) exploring the context dependency of DNNs. We verify the correctness and scheduling optimality via mixed integer programming. We extensively evaluate two versions of MASA, context-oblivious and context-aware, on three configurations of Raspberry Pi and a large set of popular DNN models triggered by different generation patterns of images. Our evaluation results show that MASA can achieve lower average response times by up to 90% on devices with small memory, i.e., 512 MB to 1 GB, compared to the state of the art multi-DNN scheduling solutions. (c) 2022 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY

Memory-aware and context-aware multi-DNN inference on the edge

Robert Birke;
2022-01-01

Abstract

Deep neural networks (DNNs) are becoming the core components of many applications running on edge devices, especially for real time image-based analysis. Increasingly, multi-faced knowledge is extracted by executing multiple DNNs inference models, e.g., identifying objects, faces, and genders from images. It is of paramount importance to guarantee low response times of such multi-DNN executions as it affects not only users quality of experience but also safety. The challenge, largely unaddressed by the state of the art, is how to overcome the memory limitation of edge devices without altering the DNN models. In this paper, we design and implement MASA, a responsive memoryaware multi-DNN execution and scheduling framework, which requires no modification of DNN models. The aim of MASA is to consistently ensure the average response time when deterministically and stochastically executing multiple DNN-based image analyses. The enabling features of MASA are (i) modeling inter- and intra-network dependency, (ii) leveraging complimentary memory usage of each layer, and (iii) exploring the context dependency of DNNs. We verify the correctness and scheduling optimality via mixed integer programming. We extensively evaluate two versions of MASA, context-oblivious and context-aware, on three configurations of Raspberry Pi and a large set of popular DNN models triggered by different generation patterns of images. Our evaluation results show that MASA can achieve lower average response times by up to 90% on devices with small memory, i.e., 512 MB to 1 GB, compared to the state of the art multi-DNN scheduling solutions. (c) 2022 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY
2022
83
101594
101609
Multiple DNNs inference; Average response time; Edge devices; Memory-aware scheduling
Bart Cox; Robert Birke; Lydia Chen
File in questo prodotto:
File Dimensione Formato  
1-s2.0-S1574119222000372-main.pdf

Accesso aperto

Dimensione 991.2 kB
Formato Adobe PDF
991.2 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/1887756
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 6
  • ???jsp.display-item.citation.isi??? 5
social impact