Abstract
Convolutional Neural Networks (CNNs) have been widely deployed, while traditional cloud data-centers based applications suffer from the bandwidth and latency network demand when applying to Industrial-Internet-of-Things (IIoT) fields. It is critical to migrate the CNN inference to edge devices for efficiency and security concerns. However, it is challenging to deploy complex CNNs on resource-constraint IIoT edge devices due to a large number of parameters and intensive floating-point computations. In this paper, we propose ABM-SpConv-SIMD, an on-device inference optimization framework, aiming at accelerating the network inference by fully utilizing the low-cost and common CPU resource. ABM-SpConv SIMD first adopts a model optimizer with pruning and quantization, which produces Sparse Convolutional models. And then, the Accumulation-Before-Multiplication mechanism is proposed to reduce multiplication operations. Additionally, the SIMD instructions, which are commonly available on cost-effective edge devices, are employed to improve the performance of convolutions. We have implemented ABM-SpConv-SIMD base on the Arm Compute Library software framework and evaluated on Hikey970 and Raspberry Pi devices with two representative models AlexNet and ResNet50. The results show that the ABM SpConv-SIMD can significantly improve the performance, and achieve on average of 1.96x and 1.73x speedup respectively over the baseline implementation with negligible loss of accuracy.
Original language | English |
---|---|
Journal | IEEE Transactions on Network Science and Engineering |
DOIs | |
Publication status | Published - 25 Feb 2022 |
Bibliographical note
Publisher Copyright:IEEE
Keywords
- Computational modeling
- Convolutional Neural Networks
- Convolutional neural networks
- Edge Devices
- Filtering algorithms
- Industrial Internet of Things
- Industrial Internet-of-Things Applications
- Matrix converters
- Performance evaluation
- Quantization (signal)
- Single Instruction Multiple Data
- Sparse Convolution