ABM-SpConv-SIMD: Accelerating Convolutional Neural Network Inference for Industrial IoT Applications on Edge Devices

Xianduo Li; Xiaoli Gong; Dong Wang; Jin Zhang; Thar Baker; Jin Zhou; Tingjuan Lu

doi:10.1109/TNSE.2022.3154412

ABM-SpConv-SIMD: Accelerating Convolutional Neural Network Inference for Industrial IoT Applications on Edge Devices

Xianduo Li, Xiaoli Gong, Dong Wang, Jin Zhang, Thar Baker, Jin Zhou, Tingjuan Lu

School of Arch, Tech and Eng

Research output: Contribution to journal › Article › peer-review

Abstract

Convolutional Neural Networks (CNNs) have been widely deployed, while traditional cloud data-centers based applications suffer from the bandwidth and latency network demand when applying to Industrial-Internet-of-Things (IIoT) fields. It is critical to migrate the CNN inference to edge devices for efficiency and security concerns. However, it is challenging to deploy complex CNNs on resource-constraint IIoT edge devices due to a large number of parameters and intensive floating-point computations. In this paper, we propose ABM-SpConv-SIMD, an on-device inference optimization framework, aiming at accelerating the network inference by fully utilizing the low-cost and common CPU resource. ABM-SpConv SIMD first adopts a model optimizer with pruning and quantization, which produces Sparse Convolutional models. And then, the Accumulation-Before-Multiplication mechanism is proposed to reduce multiplication operations. Additionally, the SIMD instructions, which are commonly available on cost-effective edge devices, are employed to improve the performance of convolutions. We have implemented ABM-SpConv-SIMD base on the Arm Compute Library software framework and evaluated on Hikey970 and Raspberry Pi devices with two representative models AlexNet and ResNet50. The results show that the ABM SpConv-SIMD can significantly improve the performance, and achieve on average of 1.96x and 1.73x speedup respectively over the baseline implementation with negligible loss of accuracy.

Original language	English
Journal	IEEE Transactions on Network Science and Engineering
DOIs	https://doi.org/10.1109/TNSE.2022.3154412
Publication status	Published - 25 Feb 2022

Bibliographical note

Publisher Copyright:
IEEE

Keywords

Computational modeling
Convolutional Neural Networks
Convolutional neural networks
Edge Devices
Filtering algorithms
Industrial Internet of Things
Industrial Internet-of-Things Applications
Matrix converters
Performance evaluation
Quantization (signal)
Single Instruction Multiple Data
Sparse Convolution

Access to Document

10.1109/TNSE.2022.3154412

ABM_SpConv_for_TNSEAccepted author manuscript, 2.16 MBLicence: Other

Cite this

@article{908c02ee2e6543ee86839635a2b304be,

title = "ABM-SpConv-SIMD: Accelerating Convolutional Neural Network Inference for Industrial IoT Applications on Edge Devices",

abstract = "Convolutional Neural Networks (CNNs) have been widely deployed, while traditional cloud data-centers based applications suffer from the bandwidth and latency network demand when applying to Industrial-Internet-of-Things (IIoT) fields. It is critical to migrate the CNN inference to edge devices for efficiency and security concerns. However, it is challenging to deploy complex CNNs on resource-constraint IIoT edge devices due to a large number of parameters and intensive floating-point computations. In this paper, we propose ABM-SpConv-SIMD, an on-device inference optimization framework, aiming at accelerating the network inference by fully utilizing the low-cost and common CPU resource. ABM-SpConv SIMD first adopts a model optimizer with pruning and quantization, which produces Sparse Convolutional models. And then, the Accumulation-Before-Multiplication mechanism is proposed to reduce multiplication operations. Additionally, the SIMD instructions, which are commonly available on cost-effective edge devices, are employed to improve the performance of convolutions. We have implemented ABM-SpConv-SIMD base on the Arm Compute Library software framework and evaluated on Hikey970 and Raspberry Pi devices with two representative models AlexNet and ResNet50. The results show that the ABM SpConv-SIMD can significantly improve the performance, and achieve on average of 1.96x and 1.73x speedup respectively over the baseline implementation with negligible loss of accuracy.",

keywords = "Computational modeling, Convolutional Neural Networks, Convolutional neural networks, Edge Devices, Filtering algorithms, Industrial Internet of Things, Industrial Internet-of-Things Applications, Matrix converters, Performance evaluation, Quantization (signal), Single Instruction Multiple Data, Sparse Convolution",

author = "Xianduo Li and Xiaoli Gong and Dong Wang and Jin Zhang and Thar Baker and Jin Zhou and Tingjuan Lu",

note = "Publisher Copyright: IEEE",

year = "2022",

month = feb,

day = "25",

doi = "10.1109/TNSE.2022.3154412",

language = "English",

}

TY - JOUR

T1 - ABM-SpConv-SIMD

T2 - Accelerating Convolutional Neural Network Inference for Industrial IoT Applications on Edge Devices

AU - Li, Xianduo

AU - Gong, Xiaoli

AU - Wang, Dong

AU - Zhang, Jin

AU - Baker, Thar

AU - Zhou, Jin

AU - Lu, Tingjuan

N1 - Publisher Copyright: IEEE

PY - 2022/2/25

Y1 - 2022/2/25

N2 - Convolutional Neural Networks (CNNs) have been widely deployed, while traditional cloud data-centers based applications suffer from the bandwidth and latency network demand when applying to Industrial-Internet-of-Things (IIoT) fields. It is critical to migrate the CNN inference to edge devices for efficiency and security concerns. However, it is challenging to deploy complex CNNs on resource-constraint IIoT edge devices due to a large number of parameters and intensive floating-point computations. In this paper, we propose ABM-SpConv-SIMD, an on-device inference optimization framework, aiming at accelerating the network inference by fully utilizing the low-cost and common CPU resource. ABM-SpConv SIMD first adopts a model optimizer with pruning and quantization, which produces Sparse Convolutional models. And then, the Accumulation-Before-Multiplication mechanism is proposed to reduce multiplication operations. Additionally, the SIMD instructions, which are commonly available on cost-effective edge devices, are employed to improve the performance of convolutions. We have implemented ABM-SpConv-SIMD base on the Arm Compute Library software framework and evaluated on Hikey970 and Raspberry Pi devices with two representative models AlexNet and ResNet50. The results show that the ABM SpConv-SIMD can significantly improve the performance, and achieve on average of 1.96x and 1.73x speedup respectively over the baseline implementation with negligible loss of accuracy.

AB - Convolutional Neural Networks (CNNs) have been widely deployed, while traditional cloud data-centers based applications suffer from the bandwidth and latency network demand when applying to Industrial-Internet-of-Things (IIoT) fields. It is critical to migrate the CNN inference to edge devices for efficiency and security concerns. However, it is challenging to deploy complex CNNs on resource-constraint IIoT edge devices due to a large number of parameters and intensive floating-point computations. In this paper, we propose ABM-SpConv-SIMD, an on-device inference optimization framework, aiming at accelerating the network inference by fully utilizing the low-cost and common CPU resource. ABM-SpConv SIMD first adopts a model optimizer with pruning and quantization, which produces Sparse Convolutional models. And then, the Accumulation-Before-Multiplication mechanism is proposed to reduce multiplication operations. Additionally, the SIMD instructions, which are commonly available on cost-effective edge devices, are employed to improve the performance of convolutions. We have implemented ABM-SpConv-SIMD base on the Arm Compute Library software framework and evaluated on Hikey970 and Raspberry Pi devices with two representative models AlexNet and ResNet50. The results show that the ABM SpConv-SIMD can significantly improve the performance, and achieve on average of 1.96x and 1.73x speedup respectively over the baseline implementation with negligible loss of accuracy.

KW - Computational modeling

KW - Convolutional Neural Networks

KW - Convolutional neural networks

KW - Edge Devices

KW - Filtering algorithms

KW - Industrial Internet of Things

KW - Industrial Internet-of-Things Applications

KW - Matrix converters

KW - Performance evaluation

KW - Quantization (signal)

KW - Single Instruction Multiple Data

KW - Sparse Convolution

UR - http://www.scopus.com/inward/record.url?scp=85125727950&partnerID=8YFLogxK

U2 - 10.1109/TNSE.2022.3154412

DO - 10.1109/TNSE.2022.3154412

M3 - Article

AN - SCOPUS:85125727950

JO - IEEE Transactions on Network Science and Engineering

JF - IEEE Transactions on Network Science and Engineering

ER -

ABM-SpConv-SIMD: Accelerating Convolutional Neural Network Inference for Industrial IoT Applications on Edge Devices

Abstract

Bibliographical note

Keywords

Access to Document

Other files and links

Fingerprint

Cite this