TY - JOUR
T1 - Detecting Android Malware with Convolutional Neural Networks and Hilbert Space-Filling Curves
AU - Mbungang, Benedict Ngaibe
AU - Wacka, Joan Beri Ali
AU - Tchakounte, Franklin
AU - Polatidis, Nikolaos
AU - Nlong II, Jean Michel
AU - Tieudjo, Daniel
PY - 2024/8/22
Y1 - 2024/8/22
N2 - Computer vision techniques have advanced greatly in recent years through deep learning, achieving unprecedented performance. This has motivated applying deep learning to malware detection through image-based approaches to circumvent extensive feature engineering for diverse threats. However, existing work converting Android binaries to rectangular images neglects the intrinsic byte sequence structure, introducing spurious spatial relationships that weaken detection accuracy. To address this, space-filling curves have mapped binaries to images while preserving ordering. This paper proposes a novel method using Hilbert space-filling curves to visualize and classify Android apps. Bytecode is extracted from Dalvik Executable (DEX) files and transformed to grayscale images via Hilbert coding for model training. Additionally, a novel and balanced image dataset is proposed consisting of Hilbert transformations for 4995 benign and 4995 malicious Android apps randomly sampled from the AndroZoo repository. Experiments using this dataset evaluated pre-trained InceptionV3, VGG16, ResNet50 and EfficientNetB0 via transfer learning. A custom Convolutional Neural Network (CNN) was also trained from scratch. InceptionV3 achieved the highest performance at 97.99% accuracy, 98.50% precision, 97.50% recall and 97.99% F1-score. Comparative assessment with previous image-based malware detection research indicates our approach outperforms state-of-the-art approaches. By leveraging Hilbert space-filling curves to map binaries to images while preserving sequential relationships, detection accuracy is improved over methods introducing extraneous spatial representations.
AB - Computer vision techniques have advanced greatly in recent years through deep learning, achieving unprecedented performance. This has motivated applying deep learning to malware detection through image-based approaches to circumvent extensive feature engineering for diverse threats. However, existing work converting Android binaries to rectangular images neglects the intrinsic byte sequence structure, introducing spurious spatial relationships that weaken detection accuracy. To address this, space-filling curves have mapped binaries to images while preserving ordering. This paper proposes a novel method using Hilbert space-filling curves to visualize and classify Android apps. Bytecode is extracted from Dalvik Executable (DEX) files and transformed to grayscale images via Hilbert coding for model training. Additionally, a novel and balanced image dataset is proposed consisting of Hilbert transformations for 4995 benign and 4995 malicious Android apps randomly sampled from the AndroZoo repository. Experiments using this dataset evaluated pre-trained InceptionV3, VGG16, ResNet50 and EfficientNetB0 via transfer learning. A custom Convolutional Neural Network (CNN) was also trained from scratch. InceptionV3 achieved the highest performance at 97.99% accuracy, 98.50% precision, 97.50% recall and 97.99% F1-score. Comparative assessment with previous image-based malware detection research indicates our approach outperforms state-of-the-art approaches. By leveraging Hilbert space-filling curves to map binaries to images while preserving sequential relationships, detection accuracy is improved over methods introducing extraneous spatial representations.
U2 - 10.1007/s42979-024-03123-6
DO - 10.1007/s42979-024-03123-6
M3 - Article
SN - 2661-8907
VL - 5
JO - SN Computer Science
JF - SN Computer Science
M1 - 810
ER -