A lung sound classification model with a spatial and channel reconstruction convolutional module

doi:10.12122/j.issn.1673-4254.2024.09.12

Abstract

Abstract:

Objective To construct a model with a spatial and channel reconstruction convolutional module for accurate identification and classification of lung sound data. Method We propose a convolutional network architecture combining the spatial-channel reconstruction convolution (SCConv) module. A lung sound feature extraction method combining the dual tunable Q-factor wavelet transform (DTQWT) with the triple Wigner-Ville transform (WVT) was used to improve the model's ability to capture the key features of the lung sounds by adaptively focusing on the important channel and spatial features. The performance of the model for classification of normal, crackles, wheezes, and crackles with wheezes was tested using the ICBHI2017 dataset. Results and Conclusion The accuracy, sensitivity, specificity and F1 score of the proposed method reached 85.68%, 93.55%, 86.79% and 90.51%, respectively, demonstrating its good performance in classification tasks in the ICBHI2017 lung sound database, especially for distinguishing normal from abnormal lung sounds.

Key words: lung sound classification, convolutional neural network, spatial and channel reconstruction convolution, dual tunable Q-factor wavelet transform, triple Wigner-Ville transform

Na YE, Chenwen WU, Jialin JIANG. A lung sound classification model with a spatial and channel reconstruction convolutional module[J]. Journal of Southern Medical University, 2024, 44(9): 1720-1728.

Figures/Tables 9

References 27

1	陈仕锋, 黄敏於, 彭显如, 等. 肺音可以作为首诊慢阻肺严重程度的判断指标[J]. 南方医科大学学报, 2020, 40(2): 177-82.
2	Xie SN, Girshick R, Dollár P, et al. Aggregated residual transformations for deep neural networks[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, HI, USA. IEEE, 2017: 5987-95.
3	Zagoruyko S, Komodakis N. Wide Residual Networks[J]. BMVC 2016, 2016: 87.1-87.12.
4	Szegedy C, Ioffe S, Vanhoucke V, et al. Inception-v4, inception-ResNet and the impact of residual connections on learning[J]. Proc AAAI Conf Artif Intell, 2017, 31(1): 4480-8.
5	Jaderberg M, Simonyan K, Zisserman A, et al. Spatial transformer networks[EB/OL]. 2015: arXiv: 1506.02025.
6	Woo S, Park J, Lee JY, et al. CBAM: convolutional block attention module[EB/OL]. 2018: arXiv: 1807.06521.
7	Mnih V, Heess N, Graves A, et al. Recurrent models of visual attention[EB/OL]. 2014: arXiv: 1406.6247.
8	Ba J, Mnih V, Kavukcuoglu K. Multiple object recognition with visual attention[J]. arXiv E Prints, 2014: arXiv: .
9	Gregor K, Danihelka I, Graves A, et al. DRAW: a recurrent neural network for image generation[EB/OL]. 2015: arXiv: 1502.04623.
10	Xu L, Cheng JH, Liu J, et al. ARSC-net: adventitious respiratory sound classification network using parallel paths with channel-spatial attention[C]//2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). Houston, TX, USA. IEEE, 2021: 1125-30.
11	Li JF, Wen Y, He LH. SCConv: spatial and channel reconstruction convolution for feature redundancy[C]//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Vancouver, BC, Canada. IEEE, 2023: 6153-62.
12	古依聪, 郭涛, 李成, 等. 基于LBP和Mixup数据增强后的肺音识别[J]. 计算机与数字工程, 2023, 51(1): 268-72.
13	Chen H, Yuan XC, Pei ZY, et al. Triple-classification of respiratory sounds using optimized S-transform and deep residual networks[J]. IEEE Access, 2845, 7: 32845-52.
14	Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate[J]. arXiv E Prints, 2014: arXiv: .
15	Xu K, Ba J, Kiros R, et al. Show, attend and tell: neural image caption generation with visual attention[EB/OL]. 2015: arXiv: 1502.03044.
16	Hou QB, Lu CZ, Cheng MM, et al. Conv2Former: a simple transformer-style ConvNet for visual recognition[J]. IEEE Trans Pattern Anal Mach Intell, 2024, doi: 10.1109/TPAMI.2024.3401450 . Online ahead of print.
17	Gulzar H, Li JY, Manzoor A, et al. Transfer Learning based Diagnosis and Analysis of Lung Sound Aberrations[J]. Int J Bioinform Biosci, 2023, 13(1): 29-40.
18	Rocha BM, Filos D, Mendes L, et al. Α respiratory sound database for the development of automated classification[C]//International Conference on Biomedical and Health Informatics. Singapore: Springer, 2018: 33-37.
19	Bohadana A, Izbicki G, Kraman SS. Fundamentals of lung auscultation[J]. N Engl J Med, 2014, 370(8): 744-51.
20	Perez L, Wang J. The Effectiveness of Data Augmentation in Image Classification using Deep Learning[J]. arXiv E Prinst, 2017: ar Xiv:1712. 04621.
21	Ma Y, Xu XZ, Yu Q, et al. LungBRN: a smart digital stethoscope for detecting respiratory disease using bi-ResNet deep learning algorithm[C]//2019 IEEE Biomedical Circuits and Systems Conference (BioCAS). Nara, Japan. IEEE, 2019: 1-4.
22	Wu CW, Ye N, Jiang JL. Classification and recognition of lung sounds based on improved Bi-ResNet model[J]. IEEE Access, 2024, 12: 73079-94.
23	Nguyen T, Pernkopf F. Lung sound classification using snapshot ensemble of convolutional neural networks[C]//2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC). Montreal, QC, Canada. IEEE, 2020: 760-3.
24	Song WJ, Han JQ, Song HW. Contrastive embeddind learning method for respiratory sound classification[C]//ICASSP 2021‑2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Toronto, ON, Canada. IEEE, 2021: 1275-9.
25	包善书, 车波, 邓林红. 基于双源域迁移学习的肺音信号识别[J]. 计算机工程, 2023, 49(9): 295-302, 312.
26	田思远. 基于CNN-Transformer的肺音信号分类研究[D]. 银川: 北方民族大学, 2023.
27	Selesnick IW. Wavelet transform with tunable Q-factor[J]. IEEE Trans Signal Process, 2011, 59(8): 3560-75.

Types of respiratory signals	Number of cycles before data enhancement	Number of cycles after data enhancement
Normal	3642	7265
Crackle	1864	6164
Wheeze	886	4025
Crackle and Wheeze	506	3305

Types of respiratory signals	Number of cycles before data enhancement	Number of cycles after data enhancement
Normal	3642	7265
Crackle	1864	6164
Wheeze	886	4025
Crackle and Wheeze	506	3305

Q value	Accuracy	Sensitivity	Specificity	AS	F1 Score
Q=2	67.90	86.50	78.30	82.40	64.50
Q=3	55.67	83.75	63.08	73.42	79.29
Q=4	66.47	93.46	76.19	84.83	88.87
Q=5	85.68	93.55	86.79	90.17	90.51
Q=6	79.17	90.19	86.39	88.29	89.34

Q value	Accuracy	Sensitivity	Specificity	AS	F1 Score
Q=2	67.90	86.50	78.30	82.40	64.50
Q=3	55.67	83.75	63.08	73.42	79.29
Q=4	66.47	93.46	76.19	84.83	88.87
Q=5	85.68	93.55	86.79	90.17	90.51
Q=6	79.17	90.19	86.39	88.29	89.34

Feature extraction	Accuracy	Sensitivity	Specificity	AS	F1 Score
Wavelet + STFT	75.40	59.50	88.70	82.05	71.33
Wavelet + STFT^[21]	52.79	31.12	69.20	50.16	50.16
Wavelet + STFT^[22]	77.81	61.99	90.10	76.05	71.05
LPCC + MFCC (CNN-Transformer)^[26]	95.70	94.62	N/A	N/A	93.88
DTCWT + WVT, STFT (SCConv-Net)	85.68	93.55	86.79	90.17	90.51