南方医科大学学报 ›› 2024, Vol. 44 ›› Issue (9): 1720-1728.doi: 10.12122/j.issn.1673-4254.2024.09.12
收稿日期:
2024-04-20
出版日期:
2024-09-20
发布日期:
2024-09-30
通讯作者:
吴辰文
E-mail:731443570@qq.com;wuchenwen@mail.lzjtu.cn
作者简介:
叶 娜,在读硕士研究生,E-mail: 731443570@qq.com基金资助:
Na YE(), Chenwen WU(
), Jialin JIANG
Received:
2024-04-20
Online:
2024-09-20
Published:
2024-09-30
Contact:
Chenwen WU
E-mail:731443570@qq.com;wuchenwen@mail.lzjtu.cn
Supported by:
摘要:
目的 探究肺音数据的准确识别及分类。 方法 本文提出了一种结合空间-通道重构卷积(SCConv)模块的卷积网络架构以及双可调Q因子小波变换(DTQWT)与三重Wigner-Ville变换(WVT)结合的肺音特征提取方法,通过自适应地聚焦于重要的通道和空间特征,提高模型对肺音关键特征的捕捉能力。基于ICBHI2017数据集,进行正常音、哮鸣音、爆裂音、哮鸣音和爆裂音结合的分类。 结果 方法在分类的准确率、敏感性、特异性以及F1分数上分别达到85.68%、93.55%、86.79%、90.51%。 结论 所提方法在ICBHI 2017肺音数据库上取得了优异的性能,特别是在区分正常肺音和异常肺音方面。
叶娜, 吴辰文, 蒋佳霖. 具有空间-通道重构卷积模块的肺音分类模型[J]. 南方医科大学学报, 2024, 44(9): 1720-1728.
Na YE, Chenwen WU, Jialin JIANG. A lung sound classification model with a spatial and channel reconstruction convolutional module[J]. Journal of Southern Medical University, 2024, 44(9): 1720-1728.
Types of respiratory signals | Number of cycles before data enhancement | Number of cycles after data enhancement |
---|---|---|
Normal | 3642 | 7265 |
Crackle | 1864 | 6164 |
Wheeze | 886 | 4025 |
Crackle and Wheeze | 506 | 3305 |
表1 数据增强前后呼吸周期数量
Tab.1 Number of respiratory cycles before and after data enhancement
Types of respiratory signals | Number of cycles before data enhancement | Number of cycles after data enhancement |
---|---|---|
Normal | 3642 | 7265 |
Crackle | 1864 | 6164 |
Wheeze | 886 | 4025 |
Crackle and Wheeze | 506 | 3305 |
Q value | Accuracy | Sensitivity | Specificity | AS | F1 Score |
---|---|---|---|---|---|
Q=2 | 67.90 | 86.50 | 78.30 | 82.40 | 64.50 |
Q=3 | 55.67 | 83.75 | 63.08 | 73.42 | 79.29 |
Q=4 | 66.47 | 93.46 | 76.19 | 84.83 | 88.87 |
Q=5 | 85.68 | 93.55 | 86.79 | 90.17 | 90.51 |
Q=6 | 79.17 | 90.19 | 86.39 | 88.29 | 89.34 |
表2 高Q因子实验结果
Tab.2 Experimental results with data with high Q factors (%)
Q value | Accuracy | Sensitivity | Specificity | AS | F1 Score |
---|---|---|---|---|---|
Q=2 | 67.90 | 86.50 | 78.30 | 82.40 | 64.50 |
Q=3 | 55.67 | 83.75 | 63.08 | 73.42 | 79.29 |
Q=4 | 66.47 | 93.46 | 76.19 | 84.83 | 88.87 |
Q=5 | 85.68 | 93.55 | 86.79 | 90.17 | 90.51 |
Q=6 | 79.17 | 90.19 | 86.39 | 88.29 | 89.34 |
Feature extraction | Accuracy | Sensitivity | Specificity | AS | F1 Score |
---|---|---|---|---|---|
Wavelet + STFT | 75.40 | 59.50 | 88.70 | 82.05 | 71.33 |
Wavelet + STFT[ | 52.79 | 31.12 | 69.20 | 50.16 | 50.16 |
Wavelet + STFT[ | 77.81 | 61.99 | 90.10 | 76.05 | 71.05 |
LPCC + MFCC (CNN-Transformer)[ | 95.70 | 94.62 | N/A | N/A | 93.88 |
DTCWT + WVT, STFT (SCConv-Net) | 85.68 | 93.55 | 86.79 | 90.17 | 90.51 |
表3 不同特征提取结果比较
Tab.3 Comparison of the results of different feature extraction methods (%)
Feature extraction | Accuracy | Sensitivity | Specificity | AS | F1 Score |
---|---|---|---|---|---|
Wavelet + STFT | 75.40 | 59.50 | 88.70 | 82.05 | 71.33 |
Wavelet + STFT[ | 52.79 | 31.12 | 69.20 | 50.16 | 50.16 |
Wavelet + STFT[ | 77.81 | 61.99 | 90.10 | 76.05 | 71.05 |
LPCC + MFCC (CNN-Transformer)[ | 95.70 | 94.62 | N/A | N/A | 93.88 |
DTCWT + WVT, STFT (SCConv-Net) | 85.68 | 93.55 | 86.79 | 90.17 | 90.51 |
Model | Accuracy | Sensitivity | Specificity | AS | F1 Score |
---|---|---|---|---|---|
ARSC-Net[ | N/A | 46.38 | 67.13 | 56.76 | 56.76 |
CNN(Convolutional Neural Network)[ | 72.38 | 35.90 | 87.80 | 61.85 | 61.85 |
CNN-RSM[ | 76.01 | 43.07 | 89.70 | 66.39 | 66.38 |
VGG16[ | 95.00 | 88.00 | 86.00 | 87.00 | 81.00 |
CNN Snapshot Ensembles+DataAugment+LogMel[ | N/A | 69.40 | 87.30 | 78.35 | 78.35 |
Contrastive Embedding Learning+LogMel+DataAugment[ | N/A | 70.93 | 85.44 | 78.19 | 78.18 |
LSR-Net[ | 79.18 | 72.04 | 88.75 | 80.40 | 80.39 |
CNN-Transformer(LPCC+MFCC)[ | 95.70 | 94.62 | N/A | N/A | 93.88 |
SCConv-Net(DTCWT+ WVT) | 85.68 | 93.55 | 86.79 | 90.17 | 90.51 |
表4 分类模型结果比较
Tab.4 Comparison of the performance of different classification models (%)
Model | Accuracy | Sensitivity | Specificity | AS | F1 Score |
---|---|---|---|---|---|
ARSC-Net[ | N/A | 46.38 | 67.13 | 56.76 | 56.76 |
CNN(Convolutional Neural Network)[ | 72.38 | 35.90 | 87.80 | 61.85 | 61.85 |
CNN-RSM[ | 76.01 | 43.07 | 89.70 | 66.39 | 66.38 |
VGG16[ | 95.00 | 88.00 | 86.00 | 87.00 | 81.00 |
CNN Snapshot Ensembles+DataAugment+LogMel[ | N/A | 69.40 | 87.30 | 78.35 | 78.35 |
Contrastive Embedding Learning+LogMel+DataAugment[ | N/A | 70.93 | 85.44 | 78.19 | 78.18 |
LSR-Net[ | 79.18 | 72.04 | 88.75 | 80.40 | 80.39 |
CNN-Transformer(LPCC+MFCC)[ | 95.70 | 94.62 | N/A | N/A | 93.88 |
SCConv-Net(DTCWT+ WVT) | 85.68 | 93.55 | 86.79 | 90.17 | 90.51 |
Using reconstructed convolutional module types | Accuracy | Sensitivity | Specificity | AS | F1 Score |
---|---|---|---|---|---|
no-SCConv | 66.80 | 89.40 | 71.70 | 80.55 | 67.50 |
SRU | 72.50 | 87.60 | 82.90 | 85.25 | 84.30 |
CRU | 74.40 | 83.70 | 74.30 | 79.00 | 80.20 |
CRU-SRU | 82.57 | 87.48 | 76.79 | 82.14 | 84.90 |
SCConv(SRU-CRU) | 85.68 | 93.55 | 86.79 | 90.17 | 90.51 |
表5 重构卷积模块实验结果
Tab.5 Experimental results of the reconfigured convolution module (%)
Using reconstructed convolutional module types | Accuracy | Sensitivity | Specificity | AS | F1 Score |
---|---|---|---|---|---|
no-SCConv | 66.80 | 89.40 | 71.70 | 80.55 | 67.50 |
SRU | 72.50 | 87.60 | 82.90 | 85.25 | 84.30 |
CRU | 74.40 | 83.70 | 74.30 | 79.00 | 80.20 |
CRU-SRU | 82.57 | 87.48 | 76.79 | 82.14 | 84.90 |
SCConv(SRU-CRU) | 85.68 | 93.55 | 86.79 | 90.17 | 90.51 |
1 | 陈仕锋, 黄敏於, 彭显如, 等. 肺音可以作为首诊慢阻肺严重程度的判断指标[J]. 南方医科大学学报, 2020, 40(2): 177-82. |
2 | Xie SN, Girshick R, Dollár P, et al. Aggregated residual transformations for deep neural networks[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, HI, USA. IEEE, 2017: 5987-95. |
3 | Zagoruyko S, Komodakis N. Wide Residual Networks[J]. BMVC 2016, 2016: 87.1-87.12. |
4 | Szegedy C, Ioffe S, Vanhoucke V, et al. Inception-v4, inception-ResNet and the impact of residual connections on learning[J]. Proc AAAI Conf Artif Intell, 2017, 31(1): 4480-8. |
5 | Jaderberg M, Simonyan K, Zisserman A, et al. Spatial transformer networks[EB/OL]. 2015: arXiv: 1506.02025. |
6 | Woo S, Park J, Lee JY, et al. CBAM: convolutional block attention module[EB/OL]. 2018: arXiv: 1807.06521. |
7 | Mnih V, Heess N, Graves A, et al. Recurrent models of visual attention[EB/OL]. 2014: arXiv: 1406.6247. |
8 | Ba J, Mnih V, Kavukcuoglu K. Multiple object recognition with visual attention[J]. arXiv E Prints, 2014: arXiv: . |
9 | Gregor K, Danihelka I, Graves A, et al. DRAW: a recurrent neural network for image generation[EB/OL]. 2015: arXiv: 1502.04623. |
10 | Xu L, Cheng JH, Liu J, et al. ARSC-net: adventitious respiratory sound classification network using parallel paths with channel-spatial attention[C]//2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). Houston, TX, USA. IEEE, 2021: 1125-30. |
11 | Li JF, Wen Y, He LH. SCConv: spatial and channel reconstruction convolution for feature redundancy[C]//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Vancouver, BC, Canada. IEEE, 2023: 6153-62. |
12 | 古依聪, 郭 涛, 李 成, 等. 基于LBP和Mixup数据增强后的肺音识别[J]. 计算机与数字工程, 2023, 51(1): 268-72. |
13 | Chen H, Yuan XC, Pei ZY, et al. Triple-classification of respiratory sounds using optimized S-transform and deep residual networks[J]. IEEE Access, 2845, 7: 32845-52. |
14 | Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate[J]. arXiv E Prints, 2014: arXiv: . |
15 | Xu K, Ba J, Kiros R, et al. Show, attend and tell: neural image caption generation with visual attention[EB/OL]. 2015: arXiv: 1502.03044. |
16 | Hou QB, Lu CZ, Cheng MM, et al. Conv2Former: a simple transformer-style ConvNet for visual recognition[J]. IEEE Trans Pattern Anal Mach Intell, 2024, doi: 10.1109/TPAMI.2024.3401450 . Online ahead of print. |
17 | Gulzar H, Li JY, Manzoor A, et al. Transfer Learning based Diagnosis and Analysis of Lung Sound Aberrations[J]. Int J Bioinform Biosci, 2023, 13(1): 29-40. |
18 | Rocha BM, Filos D, Mendes L, et al. Α respiratory sound database for the development of automated classification[C]//International Conference on Biomedical and Health Informatics. Singapore: Springer, 2018: 33-37. |
19 | Bohadana A, Izbicki G, Kraman SS. Fundamentals of lung auscultation[J]. N Engl J Med, 2014, 370(8): 744-51. |
20 | Perez L, Wang J. The Effectiveness of Data Augmentation in Image Classification using Deep Learning[J]. arXiv E Prinst, 2017: ar Xiv:1712. 04621. |
21 | Ma Y, Xu XZ, Yu Q, et al. LungBRN: a smart digital stethoscope for detecting respiratory disease using bi-ResNet deep learning algorithm[C]//2019 IEEE Biomedical Circuits and Systems Conference (BioCAS). Nara, Japan. IEEE, 2019: 1-4. |
22 | Wu CW, Ye N, Jiang JL. Classification and recognition of lung sounds based on improved Bi-ResNet model[J]. IEEE Access, 2024, 12: 73079-94. |
23 | Nguyen T, Pernkopf F. Lung sound classification using snapshot ensemble of convolutional neural networks[C]//2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC). Montreal, QC, Canada. IEEE, 2020: 760-3. |
24 | Song WJ, Han JQ, Song HW. Contrastive embeddind learning method for respiratory sound classification[C]//ICASSP 2021‑2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Toronto, ON, Canada. IEEE, 2021: 1275-9. |
25 | 包善书, 车 波, 邓林红. 基于双源域迁移学习的肺音信号识别[J]. 计算机工程, 2023, 49(9): 295-302, 312. |
26 | 田思远. 基于CNN-Transformer的肺音信号分类研究[D]. 银川: 北方民族大学, 2023. |
27 | Selesnick IW. Wavelet transform with tunable Q-factor[J]. IEEE Trans Signal Process, 2011, 59(8): 3560-75. |
[1] | 欧嘉志, 詹长安, 杨丰. 一维卷积神经网络的自编码癫痫发作异常检测模型[J]. 南方医科大学学报, 2024, 44(9): 1796-1804. |
[2] | 盛江明, 薛 娟, 李 鹏, 伊 娜. 基于时空图卷积神经网络的蛋白质复合物识别方法[J]. 南方医科大学学报, 2022, 42(7): 1075-1081. |
[3] | 斯文彬, 冯衍秋. 定量磁化率成像中磁化率重建伪影的清除:基于多通道输入的卷积神经网络方法[J]. 南方医科大学学报, 2022, 42(12): 1799-1806. |
[4] | 曹 石, 巩 高, 肖 慧, 方威扬, 阙与清, 陈超敏. 胎儿心电信号的无创提取:基于时间卷积编解码网络[J]. 南方医科大学学报, 2022, 42(11): 1672-1680. |
[5] | 王 啸, 黄 鉴, 吉 祥, 珠 珠. 人工智能在结肠息肉检测与分类中的应用[J]. 南方医科大学学报, 2021, 41(2): 310-312. |
[6] | 刘忠强, 钟 涛, 曹晓欢, 张 煜. 基于组织修复的脑肿瘤图像配准方法[J]. 南方医科大学学报, 2021, 41(2): 292-298. |
[7] | 高 琦, 朱曼曼, 李丹阳, 边兆英, 马建华. CT图像的质量评估策略:基于预恢复图像先验信息[J]. 南方医科大学学报, 2021, 41(2): 230-237. |
[8] | 慕光睿,杨燕平,高耀宗,冯前进. 基于多尺度三维卷积神经网络的头颈部危及器官分割方法[J]. 南方医科大学学报, 2020, 40(04): 491-498. |
[9] | 李青峰,邢潇丹,冯前进. 基于耦合的卷积-图卷积神经网络的阿尔茨海默病的磁共振诊断方法[J]. 南方医科大学学报, 2020, 40(04): 531-537. |
[10] | 邓力,傅蓉. 基于心拍的端到端心律失常分类[J]. 南方医科大学学报, 2019, 39(09): 1071-. |
[11] | 杜东阳,路利军,符瑞阳,袁丽莎,陈武凡,刘娅琴. 手掌静脉识别:基于端到端卷积神经网络方法[J]. 南方医科大学学报, 2019, 39(02): 207-. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||