Enhancing ResNet-18 with Squeeze-and-Excitation Attention for Improved Medical Image Diagnosis
DOI: https://doi.org/10.62381/ACS.FSSD2025.33
Author(s)
Xiaoyi Jiang*
Affiliation(s)
School of computing and Data science Department, Xiamen University Malaysia, Sepang, Selangor, 43900, Malaysia
*Corresponding Author
Abstract
This study explores the effectiveness of incorporating Squeeze-and-Excitation attention mechanisms into the ResNet-18 model for enhancing medical image classification performance. Three datasets with distinct classification tasks were selected: PneumoniaMNIST for pneumonia detection, RetinaMNIST for retinal disease classification, and OrganMNIST for multi-label organ recognition. The model's performance was evaluated on each dataset accordingly. During training, stochastic gradient descent was used as the optimizer, and task-specific loss functions were applied based on the classification type. Experimental results show that the attention-enhanced model outperforms the baseline ResNet-18 in both classification accuracy and generalization across tasks. It demonstrates improved stability and adaptability, highlighting its potential for practical applications in clinical diagnostics.
Keywords
Squeeze-and-Excitation; Attention Mechanism; ResNet-18; Medical Image Classification
References
[1] He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 770-778. https://doi.org/10.1109/CVPR.2016.90
[2] Zagoruyko, S., & Komodakis, N. (2016). Wide residual networks. arXiv preprint arXiv:1605.07146. https://arxiv.org/abs/1605.07146
[3] Xie, S., Girshick, R., Dollár, P., Tu, Z., & He, K. (2017). Aggregated residual transformations for deep neural networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 1492-1500. https://doi.org/10.1109/CVPR.2017.634
[4] Huang, G., Liu, Z., Van Der Maaten, L., & Weinberger, K. Q. (2017). Densely connected convolutional networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 4700-4708. https://doi.org/10.1109/CVPR.2017.243
[5] Hu, J., Shen, L., & Sun, G. (2018). Squeeze-and-excitation networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 7132-7141. https://doi.org/10.1109/CVPR.2018.00745
[6] Woo, S., Park, J., Lee, J.-Y., & Kweon, I. S. (2018). CBAM: Convolutional block attention module. Proceedings of the European Conference on Computer Vision (ECCV), 3-19. https://doi.org/10.1007/978-3-030-01234-2_1
[7] Wang, F., Jiang, M., Qian, C., Yang, S., Li, C., Zhang, H., Wang, X., & Tang, X. (2017). Residual attention network for image classification. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 3156-3164. https://doi.org/10.1109/CVPR.2017.683
[8] Gao, Z., Xie, J., Wang, Q., & Li, P. (2019). Global second-order pooling convolutional networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 3024-3033. https://doi.org/10.1109/CVPR.2019.00314
[9] Li, X., Wang, W., Hu, X., & Yang, J. (2019). Selective kernel networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 510-519. https://doi.org/10.1109/CVPR.2019.00060
[10] Zhang, X., Zhou, X., Lin, M., & Sun, J. (2018). ShuffleNet: An extremely efficient convolutional neural network for mobile devices. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 6848-6856. https://doi.org/10.1109/CVPR.2018.00716
[11] Chen, Y., Li, J., Xiao, H., Jin, X., Yan, S., & Feng, J. (2017). Dual path networks. Advances in Neural Information Processing Systems (NeurIPS), 4467-4475. https://proceedings.neurips.cc/paper/2017/hash/3fc0a7dc4aab82b9b722cbf99f27f0b1-Abstract.html