A Multi-Object Tracking Algorithm for UAVs in Scenarios with Drastic Scale Variations and Dense Occlusions
DOI: https://doi.org/10.62381/I265304
Author(s)
Ziyang Qin1, Zongshang Yang1, Wanwan Wang2, Jiangang Zhang2
Affiliation(s)
1School of Artificial Intelligence and Big Data, Henan University of Technology, Zhengzhou, Henan, China
2iFLYTEK Co., Ltd., Hefei, Anhui, China
Abstract
UAV-based aerial imaging plays a crucial role in intelligent inspection applications. However, in real-world high-altitude scenarios, drastic scale variations and dense occlusions often result in missed detections, false positives, and identity switches in multi-object tracking systems. To address this issue, this paper proposes a lightweight UAV multi-object tracking method. A dynamic feature reconstruction module (DySample) is introduced at the detection stage to enhance small-object representation. To mitigate background interference caused by feature amplification, an efficient multi-scale attention mechanism (EMA) is incorporated. At the association stage, a confidence-driven adaptive Kalman filter combined with a dual-threshold matching strategy is employed to improve trajectory stability under occlusion. Ablation experiments on the VisDrone2019-MOT dataset show that, compared to the baseline model, the proposed method improves MOTA by 3.5 percentage points (from 25.3% to 28.8%), reduces ID switches from 38 to 35, and decreases false positives from 4728 to 3527, demonstrating advantages in suppressing false detections and maintaining identity consistency. However, the false negative (FN) rate increases (from 8440 to 9017), indicating room for improvement in recall under strict noise suppression. This study provides a feasible solution for UAV multi-object tracking under constrained computational resources.
Keywords
UAV Vision; Multi-Object Tracking; Dynamic Upsampling; Efficient Multi-Scale Attention; Adaptive Kalman Filtering.
References
[1]Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection //Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). Las Vegas, NV, USA: IEEE, 2016: 779-788.
[2]Hu, X., Pan, S. F. "A UAV Aerial Object Detection Algorithm Based on Improved Lightweight YOLOX." Computer Measurement and Control, 2024, 32(01): 57-63.
[3]Pu, L., Zhang, X. J. "UAV Visual Object Detection and Tracking Based on Deep Learning." Journal of Beijing University of Aeronautics and Astronautics, 2022, 48(05): 872-880.
[4]Ju, M. R., Luo, H. B., Wang, Z. B., et al. "Improved YOLO V3 Algorithm and Its Application in Small Object Detection." Acta Optica Sinica, 2019, 39(07): 253-260.
[5]Li, Z. H., Wang, Z. P., He, Y. T. "Aerial Small Object Detection Algorithm Based on Adaptive Collaborative Attention Mechanism." Acta Aeronautica et al Astronautica Sinica, 2023, 44(13): 244-254.
[6]Wang J, Chen K, Xu R, et al. CARAFE: Content-Aware ReAssembly of Features //Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). Seoul, Korea (South): IEEE, 2019: 3007-3016.
[7]Wojke, N., Bewley, A., Paulus, D. "Simple Online and Realtime Tracking with a Deep Association Metric." 2017 IEEE International Conference on Image Processing (ICIP). Beijing: IEEE, 2017: 3645-3649.
[8]Luo, Q., Zhao, R., Zhuang, H. S., et al. "YOLOv5 and Deep-SORT Joint Optimization for UAV Multi-Object Tracking." Signal Processing, 2022, 38(12): 2628-2638.
[9]Zhang, Y., Sun, P., Jiang, Y., et al. "ByteTrack Multi-Object Tracking by Associating Every Detection Box." European Conference on Computer Vision. Cham: Springer Nature Switzerland, 2022: 1-21.
[10]Cao J, Pang J, Weng X, et al. Observation-centric sort: Rethinking sort for robust multi-object tracking //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Vancouver, BC, Canada: IEEE, 2023: 9686-9696.
[11]Lin, Z., Yun, B., Zheng, Y. "LD-YOLO: A Lightweight Dynamic Forest Fire and Smoke Detection Model with DySample and Spatial Context Awareness Module." Forests, 2024, 15(9): 1630-1630.
[12]Sun, P., Xie, H., Lu, Q. Z., et al. "A YOLO11 Asphalt Pavement Crack Detection Algorithm Combined with Hyperbolic Enhancement and EMA." China Testing, 2025, 51(S2): 158-164.
[13]Lv, J., Ran, J. "Bagged Grape Video Counting Method Based on Improved YOLOv9s and Adaptive Kalman Filtering." Transactions of the Chinese Society of Agricultural Engineering, 2025, 41(10): 195-203.