Improving Small Object Detection in Remote Sensing with YOLO11-CBAM and Deep Learning
Nima Garshasebi
Abstract:
In this study, we present an advanced object detection system designed specifically for remote sensing images, leveraging the YOLOv11 framework enhanced with a Convolutional Block Attention Module (CBAM). Detecting objects in remote sensing imagery poses significant challenges due to the wide variation in object scales, complex backgrounds, densely packed objects, and arbitrary orientations. Traditional detection models often struggle under these conditions, particularly in accurately identifying small objects. To address these limitations, we propose two key improvements to YOLOv11: (1) the integration of CBAM, which enhances feature extraction by focusing on critical regions through channel and spatial attention mechanisms, thereby suppressing irrelevant background information, and (2) the modification of the detection head by introducing an additional layer specifically optimized for small object detection, improving the model’s ability to handle multi-scale objects. We evaluated our proposed model on the DOTA dataset, a widely recognized benchmark for aerial image object detection. Experimental results demonstrate a significant improvement in performance, achieving a mean Average Precision (mAP50) of 76.68%, which outperforms both the baseline YOLOv11 and several state-of-the-art models. Furthermore, ablation studies confirm the individual contributions of CBAM and the enhanced detection head to the overall performance. These findings highlight the effectiveness of combining attention mechanisms with multi-scale feature learning to advance object detection in remote sensing applications, offering a robust solution for real-world scenarios such as urban planning, environmental monitoring, and disaster management.
Keywords:
Attention layer, Cbam, Deep learning, Object detection, Satellite images.