SaccadeMOT: Enhancing Object Detection and Tracking in Gigapixel Images via Scale-Aware Density Estimation

Abstract

Recent advancements in deep learning for object detection and tracking have primarily focused on megapixel images, creating a significant gap in the efficient processing of gigapixel images. These ultra-high-resolution images pose unique challenges due to their enormous size and computational requirements. To address this, we present SaccadeMOT, an innovative architecture for gigapixel-level multi-object tracking inspired by the saccadic movements of the human eye. The key feature of SaccadeMOT is its ability to strategically select and process specific regions of the image, thereby drastically reducing computational load. The detection process is divided into two stages: the ‘saccade’ stage, which identifies regions of probable interest, and the ‘gaze’ stage, which refines the detection within these targeted areas. Based on the detection results, we track each object using a combination of head tracking and body tracking. Our approach, evaluated on the PANDA dataset, not only achieves an 8x speed increase over state-of-the-art methods but also shows significant promise in gigapixel-level pathology analysis, particularly in Whole Slide Imaging applications.

Publication
In “27th European Conference on Artificial Intelligence”