AnchorHead and add_gt_as_proposals option of the samplers are incompatible #12298

cmhm7 · 2025-01-24T09:02:29Z

Thanks for your error report and we appreciate it a lot.

Checklist

I have searched related issues but cannot get the expected help.
I have read the FAQ documentation but cannot get the expected help.
The bug has not been fixed in the latest version.

Describe the bug
When using a sampler (like RandomSampler) with add_gt_as_proposals=True in an AnchorHead (like RPNHead), sometimes theres is a crash because of out of range accesses

Reproduction

What command or script did you run?

python tools/train.py config.py

with a FasterRCNN in the config, and a RandomSampler with add_gt_as_proposals=True in the train_cfg for the rpn :

model = dict(
    type='FasterRCNN',
...
    rpn_head=dict(
        type='RPNHead',
        in_channels=96,
        feat_channels=96,
        anchor_generator=dict(
            type='AnchorGenerator',
            scales=[8],
            ratios=[0.5, 1.0, 2.0],
            strides=[1]),
...
   train_cfg=dict(
        rpn=dict(
            assigner=dict(
                type='MaxIoUAssigner',
                pos_iou_thr=0.7,
                neg_iou_thr=0.3,
                min_pos_iou=0.3,
                match_low_quality=True,
                ignore_iof_thr=-1),
            sampler=dict(
                type='RandomSampler',
                num=10,
                pos_fraction=0.2,
                neg_pos_ub=-1,
                add_gt_as_proposals=True),

Did you make any modifications on the code or config? Did you understand what you have modified?
Yes and yes
What dataset did you use?
Custom one

Environment

Please run python mmdet/utils/collect_env.py to collect necessary environment information and paste it here.

sys.platform: linux
Python: 3.7.10 (default, Feb 26 2021, 18:47:35) [GCC 7.3.0]
CUDA available: False
MUSA available: False
numpy_random_seed: 2147483648
GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
PyTorch: 1.9.0
PyTorch compiling details: PyTorch built with:

GCC 7.3
C++ Version: 201402
Intel(R) oneAPI Math Kernel Library Version 2021.2-Product Build 20210312 for Intel(R) 64 architecture applications
Intel(R) MKL-DNN v2.1.2 (Git Hash 98be7e8afa711dc9b66c8ff3504129cb82013cdb)
OpenMP 201511 (a.k.a. OpenMP 4.5)
NNPACK is enabled
CPU capability usage: AVX2
Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.1, CUDNN_VERSION=8.0.5, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.9.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,

TorchVision: 0.10.0
OpenCV: 4.10.0
MMEngine: 0.10.6
MMDetection: 3.3.0+78cf2bc

You may add addition that may be helpful for locating the problem, such as - How you installed PyTorch [e.g., pip, conda, source]
- Other environment variables that may be related (such as $PATH, $LD_LIBRARY_PATH, $PYTHONPATH, etc.)

I used the Dockerfile from MMDetection with

ARG PYTORCH="1.9.0"
ARG CUDA="11.1"
ARG CUDNN="8"

Error traceback
Error happens randomly in long trainings, I don't have the exact message noted

It said out of array access in anchor_head.py at line 292

Bug fix
The reason is that in AnchorHead , in method _get_targets_single, we have

anchors = flat_anchors[inside_flags]
...

sampling_result = self.sampler.sample(assign_result, pred_instances,
                                              gt_instances)

The rest of the code uses the anchors previously computed. But if add_gt_as_proposals=True in the sampler, the sampler appends the GT in its internal anchor list, so it becomes longer, and thus sampling_result.pos_inds and sampling_result.neg_inds can contain indices >= len(anchors)

If you fix that, you still have issues with anchor_head.py line 415, because images_to_levels expects the lists to be the same length accross all images

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AnchorHead and add_gt_as_proposals option of the samplers are incompatible #12298

AnchorHead and add_gt_as_proposals option of the samplers are incompatible #12298

cmhm7 commented Jan 24, 2025

AnchorHead and add_gt_as_proposals option of the samplers are incompatible #12298

AnchorHead and add_gt_as_proposals option of the samplers are incompatible #12298

Comments

cmhm7 commented Jan 24, 2025