Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fix] Fix dimension error when using slide inference with Mask2Former head #3752

Open
wants to merge 1 commit into
base: dev-1.x
Choose a base branch
from

Conversation

Joris-Kuehl-TU-Berlin
Copy link

@Joris-Kuehl-TU-Berlin Joris-Kuehl-TU-Berlin commented Aug 5, 2024

Motivation

The method predict in mmseg/models/decode_heads/mask2former_head.py is incompatible with the 'slide' inference mode.

To elaborate, on line 280 of slide_inference in mmseg/models/segmentors/encoder_decoder.py, the key 'img_shape' of batch_img_metas[0] is overwritten by the shape of the cropped image / sliding window.

In line 353 of mmseg/models/decode_heads/decode_head.py, this is the first key that is used to get the target size for upsampling. As such, crop_seg_logits will have the same shape as crop_img.

In mmseg/models/decode_heads/mask2former_head.py, the first shape that is referenced as a target for upsampling is 'pad_shape' instead. Since slide_inference does not overwrite this key, each cropped image is upsampled to the size of the full image, and then further padded with zeros by slide_inference, leading to a dimension mismatch when trying to add up the padded crop_seg_logits. This leads to the issue described in #3666.

Modification

I have simply adjusted the size selection in mmseg/models/decode_heads/mask2former_head.py to match that of mmseg/models/decode_heads/decode_head.py. As a result, a dimension mismatch no longer occurs when using slide inference with a Mask2Former head.

Checklist

  1. Pre-commit or other linting tools are used to fix the potential lint issues.
  2. The modification is covered by complete unit tests. If not, please add more unit test to ensure the correctness.
  3. If the modification has potential influence on downstream projects, this PR should be tested with downstream projects, like MMDet or MMDet3D.
  4. The documentation has been modified accordingly, like docstring or example tutorials.

@londumas
Copy link

londumas commented Nov 8, 2024

Thank you for this PR, it works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants