Add Instance Segmentation Transforms#836
Conversation
|
/review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b7caf1ca53
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if isinstance(bboxes_out, list): | ||
| bboxes_out = np.array(bboxes_out) |
There was a problem hiding this comment.
Preserve empty bbox tensors as (0, 4)
When an image has no instances (which the dataset allows by default) or an augmentation drops every box, albumentations can return bboxes as an empty list; np.array([]) produces shape (0,), so the dataset returns a box tensor with no second dimension. The downstream matchers/criteria expect each target's boxes to have shape [num_target_boxes, 4], so these samples can fail once an empty target is batched; reshape empty outputs to (0, 4) as the EoMT path does.
Useful? React with 👍 / 👎.
| } | ||
|
|
||
|
|
||
| class LTDETRInstanceSegmentationCollateFunction(_LTDETRCollateFunction): |
There was a problem hiding this comment.
Wire the LTDETR collate into dataset selection
This collate is never selected in the standard training flow: train_task.py instantiates collates only via train_dataset.batch_collate_fn_cls, while InstanceSegmentationDataset still points that class variable at EoMTInstanceSegmentationCollateFunction even when an LTDETR transform is installed. As a result, LTDETR instance-segmentation training through the normal dataset path silently skips the default mixup and this class's step-aware reinitialization logic.
Useful? React with 👍 / 👎.
| elif normalize == "none": | ||
| self.normalize = NormalizeArgs() |
There was a problem hiding this comment.
Match default normalization stats to channel_drop
When users enable channel_drop with num_channels_keep other than 3 and leave normalize at the default auto, this branch still creates a 3-channel ImageNet NormalizeArgs() before num_channels is resolved from channel_drop. The sample pipeline then applies 3 mean/std values to a C!=3 image, which can fail or normalize the extra/dropped channels incorrectly; repeat/trim the stats or reject this combination as the EoMT transform compatibility pass does.
Useful? React with 👍 / 👎.
What has changed and why?
This PR stacks on top of #835.
It adds LT-DETR instance segmentation transform support on top of the object detection transform refactor:
ltdetr_transformsinstance segmentation transform moduleHow has it been tested?
Unit tests.
Did you update CHANGELOG.md?
Did you update the documentation?