-
Notifications
You must be signed in to change notification settings - Fork 19.6k
Add RandomResizedCrop layer #21917
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Add RandomResizedCrop layer #21917
Conversation
Summary of ChangesHello @MalyalaKarthik66, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a new Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces the RandomResizedCrop layer, a valuable addition for image augmentation. The implementation is well-structured and includes comprehensive tests covering various scenarios and backends.
I've identified a few areas for improvement:
- The class docstring is missing a usage example and shape information, which is recommended by the Keras API design guidelines.
- There's a potential bug when an integer seed is passed directly to
get_random_transformation, which could lead to non-random behavior. - The resizing logic for images and segmentation masks has some code duplication that can be refactored for better maintainability.
Overall, this is a great contribution. Addressing these points will make the new layer even more robust and user-friendly.
| def get_random_transformation(self, data, training=True, seed=None): | ||
| """Returns a crop transformation `(h_start, w_start, crop_h, crop_w)`. | ||
|
|
||
| The same crop parameters are applied to all images in a batch, | ||
| which matches the behavior of other preprocessing layers. | ||
| """ | ||
| if isinstance(data, dict): | ||
| images = data.get("images", None) | ||
| input_shape = backend.shape(images) | ||
| else: | ||
| input_shape = backend.shape(data) | ||
|
|
||
| input_height = ops.cast(input_shape[self.height_axis], "float32") | ||
| input_width = ops.cast(input_shape[self.width_axis], "float32") | ||
|
|
||
| if training: | ||
| h_start, w_start, crop_h, crop_w = self._get_random_crop_params( | ||
| input_height, input_width, seed | ||
| ) | ||
| else: | ||
| h_start, w_start, crop_h, crop_w = self._get_center_crop_params( | ||
| input_height, input_width | ||
| ) | ||
|
|
||
| return ( | ||
| ops.cast(h_start, "int32"), | ||
| ops.cast(w_start, "int32"), | ||
| ops.cast(crop_h, "int32"), | ||
| ops.cast(crop_w, "int32"), | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If an integer seed is passed directly to get_random_transformation, all subsequent calls to backend.random.uniform within _get_random_crop_params will produce the same random number. This is because an integer seed is stateless.
To ensure different random numbers are generated for scale, ratio, and crop position, you should create a stateful SeedGenerator from the integer seed if one is provided.
def get_random_transformation(self, data, training=True, seed=None):
"""Returns a crop transformation `(h_start, w_start, crop_h, crop_w)`.
The same crop parameters are applied to all images in a batch,
which matches the behavior of other preprocessing layers.
"""
if isinstance(seed, int):
seed = SeedGenerator(seed)
if isinstance(data, dict):
images = data.get("images", None)
input_shape = backend.shape(images)
else:
input_shape = backend.shape(data)
input_height = ops.cast(input_shape[self.height_axis], "float32")
input_width = ops.cast(input_shape[self.width_axis], "float32")
if training:
h_start, w_start, crop_h, crop_w = self._get_random_crop_params(
input_height, input_width, seed
)
else:
h_start, w_start, crop_h, crop_w = self._get_center_crop_params(
input_height, input_width
)
return (
ops.cast(h_start, "int32"),
ops.cast(w_start, "int32"),
ops.cast(crop_h, "int32"),
ops.cast(crop_w, "int32"),
)| """Randomly crops and resizes images to a target size. | ||
|
|
||
| This layer: | ||
| 1. Samples a random relative area from `scale`. | ||
| 2. Samples a random aspect ratio from `ratio`. | ||
| 3. Derives a crop window (height, width) from these values. | ||
| 4. Crops the image and resizes the crop to `(height, width)`. | ||
|
|
||
| Args: | ||
| height: Integer. Target height of the output image. | ||
| width: Integer. Target width of the output image. | ||
| scale: Tuple of two floats `(min_scale, max_scale)`. The | ||
| sampled relative area (crop_area / image_area) will lie | ||
| in this range. Default `(0.08, 1.0)`. | ||
| ratio: Tuple of two floats `(min_ratio, max_ratio)`. Aspect | ||
| ratio (width / height) of the crop is sampled from this | ||
| interval in log-space. Default `(0.75, 1.33)`. | ||
| interpolation: String. Interpolation mode used in the resize | ||
| step, e.g. `"bilinear"`. Default `"bilinear"`. | ||
| seed: Optional integer. Random seed. | ||
| data_format: Optional string, `"channels_last"` or | ||
| `"channels_first"`. Follows global image data format by | ||
| default. | ||
| name: Optional string name. | ||
| **kwargs: Additional layer keyword arguments. | ||
|
|
||
| Notes: | ||
| * On inference (`training=False`), the layer performs a | ||
| deterministic center crop that preserves the target | ||
| aspect ratio, followed by resize to `(height, width)`. | ||
| * On the OpenVINO backend, `backend.image.resize` is not | ||
| implemented. In this case, the layer raises a | ||
| `NotImplementedError` at runtime. | ||
| """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docstring is missing a code example, as well as Input shape and Output shape sections, which are recommended by the Keras API design guidelines. Adding these will improve usability and documentation quality.
Please add a simple example demonstrating the layer's usage, and document the expected input and output shapes.
"""Randomly crops and resizes images to a target size.
This layer:
1. Samples a random relative area from `scale`.
2. Samples a random aspect ratio from `ratio`.
3. Derives a crop window (height, width) from these values.
4. Crops the image and resizes the crop to `(height, width)`.
Args:
height: Integer. Target height of the output image.
width: Integer. Target width of the output image.
scale: Tuple of two floats `(min_scale, max_scale)`. The
sampled relative area (crop_area / image_area) will lie
in this range. Default `(0.08, 1.0)`.
ratio: Tuple of two floats `(min_ratio, max_ratio)`. Aspect
ratio (width / height) of the crop is sampled from this
interval in log-space. Default `(0.75, 1.33)`.
interpolation: String. Interpolation mode used in the resize
step, e.g. `"bilinear"`. Default `"bilinear"`.
seed: Optional integer. Random seed.
data_format: Optional string, `"channels_last"` or
`"channels_first"`. Follows global image data format by
default.
name: Optional string name.
**kwargs: Additional layer keyword arguments.
Input shape:
A 3D (unbatched) or 4D (batched) tensor, with shape:
`(..., height, width, channels)` if `data_format="channels_last"`,
or `(..., channels, height, width)` if `data_format="channels_first"`.
Output shape:
A 3D (unbatched) or 4D (batched) tensor, with shape:
`(..., self.height, self.width, channels)` if
`data_format="channels_last"`, or
`(..., channels, self.height, self.width)` if
`data_format="channels_first"`.
Example:
```python
input_shape = (2, 300, 300, 3)
input_data = ops.random.uniform(shape=input_shape)
layer = keras.layers.RandomResizedCrop(height=224, width=224)
output_data = layer(input_data)
print(output_data.shape)
# (2, 224, 224, 3)
```
Notes:
* On inference (`training=False`), the layer performs a
deterministic center crop that preserves the target
aspect ratio, followed by resize to `(height, width)`.
* On the OpenVINO backend, `backend.image.resize` is not
implemented. In this case, the layer raises a
`NotImplementedError` at runtime.
"""References
- The Keras API design guidelines recommend that all docstrings include code examples (line 145) and follow a specific structure, including Input/Output shape sections for layers (line 157). (link)
| def _resize_images(self, images): | ||
| """Resize images to `(height, width)` using backend API. | ||
|
|
||
| For OpenVINO, this raises NotImplementedError because the | ||
| backend does not yet provide `backend.image.resize`. | ||
| """ | ||
| if backend.backend() == "openvino": | ||
| raise NotImplementedError( | ||
| "`RandomResizedCrop` is not yet supported on the " | ||
| "OpenVINO backend because `backend.image.resize` is " | ||
| "not implemented there. Please use `RandomCrop` or " | ||
| "switch to a different backend until resize support " | ||
| "is added." | ||
| ) | ||
|
|
||
| return backend.image.resize( | ||
| images, | ||
| size=(self.height, self.width), | ||
| interpolation=self.interpolation, | ||
| antialias=False, | ||
| crop_to_aspect_ratio=False, | ||
| pad_to_aspect_ratio=False, | ||
| fill_mode="constant", | ||
| fill_value=0.0, | ||
| data_format=self.data_format, | ||
| ) | ||
|
|
||
| def transform_images(self, images, transformation=None, training=True): | ||
| """Apply random resized crop to a batch of images.""" | ||
| if transformation is None: | ||
| transformation = self.get_random_transformation( | ||
| images, training=training | ||
| ) | ||
| h_start, w_start, crop_h, crop_w = transformation | ||
|
|
||
| images = self._slice_images(images, h_start, w_start, crop_h, crop_w) | ||
| images = self._resize_images(images) | ||
| return images | ||
|
|
||
| def transform_segmentation_masks( | ||
| self, masks, transformation, training=True | ||
| ): | ||
| """Apply the same crop + resize to segmentation masks.""" | ||
| h_start, w_start, crop_h, crop_w = transformation | ||
| masks = self._slice_images(masks, h_start, w_start, crop_h, crop_w) | ||
|
|
||
| if backend.backend() == "openvino": | ||
| raise NotImplementedError( | ||
| "Segmentation mask resizing for `RandomResizedCrop` is " | ||
| "not yet supported on the OpenVINO backend." | ||
| ) | ||
|
|
||
| return backend.image.resize( | ||
| masks, | ||
| size=(self.height, self.width), | ||
| interpolation="nearest", | ||
| antialias=False, | ||
| crop_to_aspect_ratio=False, | ||
| pad_to_aspect_ratio=False, | ||
| fill_mode="constant", | ||
| fill_value=0.0, | ||
| data_format=self.data_format, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's some code duplication in _resize_images and transform_segmentation_masks for handling the OpenVINO backend and calling backend.image.resize. Also, the error message for OpenVINO in transform_segmentation_masks is less informative than the one in _resize_images.
You can refactor this into a single private helper method _resize to remove duplication and ensure consistent, helpful error messages. This improves maintainability.
def _resize(self, images, interpolation):
"""Resize images to `(height, width)` using backend API.
For OpenVINO, this raises NotImplementedError because the
backend does not yet provide `backend.image.resize`.
"""
if backend.backend() == "openvino":
raise NotImplementedError(
"`RandomResizedCrop` is not yet supported on the "
"OpenVINO backend because `backend.image.resize` is "
"not implemented there. Please use `RandomCrop` or "
"switch to a different backend until resize support "
"is added."
)
return backend.image.resize(
images,
size=(self.height, self.width),
interpolation=interpolation,
antialias=False,
crop_to_aspect_ratio=False,
pad_to_aspect_ratio=False,
fill_mode="constant",
fill_value=0.0,
data_format=self.data_format,
)
def _resize_images(self, images):
"""Resize images to `(height, width)` using backend API."""
return self._resize(images, self.interpolation)
def transform_images(self, images, transformation=None, training=True):
"""Apply random resized crop to a batch of images."""
if transformation is None:
transformation = self.get_random_transformation(
images, training=training
)
h_start, w_start, crop_h, crop_w = transformation
images = self._slice_images(images, h_start, w_start, crop_h, crop_w)
images = self._resize_images(images)
return images
def transform_segmentation_masks(
self, masks, transformation, training=True
):
"""Apply the same crop + resize to segmentation masks."""
h_start, w_start, crop_h, crop_w = transformation
masks = self._slice_images(masks, h_start, w_start, crop_h, crop_w)
return self._resize(masks, "nearest")
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #21917 +/- ##
==========================================
+ Coverage 76.30% 76.89% +0.59%
==========================================
Files 580 581 +1
Lines 60031 60175 +144
Branches 9433 9443 +10
==========================================
+ Hits 45805 46271 +466
+ Misses 11750 11549 -201
+ Partials 2476 2355 -121
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Add
keras.layers.RandomResizedCroplayer that:scale=(0.08, 1.0)and aspect ratio fromratio=(0.75, 1.33)backend.image.resizeCloses #21822