Skip to content

Conversation

harshaljanjani
Copy link
Collaborator

@harshaljanjani harshaljanjani commented Jul 9, 2025

Description of the change

Welcome D-FINE to the KerasHub family of models!
D-FINE, a powerful real-time object detector, sets a new state-of-the-art benchmark for object detection on KerasHub. It achieves outstanding localization precision by redefining the bounding box regression task in DETR models. Additionally, it incorporates lightweight optimizations in computationally intensive modules and operations, striking a better balance between speed and accuracy. Specifically, D-FINE-L/X achieves 54.0%/55.8% AP on the COCO dataset at 124/78 FPS on an NVIDIA T4 GPU. When pretrained on Objects365, D-FINE-L/X attains 57.1%/59.3% AP, surpassing all existing real-time detectors.

Closes the second half and thus, the complete issue #2271

Results in Action of KerasHub's D-FINE

Model Predictions Numerics Matching (Visit the Colab notebook for the complete results)

Colab Notebook

The D-FINE Fine-Tuning, Prediction and Numerics Verification Notebook for KerasHub

Checklist

  • I have added all the necessary unit tests for my change.
  • I have verified that my change does not break existing code and works with all backends (TensorFlow, JAX, and PyTorch).
  • My PR is based on the latest changes of the main branch (if unsure, rebase the code).
  • I have followed the Keras Hub Model contribution guidelines in making these changes.
  • I have followed the Keras Hub API design guidelines in making these changes.
  • I have signed the Contributor License Agreement.

@harshaljanjani harshaljanjani self-assigned this Jul 9, 2025
@divyashreepathihalli
Copy link
Collaborator

/gemini review

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the D-FINE model to KerasHub, including its architecture, layers, tests, and a checkpoint conversion script. The implementation is comprehensive and well-structured. I've provided a few suggestions to improve code clarity, maintainability, and correctness. Overall, this is a solid contribution.

@harshaljanjani harshaljanjani marked this pull request as ready for review July 12, 2025 19:25
@harshaljanjani
Copy link
Collaborator Author

harshaljanjani commented Jul 13, 2025

@divyashreepathihalli @mattdangerw D-FINE is ready for its first round of reviews!
As discussed, we're leaving out the task model from the scope of this PR, given the sheer volume of the code, since the task model is a 1000+ LOC effort in itself. I've not only covered the numerics check, but also the examples we'd add to the quickstart notebook on Kaggle once merged, in the Colab notebook linked in the PR description!

Copy link
Member

@mattdangerw mattdangerw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Nice work. Just some initial comments.

In general, now that this is up and working let's see if we can find anywhere to cut complexity if we can. Anything we can do to same lines of code (without playing code golf) will probably help keep this maintainable for the future.

@harshaljanjani
Copy link
Collaborator Author

Thanks for the reviews @mattdangerw. Yeah let's definitely cut down the complexity wherever possible for maintainability, I'll look into it!

@harshaljanjani
Copy link
Collaborator Author

@mattdangerw Could you please check if all your comments have been addressed when you have the time, thanks a lot!

@sachinprasadhs sachinprasadhs moved this to In Progress in KerasHub Jul 16, 2025
@harshaljanjani
Copy link
Collaborator Author

@mattdangerw @divyashreepathihalli
Good day, just a gentle reminder for a timely review of this PR, thank you!

Copy link
Collaborator

@sachinprasadhs sachinprasadhs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!
I have added few comments, mainly focusing on our standard design process.

@harshaljanjani
Copy link
Collaborator Author

@sachinprasadhs
Thanks for taking the time to review. I've addressed the concerns to the best of my ability.

Copy link
Collaborator

@sachinprasadhs sachinprasadhs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing all the comments, this looks better now. Just one place you might have missed to make change, added comment.

@sachinprasadhs sachinprasadhs added the kokoro:force-run Runs Tests on GPU label Jul 25, 2025
@kokoro-team kokoro-team removed the kokoro:force-run Runs Tests on GPU label Jul 25, 2025
@harshaljanjani
Copy link
Collaborator Author

@sachinprasadhs Resolved, thanks!

@harshaljanjani harshaljanjani added the kokoro:force-run Runs Tests on GPU label Jul 26, 2025
@kokoro-team kokoro-team removed the kokoro:force-run Runs Tests on GPU label Aug 20, 2025
@harshaljanjani
Copy link
Collaborator Author

@sachinprasadhs @divyashreepathihalli None of the test failures here seem to be related to D-FINE (they’re all random HTTPS requests.exceptions.ReadTimeout errors), even after the latest commits from keras-hub:master. For your reference, please check the latest and the penultimate CI run results, all I've done is merge. In the penultimate run, atleast the CPU tests were working fine, but after the keras-hub:master merge, it’s just throwing random errors.

I just wanted to bring to your attention that the functionality is working correctly on my end as I've extensively tested locally, and that none of the commits I’ve introduced are breaking in nature.

@sachinprasadhs
Copy link
Collaborator

Some suggestions to improve in object detector:

Leverage Keras Logic: Let's optimize the code by replacing custom logic with existing Keras implementations where possible.
CIoU Loss: Wherever GIoU is used, check if you can use Keras's built-in CIoU loss, which is an improved version of GIoU. You can find it here.
Standardize Bounding Box Functions: Review Keras's bounding box utilities. If any of your custom functions offer general utility, consider opening a PR to contribute them to the Keras.
Isolate Custom Loss: To improve code organization, please move the D-Fine-specific loss function into a new file.

@sachinprasadhs
Copy link
Collaborator

Let's apply the same principles to our utility functions. We can improve them by:

Optimizing for performance and readability.

Prioritizing built-in Keras functions to reduce custom code.

For instance, I believe we can use some of the functions like

If you feel aything can be moved to Core Keras functionalities for other use cases, move it there.

Copy link
Collaborator

@sachinprasadhs sachinprasadhs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have made some suggestions to both the object_detector and utils file.

@harshaljanjani
Copy link
Collaborator Author

harshaljanjani commented Aug 22, 2025

@sachinprasadhs Learned about the existing utilities here; thanks! I think you’ll find the latest abstractions much better. I’ve incorporated your reviews along with some additional changes that I could gather from your insights (except for decode_deltas_to_boxes and encode_box_to_deltas, since they're for anchor-based models and D-FINE is anchor-free).

Note: For now, I haven’t moved the loss computation to another file. I’ll do so after you review the diff; otherwise, the diff would just appear as a block of red and a block of green, making it difficult to spot the actual changes.

@divyashreepathihalli
Copy link
Collaborator

/gemini review

@divyashreepathihalli divyashreepathihalli added the kokoro:force-run Runs Tests on GPU label Aug 25, 2025
Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the D-FINE model, a state-of-the-art real-time object detector, to KerasHub. The contribution is comprehensive, including the backbone, object detector task, preprocessor, extensive tests, and a checkpoint conversion script. The implementation is of high quality, adhering well to backend-agnostic principles by using keras.ops. I've identified a few minor style guide violations related to incomplete docstrings and the use of type hints in function signatures across the new modules. Additionally, I've suggested a small improvement to the checkpoint conversion script to make it more robust. Overall, this is an excellent and substantial addition to the library.

@kokoro-team kokoro-team removed the kokoro:force-run Runs Tests on GPU label Aug 25, 2025
@keras-team keras-team deleted a comment from gemini-code-assist bot Aug 25, 2025
@keras-team keras-team deleted a comment from gemini-code-assist bot Aug 25, 2025
Copy link
Collaborator

@sachinprasadhs sachinprasadhs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some nits.

@sachinprasadhs
Copy link
Collaborator

And then you can move the losses to a different file, rest everything looks good to me. Thanks

@harshaljanjani
Copy link
Collaborator Author

@sachinprasadhs Will cover these, thanks!

@harshaljanjani
Copy link
Collaborator Author

harshaljanjani commented Aug 26, 2025

@sachinprasadhs @divyashreepathihalli Covered everything (human reviews + whatever seemed right from Gemini). Will run GPU tests to keep it ready for once you're back 🙂

@harshaljanjani harshaljanjani added the kokoro:force-run Runs Tests on GPU label Aug 26, 2025
@kokoro-team kokoro-team removed the kokoro:force-run Runs Tests on GPU label Aug 26, 2025
@harshaljanjani
Copy link
Collaborator Author

Umm, interesting.
This looks like a CI issue; CI was green at merge time yesterday.
I wonder why it's throwing a ResourceExhaustedError for an "S" category preset with weights of only 645 MB (the preset I uploaded is here, which matches the original google/t5gemma-s-s-ul2); which is the smallest T5Gemma preset available.

Please note the CI test failures are not related to the code of D-FINE or T5Gemma.

Build log for TF (ResourceExhaustedError on the smallest T5Gemma preset):
https://btx.cloud.google.com/invocations/3157d0c5-07c8-41ce-85bc-343150445a88/targets/keras_hub%2Fgithub%2Fubuntu%2Fgpu%2Ftensorflow%2Fpresubmit/log

@harshaljanjani harshaljanjani added the kokoro:force-run Runs Tests on GPU label Aug 26, 2025
@kokoro-team kokoro-team removed the kokoro:force-run Runs Tests on GPU label Aug 26, 2025
@divyashreepathihalli divyashreepathihalli merged commit 0c04f88 into keras-team:master Aug 27, 2025
10 checks passed
@github-project-automation github-project-automation bot moved this from In Progress to Done in KerasHub Aug 27, 2025
@harshaljanjani harshaljanjani deleted the d-fine branch August 27, 2025 03:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

5 participants