-
Notifications
You must be signed in to change notification settings - Fork 293
Add D-FINE to KerasHub #2318
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add D-FINE to KerasHub #2318
Conversation
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces the D-FINE model to KerasHub, including its architecture, layers, tests, and a checkpoint conversion script. The implementation is comprehensive and well-structured. I've provided a few suggestions to improve code clarity, maintainability, and correctness. Overall, this is a solid contribution.
@divyashreepathihalli @mattdangerw D-FINE is ready for its first round of reviews! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Nice work. Just some initial comments.
In general, now that this is up and working let's see if we can find anywhere to cut complexity if we can. Anything we can do to same lines of code (without playing code golf) will probably help keep this maintainable for the future.
Thanks for the reviews @mattdangerw. Yeah let's definitely cut down the complexity wherever possible for maintainability, I'll look into it! |
@mattdangerw Could you please check if all your comments have been addressed when you have the time, thanks a lot! |
@mattdangerw @divyashreepathihalli |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR!
I have added few comments, mainly focusing on our standard design process.
@sachinprasadhs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing all the comments, this looks better now. Just one place you might have missed to make change, added comment.
@sachinprasadhs Resolved, thanks! |
@sachinprasadhs @divyashreepathihalli None of the test failures here seem to be related to D-FINE (they’re all random HTTPS I just wanted to bring to your attention that the functionality is working correctly on my end as I've extensively tested locally, and that none of the commits I’ve introduced are breaking in nature. |
Some suggestions to improve in object detector: Leverage Keras Logic: Let's optimize the code by replacing custom logic with existing Keras implementations where possible. |
Let's apply the same principles to our utility functions. We can improve them by: Optimizing for performance and readability. Prioritizing built-in Keras functions to reduce custom code. For instance, I believe we can use some of the functions like If you feel aything can be moved to Core Keras functionalities for other use cases, move it there. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have made some suggestions to both the object_detector and utils file.
@sachinprasadhs Learned about the existing utilities here; thanks! I think you’ll find the latest abstractions much better. I’ve incorporated your reviews along with some additional changes that I could gather from your insights (except for decode_deltas_to_boxes and encode_box_to_deltas, since they're for anchor-based models and D-FINE is anchor-free). Note: For now, I haven’t moved the loss computation to another file. I’ll do so after you review the diff; otherwise, the diff would just appear as a block of red and a block of green, making it difficult to spot the actual changes. |
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces the D-FINE model, a state-of-the-art real-time object detector, to KerasHub. The contribution is comprehensive, including the backbone, object detector task, preprocessor, extensive tests, and a checkpoint conversion script. The implementation is of high quality, adhering well to backend-agnostic principles by using keras.ops
. I've identified a few minor style guide violations related to incomplete docstrings and the use of type hints in function signatures across the new modules. Additionally, I've suggested a small improvement to the checkpoint conversion script to make it more robust. Overall, this is an excellent and substantial addition to the library.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some nits.
And then you can move the losses to a different file, rest everything looks good to me. Thanks |
@sachinprasadhs Will cover these, thanks! |
@sachinprasadhs @divyashreepathihalli Covered everything (human reviews + whatever seemed right from Gemini). Will run GPU tests to keep it ready for once you're back 🙂 |
Umm, interesting. Please note the CI test failures are not related to the code of D-FINE or T5Gemma. Build log for TF ( |
Description of the change
Welcome D-FINE to the KerasHub family of models!
D-FINE, a powerful real-time object detector, sets a new state-of-the-art benchmark for object detection on KerasHub. It achieves outstanding localization precision by redefining the bounding box regression task in DETR models. Additionally, it incorporates lightweight optimizations in computationally intensive modules and operations, striking a better balance between speed and accuracy. Specifically, D-FINE-L/X achieves 54.0%/55.8% AP on the COCO dataset at 124/78 FPS on an NVIDIA T4 GPU. When pretrained on Objects365, D-FINE-L/X attains 57.1%/59.3% AP, surpassing all existing real-time detectors.
Closes the second half and thus, the complete issue #2271
Results in Action of KerasHub's D-FINE
Colab Notebook
The D-FINE Fine-Tuning, Prediction and Numerics Verification Notebook for KerasHub
Checklist