Description
Issue Type
Documentation Bug
Source
source
Keras Version
Keras 3.9.2
Custom Code
Yes
OS Platform and Distribution
Linux Ubuntu 24.04
Python version
3.12
GPU model and memory
RTX 4090 with 24564 MiB
Current Behavior?
I think the example Gradient Centralization for Better Training Performance (https://keras.io/examples/vision/gradient_centralization/#train-the-model-with-gc) contains a wrong comparison between GC and no-GC models. I've prepared a modification of the Colab notebook that wraps the model architecture logic into a make_model()
function. Then I call this function twice, once before compiling the no-GC model and once before compiling the GC one. When doing the comparison this way, the GC model is trained independently. I think the original comparison was wrong because the GC model was a recompiled no-GC model that was trained before, thus explaining its 71% ACC already in the first epoch.
My corrected colab example can be seen here: https://colab.research.google.com/drive/1HyTaaKevI1Izv6XsswO2NnktodO6_tuv?usp=sharing
Standalone code to reproduce the issue or tutorial link
This is the original Colab with the wrong comparison due to the GC model training starting at the state of the trained no-GC model: https://colab.research.google.com/github/keras-team/keras-io/blob/master/examples/vision/ipynb/gradient_centralization.ipynb