Skip to content

Stack 2/2: Adding early dropout to drop path feature using linear decay.#642

Open
alxmrs wants to merge 7 commits into
mainfrom
u/alxmrs/early-sd
Open

Stack 2/2: Adding early dropout to drop path feature using linear decay.#642
alxmrs wants to merge 7 commits into
mainfrom
u/alxmrs/early-sd

Conversation

@alxmrs

@alxmrs alxmrs commented Mar 10, 2026

Copy link
Copy Markdown
Member

This paper1 and corresponding repo2 suggest that dropout is even more effective when it is only used early during training. The paper reports that early dropout in ConvNext UNets works better than both using (s.d.) dropout or not using dropout at all:

Screenshot 2026-03-10 at 3 48 51 PM

Thus, this PR adds an enhancement to #641 that lets users define a number of epochs to linearly decay the dropout rate until it hits zero. This provides a really simple means of specifying early dropout in our experiments. According to 2, this provides just as much benefit as more complex schemes of implementing early drop out while minimizing the number of things we need to configure.

@alxmrs alxmrs changed the title Stack 2/2 Adding early s.d. dropout to drop path feature using linear decay. Stack 2/2: Adding early s.d. dropout to drop path feature using linear decay. Mar 10, 2026
@alxmrs alxmrs force-pushed the u/alxmrs/early-sd branch from 93e5c02 to 50005bc Compare March 10, 2026 22:41
Comment thread src/ocean_emulators/models/base.py Outdated
@alxmrs

alxmrs commented Mar 11, 2026

Copy link
Copy Markdown
Member Author

@codex May I have a review?

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d5f89d148f

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/ocean_emulators/train.py Outdated
@alxmrs

alxmrs commented Apr 1, 2026

Copy link
Copy Markdown
Member Author

@codex may I have another review from you?

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 74a4b659f0

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".


# Early stochastic depth: decay drop path rate over training.
if isinstance(self.model, BaseModel):
self.model.set_epoch(epoch - 1)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Align drop-path decay with configured epoch count

Calling set_epoch(epoch - 1) shifts the early stochastic-depth schedule by one epoch, so with drop_path_epochs = N the drop-path rate does not reach zero until epoch N+1 (e.g., at epoch 10 it is still nonzero when configured for 10). This conflicts with the new config/docs that say the rate decays to zero over the specified number of epochs and can skew experiment reproducibility when users tune drop_path_epochs expecting an exact cutoff.

Useful? React with 👍 / 👎.

@alxmrs alxmrs force-pushed the u/alxmrs/early-sd branch 2 times, most recently from 8b3db0d to b0628f1 Compare April 22, 2026 18:11
@alxmrs alxmrs changed the title Stack 2/2: Adding early s.d. dropout to drop path feature using linear decay. Stack 2/2: Adding early dropout to drop path feature using linear decay. Apr 22, 2026
@alxmrs alxmrs force-pushed the u/alxmrs/early-sd branch from b0628f1 to a5d3264 Compare April 24, 2026 03:14
@alxmrs

alxmrs commented Apr 24, 2026

Copy link
Copy Markdown
Member Author

@codex may I have your review?

@alxmrs alxmrs requested a review from jder April 24, 2026 03:15
@alxmrs alxmrs marked this pull request as ready for review April 24, 2026 03:15
Comment thread src/ocean_emulators/config.py Outdated
Base automatically changed from u/alxmrs/sd to main April 24, 2026 03:28
@jder

jder commented Apr 29, 2026

Copy link
Copy Markdown
Member

@fomo-bot can you rebase this on the latest main given the changes that happened on the (previous) base PR please?

@fomo-bot fomo-bot force-pushed the u/alxmrs/early-sd branch from 4f0b75b to 07a0cea Compare April 29, 2026 17:55
@fomo-bot

Copy link
Copy Markdown
Collaborator

Rebased u/alxmrs/early-sd onto the current origin/main tip 96407bf1 and force-pushed the rewritten branch to GitHub. The old stacked #641 base commits were removed from this branch history, and only the #642 commits were replayed on top of main.



I resolved the train.py conflicts by keeping main’s broader sampler set_epoch hook and preserving this PR’s final model.set_epoch(epoch - 1) behavior in src/ocean_emulators/train.py:443.



Verified with uv run pytest tests/test_blocks.py and uv run pytest tests/test_samplers.py; both passed.

Codex work

Codex transcript

- I’m checking the branch state and the PR’s current base/head so I can rebase the right stack point and avoid rewriting the wrong history.
- I’ve confirmed the PR branch is `u/alxmrs/early-sd` and the PR now targets `main`. Next I’m fetching the current remote state and checking the branch ancestry so I can rebase only the intended commits.
- The remote fetch is done. I’m looking at the commit ancestry now to separate this PR’s own commits from the earlier stacked base, because a plain `rebase main` would keep the wrong commit range if the stack root moved.
- The branch still contains the old stacked base commits from `#641`, while `origin/main` already has that work as the squashed merge commit `0a633c8d`. I’m confirming the stack boundary against `#641`, then I’ll rebase only the `#642` commits on top of current `origin/main`.
- I’ve confirmed the boundary: commits through `ebedf89e` are the old `#641` base, and the actual `#642` work starts at `c7913a4f`. I’m saving a local backup ref, then rebasing `c7913a4f..HEAD` onto `origin/main`.
- The rebase is in progress and hit one conflict in `src/ocean_emulators/train.py` while replaying `Fix off by one error`. I’m checking the conflict against the current mainline version and the rebased patch so I can resolve it without dropping the earlier review fixes.
- Main has already generalized the sampler epoch hook to any sampler that exposes `set_epoch`, so the conflict is not just textual. I’m checking the sampler types to preserve that generalization while still carrying forward the off-by-one fix from this PR.
- I’m resolving the conflict in favor of current main’s broader sampler hook, and keeping the model epoch wiring in the pre-`662e81e8` state so the remaining commits can replay normally.
- The next commit, `oops, wrong epochs.`, conflicted in the same spot. I’m resolving this one to the intended final behavior from the PR: samplers stay on the current epoch, while the model’s dropout schedule uses `epoch - 1`.
- Git stopped for the commit message because this environment has no interactive editor. I’m resuming the rebase non-interactively with the existing message.
- The rebase completed. I’m verifying the rewritten history against `origin/main`, then I’ll run the targeted dropout/block tests and push the rebased branch with `--force-with-lease`.
- The rebase itself is clean. The first test attempt failed because `pytest` isn’t on this shell path, so I’m checking the project’s configured test runner before pushing.
- The project uses `uv`, so I’ve kicked off `uv run pytest tests/test_blocks.py`. It had to create the local virtualenv first; I’m waiting for the test result now.
- `tests/test_blocks.py` passed after the rebase. I’m running the sampler tests too, since the only conflict resolution touched the epoch hook around sampler scheduling in `train.py`.
- The rebased branch passes the relevant block and sampler tests. I’m doing a final state check, then I’ll force-push with lease to update the PR branch on GitHub.
- Rebased `u/alxmrs/early-sd` onto the current `origin/main` tip `96407bf1` and force-pushed the rewritten branch to GitHub. The old stacked `#641` base commits were removed from this branch history, and only the `#642` commits were replayed on top of main.

I resolved the `train.p...[truncated]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants