Revert active-buffer-size-gb arg name. #2257

lmcafee-nvidia · 2025-11-14T17:37:44Z

What does this PR do ?

Revert a recent name change. This PR switches --inference-dynamic-batching-active-buffer-size-gb back to --inference-dynamic-batching-buffer-size-gb. Now, the total buffer size additionally depends on the setting of --inference-dynamic-batching-unified-memory-level:

uvm level 0: total buffer size == buffer_size_gb
uvm level 1: total buffer size == 2 * buffer_size gb

⚠️ For major changes (either in lines of code or in its impact), please make sure to first share discuss a design-doc with the team.

Contribution process

flowchart LR
    A[Pre-checks] --> B[PR Tests]
    subgraph Code Review/Approval
        C1[Expert Review] --> C2[Final Review]
    end
    B --> C1
    C2 --> D[Merge]

Pre-checks

I want this PR in a versioned release and have added the appropriate Milestone (e.g., Core 0.8)
I have added relevant unit tests
I have added relevant functional tests
I have added proper typing to my code Typing guidelines
I have added relevant documentation
I have run the autoformatter.sh on my PR

Code review

The following process is enforced via the CODEOWNERS file for changes into megatron/core. For changes outside of megatron/core, it is up to the PR author whether or not to tag the Final Reviewer team.

For MRs into `main` branch

(Step 1): Add PR label `Expert Review`

(Step 2): Collect the expert reviewers reviews

Attach the Expert Review label when your PR is ready for review.
GitHub auto-assigns expert reviewers based on your changes. They will get notified and pick up your PR soon.

⚠️ Only proceed to the next step once all reviewers have approved, merge-conflict are resolved and the CI is passing.
Final Review might get declined if these requirements are not fulfilled.

(Step 3): Final Review

Add Final Review label
GitHub auto-assigns final reviewers based on your changes. They will get notified and pick up your PR soon.

(Optional Step 4): Cherry-pick into release branch

If this PR also needs to be merged into core_r* release branches, after this PR has been merged, select Cherry-pick to open a new PR into the release branch.

For MRs into `dev` branch

The proposed review process for `dev` branch is under active discussion.

MRs are mergable after one approval by either [email protected] or [email protected].

Merging your PR

Any member of core-adlr and core-nemo will be able to merge your PR.

lmcafee-nvidia added 3 commits November 14, 2025 08:42

renamed active_buffer_size -> buffer_size, in most files.

0ec0350

system runs with new buffer_size_gb arg.

98869b9

clean up.

12cee4e

lmcafee-nvidia added this to the Core 0.15 milestone Nov 14, 2025

lmcafee-nvidia self-assigned this Nov 14, 2025

lmcafee-nvidia requested review from a team as code owners November 14, 2025 17:37

lmcafee-nvidia added the Expert Review Apply this label to indicate that your PR is ready for expert review. label Nov 14, 2025

copy-pr-bot bot temporarily deployed to nemo-ci November 14, 2025 17:37 Inactive

copy-pr-bot bot had a problem deploying to nemo-ci November 14, 2025 17:38 Failure

lmcafee-nvidia requested review from kanz-nv, sidsingh-nvidia and tdene and removed request for sidsingh-nvidia November 14, 2025 17:38

copy-pr-bot bot had a problem deploying to public November 14, 2025 17:41 Failure

copy-pr-bot bot temporarily deployed to public November 14, 2025 17:41 Inactive

format.

00330dd

copy-pr-bot bot temporarily deployed to nemo-ci November 14, 2025 17:44 Inactive

copy-pr-bot bot had a problem deploying to nemo-ci November 14, 2025 17:44 Failure

copy-pr-bot bot temporarily deployed to nemo-ci November 14, 2025 17:44 Inactive

copy-pr-bot bot had a problem deploying to nemo-ci November 14, 2025 17:44 Failure

copy-pr-bot bot temporarily deployed to test November 14, 2025 17:44 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci November 14, 2025 17:47 Inactive

copy-pr-bot bot had a problem deploying to public November 14, 2025 17:47 Failure

copy-pr-bot bot temporarily deployed to public November 14, 2025 17:47 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci November 14, 2025 17:48 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci November 19, 2025 19:47 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci November 19, 2025 19:48 Inactive

copy-pr-bot bot temporarily deployed to test November 19, 2025 19:48 Inactive

copy-pr-bot bot temporarily deployed to public November 19, 2025 19:51 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci November 19, 2025 19:53 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci November 19, 2025 19:57 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci November 19, 2025 20:12 Inactive

lmcafee-nvidia added this pull request to the merge queue Nov 19, 2025

Merged via the queue into NVIDIA:main with commit 21968ea Nov 19, 2025
45 checks passed

lmcafee-nvidia deleted the lmcafee/revert-active-buffer-size-name branch November 19, 2025 20:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Revert active-buffer-size-gb arg name. #2257

Revert active-buffer-size-gb arg name. #2257

Uh oh!

lmcafee-nvidia commented Nov 14, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Revert active-buffer-size-gb arg name. #2257

Revert active-buffer-size-gb arg name. #2257

Uh oh!

Conversation

lmcafee-nvidia commented Nov 14, 2025

What does this PR do ?

Contribution process

Pre-checks

Code review

(Step 1): Add PR label Expert Review

(Step 2): Collect the expert reviewers reviews

(Step 3): Final Review

(Optional Step 4): Cherry-pick into release branch

Merging your PR

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

(Step 1): Add PR label `Expert Review`