Skip to content

Conversation

@vignesh1507
Copy link
Contributor

I fixed a runtime error and cleaned up minor issues in pretrain_retro.py:

What I changed

Added missing import:
from importlib import import_module Reason: import_module(...) is called in core_model_provider when args.spec is set; without this import the code throws a NameError at runtime.
Removed unused import:
MegatronTokenizer (unused) — removed to avoid noise and potential lint warnings.
Improved log message:
Changed print_rank_0('building GPT model ...') → print_rank_0('building Retro model ...') to accurately reflect the model being constructed.
Why

The missing import is a definite runtime bug that will break the experimental spec-loading path (args.spec).
Removing unused imports helps keep the codebase clean and avoids linter complaints.
Updating the log message reduces confusion when reading logs (this script builds a Retro model, not GPT).
Testing / How I validated

Static review to confirm import_module usage and the missing import.
Confirmed MegatronTokenizer is not referenced elsewhere in the file.
Confirmed the corrected log message matches the behavior of core_model_provider.
Risk / Backward compatibility

Low risk: only adds an import, removes an unused import, and updates a log string. No behavior/functionality changes to model logic.
If other modules relied on MegatronTokenizer being imported at this module level (unlikely), we should import it explicitly from megatron.core.tokenizers.

… import, and clarify log message

I fixed a runtime error and cleaned up minor issues in pretrain_retro.py:

What I changed

Added missing import:
from importlib import import_module Reason: import_module(...) is called in core_model_provider when args.spec is set; without this import the code throws a NameError at runtime.
Removed unused import:
MegatronTokenizer (unused) — removed to avoid noise and potential lint warnings.
Improved log message:
Changed print_rank_0('building GPT model ...') → print_rank_0('building Retro model ...') to accurately reflect the model being constructed.
Why

The missing import is a definite runtime bug that will break the experimental spec-loading path (args.spec).
Removing unused imports helps keep the codebase clean and avoids linter complaints.
Updating the log message reduces confusion when reading logs (this script builds a Retro model, not GPT).
Testing / How I validated

Static review to confirm import_module usage and the missing import.
Confirmed MegatronTokenizer is not referenced elsewhere in the file.
Confirmed the corrected log message matches the behavior of core_model_provider.
Risk / Backward compatibility

Low risk: only adds an import, removes an unused import, and updates a log string. No behavior/functionality changes to model logic.
If other modules relied on MegatronTokenizer being imported at this module level (unlikely), they should import it explicitly from megatron.core.tokenizers.
@copy-pr-bot
Copy link

copy-pr-bot bot commented Nov 2, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@yashaswikarnati
Copy link
Contributor

/ok to test d019406

@github-actions
Copy link
Contributor

github-actions bot commented Nov 2, 2025

Thank you for your contribution!

NVIDIA Megatron-LM is currently transitioning to development on Github. We will aim to review your PR after we complete our transition and stabilize our Github development process.

Thank you for your understanding.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants