-
Notifications
You must be signed in to change notification settings - Fork 1.9k
[TRTLLM-5061] chore: add tags to API reference #6123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughThis update introduces structured API status tagging throughout the codebase and documentation. It adds status metadata (such as "beta", "prototype", "deprecated") to Pydantic model fields and API methods, updates Sphinx documentation to display these tags, and implements tests to ensure status consistency. Supporting utilities and CSS styles are included. Changes
Sequence Diagram(s)sequenceDiagram
participant Dev as Developer
participant Model as Pydantic Model
participant Utils as ApiParamTagger/ApiStatusRegistry
participant Docs as Sphinx
Dev->>Model: Define field with Field(..., status="beta")
Model->>Utils: Expose field metadata
Utils->>Docs: Tag fields/methods with :tag:`beta`
Docs->>Dev: Render documentation with status tags and custom CSS
sequenceDiagram
participant User as User
participant LLM as TorchLlmArgs
participant Logger as Logger
User->>LLM: Set field values (some with status="beta")
LLM->>LLM: warn_on_unstable_feature_usage()
alt Field is unstable
LLM->>Logger: Log warning about unstable feature usage
end
Estimated code review effort3 (~45 minutes) Possibly related PRs
Suggested reviewers
Poem
✨ Finishing Touches
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
280e995 to
9b38c27
Compare
|
/bot run |
|
PR_Github #12155 [ run ] triggered by Bot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🧹 Nitpick comments (3)
tests/unittest/llmapi/test_utils.py (1)
4-24: Comprehensive test for ApiStatusRegistry functionality.The test thoroughly validates the status registry for both standalone functions and class methods. Consider using unique function names instead of reusing
_my_methodto improve test clarity and avoid potential confusion.@ApiStatusRegistry.set_api_status("beta") -def _my_method(self, *args, **kwargs): +def _my_method_beta(self, *args, **kwargs): pass -assert ApiStatusRegistry.get_api_status(_my_method) == "beta" +assert ApiStatusRegistry.get_api_status(_my_method_beta) == "beta" @ApiStatusRegistry.set_api_status("prototype") -def _my_method(self, *args, **kwargs): +def _my_method_prototype(self, *args, **kwargs): pass -assert ApiStatusRegistry.get_api_status(_my_method) == "prototype" +assert ApiStatusRegistry.get_api_status(_my_method_prototype) == "prototype"tensorrt_llm/llmapi/utils.py (1)
551-555: Fix docstring formattingThe docstring should follow PEP 257 conventions with proper summary and description separation.
class ApiParamTagger: - ''' A helper to tag the api doc according to the status of the fields. - The status is set in the json_schema_extra of the field. - ''' + '''Tag API documentation according to field status. + + The status is set in the json_schema_extra of the field. + '''tensorrt_llm/llmapi/llm_args.py (1)
1856-1856: Fix line length violations.Three lines exceed the 120-character limit. Please reformat these field definitions to comply with the style guidelines.
- "Note that each CUDA graph can use up to 200 MB of extra memory.", + "Note that each CUDA graph can use up to 200 MB of extra memory.", status="beta") - default=False, - description="Disable the overlap scheduler.", + default=False, + description="Disable the overlap scheduler.", status="beta") - "If true, will iterate over sampling_params of each request and use the corresponding sampling strategy, e.g. top-k, top-p, etc.", + "If true, will iterate over sampling_params of each request and use the " + "corresponding sampling strategy, e.g. top-k, top-p, etc.",Also applies to: 1862-1862, 1873-1873
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
docs/source/_static/custom.css(1 hunks)docs/source/conf.py(3 hunks)docs/source/helper.py(2 hunks)tensorrt_llm/llmapi/llm.py(8 hunks)tensorrt_llm/llmapi/llm_args.py(14 hunks)tensorrt_llm/llmapi/utils.py(2 hunks)tests/unittest/api_stability/api_stability_core.py(5 hunks)tests/unittest/api_stability/references/llm.yaml(3 hunks)tests/unittest/api_stability/references_committed/llm.yaml(1 hunks)tests/unittest/llmapi/test_utils.py(1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (2)
tests/unittest/llmapi/test_utils.py (1)
tensorrt_llm/llmapi/utils.py (3)
ApiStatusRegistry(588-629)set_api_status(605-618)get_api_status(621-622)
tensorrt_llm/llmapi/utils.py (1)
tensorrt_llm/logger.py (1)
debug(143-144)
🪛 Ruff (0.12.2)
tensorrt_llm/llmapi/llm_args.py
1856-1856: Line too long (138 > 120)
(E501)
1862-1862: Line too long (142 > 120)
(E501)
1873-1873: Line too long (127 > 120)
(E501)
tensorrt_llm/llmapi/utils.py
549-551: 1 blank line required between summary line and description
(D205)
557-558: One-line docstring should fit on one line
Reformat to one line
(D200)
568-574: 1 blank line required between summary line and description
(D205)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Pre-commit Check
🔇 Additional comments (17)
docs/source/_static/custom.css (1)
1-26: Well-structured CSS for API status tags.The CSS implementation is clean and follows good practices:
- Semantic class names for different API status types
- Good color contrast for accessibility
- Consistent styling approach with appropriate padding and sizing
tests/unittest/api_stability/references_committed/llm.yaml (1)
93-95: Properly added stream_interval parameter.The new
stream_intervalparameter follows the correct pattern with appropriate type annotation and default value.docs/source/helper.py (2)
289-299: Clean utility function for extracting Pydantic methods.The implementation correctly creates a dummy Pydantic model to extract inherited methods and properly excludes
__init__from the filtering.
316-327: Improved documentation generation with proper filtering.The updates to autodoc options and selective exclusion of Pydantic methods for specific classes will result in cleaner, more focused API documentation.
docs/source/conf.py (5)
15-15: Proper import for custom role implementation.The
nodesimport from docutils is necessary for implementing the customtagrole.
64-73: Well-configured autodoc_pydantic options.The configuration options provide good control over documentation generation with proper member ordering and selective visibility of validators and signatures.
153-155: Proper CSS file inclusion.The custom CSS file is correctly included to support the API status tag styling.
158-162: Standard implementation of custom Sphinx role.The
tag_rolefunction follows Sphinx conventions for custom role implementation and correctly creates literal nodes with appropriate CSS classes.
168-171: Proper integration of API status tagging system.The setup function correctly imports and calls
tag_llm_params()and registers the customtagrole, integrating the API status system into the documentation build process.tests/unittest/api_stability/api_stability_core.py (1)
407-407: LGTM!Creating a deep copy of the reference before merging is the correct approach to preserve the non-committed API state for validation.
tensorrt_llm/llmapi/utils.py (1)
496-497: LGTM!The change correctly updates the deprecation check to use the new unified
statusfield approach.tests/unittest/api_stability/references/llm.yaml (1)
1-194: LGTM!The status annotations are consistently applied and align with the API stability tracking system implementation.
tensorrt_llm/llmapi/llm.py (2)
43-43: LGTM!The
@set_api_status("beta")decorators are correctly applied to mark the stability status of various API methods and properties.Also applies to: 215-216, 425-426, 439-440, 453-454, 475-476, 672-673
1050-1050: LGTM!The documentation correctly directs users to
TorchLlmArgsfor detailed argument information.tensorrt_llm/llmapi/llm_args.py (3)
64-92: Well-designed custom Field wrapper implementation.The custom Field wrapper cleanly extends Pydantic's Field to support API status metadata. The implementation correctly handles existing json_schema_extra and provides clear documentation for the status levels.
1017-1017: Consistent and appropriate status annotations applied.The status annotations are consistently applied across fields and align well with the API stability framework. The categorization appears logical:
- "beta" for stable but newer features
- "prototype" for experimental features
- "deprecated" for fields with deprecation warnings
Also applies to: 1023-1023, 1034-1036, 1039-1040, 1053-1055, 1057-1062, 1067-1069, 1096-1097, 1101-1102, 1106-1106, 1109-1110, 1113-1115, 1137-1140, 1146-1147, 1151-1152, 1156-1157, 1179-1179, 1837-1839, 1846-1847, 1856-1857, 1861-1862, 1865-1866, 1869-1870, 1875-1876, 1881-1882, 1885-1887, 1892-1893, 1896-1897, 1900-1900, 1904-1905, 1908-1910, 1922-1922, 1936-1936, 1943-1944
2132-2150: Effective implementation of unstable feature warnings.The method properly identifies explicitly set fields with unstable status and provides clear warnings to users. The implementation correctly:
- Uses
exclude_unset=Trueto only check user-provided values- Extracts status from field metadata safely
- Warns only for unstable statuses ("beta", "prototype")
- Provides actionable warning messages
| class ParamSnapshot: | ||
| annotation: type | ||
| default: Any = None | ||
| status: Optional[str] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add status field handling in serialization methods
The new status field should be included in the to_dict() method to ensure it's properly serialized when saving snapshots.
Apply this diff to fix the serialization:
def to_dict(self):
d = {f.name: getattr(self, f.name) for f in fields(self)}
d['annotation'] = repr_annotation(d['annotation'])
if d['default'] == inspect._empty:
d['default'] = "inspect._empty"
+ if d['status'] is None:
+ del d['status']
return dAlso applies to: 115-121
🤖 Prompt for AI Agents
In tests/unittest/api_stability/api_stability_core.py at lines 76 and 115-121,
the new 'status' field is declared but not included in the serialization
methods. Update the to_dict() method to add the 'status' field so it is properly
serialized when saving snapshots. Ensure that any deserialization or related
methods also handle the 'status' field accordingly.
| class MethodSnapshot: | ||
| parameters: Dict[str, ParamSnapshot] | ||
| return_annotation: type | ||
| status: Optional[str] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add status field handling in serialization methods
The new status field should be included in the to_dict() method to ensure it's properly serialized when saving snapshots.
Apply this diff to fix the serialization:
def to_dict(self):
d = {f.name: getattr(self, f.name) for f in fields(self)}
d['parameters'] = {
name: param.to_dict()
for name, param in d['parameters'].items()
}
d['return_annotation'] = repr_annotation(d['return_annotation'])
+ if d['status'] is None:
+ del d['status']
return dAlso applies to: 170-177
🤖 Prompt for AI Agents
In tests/unittest/api_stability/api_stability_core.py at lines 133 and 170-177,
the new 'status' field is declared but not included in the serialization
methods. Update the to_dict() method to add the 'status' field so it is properly
serialized when saving snapshots. Ensure that any other serialization or
deserialization methods in these lines also handle the 'status' field
accordingly.
| if method_name == "__init__": | ||
| for param_name, param_data in method_data.get('parameters', | ||
| {}).items(): | ||
| print(f"param_name: {param_name}, param_data: {param_data}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove debug print statement
Debug print statements should not be committed to production test code.
- print(f"param_name: {param_name}, param_data: {param_data}")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| print(f"param_name: {param_name}, param_data: {param_data}") |
🤖 Prompt for AI Agents
In tests/unittest/api_stability/api_stability_core.py at line 529, remove the
debug print statement that outputs param_name and param_data, as debug prints
should not be present in committed test code.
| def my_method(self, *args, **kwargs): | ||
| pass | ||
| ''' | ||
| method_to_status = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thread safety concern with mutable class variable
Using a mutable class variable method_to_status could cause race conditions in multi-threaded environments.
Consider using a thread-safe approach:
import threading
class ApiStatusRegistry:
_lock = threading.Lock()
method_to_status = {}
@classmethod
def set_api_status(cls, status: str):
def decorator(func):
with cls._lock:
if func.__qualname__ in cls.method_to_status:
logger.debug(
f"Method {func.__qualname__} already has a status, skipping the decorator"
)
return func
cls.method_to_status[func.__qualname__] = status
func.__doc__ = cls.amend_api_doc_with_status_tags(func)
return func
return decorator🤖 Prompt for AI Agents
In tensorrt_llm/llmapi/utils.py at line 605, the mutable class variable
method_to_status is not thread-safe and may cause race conditions. To fix this,
introduce a threading.Lock as a class-level lock and wrap all accesses and
modifications to method_to_status within this lock to ensure thread safety.
Update the decorator to acquire the lock before checking or setting
method_to_status entries and release it afterward.
| set_api_status = ApiStatusRegistry().set_api_status | ||
| get_api_status = ApiStatusRegistry().get_api_status |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incorrect singleton instantiation
The class uses classmethods but creates an instance. Either use instance methods or remove the instantiation.
-set_api_status = ApiStatusRegistry().set_api_status
-get_api_status = ApiStatusRegistry().get_api_status
+set_api_status = ApiStatusRegistry.set_api_status
+get_api_status = ApiStatusRegistry.get_api_status📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| set_api_status = ApiStatusRegistry().set_api_status | |
| get_api_status = ApiStatusRegistry().get_api_status | |
| set_api_status = ApiStatusRegistry.set_api_status | |
| get_api_status = ApiStatusRegistry.get_api_status |
🤖 Prompt for AI Agents
In tensorrt_llm/llmapi/utils.py around lines 635 to 636, the code incorrectly
creates an instance of ApiStatusRegistry to call classmethods. Remove the
instantiation and call set_api_status and get_api_status directly on the
ApiStatusRegistry class without creating an object.
|
PR_Github #12155 [ run ] completed with state |
9b38c27 to
fdc6c35
Compare
|
/bot run --stage-list "A100X-PyTorch-1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (2)
tensorrt_llm/llmapi/utils.py (2)
605-605: Thread safety concern with mutable class variableUsing a mutable class variable
method_to_statuscould cause race conditions in multi-threaded environments.
635-636: Incorrect singleton instantiationThe class uses classmethods but creates an instance. Either use instance methods or remove the instantiation.
🧹 Nitpick comments (4)
docs/source/helper.py (1)
289-299: Consider caching the Pydantic methods list for better performance.The function creates a dummy Pydantic BaseModel class every time it's called, which could be inefficient if called multiple times during documentation generation.
Consider caching the result:
+@cache def get_pydantic_methods() -> list[str]: from pydantic import BaseModel class Dummy(BaseModel): pass methods = set( [method for method in dir(Dummy) if not method.startswith('_')]) methods.discard("__init__") return list(methods)Alternatively, use a more direct approach:
+from functools import lru_cache + +@lru_cache(maxsize=1) def get_pydantic_methods() -> list[str]: from pydantic import BaseModel - - class Dummy(BaseModel): - pass - - methods = set( - [method for method in dir(Dummy) if not method.startswith('_')]) - methods.discard("__init__") - return list(methods) + + # Get methods directly from BaseModel without creating a dummy class + methods = set(method for method in dir(BaseModel) if not method.startswith('_')) + methods.discard("__init__") + return list(methods)tensorrt_llm/llmapi/utils.py (2)
551-589: The ApiParamTagger implementation looks solid.The class correctly processes Pydantic model fields and adds status tags to their descriptions. The implementation properly rebuilds the model after modifying field descriptions.
Consider adding some validation for the status values:
def amend_pydantic_field_description_with_tags(self, cls: Type[BaseModel], field_names: list[str], tag: str) -> None: + # Validate status tag + valid_statuses = {"beta", "prototype", "deprecated", "stable"} + if tag not in valid_statuses: + logger.warning(f"Unknown status tag: {tag}. Valid values: {valid_statuses}") + assert field_names for field_name in field_names:
591-633: Address static analysis hints for docstring formatting.The static analysis tools correctly identified docstring formatting issues.
Apply these fixes for better docstring formatting:
class ApiParamTagger: - ''' A helper to tag the api doc according to the status of the fields. + """A helper to tag the api doc according to the status of the fields. + The status is set in the json_schema_extra of the field. - ''' + """ def __call__(self, cls: Type[BaseModel]) -> None: - self.process_pydantic_model(cls) + """Process the Pydantic model to add tags to the fields.""" + self.process_pydantic_model(cls) def process_pydantic_model(self, cls: Type[BaseModel]) -> None: - """Process the Pydantic model to add tags to the fields. - """ + """Process the Pydantic model to add tags to the fields.""" def amend_pydantic_field_description_with_tags(self, cls: Type[BaseModel], field_names: list[str], tag: str) -> None: - """Amend the description of the fields with tags. + """Amend the description of the fields with tags. + e.g. :tag:`beta` or :tag:`prototype` + Args: cls: The Pydantic BaseModel class. field_names: The names of the fields to amend. tag: The tag to add to the fields. """tensorrt_llm/llmapi/llm_args.py (1)
1864-1881: Fix line length violations in field descriptions.Several field descriptions exceed the 120 character limit. Consider breaking these into multiple lines for better readability and code style consistency.
For example, line 1864 could be reformatted as:
description="CUDA graph config.If true, use CUDA graphs for decoding. \ CUDA graphs are only created for the batch sizes in cuda_graph_config.batch_sizes, \ and are enabled for batches that consist of decoding requests *only* \ (the reason is that it's hard to capture a single graph with prefill requests \ since the input shapes are a function of the sequence lengths).\ - Note that each CUDA graph can use up to 200 MB of extra memory.", + Note that each CUDA graph can use up to 200 MB of extra memory.",Similar adjustments can be made to lines 1870 and 1881.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
docs/source/_static/custom.css(1 hunks)docs/source/conf.py(3 hunks)docs/source/helper.py(2 hunks)tensorrt_llm/llmapi/llm.py(8 hunks)tensorrt_llm/llmapi/llm_args.py(14 hunks)tensorrt_llm/llmapi/utils.py(2 hunks)tests/unittest/api_stability/api_stability_core.py(5 hunks)tests/unittest/api_stability/references/llm.yaml(3 hunks)tests/unittest/api_stability/references_committed/llm.yaml(1 hunks)tests/unittest/llmapi/test_utils.py(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- tensorrt_llm/llmapi/llm.py
🚧 Files skipped from review as they are similar to previous changes (6)
- docs/source/_static/custom.css
- tests/unittest/api_stability/references_committed/llm.yaml
- tests/unittest/llmapi/test_utils.py
- tests/unittest/api_stability/api_stability_core.py
- tests/unittest/api_stability/references/llm.yaml
- docs/source/conf.py
🧰 Additional context used
🪛 Ruff (0.12.2)
tensorrt_llm/llmapi/llm_args.py
1864-1864: Line too long (138 > 120)
(E501)
1870-1870: Line too long (142 > 120)
(E501)
1881-1881: Line too long (127 > 120)
(E501)
tensorrt_llm/llmapi/utils.py
549-551: 1 blank line required between summary line and description
(D205)
557-558: One-line docstring should fit on one line
Reformat to one line
(D200)
568-574: 1 blank line required between summary line and description
(D205)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Pre-commit Check
🔇 Additional comments (8)
docs/source/helper.py (1)
316-327: The autodoc options refactoring looks good.The changes properly standardize the autodoc options and add filtering for Pydantic methods on specific classes. This should improve the documentation clarity by hiding irrelevant inherited methods.
tensorrt_llm/llmapi/utils.py (1)
496-496: LGTM! Field name change aligns with the new status system.The change from checking
"deprecated"to"status" == "deprecated"is consistent with the new unified status tagging system introduced in this PR.tensorrt_llm/llmapi/llm_args.py (6)
64-92: Well-designed custom Field wrapper for API status tagging.The implementation correctly extends Pydantic's Field with status metadata support. The function properly handles existing json_schema_extra values and provides clear documentation for the status levels.
1025-1187: Comprehensive API status tagging enhances user guidance.The status annotations on BaseLlmArgs fields provide clear indicators of API stability:
betastatus for features likegpus_per_nodeandenable_attention_dpprototypestatus for experimental features likecp_configand various stats optionsdeprecatedstatus for fields being phased out likemax_lora_rankandbackendThis systematic approach helps users make informed decisions about which features to use in production.
1731-1735: New fields properly implemented.The addition of
batching_typeandnormalize_log_probsfields toTrtLlmArgsfollows the established patterns with appropriate type annotations, default values, and descriptions.
1846-1952: Appropriate status annotations for PyTorch backend features.The status annotations on TorchLlmArgs fields properly reflect the experimental nature of many PyTorch backend features:
betastatus for features likecuda_graph_configandmoe_configprototypestatus for experimental features liketorch_compile_configdeprecatedstatus forbuild_configbeing phased outThis helps users understand the maturity level of PyTorch backend features.
2140-2158: Well-implemented warning system for unstable features.The
warn_on_unstable_feature_usagemethod provides valuable user feedback:
- Only warns about explicitly set fields using
exclude_unset=True- Properly checks for status metadata in
json_schema_extra- Provides clear warning messages for beta and prototype features
- Helps users make informed decisions about feature stability
64-2158: Excellent implementation of API stability tagging system.This comprehensive update introduces a well-designed API stability framework that:
- Extends Pydantic seamlessly - The custom Field wrapper integrates naturally with existing code
- Provides clear stability indicators - Status levels (prototype, beta, deprecated) are well-defined
- Offers user guidance - The warning system helps users understand feature maturity
- Maintains consistency - Status annotations are applied systematically across all argument classes
The implementation demonstrates thoughtful design that enhances the user experience while maintaining backward compatibility.
|
PR_Github #12212 [ run ] triggered by Bot |
|
PR_Github #12212 [ run ] completed with state |
fdc6c35 to
9d0ea69
Compare
|
/bot run --disable-fail-fast |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (5)
tests/unittest/api_stability/api_stability_core.py (3)
76-76: Status field serialization needs to be handled.The new
statusfield should be included in theto_dict()method to ensure it's properly serialized when saving snapshots.
133-133: Status field serialization needs to be handled.The new
statusfield should be included in theto_dict()method to ensure it's properly serialized when saving snapshots.
529-529: Remove debug print statement.Debug print statements should not be committed to production test code.
tensorrt_llm/llmapi/utils.py (2)
605-605: Thread safety concern with mutable class variable.Using a mutable class variable
method_to_statuscould cause race conditions in multi-threaded environments. Consider using a thread-safe approach with locks.
635-636: Incorrect singleton instantiation.The code incorrectly creates an instance of
ApiStatusRegistryto call classmethods. The methods should be called directly on the class.
🧹 Nitpick comments (2)
tensorrt_llm/llmapi/utils.py (1)
551-588: Well-designed API parameter tagging functionality.The
ApiParamTaggerclass provides a clean way to add status tags to Pydantic model fields. The implementation correctly:
- Processes model fields with status metadata
- Amends field descriptions with appropriate tags
- Rebuilds the model schema
Consider minor docstring formatting improvements as suggested by static analysis:
- ''' A helper to tag the api doc according to the status of the fields. - The status is set in the json_schema_extra of the field. - ''' + """A helper to tag the api doc according to the status of the fields. + + The status is set in the json_schema_extra of the field. + """tensorrt_llm/llmapi/llm_args.py (1)
1864-1864: Address line length violations for code style consistency.The static analysis tool identified three lines that exceed the 120-character limit. Consider reformatting these field descriptions to comply with the length limit.
For line 1864, consider:
- "Note that each CUDA graph can use up to 200 MB of extra memory.", + "Note that each CUDA graph can use up to 200 MB of extra memory.",For line 1870, break the description:
- description="Disable the overlap scheduler.", + description="Disable the overlap scheduler.",For line 1881, break the description:
- "If true, will iterate over sampling_params of each request and use the corresponding sampling strategy, e.g. top-k, top-p, etc.", + "If true, will iterate over sampling_params of each request and use the " + "corresponding sampling strategy, e.g. top-k, top-p, etc.",Also applies to: 1870-1870, 1881-1881
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
docs/source/_static/custom.css(1 hunks)docs/source/conf.py(3 hunks)docs/source/helper.py(2 hunks)tensorrt_llm/llmapi/llm.py(8 hunks)tensorrt_llm/llmapi/llm_args.py(14 hunks)tensorrt_llm/llmapi/utils.py(2 hunks)tests/unittest/api_stability/api_stability_core.py(5 hunks)tests/unittest/api_stability/references/llm.yaml(3 hunks)tests/unittest/api_stability/references_committed/llm.yaml(1 hunks)tests/unittest/llmapi/test_utils.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (7)
- docs/source/_static/custom.css
- tests/unittest/api_stability/references_committed/llm.yaml
- tests/unittest/llmapi/test_utils.py
- docs/source/helper.py
- docs/source/conf.py
- tests/unittest/api_stability/references/llm.yaml
- tensorrt_llm/llmapi/llm.py
🧰 Additional context used
🧬 Code Graph Analysis (1)
tensorrt_llm/llmapi/utils.py (1)
tensorrt_llm/logger.py (1)
debug(143-144)
🪛 Ruff (0.12.2)
tensorrt_llm/llmapi/llm_args.py
1864-1864: Line too long (138 > 120)
(E501)
1870-1870: Line too long (142 > 120)
(E501)
1881-1881: Line too long (127 > 120)
(E501)
tensorrt_llm/llmapi/utils.py
549-551: 1 blank line required between summary line and description
(D205)
557-558: One-line docstring should fit on one line
Reformat to one line
(D200)
568-574: 1 blank line required between summary line and description
(D205)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Pre-commit Check
🔇 Additional comments (14)
tests/unittest/api_stability/api_stability_core.py (3)
7-7: Import added for debugging - remove if debug prints are cleaned up.The
pprintimport appears to be used for debugging purposes. If debug print statements are removed from the code, this import should also be removed.
407-407: Good addition for isolating non-committed API data.Creating a deep copy of the reference ensures proper isolation when working with non-committed API data for status validation.
452-532: Well-structured API status validation test.The test method effectively validates API status consistency between live code and reference YAML. The logic properly handles:
- Filtering for llm.yaml reference file only
- Checking non-committed APIs
- Validating both method and parameter statuses
- Providing clear error messages for mismatches
The implementation is comprehensive and serves its purpose well.
tensorrt_llm/llmapi/utils.py (3)
496-497: Correct update to status field handling.The change from checking a boolean
deprecatedfield to checkingstatus == "deprecated"is consistent with the new unified status field approach.
591-633: Good API status management functionality.The
ApiStatusRegistryclass provides a clean interface for managing API method statuses with:
- Decorator-based status assignment
- Automatic docstring amendment
- Clear retrieval methods
The functionality is well-designed for its intended purpose.
586-588: Clean utility function for parameter tagging.The
tag_llm_params()function provides a simple and clean interface for applying status tagging to LLM parameters.tensorrt_llm/llmapi/llm_args.py (8)
64-92: Well-implemented custom Field wrapper with clear status documentation.The custom Field function correctly extends Pydantic's Field to add status metadata to the JSON schema. The implementation properly handles existing json_schema_extra values and provides clear documentation of the status levels.
1025-1077: Consistent and appropriate status assignments for existing fields.The status annotations properly reflect the maturity levels of the fields:
- Beta features are correctly marked for features that are recommended but not fully stable
- Prototype features are marked for experimental functionality
- Deprecated fields include both status annotation and deprecation messages
1104-1124: Appropriate prototype status assignments for experimental features.The prototype status is correctly applied to experimental features like iter stats, PEFT cache, scheduler config, and cache transceiver config, which aligns with their experimental nature.
1187-1187: Correct deprecation status for backend field.The deprecated status is appropriate for the backend field, which is already excluded from JSON schema and appears to be legacy functionality.
1731-1735: Well-implemented new fields with appropriate types and defaults.The new
batching_typeandnormalize_log_probsfields are correctly typed with sensible defaults and clear descriptions. The absence of status annotations indicates they're considered stable features.
1845-1952: Comprehensive and appropriate status assignments for PyTorch backend features.The status annotations correctly reflect the maturity levels of PyTorch backend features:
- Beta status for features that are recommended but still maturing
- Prototype status for experimental features like autotuner and torch compile config
- Deprecated status for build_config which is being phased out
2140-2158: Excellent runtime warning implementation for unstable features.The
warn_on_unstable_feature_usagemethod provides valuable user feedback by:
- Only checking explicitly set fields (using
exclude_unset=True)- Properly accessing status metadata from field info
- Providing clear warnings for beta and prototype features
- Helping users understand the stability implications of their configuration
64-2158: Comprehensive and well-executed API status tagging implementation.This implementation effectively establishes a unified mechanism for API status annotation:
- Custom Field wrapper seamlessly extends Pydantic functionality
- Status levels are consistently applied across all field types
- Runtime warnings help users understand stability implications
- The approach integrates well with the broader API status tagging effort described in the AI summary
The implementation demonstrates good architectural design and attention to user experience.
|
PR_Github #12220 [ run ] triggered by Bot |
|
PR_Github #12220 [ run ] completed with state |
9d0ea69 to
53103d7
Compare
|
/bot run |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (2)
tests/unittest/api_stability/api_stability_core.py (1)
452-532: Remove debug print statement and consider thread safetyThe test implementation is comprehensive and correctly validates API status consistency. However, there's a debug print statement that should be removed.
The debug print statement on line 529 should be removed as noted in previous reviews.
Additionally, consider the thread safety implications when accessing the
ApiStatusRegistry.method_to_statusdictionary, as noted in theutils.pyreview comments.- print(f"param_name: {param_name}, param_data: {param_data}")tensorrt_llm/llmapi/utils.py (1)
591-636: Fix thread safety and singleton instantiation issuesThe
ApiStatusRegistryimplementation has two critical issues identified in previous reviews that need to be addressed:
- Thread safety concern with the mutable class variable
- Incorrect singleton instantiation pattern
These issues were previously identified but still need to be fixed:
+import threading + class ApiStatusRegistry: - ''' A registry to store the status of the api. + """A registry to store the status of the api. usage: @ApiStatusRegistry.set_api_status("beta") def my_method(self, *args, **kwargs): pass class App: @ApiStatusRegistry.set_api_status("beta") def my_method(self, *args, **kwargs): pass - ''' + """ + _lock = threading.Lock() method_to_status = {} @classmethod def set_api_status(cls, status: str): def decorator(func): + with cls._lock: # Use qualified name to support class methods if func.__qualname__ in cls.method_to_status: logger.debug( f"Method {func.__qualname__} already has a status, skipping the decorator" ) return func cls.method_to_status[func.__qualname__] = status func.__doc__ = cls.amend_api_doc_with_status_tags(func) return func return decorator @classmethod def get_api_status(cls, method: Callable) -> Optional[str]: + with cls._lock: return cls.method_to_status.get(method.__qualname__, None) -set_api_status = ApiStatusRegistry().set_api_status -get_api_status = ApiStatusRegistry().get_api_status +set_api_status = ApiStatusRegistry.set_api_status +get_api_status = ApiStatusRegistry.get_api_status
🧹 Nitpick comments (3)
tests/unittest/api_stability/api_stability_core.py (1)
7-7: Unused import should be removedThe
pprintimport is not used anywhere in the code and should be removed to keep the imports clean.-from pprint import pprinttensorrt_llm/llmapi/utils.py (1)
551-588: ApiParamTagger implementation needs docstring formatting fixesThe
ApiParamTaggerclass implementation is functionally correct and provides a clean way to add status tags to Pydantic model fields. However, there are some docstring formatting issues that should be addressed.Apply these fixes for better docstring formatting:
class ApiParamTagger: - ''' A helper to tag the api doc according to the status of the fields. + """A helper to tag the api doc according to the status of the fields. + The status is set in the json_schema_extra of the field. - ''' + """ def __call__(self, cls: Type[BaseModel]) -> None: self.process_pydantic_model(cls) def process_pydantic_model(self, cls: Type[BaseModel]) -> None: """Process the Pydantic model to add tags to the fields. + """tensorrt_llm/llmapi/llm_args.py (1)
1864-1864: Fix line length violations.Several lines exceed the 120-character limit. Consider breaking these long description strings for better readability.
For line 1864:
- description="CUDA graph config.If true, use CUDA graphs for decoding. \ - CUDA graphs are only created for the batch sizes in cuda_graph_config.batch_sizes, \ - and are enabled for batches that consist of decoding requests *only* \ - (the reason is that it's hard to capture a single graph with prefill requests \ - since the input shapes are a function of the sequence lengths).\ - Note that each CUDA graph can use up to 200 MB of extra memory.", + description="CUDA graph config. If true, use CUDA graphs for decoding. " + "CUDA graphs are only created for the batch sizes in cuda_graph_config.batch_sizes, " + "and are enabled for batches that consist of decoding requests *only* " + "(the reason is that it's hard to capture a single graph with prefill requests " + "since the input shapes are a function of the sequence lengths). " + "Note that each CUDA graph can use up to 200 MB of extra memory.",Similar formatting should be applied to lines 1870 and 1881.
Also applies to: 1870-1870, 1881-1881
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
docs/source/_static/custom.css(1 hunks)docs/source/conf.py(3 hunks)docs/source/helper.py(2 hunks)tensorrt_llm/llmapi/llm.py(8 hunks)tensorrt_llm/llmapi/llm_args.py(14 hunks)tensorrt_llm/llmapi/utils.py(2 hunks)tests/unittest/api_stability/api_stability_core.py(5 hunks)tests/unittest/api_stability/references/llm.yaml(3 hunks)tests/unittest/api_stability/references_committed/llm.yaml(1 hunks)tests/unittest/llmapi/test_utils.py(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- tensorrt_llm/llmapi/llm.py
🚧 Files skipped from review as they are similar to previous changes (6)
- docs/source/_static/custom.css
- tests/unittest/llmapi/test_utils.py
- tests/unittest/api_stability/references_committed/llm.yaml
- docs/source/helper.py
- tests/unittest/api_stability/references/llm.yaml
- docs/source/conf.py
🧰 Additional context used
🧬 Code Graph Analysis (1)
tensorrt_llm/llmapi/utils.py (1)
tensorrt_llm/logger.py (1)
debug(143-144)
🪛 Ruff (0.12.2)
tensorrt_llm/llmapi/llm_args.py
1864-1864: Line too long (138 > 120)
(E501)
1870-1870: Line too long (142 > 120)
(E501)
1881-1881: Line too long (127 > 120)
(E501)
tensorrt_llm/llmapi/utils.py
549-551: 1 blank line required between summary line and description
(D205)
557-558: One-line docstring should fit on one line
Reformat to one line
(D200)
568-574: 1 blank line required between summary line and description
(D205)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Pre-commit Check
🔇 Additional comments (8)
tests/unittest/api_stability/api_stability_core.py (3)
76-76: Status field addition looks goodThe addition of the optional
statusfield to track API status metadata is well-implemented and aligns with the API status tagging system.
133-133: Status field addition looks goodThe addition of the optional
statusfield toMethodSnapshotis consistent with theParamSnapshotimplementation and properly supports the API status tracking functionality.
407-407: Deep copy for non-committed reference is appropriateCreating a deep copy of the reference as
non_committed_referenceis the correct approach to isolate non-committed API data from committed APIs, enabling proper status validation.tensorrt_llm/llmapi/utils.py (1)
496-496: Deprecated field check update is correctThe change to check for deprecated fields using
field_info.get("status", None) == "deprecated"is appropriate and consistent with the new status tagging system.tensorrt_llm/llmapi/llm_args.py (4)
64-91: Well-implemented custom Field wrapper for API status tagging.The custom
Fieldfunction effectively extends Pydantic's Field to support status metadata. The implementation correctly handles both existing and newjson_schema_extradictionaries, and the type annotations properly constrain the status values to valid options.
1025-1025: Consistent and logical API status annotations applied.The status annotations throughout the file are well-chosen and consistent:
- Deprecated: Legacy fields being phased out (max_lora_rank, backend, build_config)
- Beta: Stable newer features recommended for use (gpus_per_node, enable_attention_dp)
- Prototype: Experimental features not recommended for production (cp_config, various stats configurations)
The annotations align well with the broader API stability framework and will help users understand feature maturity levels.
Also applies to: 1031-1031, 1043-1044, 1047-1048, 1061-1063, 1065-1070, 1075-1077, 1104-1105, 1109-1110, 1114-1114, 1117-1118, 1121-1123, 1146-1148, 1154-1155, 1159-1160, 1164-1165, 1187-1187, 1845-1847, 1854-1855, 1864-1865, 1868-1870, 1873-1874, 1877-1878, 1883-1884, 1889-1890, 1894-1895, 1900-1901, 1904-1905, 1908-1908, 1912-1913, 1916-1918, 1930-1930, 1944-1944, 1951-1952
1731-1735: New fields properly implemented.The new
batching_typeandnormalize_log_probsfields are well-implemented with appropriate types, defaults, and descriptions. Since they lack status annotations, they are correctly treated as stable features.
2140-2158: Effective implementation of unstable feature warning system.The
warn_on_unstable_feature_usagemethod properly identifies explicitly set fields and warns users about beta/prototype features. The implementation correctly usesmodel_dump(exclude_unset=True)to focus only on user-configured fields and provides clear warnings about feature stability.
|
PR_Github #12312 [ run ] triggered by Bot |
|
PR_Github #12312 [ run ] completed with state |
b67933b to
e03f790
Compare
|
/bot run |
|
PR_Github #12518 [ run ] triggered by Bot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (5)
tests/unittest/api_stability/api_stability_core.py (3)
76-76: Addstatusfield handling in serialization methodsThe new
statusfield should be included in theto_dict()method to ensure it's properly serialized when saving snapshots.
133-133: Addstatusfield handling in serialization methodsThe new
statusfield should be included in theto_dict()method to ensure it's properly serialized when saving snapshots.
529-529: Remove debug print statementDebug print statements should not be committed to production test code.
tensorrt_llm/llmapi/utils.py (2)
605-605: Thread safety concern with mutable class variableUsing a mutable class variable
method_to_statuscould cause race conditions in multi-threaded environments.
635-636: Incorrect singleton instantiationThe class uses classmethods but creates an instance. Either use instance methods or remove the instantiation.
🧹 Nitpick comments (3)
tests/unittest/api_stability/api_stability_core.py (1)
7-7: Remove unused importThe
pprintimport doesn't appear to be used anywhere in the file. Consider removing it to keep the imports clean.-from pprint import pprinttensorrt_llm/llmapi/utils.py (1)
551-584: Well-implemented API parameter tagging functionalityThe
ApiParamTaggerclass effectively handles status tagging for Pydantic model fields. The logic correctly reads status fromjson_schema_extrametadata and amends field descriptions with appropriate tags.Consider fixing the docstring formatting issues flagged by static analysis:
- ''' A helper to tag the api doc according to the status of the fields. - The status is set in the json_schema_extra of the field. - ''' + """A helper to tag the api doc according to the status of the fields. + + The status is set in the json_schema_extra of the field. + """tensorrt_llm/llmapi/llm_args.py (1)
1868-1870: Good status annotations for PyTorch backend fields, but fix line length violations.The status annotations properly categorize PyTorch-specific fields by their maturity levels. However, there are line length violations that need to be addressed.
Apply these fixes for the line length violations:
- "Lower values trigger more frequent garbage collection.", - status="beta") + "Lower values trigger more frequent garbage collection.", + status="beta") - "CUDA graph config.If true, use CUDA graphs for decoding. \ - CUDA graphs are only created for the batch sizes in cuda_graph_config.batch_sizes, \ - and are enabled for batches that consist of decoding requests *only* \ - (the reason is that it's hard to capture a single graph with prefill requests \ - since the input shapes are a function of the sequence lengths).\ - Note that each CUDA graph can use up to 200 MB of extra memory.", + "CUDA graph config. If true, use CUDA graphs for decoding. " + "CUDA graphs are only created for the batch sizes in cuda_graph_config.batch_sizes, " + "and are enabled for batches that consist of decoding requests *only* " + "(the reason is that it's hard to capture a single graph with prefill requests " + "since the input shapes are a function of the sequence lengths). " + "Note that each CUDA graph can use up to 200 MB of extra memory.", - "If true, will iterate over sampling_params of each request and use the corresponding sampling strategy, e.g. top-k, top-p, etc.", + "If true, will iterate over sampling_params of each request and use the " + "corresponding sampling strategy, e.g. top-k, top-p, etc.", - "If true, will use the TRTLLM sampler instead of the PyTorch sampler. The TRTLLM sampler has a wide coverage of sampling strategies.", + "If true, will use the TRTLLM sampler instead of the PyTorch sampler. " + "The TRTLLM sampler has a wide coverage of sampling strategies.", - "If true, enables per request stats per iteration. Must also set enable_iter_perf_stats to true to get request stats.", + "If true, enables per request stats per iteration. " + "Must also set enable_iter_perf_stats to true to get request stats.",Also applies to: 1877-1878, 1887-1888, 1891-1893, 1896-1897, 1900-1901, 1906-1907, 1912-1913, 1916-1918, 1923-1924, 1927-1928, 1931-1931, 1935-1936, 1939-1941, 1953-1953, 1967-1967, 1974-1975
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
docs/source/_static/custom.css(1 hunks)docs/source/conf.py(3 hunks)docs/source/helper.py(2 hunks)tensorrt_llm/llmapi/llm.py(9 hunks)tensorrt_llm/llmapi/llm_args.py(14 hunks)tensorrt_llm/llmapi/utils.py(2 hunks)tests/unittest/api_stability/api_stability_core.py(5 hunks)tests/unittest/api_stability/references/llm.yaml(3 hunks)tests/unittest/api_stability/references_committed/llm.yaml(1 hunks)tests/unittest/llmapi/test_utils.py(1 hunks)
🧬 Code Graph Analysis (2)
tensorrt_llm/llmapi/llm_args.py (1)
tensorrt_llm/builder.py (1)
default(50-58)
tensorrt_llm/llmapi/utils.py (3)
tensorrt_llm/_torch/auto_deploy/llm_args.py (1)
LlmArgs(24-238)tensorrt_llm/llmapi/llm_args.py (2)
decorator(508-522)decorator(538-548)tensorrt_llm/logger.py (1)
debug(143-144)
🪛 Ruff (0.12.2)
tensorrt_llm/llmapi/llm_args.py
1887-1887: Line too long (138 > 120)
(E501)
1893-1893: Line too long (142 > 120)
(E501)
1904-1904: Line too long (127 > 120)
(E501)
tensorrt_llm/llmapi/utils.py
549-551: 1 blank line required between summary line and description
(D205)
557-558: One-line docstring should fit on one line
Reformat to one line
(D200)
568-574: 1 blank line required between summary line and description
(D205)
✅ Files skipped from review due to trivial changes (1)
- tensorrt_llm/llmapi/llm.py
🚧 Files skipped from review as they are similar to previous changes (6)
- tests/unittest/llmapi/test_utils.py
- tests/unittest/api_stability/references_committed/llm.yaml
- docs/source/helper.py
- docs/source/_static/custom.css
- docs/source/conf.py
- tests/unittest/api_stability/references/llm.yaml
🧰 Additional context used
🧬 Code Graph Analysis (2)
tensorrt_llm/llmapi/llm_args.py (1)
tensorrt_llm/builder.py (1)
default(50-58)
tensorrt_llm/llmapi/utils.py (3)
tensorrt_llm/_torch/auto_deploy/llm_args.py (1)
LlmArgs(24-238)tensorrt_llm/llmapi/llm_args.py (2)
decorator(508-522)decorator(538-548)tensorrt_llm/logger.py (1)
debug(143-144)
🪛 Ruff (0.12.2)
tensorrt_llm/llmapi/llm_args.py
1887-1887: Line too long (138 > 120)
(E501)
1893-1893: Line too long (142 > 120)
(E501)
1904-1904: Line too long (127 > 120)
(E501)
tensorrt_llm/llmapi/utils.py
549-551: 1 blank line required between summary line and description
(D205)
557-558: One-line docstring should fit on one line
Reformat to one line
(D200)
568-574: 1 blank line required between summary line and description
(D205)
🔇 Additional comments (8)
tests/unittest/api_stability/api_stability_core.py (2)
407-407: Good separation of committed and non-committed referencesCreating a deep copy to isolate non-committed API data from committed APIs is a clean approach that supports the new status validation functionality effectively.
452-533: Well-implemented API status validationThe
test_api_statusmethod provides comprehensive validation of API status consistency between live code and reference files. The logic correctly:
- Separates committed vs non-committed API validation
- Validates method status annotations using the API registry
- Checks parameter status consistency for
__init__methods- Provides clear error messages for missing or mismatched status tags
This is a solid implementation that ensures API status metadata remains synchronized across the codebase.
tensorrt_llm/llmapi/utils.py (2)
496-497: Good update to align with new status field approachThe condition change from checking a
"deprecated"key to checking"status" == "deprecated"properly aligns with the new unified status tagging system.
586-588: Clean utility function for LLM parameter taggingThe
tag_llm_paramsfunction provides a clean interface for applying status tags to the mainLlmArgsmodel. Good encapsulation.tensorrt_llm/llmapi/llm_args.py (4)
64-92: Well-implemented custom Field wrapper with proper fallback handling.The custom
Fieldfunction correctly wraps Pydantic'sFieldand adds status metadata tojson_schema_extra. The implementation handles both dictionary and non-dictionary cases forjson_schema_extra, ensuring robustness.
1025-1026: Appropriate status annotations for API stability tracking.The status annotations correctly identify the maturity level of various fields:
"deprecated"for legacy fields likemax_lora_rank,max_loras,max_cpu_loras, andbackend"beta"for stable but evolving features likegpus_per_node,moe_cluster_parallel_size,enable_attention_dp"prototype"for experimental features likecp_config, iterator stats fields, and various cache configsThis aligns well with the API stability tagging system and helps users understand feature maturity.
Also applies to: 1030-1031, 1042-1044, 1047-1048, 1061-1063, 1065-1070, 1075-1077, 1104-1105, 1109-1110, 1114-1114, 1117-1118, 1121-1123, 1146-1148, 1154-1155, 1159-1160, 1164-1165, 1187-1187
1740-1744: New fields added with appropriate defaults.The addition of
batching_typeandnormalize_log_probsfields toTrtLlmArgslooks correct:
batching_typeuses the properBatchingTypeenum from the relevant code snippets- Both fields have sensible defaults (
NoneandFalserespectively)- Clear descriptions are provided
2163-2181: Well-implemented warning method for unstable feature usage.The
warn_on_unstable_feature_usagemethod correctly:
- Uses
model_dump(exclude_unset=True)to only check explicitly set fields- Accesses field metadata from
model_fieldsandjson_schema_extra- Warns only for "beta" and "prototype" status values
- Provides clear messaging about the feature's stability level
This provides valuable feedback to users about unstable features they're using.
|
PR_Github #12518 [ run ] completed with state |
e03f790 to
9e8977c
Compare
9e8977c to
1a9909b
Compare
|
/bot run |
|
PR_Github #12594 [ run ] triggered by Bot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (2)
tensorrt_llm/llmapi/utils.py (2)
605-605: Thread safety concern with mutable class variable remains unaddressed.The past review comment about thread safety with the mutable class variable
method_to_statushas not been addressed. This could still cause race conditions in multi-threaded environments.
635-636: Incorrect singleton instantiation pattern remains unaddressed.The past review comment about incorrect instantiation of
ApiStatusRegistryfor classmethods has not been addressed. The code still creates unnecessary instances to call classmethods.
🧹 Nitpick comments (1)
tensorrt_llm/llmapi/llm_args.py (1)
1887-1887: Address line length issue in CUDA graph description.The line exceeds the 120-character limit. Consider breaking the description into multiple lines.
Apply this diff to improve readability:
- "CUDA graphs are only created for the batch sizes in cuda_graph_config.batch_sizes, \ - and are enabled for batches that consist of decoding requests *only* \ - (the reason is that it's hard to capture a single graph with prefill requests \ - since the input shapes are a function of the sequence lengths).\ - Note that each CUDA graph can use up to 200 MB of extra memory.", + "CUDA graphs are only created for the batch sizes in cuda_graph_config.batch_sizes, " + "and are enabled for batches that consist of decoding requests *only* " + "(the reason is that it's hard to capture a single graph with prefill requests " + "since the input shapes are a function of the sequence lengths). " + "Note that each CUDA graph can use up to 200 MB of extra memory.",
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
docs/source/_static/custom.css(1 hunks)docs/source/conf.py(3 hunks)docs/source/helper.py(2 hunks)tensorrt_llm/bench/benchmark/throughput.py(1 hunks)tensorrt_llm/llmapi/llm.py(9 hunks)tensorrt_llm/llmapi/llm_args.py(14 hunks)tensorrt_llm/llmapi/utils.py(2 hunks)tests/unittest/api_stability/api_stability_core.py(5 hunks)tests/unittest/api_stability/references/llm.yaml(3 hunks)tests/unittest/api_stability/references_committed/llm.yaml(1 hunks)tests/unittest/llmapi/test_utils.py(1 hunks)
🪛 Ruff (0.12.2)
tensorrt_llm/llmapi/llm_args.py
1887-1887: Line too long (138 > 120)
(E501)
1893-1893: Line too long (142 > 120)
(E501)
1904-1904: Line too long (127 > 120)
(E501)
tensorrt_llm/llmapi/utils.py
549-551: 1 blank line required between summary line and description
(D205)
557-558: One-line docstring should fit on one line
Reformat to one line
(D200)
568-574: 1 blank line required between summary line and description
(D205)
✅ Files skipped from review due to trivial changes (1)
- tensorrt_llm/bench/benchmark/throughput.py
🚧 Files skipped from review as they are similar to previous changes (8)
- docs/source/_static/custom.css
- tests/unittest/api_stability/references_committed/llm.yaml
- tests/unittest/llmapi/test_utils.py
- docs/source/helper.py
- tests/unittest/api_stability/api_stability_core.py
- docs/source/conf.py
- tests/unittest/api_stability/references/llm.yaml
- tensorrt_llm/llmapi/llm.py
🧰 Additional context used
🪛 Ruff (0.12.2)
tensorrt_llm/llmapi/llm_args.py
1887-1887: Line too long (138 > 120)
(E501)
1893-1893: Line too long (142 > 120)
(E501)
1904-1904: Line too long (127 > 120)
(E501)
tensorrt_llm/llmapi/utils.py
549-551: 1 blank line required between summary line and description
(D205)
557-558: One-line docstring should fit on one line
Reformat to one line
(D200)
568-574: 1 blank line required between summary line and description
(D205)
🔇 Additional comments (11)
tensorrt_llm/llmapi/utils.py (2)
496-496: LGTM! Consistent with new status metadata approach.The change from checking the
"deprecated"key tofield_info.get("status", None) == "deprecated"correctly aligns with the new status metadata structure introduced in theFieldwrapper.
551-589: LGTM! Well-designed API parameter tagging utility.The
ApiParamTaggerclass provides a clean approach to process Pydantic models and amend field descriptions with status tags based onjson_schema_extrametadata. The implementation correctly rebuilds the model schema after modifications.tensorrt_llm/llmapi/llm_args.py (9)
64-92: Well-designed custom Field wrapper for API stability tracking.The custom
Fieldwrapper effectively extends Pydantic'sFieldto support status metadata while maintaining backward compatibility. The implementation correctly handles thejson_schema_extradictionary merging and provides clear documentation for the status levels.
1025-1025: Consistent status annotation application.The addition of
status="beta"to thegpus_per_nodefield is appropriate and follows the established pattern for marking unstable API parameters.
1030-1031: Good status annotation for MoE cluster parameter.Marking
moe_cluster_parallel_sizeas beta status is appropriate given its specialized nature for MoE models.
1042-1044: Appropriate beta status for attention data parallel.The beta status for
enable_attention_dpcorrectly reflects its experimental nature.
1047-1048: Correct prototype status for context parallel config.Marking
cp_configas prototype status is appropriate given its advanced and potentially unstable nature.
1061-1063: Proper deprecation marking for max_lora_rank.The combination of
deprecatedparameter andstatus="deprecated"correctly indicates this field should no longer be used.
1740-1744: New fields properly documented and typed.The addition of
batching_typeandnormalize_log_probsfields follows good practices with clear descriptions and appropriate types.
1868-1870: Appropriate deprecation of build_config in TorchLlmArgs.Marking the
build_configfield as deprecated inTorchLlmArgsis appropriate since it's mentioned as "just a dummy BuildConfig" in the comment.
2163-2181: Excellent feature for warning about unstable API usage.The
warn_on_unstable_feature_usagemethod provides valuable feedback to users about potentially unstable features they're using. The implementation correctly checks field metadata and provides informative warnings.
|
PR_Github #12594 [ run ] completed with state |
1a9909b to
7d34d45
Compare
|
/bot run |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (4)
tests/unittest/api_stability/api_stability_core.py (2)
76-76: Status fields properly declared but serialization incompleteThe new
statusfields are properly declared in both dataclasses. However, the serialization methods still need to handle these fields properly as noted in the past review comments.Also applies to: 133-133
452-533: Well-structured API status validation with one debug print issueThe
test_api_statusmethod provides comprehensive validation of API status consistency between live code and reference YAML. The implementation properly:
- Limits scope to
llm.yamlreference file- Imports required dependencies within the method scope
- Validates both method and parameter status annotations
- Provides detailed error messages for mismatches
However, there's a debug print statement that should be removed.
The debug print at line 529 should be removed as noted in past review comments:
- print(f"param_name: {param_name}, param_data: {param_data}")tensorrt_llm/llmapi/utils.py (2)
591-632: API status registry needs thread safety improvementsThe
ApiStatusRegistryclass provides a good foundation for managing API method statuses, but has thread safety concerns as noted in past reviews.The implementation correctly uses
__qualname__for method identification and provides proper status retrieval and docstring amendment functionality.
635-636: Fix incorrect singleton instantiationThe code incorrectly creates an instance to call classmethods, as noted in past review comments.
-set_api_status = ApiStatusRegistry().set_api_status -get_api_status = ApiStatusRegistry().get_api_status +set_api_status = ApiStatusRegistry.set_api_status +get_api_status = ApiStatusRegistry.get_api_status
🧹 Nitpick comments (2)
tests/unittest/api_stability/api_stability_core.py (1)
7-7: Unused import detectedThe
pprintimport is added but not used in the visible code. Consider removing it to keep imports clean.-from pprint import pprinttensorrt_llm/llmapi/llm_args.py (1)
1887-1887: Address line length violations for code style consistency.Three lines exceed the 120-character limit. Consider breaking these long description strings for better readability:
- "Note that each CUDA graph can use up to 200 MB of extra memory.", + "Note that each CUDA graph can use up to 200 MB of extra memory.", - description="Disable the overlap scheduler.", + description="Disable the overlap scheduler.", - "If true, will iterate over sampling_params of each request and use the corresponding sampling strategy, e.g. top-k, top-p, etc.", + "If true, will iterate over sampling_params of each request and use " + "the corresponding sampling strategy, e.g. top-k, top-p, etc.",Also applies to: 1893-1893, 1904-1904
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
docs/source/_static/custom.css(1 hunks)docs/source/conf.py(3 hunks)docs/source/helper.py(2 hunks)tensorrt_llm/bench/benchmark/throughput.py(1 hunks)tensorrt_llm/llmapi/llm.py(9 hunks)tensorrt_llm/llmapi/llm_args.py(14 hunks)tensorrt_llm/llmapi/utils.py(2 hunks)tests/unittest/api_stability/api_stability_core.py(5 hunks)tests/unittest/api_stability/references/llm.yaml(3 hunks)tests/unittest/api_stability/references_committed/llm.yaml(1 hunks)tests/unittest/llmapi/test_utils.py(1 hunks)
🪛 Ruff (0.12.2)
tensorrt_llm/llmapi/utils.py
549-551: 1 blank line required between summary line and description
(D205)
557-558: One-line docstring should fit on one line
Reformat to one line
(D200)
568-574: 1 blank line required between summary line and description
(D205)
tensorrt_llm/llmapi/llm_args.py
1887-1887: Line too long (138 > 120)
(E501)
1893-1893: Line too long (142 > 120)
(E501)
1904-1904: Line too long (127 > 120)
(E501)
🚧 Files skipped from review as they are similar to previous changes (8)
- tests/unittest/api_stability/references_committed/llm.yaml
- docs/source/_static/custom.css
- tests/unittest/llmapi/test_utils.py
- tensorrt_llm/bench/benchmark/throughput.py
- docs/source/helper.py
- docs/source/conf.py
- tensorrt_llm/llmapi/llm.py
- tests/unittest/api_stability/references/llm.yaml
🧰 Additional context used
🪛 Ruff (0.12.2)
tensorrt_llm/llmapi/utils.py
549-551: 1 blank line required between summary line and description
(D205)
557-558: One-line docstring should fit on one line
Reformat to one line
(D200)
568-574: 1 blank line required between summary line and description
(D205)
tensorrt_llm/llmapi/llm_args.py
1887-1887: Line too long (138 > 120)
(E501)
1893-1893: Line too long (142 > 120)
(E501)
1904-1904: Line too long (127 > 120)
(E501)
🔇 Additional comments (8)
tests/unittest/api_stability/api_stability_core.py (1)
407-407: Good implementation of reference isolationCreating a deep copy of the reference as
non_committed_referenceis a solid approach to isolate non-committed API elements for separate validation.tensorrt_llm/llmapi/utils.py (2)
496-496: Correct alignment with new status metadata structureThe change from checking
"deprecated"tofield_info.get("status", None) == "deprecated"properly aligns with the new unified status metadata approach introduced in this PR.
551-588: Well-implemented parameter tagging utilityThe
ApiParamTaggerclass provides a clean mechanism for adding status tags to Pydantic model fields. The implementation properly:
- Processes model fields with status metadata
- Amends field descriptions with appropriate tags
- Forces model rebuild to reflect changes
- Provides a convenient
tag_llm_paramsfunction for easy usageThe approach of modifying field descriptions and rebuilding the model schema is appropriate for documentation integration.
Note: The static analysis hints about docstring formatting are cosmetic and don't affect functionality.
tensorrt_llm/llmapi/llm_args.py (5)
64-92: LGTM! Well-structured API status tagging implementation.The custom
Fieldwrapper effectively extends Pydantic's functionality to support status metadata. The implementation properly handles edge cases and maintains backward compatibility.Minor suggestion: Consider clarifying the docstring for the
defaultparameter since...is Pydantic's sentinel for required fields, not a literal default value.
1025-1025: Well-categorized API status annotations.The status annotations are appropriately assigned:
- Deprecated fields correctly identify legacy parameters with modern replacements (e.g.,
max_lora_rank→lora_config.max_lora_rank)- Beta fields mark stable but potentially evolving features
- Prototype fields identify experimental or debugging functionality
This systematic approach to API stability labeling will help users make informed decisions about feature adoption.
Also applies to: 1031-1031, 1044-1044, 1048-1048, 1062-1063, 1069-1070, 1076-1077, 1105-1105, 1110-1110, 1114-1114, 1118-1118, 1123-1123, 1148-1148, 1155-1155, 1160-1160, 1165-1165, 1187-1187, 1869-1870, 1878-1878, 1888-1888, 1893-1893, 1897-1897, 1901-1901, 1907-1907, 1913-1913, 1918-1918, 1924-1924, 1928-1928, 1931-1931, 1936-1936, 1941-1941, 1953-1953, 1967-1967, 1975-1975
2163-2181: Excellent runtime warning implementation for unstable features.The method effectively provides user feedback about beta and prototype feature usage by:
- Only checking explicitly set fields (avoiding false positives)
- Providing clear, actionable warning messages
- Distinguishing between beta and prototype stability levels
This helps users make informed decisions about feature adoption in production environments.
2740-2744: Well-structured new field additions.The new
batching_typeandnormalize_log_probsfields are properly implemented:
- Appropriate types and sensible defaults
- Clear, descriptive documentation
- Consistent with existing field patterns in the class
64-2181: Excellent implementation of API stability framework.This comprehensive update successfully introduces structured API status tagging throughout the LLM arguments system. Key strengths:
- Clean Architecture: The custom
Fieldwrapper elegantly extends Pydantic without breaking existing functionality- Systematic Categorization: Status annotations are thoughtfully applied across deprecated, beta, and prototype features
- Runtime Feedback: The warning mechanism helps developers make informed decisions about feature usage
- Future-Proof Design: The framework is extensible and integrates well with the broader API stability system
This will significantly improve developer experience by providing clear guidance on feature maturity and stability.
7d34d45 to
8fb25e1
Compare
Signed-off-by: Superjomn <[email protected]> Signed-off-by: nv-guomingz <[email protected]>
8fb25e1 to
9cd6594
Compare
|
/bot run |
|
PR_Github #12704 [ run ] triggered by Bot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (2)
tensorrt_llm/llmapi/utils.py (2)
605-605: Thread safety concern with mutable class variableUsing a mutable class variable
method_to_statuscould cause race conditions in multi-threaded environments.
635-636: Incorrect singleton instantiationThe class uses classmethods but creates an instance. Either use instance methods or remove the instantiation.
🧹 Nitpick comments (2)
tensorrt_llm/llmapi/utils.py (1)
551-589: Fix docstring formatting issues but functionality looks good.The
ApiParamTaggerclass implementation is sound and correctly processes Pydantic models to add status tags. However, there are docstring formatting issues that should be addressed.Apply these formatting fixes:
class ApiParamTagger: - ''' A helper to tag the api doc according to the status of the fields. - The status is set in the json_schema_extra of the field. - ''' + """A helper to tag the api doc according to the status of the fields. + + The status is set in the json_schema_extra of the field. + """ def __call__(self, cls: Type[BaseModel]) -> None: - self.process_pydantic_model(cls) + """Process the given Pydantic model.""" + self.process_pydantic_model(cls) def process_pydantic_model(self, cls: Type[BaseModel]) -> None: - """Process the Pydantic model to add tags to the fields. - """ + """Process the Pydantic model to add tags to the fields.""" def amend_pydantic_field_description_with_tags(self, cls: Type[BaseModel], field_names: list[str], tag: str) -> None: - """Amend the description of the fields with tags. - e.g. :tag:`beta` or :tag:`prototype` + """Amend the description of the fields with tags. + + e.g. :tag:`beta` or :tag:`prototype` + Args: cls: The Pydantic BaseModel class. field_names: The names of the fields to amend. tag: The tag to add to the fields. """tensorrt_llm/llmapi/llm_args.py (1)
1867-1867: Fix line length violations in field descriptions.The static analysis correctly identifies lines exceeding 120 characters. These can be fixed by breaking the descriptions into multiple lines.
Apply these fixes for better code formatting:
- Note that each CUDA graph can use up to 200 MB of extra memory.", + Note that each CUDA graph can use up to 200 MB of extra memory." + "", - "If true, will iterate over sampling_params of each request and use the corresponding sampling strategy, e.g. top-k, top-p, etc.", + "If true, will iterate over sampling_params of each request and use the " + "corresponding sampling strategy, e.g. top-k, top-p, etc.",For line 1873, the description appears to be just "Disable the overlap scheduler." which should already be under the limit, so please double-check that specific line.
Also applies to: 1873-1873, 1884-1884
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
docs/source/_static/custom.css(1 hunks)docs/source/conf.py(3 hunks)docs/source/helper.py(2 hunks)tensorrt_llm/bench/benchmark/throughput.py(2 hunks)tensorrt_llm/llmapi/llm.py(9 hunks)tensorrt_llm/llmapi/llm_args.py(14 hunks)tensorrt_llm/llmapi/utils.py(2 hunks)tests/unittest/api_stability/api_stability_core.py(5 hunks)tests/unittest/api_stability/references/llm.yaml(3 hunks)tests/unittest/api_stability/references_committed/llm.yaml(1 hunks)tests/unittest/llmapi/test_utils.py(1 hunks)
🧬 Code Graph Analysis (1)
tensorrt_llm/llmapi/utils.py (2)
tensorrt_llm/_torch/auto_deploy/llm_args.py (1)
LlmArgs(224-316)tensorrt_llm/logger.py (1)
debug(143-144)
🪛 Ruff (0.12.2)
tensorrt_llm/llmapi/llm_args.py
1867-1867: Line too long (138 > 120)
(E501)
1873-1873: Line too long (142 > 120)
(E501)
1884-1884: Line too long (127 > 120)
(E501)
tensorrt_llm/llmapi/utils.py
549-551: 1 blank line required between summary line and description
(D205)
557-558: One-line docstring should fit on one line
Reformat to one line
(D200)
568-574: 1 blank line required between summary line and description
(D205)
🚧 Files skipped from review as they are similar to previous changes (9)
- tests/unittest/api_stability/references_committed/llm.yaml
- tests/unittest/llmapi/test_utils.py
- docs/source/helper.py
- docs/source/_static/custom.css
- tensorrt_llm/bench/benchmark/throughput.py
- tests/unittest/api_stability/api_stability_core.py
- docs/source/conf.py
- tensorrt_llm/llmapi/llm.py
- tests/unittest/api_stability/references/llm.yaml
🧰 Additional context used
🧬 Code Graph Analysis (1)
tensorrt_llm/llmapi/utils.py (2)
tensorrt_llm/_torch/auto_deploy/llm_args.py (1)
LlmArgs(224-316)tensorrt_llm/logger.py (1)
debug(143-144)
🪛 Ruff (0.12.2)
tensorrt_llm/llmapi/llm_args.py
1867-1867: Line too long (138 > 120)
(E501)
1873-1873: Line too long (142 > 120)
(E501)
1884-1884: Line too long (127 > 120)
(E501)
tensorrt_llm/llmapi/utils.py
549-551: 1 blank line required between summary line and description
(D205)
557-558: One-line docstring should fit on one line
Reformat to one line
(D200)
568-574: 1 blank line required between summary line and description
(D205)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Pre-commit Check
🔇 Additional comments (6)
tensorrt_llm/llmapi/utils.py (1)
496-496: LGTM! Consistent with the new status tagging system.The change from checking
"deprecated"key tofield_info.get("status", None) == "deprecated"aligns with the unified status metadata approach being introduced throughout the codebase.tensorrt_llm/llmapi/llm_args.py (5)
64-92: Excellent implementation of the custom Field wrapper.The implementation properly handles all edge cases for json_schema_extra and provides a clean API for adding status metadata to Pydantic fields. The documentation is comprehensive and the allowed status values follow a logical hierarchy.
1005-1005: Well-structured status annotations for BaseLlmArgs fields.The status assignments are logical and consistent:
"beta"for features that are usable but not fully stable"prototype"for experimental features"deprecated"for fields being phased out with clear migration guidanceThis will help users make informed decisions about feature usage.
Also applies to: 1011-1011, 1024-1024, 1028-1028, 1042-1043, 1049-1050, 1056-1057, 1085-1085, 1090-1090, 1094-1094, 1098-1098, 1103-1103, 1128-1128, 1135-1135, 1140-1140, 1145-1145, 1167-1167
1719-1724: Clean addition of new stable fields to TrtLlmArgs.The new
batching_typeandnormalize_log_probsfields are properly typed with appropriate defaults and descriptions. The absence of status annotations indicates these are stable APIs.
1847-1849: Comprehensive and well-categorized status annotations for TorchLlmArgs.The status assignments are logical:
build_configappropriately marked as deprecated- Performance/debugging features marked as
"prototype"- More mature but evolving features marked as
"beta"This provides clear guidance on the maturity level of PyTorch backend features.
Also applies to: 1857-1857, 1867-1867, 1872-1872, 1876-1876, 1880-1880, 1886-1886, 1892-1892, 1897-1897, 1903-1903, 1907-1907, 1910-1910, 1915-1915, 1920-1920, 1932-1932, 1946-1946, 1954-1954
2142-2160: Excellent implementation of unstable feature warning system.The method effectively leverages the status metadata to warn users about potentially unstable features they're using. Key strengths:
- Only warns about explicitly set fields
- Clear, actionable warning messages
- Proper method chaining with return self
- Good integration with the status annotation system
|
PR_Github #12704 [ run ] completed with state |
It's a mirror PR of #5707 since @Superjomn is on vacation.
Summary by CodeRabbit