Adding nno dashboard #366

ggordaniRed · 2025-11-19T14:17:06Z

Creating workflow for NNO dashboard

Summary by CodeRabbit

Release Notes

New Features
- Added Network Operator CI Dashboard alongside the existing GPU Operator Dashboard for comprehensive test reporting across operator and OpenShift versions
- Implemented per-operator change detection in workflows to independently generate dashboards when respective data changes
- Enhanced dashboards with per-test-flavor categorization and reporting for more granular test result visibility

adding nno dashboard module for generate html page for nvidia network operator

- Extract test flavors from job names (DOCA4, Bare Metal, etc.) - Structure data by actual OCP version instead of infrastructure type - Update templates to display test flavors as subsections - Fix merge logic to preserve test_flavors field Fixes GitHub Actions workflow failure on PR 67673

- Make test_flavor an optional field in TestResult dataclass - Include test_flavor in to_dict() only when set - Fixes TypeError when merging NNO results with test_flavor field Resolves: TypeError: TestResult.__init__() got an unexpected keyword argument 'test_flavor'

- Properly distinguish RDMA Legacy SR-IOV vs RDMA Shared Device - Detect GPU-enabled tests (with GPU / without GPU) - Support Hosted, Bare Metal, and DOCA4 infrastructure types - Better formatting: 'DOCA4 - RDMA Legacy SR-IOV with GPU' instead of separate parts Test flavors now supported: - RDMA Legacy SR-IOV (with/without GPU) - RDMA Shared Device (with/without GPU) - Bare Metal E2E - Hosted configurations - DOCA4 configurations

- Add is_valid_ocp_version() to validate OpenShift version keys - Filter out infrastructure types: doca4, bare-metal, hosted, Unknown - Only show actual OCP versions (4.17.16, 4.16, etc.) in TOC and sections - Log filtered keys for debugging This fixes the issue where 'doca4', 'bare-metal', and 'Unknown' appeared as OpenShift versions instead of being properly organized under the actual OCP version.

- When OCP version cannot be determined (infrastructure types without valid version), skip the result instead of creating 'Unknown' entries - Add validation to ensure OCP version starts with digit and contains dots - Improve logging to show resolution path for infrastructure types - This prevents 'doca4', 'bare-metal', 'hosted', and 'Unknown' from appearing as OCP versions in the data The dashboard now only processes actual OpenShift versions like '4.17.16'.

openshift-ci · 2025-11-19T14:17:12Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ggordaniRed

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [ggordaniRed]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

coderabbitai · 2025-11-19T14:17:20Z

Walkthrough

This PR extends the GPU Operator dashboard infrastructure to support Network Operator (NNO) dashboards. It adds NNO-specific CI data fetching, HTML dashboard generation, HTML templates, workflow orchestration with change detection, and corresponding documentation, while extending the GPU operator's data model to support per-flavor test results.

Changes

Cohort / File(s)	Summary
Workflow orchestration `.github/workflows/generate_matrix_page.yaml`	Adds Network Operator dashboard paths (JSON and HTML outputs), installs NNO dependencies, fetches NNO CI data, detects changes in NNO JSON files, and conditionally generates both GPU and NNO dashboards based on respective change flags.
GPU operator data model extension `workflows/gpu_operator_dashboard/fetch_ci_data.py`	Extends TestResult dataclass with optional test_flavor field; augments merge_ocp_version_results to initialize and merge per-flavor test results and job history links into test_flavors map.
Network Operator documentation `workflows/nno_dashboard/README.md`	Introduces comprehensive documentation covering dashboard purpose, architecture (reusing GPU operator core logic), supported test patterns, usage instructions for fetch_ci_data and generate_ci_dashboard CLIs, testing, GitHub Actions integration, troubleshooting, and example data structures.
NNO CI data fetching `workflows/nno_dashboard/fetch_ci_data.py`	Introduces new module with NNO-specific regex patterns, PR/build processing logic (process_closed_prs, fetch_filtered_files, extract_build_components, build_files_lookup, process_single_build, fetch_pr_files, filter_network_finished_files), test flavor extraction from job names, per-PR orchestration (process_tests_for_pr), and CLI entry point (main).
NNO dashboard generation `workflows/nno_dashboard/generate_ci_dashboard.py`	Introduces new module for HTML dashboard generation; adds OCP version validation (is_valid_ocp_version), test flavor HTML section building (build_test_flavors_sections), full dashboard assembly (generate_test_matrix), and CLI entry point (main), reusing GPU operator dashboard core helpers.
NNO dependencies `workflows/nno_dashboard/requirements.txt`	Specifies 12 pinned Python dependencies including pydantic, requests, semver, typing extensions, and urllib3.
NNO HTML templates `workflows/nno_dashboard/templates/{footer,header,main_table,test_flavor_section}.html`	Adds HTML template fragments: header with document structure and sortTable JavaScript function, main_table with per-OCP version blocks and placeholders, test_flavor_section with flavor-keyed table structure, and footer with last-updated timestamp and document closure.

Sequence Diagram

sequenceDiagram
    actor WF as Workflow
    participant UD as Change Detection
    participant GF as GPU fetch_ci_data
    participant NF as NNO fetch_ci_data
    participant GG as GPU generate_dashboard
    participant NG as NNO generate_dashboard
    
    WF->>UD: Detect GPU JSON changes
    WF->>UD: Detect NNO JSON changes
    alt GPU_CHANGED
        WF->>GF: Fetch GPU CI data
        GF-->>WF: GPU JSON output
        WF->>GG: Generate GPU dashboard
        GG-->>WF: GPU HTML output
    end
    alt NNO_CHANGED
        WF->>NF: Fetch NNO CI data
        NF-->>WF: NNO JSON output
        WF->>NG: Generate NNO dashboard
        NG-->>WF: NNO HTML output
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Dense new modules: workflows/nno_dashboard/fetch_ci_data.py contains 9 new functions with intricate build processing, file filtering, and data transformation logic; generate_ci_dashboard.py introduces 4 new functions with template rendering and OCP version validation
Data structure extensions: GPU operator model extended with test_flavor field and per-flavor merge logic in merge_ocp_version_results, requiring verification of merge correctness
Multi-file scope: Changes span workflow, data models, new NNO-specific modules, templates, and dependencies
Heterogeneous logic: PR/build extraction, dual-build handling, test flavor derivation, and HTML generation require separate reasoning for each component

Areas requiring extra attention:

PR history processing logic in process_closed_prs and edge cases in fetch_filtered_files (pagination, error handling)
Regex patterns (TEST_RESULT_PATH_REGEX, OCP_FULL_VERSION) correctness for NNO artifact paths
Dual-build mapping logic in build_files_lookup and process_single_build to prevent data loss or duplication
Test flavor extraction heuristics in extract_test_flavor_from_job_name for accuracy
Integration point between GPU operator data model and NNO-specific usage (test_flavor field, per-flavor merge)
Template placeholder injection and HTML rendering in generate_test_matrix

Possibly related PRs

Fix multiple runs per PR #244: Introduces PR-level processing with multiple builds per PR, mirrored by NNO's process_closed_prs and build aggregation logic
Refactor dashboard data processing for clarity and efficiency #298: Modifies merge_ocp_version_results and per-version data structure merge logic that this PR extends with test_flavor support
skipping html generation if no version was updated #179: Establishes conditional dashboard generation pattern in workflows that this PR extends for NNO dashboard paths and change flags

Suggested labels

needs-ok-to-test

Suggested reviewers

empovit
fabiendupont
wabouhamad

Poem

🐰 A Network Operator joins the stage,

CI dashboards turn a golden page,

Test flavors bloom in HTML light,

Side by side with GPU might,

Per-PR builds dance in parallel flight! 🚀

Pre-merge checks and finishing touches

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Title check	❓ Inconclusive	The title 'Adding nno dashboard' is vague and uses non-descriptive terms that don't clearly convey the scope or main purpose of the changeset.	Improve the title to be more specific about what the NNO dashboard does or accomplishes, such as 'Add Network Operator dashboard generation workflow' or 'Implement NNO CI dashboard with fetch and generation steps'.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage	✅ Passed	Docstring coverage is 82.35% which is sufficient. The required threshold is 80.00%.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Tip

📝 Customizable high-level summaries are now available in beta!

You can now customize how CodeRabbit generates the high-level summary in your pull requests — including its content, structure, tone, and formatting.

Provide your own instructions using the high_level_summary_instructions setting.
Format the summary however you like (bullet lists, tables, multi-section layouts, contributor stats, etc.).
Use high_level_summary_in_walkthrough to move the summary from the description to the walkthrough section.

Example instruction:

"Divide the high-level summary into five sections:

📝 Description — Summarize the main change in 50–60 words, explaining what was done.

📓 References — List relevant issues, discussions, documentation, or related PRs.

📦 Dependencies & Requirements — Mention any new/updated dependencies, environment variable changes, or configuration updates.

📊 Contributor Summary — Include a Markdown table showing contributions:
| Contributor | Lines Added | Lines Removed | Files Changed |

✔️ Additional Notes — Add any extra reviewer context.
Keep each section concise (under 200 words) and use bullet or numbered lists for clarity."

Note: This feature is currently in beta for Pro-tier users, and pricing will be announced later.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (7)

workflows/nno_dashboard/README.md (1)
37-39: Tighten README accuracy and address minor markdown/doc nits

A few small adjustments would make this README more robust:
Merged vs closed PRs (Lines 61–63, 89–90): The implementation in process_closed_prs uses the GitHub API with state="closed" and does not filter on merged_at, so it will process all closed PRs (merged and unmerged). Either:

Update the wording to “closed PRs / closed pull requests”, or

Adjust the code to explicitly filter for merged PRs and then keep the current wording.
Keeping docs aligned with behavior will avoid confusion when debugging runs.
Fenced code block language (Lines 37–39): The example GCS URL block is just:
https://gcsweb-...
Adding a language (e.g. text) will satisfy markdownlint and improve consistency:
```text
https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/...
Brittle regex line reference (Line 122): Referring to “regex in fetch_ci_data.py line 36–40” is likely to drift as the file evolves. Suggest rephrasing to something like “see TEST_RESULT_PATH_REGEX in fetch_ci_data.py” without line numbers.

Hyphenation nit (Line 76): If you care about LanguageTool’s style suggestion, you could change “Network Operator specific templates” to “Network‑Operator‑specific templates”, but that’s purely stylistic.
Also, using a gcsweb example URL instead of the Prow UI matches how the rest of this repo prefers GCS navigation. Based on learnings.

Also applies to: 61-63, 76-77, 89-90, 121-123
workflows/gpu_operator_dashboard/fetch_ci_data.py (1)

87-108: test_flavor and test_flavors merging look correct; consider generalizing comments and tightening docs

The new test_flavor field on TestResult and the test_flavors merging in merge_ocp_version_results are implemented in a backward-compatible way:

to_dict() only emits "test_flavor" when set, so older JSON remains valid and TestResult(**item) still works for legacy entries.

The merge_ocp_version_results initializer plus update(existing_version_data) preserves any preexisting fields while ensuring notes, bundle_tests, release_tests, job_history_links, and test_flavors always exist.

The flavor merge correctly reuses merge_release_tests and normalizes job_history_links from set/list to a sorted list per flavor.

Two small polish suggestions:

Line 95/105: the inline comments call this “NNO-specific”, but now the GPU dashboard’s core model supports flavors generically. Renaming the comment to something like “optional test configuration flavor” would better reflect its shared nature.

Lines 573–578 and 614–637: the docstring and type hints for merge_ocp_version_results still describe only bundle/release tests and don’t mention test_flavors or that job_history_links may be sets or lists. Updating those descriptions will help future readers understand the full shape of new_version_data / existing_version_data.

Also applies to: 573-639
workflows/nno_dashboard/fetch_ci_data.py (2)
61-106: Avoid catching bare Exception in fetch_filtered_files to prevent masking real errors

In fetch_filtered_files (Lines 61–106), the while True loop wraps the GCS call with:
try:
    response_data = http_get_json(...)
    ...
except Exception as e:
    logger.debug(f"PR #{pr_number} not found in {repo} or error occurred: {e}")
    break
This will silently treat any unexpected error (authentication issues, JSON decoding bugs, API changes, etc.) as “PR not found”, which can make real failures hard to diagnose.

Consider tightening this logic, for example:

Catch only the specific exception(s) that http_get_json is expected to raise when a PR’s artifacts truly don’t exist, and keep the message at debug in that case.

For other exceptions, either let them propagate or log at warning/error so issues surface quickly.

This keeps the “try both repos” behavior for missing artifacts while avoiding accidental suppression of genuine problems.

109-138: Small lint/clarity cleanups (unused variable, no-op f-string) in NNO fetcher

There are a couple of low-impact issues that are easy to tidy up:
Unused repo unpacking (Lines 132–135 and 293–295): extract_build_components and filter_network_finished_files unpack repo but never use it. To satisfy linters and clarify intent, you can prefix it with _ or drop it from the unpacked tuple if you don’t need it:
_repo, pr_number, job_name, build_id = extract_build_components(path)
No-op f-string (Line 289): logger.debug(f" Skipping - not nested or top-level") doesn’t interpolate anything. You can safely remove the f prefix:
logger.debug("  Skipping - not nested or top-level")
These are purely cosmetic but will keep Ruff and other tools quiet.

Also applies to: 263-315
workflows/nno_dashboard/templates/test_flavor_section.html (1)

1-14: Template structure looks good; ensure {ocp_key} and {flavor_id} are safe for use in HTML IDs

The section markup is clear and matches the per-flavor data model (label + table of OCP vs Network Operator versions). One thing to watch for is that table-{ocp_key}-{flavor_id} produces valid and unique id attributes:

If ocp_key or flavor_id can contain spaces or special characters, consider normalizing them (e.g., lowercasing and replacing non-alphanumerics with -) before substitution.

This will also make it easier to target these tables from any JavaScript or CSS you add later.

Otherwise, the template looks ready to integrate with the NNO dashboard generator.

workflows/nno_dashboard/templates/header.html (1)

1-36: Sorting script works but could be hardened and cleaned up

This should work for your current tables, but a couple of optional tweaks would make it more robust and user‑friendly:

Clear asc/desc classes from all header cells before toggling on the clicked one, so only one column is visually marked as sorted at a time.

For numeric columns, consider explicitly coercing to numbers (e.g. var a = Number(cellA); var b = Number(cellB); and checking Number.isNaN) instead of relying on isNaN + implicit subtraction, which can behave oddly with formatted values.

These are non‑blocking polish items.

.github/workflows/generate_matrix_page.yaml (1)

35-41: JSON-only change detection may skip publishing template/generator updates

The new GPU_CHANGED / NNO_CHANGED flags are based solely on git diff of the JSON files. On pull_request_target runs this means:

If a PR only changes HTML templates, CSS, or the Python generator code (but the JSON content stays identical), the dashboards will not be regenerated, so gh‑pages keeps the old HTML until a later run where the JSON actually changes.

workflow_dispatch still forces regeneration, but that requires someone to remember to trigger it after non‑data changes.

If you want template / code changes to be published automatically on merge, consider one of:

Always regenerating HTML on pull_request_target and relying on git diff in the gh‑pages checkout to avoid redundant commits, or

Expanding the change detection to also look at relevant source paths (e.g. workflows/**/generate_ci_dashboard.py, workflows/**/templates/**, styles.css) instead of only the JSON.

Functionally the NNO wiring (env vars, fetch, and generate steps) looks correct; this is mainly about long‑term behavior of the gating logic.

Also applies to: 71-75, 79-88, 91-129

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6917f2e and cfd3e99.

📒 Files selected for processing (10)

.github/workflows/generate_matrix_page.yaml (2 hunks)
workflows/gpu_operator_dashboard/fetch_ci_data.py (3 hunks)
workflows/nno_dashboard/README.md (1 hunks)
workflows/nno_dashboard/fetch_ci_data.py (1 hunks)
workflows/nno_dashboard/generate_ci_dashboard.py (1 hunks)
workflows/nno_dashboard/requirements.txt (1 hunks)
workflows/nno_dashboard/templates/footer.html (1 hunks)
workflows/nno_dashboard/templates/header.html (1 hunks)
workflows/nno_dashboard/templates/main_table.html (1 hunks)
workflows/nno_dashboard/templates/test_flavor_section.html (1 hunks)

🧰 Additional context used

🧠 Learnings (3)

📚 Learning: 2025-08-27T18:08:59.287Z

Learnt from: empovit
Repo: rh-ecosystem-edge/nvidia-ci PR: 291
File: workflows/gpu-operator-versions/nvidia_gpu_operator.py:58-58
Timestamp: 2025-08-27T18:08:59.287Z
Learning: In the nvidia-ci repository, the GH_AUTH_TOKEN environment variable is set using base64 encoding of the GitHub token: `export GH_AUTH_TOKEN=$(echo ${{ secrets.GITHUB_TOKEN }} | base64)` and this works successfully with GitHub Container Registry (GHCR) Bearer authentication in workflows/gpu-operator-versions/nvidia_gpu_operator.py.

Applied to files:

.github/workflows/generate_matrix_page.yaml

📚 Learning: 2025-11-09T13:12:43.631Z

Learnt from: empovit
Repo: rh-ecosystem-edge/nvidia-ci PR: 359
File: workflows/gpu_operator_versions/update_versions.py:100-106
Timestamp: 2025-11-09T13:12:43.631Z
Learning: In the nvidia-ci repository's `workflows/gpu_operator_versions/update_versions.py`, the `gpu_releases` variable is intentionally limited to the latest 2 GPU operator versions (via `get_latest_versions(list(gpu_versions.keys()), 2)` in main()). For maintenance OCP versions with pinned GPU operators, if the pinned version is not among the latest 2 releases, the test is intentionally skipped with a warning rather than falling back to test with any available version. This is because testing with outdated GPU operators would fail in their infrastructure, and they prefer to notice missing tests rather than deal with failed test runs, especially since GitHub organization repositories don't send notification emails about failed workflows.

Applied to files:

workflows/gpu_operator_dashboard/fetch_ci_data.py
workflows/nno_dashboard/README.md

📚 Learning: 2025-08-31T15:44:50.325Z

Learnt from: empovit
Repo: rh-ecosystem-edge/nvidia-ci PR: 295
File: workflows/gpu_operator_dashboard/fetch_ci_data.py:61-64
Timestamp: 2025-08-31T15:44:50.325Z
Learning: In the nvidia-ci project, gcsweb URLs are preferred over prow.ci.openshift.org URLs for the GPU operator dashboard because gcsweb allows easier navigation up the directory structure, while Prow UI requires switching to "Artifacts" first.

Applied to files:

workflows/nno_dashboard/README.md

🧬 Code graph analysis (2)

workflows/nno_dashboard/fetch_ci_data.py (1)

workflows/gpu_operator_dashboard/fetch_ci_data.py (12)

http_get_json (48-52)

fetch_gcs_file_content (55-64)

build_prow_job_url (67-69)

TestResultKey (75-84)

TestResult (88-136)

merge_bundle_tests (479-509)

merge_release_tests (517-570)

merge_ocp_version_results (573-639)

merge_and_save_results (642-666)

int_or_none (674-680)

build_key (121-124)

has_exact_versions (126-136)

workflows/nno_dashboard/generate_ci_dashboard.py (2)

workflows/common/templates.py (1)

load_template (9-37)

workflows/gpu_operator_dashboard/generate_ci_dashboard.py (5)

has_valid_semantic_versions (15-43)

build_catalog_table_rows (89-169)

build_notes (172-187)

build_toc (190-201)

build_bundle_info (204-244)

🪛 LanguageTool

workflows/nno_dashboard/README.md

[grammar] ~76-~76: Use a hyphen to join words.
Context: ...te_ci_dashboard- Uses Network Operator specific templates (intemplates/` dire...

(QB_NEW_EN_HYPHEN)

🪛 markdownlint-cli2 (0.18.1)

workflows/nno_dashboard/README.md

37-37: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🪛 Ruff (0.14.5)

workflows/nno_dashboard/fetch_ci_data.py

1-1: Shebang is present but file is not executable

(EXE001)

101-101: Do not catch blind exception: Exception

(BLE001)

168-168: Unpacked variable repo is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)

289-289: f-string without any placeholders

Remove extraneous f prefix

(F541)

293-293: Unpacked variable repo is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)

workflows/nno_dashboard/generate_ci_dashboard.py

1-1: Shebang is present but file is not executable

(EXE001)

64-64: Consider moving this statement to an else block

(TRY300)

🔇 Additional comments (4)

workflows/nno_dashboard/fetch_ci_data.py (1)
387-461: Confirm strict OCP version gating via has_exact_versions() is intentional

In process_tests_for_pr (Lines 387–461, 478–495):
You derive result from process_single_build, then immediately do:
actual_ocp_version = result.ocp_full_version if result.has_exact_versions() else None
if not actual_ocp_version:
    continue
and later further validate that actual_ocp_version starts with a digit and contains a dot.
Compared to the GPU workflow, this is stricter: GPU bundle tests are still keyed by the OCP version parsed from the path even when has_exact_versions() is false, whereas here any build that fails has_exact_versions() is skipped entirely (both bundle and release), unless it falls into the special-case correction for legacy “doca4/bare-metal/hosted” tokens.
This is reasonable if you explicitly want to drop all runs where either OCP or operator version strings are not “exact” and parseable, but it does mean:

Any future changes in version formatting that break has_exact_versions() will quietly exclude those runs from the NNO dashboard.

Bundle-only jobs with missing/bad ocp.version / operator.version files will be skipped even though the path-derived OCP version may still be usable.

If this stricter behavior is deliberate, consider adding a short comment above this block to document that design choice. Otherwise, you might want to:

Use ocp_version from the regex as a fallback key for bundle tests, or

Log a warning when has_exact_versions() fails so it’s visible when data is being dropped.

Also applies to: 478-495
workflows/nno_dashboard/requirements.txt (1)

1-11: Dependencies are already well-aligned across workflows; no changes needed.

Verification confirms that the NNO dashboard's pinned versions match perfectly with the GPU operator dashboard and are consistent with other workflow dashboards. All shared dependencies (requests, semver, urllib3) use identical versions across all workflow files, and pydantic versions (2.11.1) are identical where both files use them. No alignment issues found.

workflows/nno_dashboard/templates/main_table.html (1)

1-9: Template structure and placeholders look consistent with generator usage

The anchor id, container markup, and {notes}, {test_flavors_sections}, {bundle_info} placeholders line up with how generate_test_matrix fills this template, so this fragment should integrate cleanly with the TOC and per‑OCP blocks.

workflows/nno_dashboard/templates/footer.html (1)

1-6: Footer wiring for last-updated timestamp looks good

{LAST_UPDATED} is correctly isolated in a dedicated container and matches the replacement logic in generate_test_matrix, with proper closing tags.

coderabbitai · 2025-11-19T14:27:15Z

workflows/nno_dashboard/generate_ci_dashboard.py

+def is_valid_ocp_version(version_key: str) -> bool:
+    """
+    Check if a version key is a valid OpenShift version.
+    
+    Valid: "4.17.16", "4.16", "4.15.0"
+    Invalid: "doca4", "bare-metal", "hosted", "Unknown"
+    """
+    # Filter out known infrastructure types
+    invalid_keys = ["doca4", "bare-metal", "hosted", "unknown"]
+    if version_key.lower() in invalid_keys:
+        return False
+
+    # Valid OCP versions start with a digit and contain dots
+    if not version_key or not version_key[0].isdigit():
+        return False
+
+    # Check if it looks like a semantic version (X.Y or X.Y.Z)
+    parts = version_key.split('.')
+    if len(parts) < 2:
+        return False
+
+    try:
+        # Try to parse first two parts as numbers
+        int(parts[0])
+        int(parts[1])
+        return True
+    except (ValueError, IndexError):
+        return False
+


⚠️ Potential issue | 🟠 Major

Fix OCP version ordering and consider small lint cleanups

OCP versions are sorted lexicographically, not numerically

Currently:

valid_ocp_keys = [key for key in ocp_data.keys() if is_valid_ocp_version(key)] sorted_ocp_keys = sorted(valid_ocp_keys, reverse=True)

Because this is a plain string sort, versions like "4.9" will sort after "4.10" in an unintuitive order. To get a proper numerical/semantic ordering without extra dependencies, you can sort by the numeric components:

- valid_ocp_keys = [key for key in ocp_data.keys() if is_valid_ocp_version(key)] - sorted_ocp_keys = sorted(valid_ocp_keys, reverse=True) + valid_ocp_keys = [key for key in ocp_data.keys() if is_valid_ocp_version(key)] + sorted_ocp_keys = sorted( + valid_ocp_keys, + key=lambda v: tuple(int(p) for p in v.split(".")[:3]), + reverse=True, + )

This keeps behavior aligned with how OCP versions are usually presented in dashboards.

Optional: address Ruff hints (non-functional)

If you don’t intend to run this file directly, you can drop the shebang line to silence EXE001, since the workflow uses python -m workflows.nno_dashboard.generate_ci_dashboard anyway.

The return True inside the try block in is_valid_ocp_version is fine as-is; if you want to satisfy TRY300, you could move it into an else: after the except, but that’s purely stylistic.

Also applies to: 115-133

🧰 Tools

🪛 Ruff (0.14.5)

64-64: Consider moving this statement to an else block

(TRY300)

🤖 Prompt for AI Agents

In workflows/nno_dashboard/generate_ci_dashboard.py around lines 39-67 (and similarly 115-133), is_valid_ocp_version and subsequent sorting use string-based ordering which misorders versions like "4.9" vs "4.10"; update the sort to sort by numeric version components instead of plain strings by splitting the version on '.', converting segments to ints (pad shorter versions with zeros or sort by tuple of ints) and use that tuple as the sort key with reverse=True for descending order; optionally remove the shebang on top if the file is not executed directly to silence EXE001 and, if desired for style only, move the successful return out of the try into an else to satisfy TRY300.

TomerNewman · 2025-11-23T10:12:10Z

workflows/nno_dashboard/generate_ci_dashboard.py

+# The functions just reference the field name, they don't care about the actual operator type.
+
+
+def is_valid_ocp_version(version_key: str) -> bool:


I think you can use the code we already have because we are doing the same thing, and instead of gpu_operator_dashboard we can have something more generic like dashboard or something else similar

There seem to be a lot of code duplication. Let's try to either make the dashboard generation code flexible enough to accommodate both the NNO and the GPU operator, or refactor the common code and reuse it for NNO/GPU, implement just the customization.

empovit · 2025-11-24T05:49:06Z

.github/workflows/generate_matrix_page.yaml

+          # GPU Operator dashboard paths
          echo "DASHBOARD_DATA_FILEPATH=${DASHBOARD_OUTPUT_DIR}/gpu_operator_matrix.json" >> "$GITHUB_ENV"
          echo "DASHBOARD_HTML_FILEPATH=${DASHBOARD_OUTPUT_DIR}/gpu_operator_matrix.html" >> "$GITHUB_ENV"
+          # Network Operator dashboard paths


Why do we want to have the NNO dashboard updated in the same workflow as the GPU operator? But even then, I think it would be better to separate the NNO steps from the GPU ones. E.g. "Set GPU operator env vars", "Set NNO env vars", "Fetch GPU operator CI results", "Fetch NNO CI resutls", etc.

empovit · 2025-11-24T05:58:11Z

workflows/gpu_operator_dashboard/fetch_ci_data.py

+        "bundle_tests": [],
+        "release_tests": [],
+        "job_history_links": [],
+        "test_flavors": {}


What is the meaning of test flavors? It looks like you want to incorporate them into "bundle_tests"/"release_tests" instead of having a separate section. E.g.

OpenShift GPU Operator Network Operator

4.19.17 25.10.1 25.3 (eth), 25.3 (infiniband)

Let's think what would be the best way to organize the data.

Also, this doesn't seem to belong in "gpu_operator_dashboard". We'll need to change the directory structure. Maybe keep the shared code separately, operator-specific code in the respective directories.

empovit · 2025-11-24T06:07:26Z

workflows/nno_dashboard/fetch_ci_data.py

@@ -0,0 +1,531 @@
+#!/usr/bin/env python


I'm not sure why we need to change both gpu_operator_dashboard/fetch_ci_data.py and nno_dashboard/fetch_ci_data.py. Either one script can handle both operators, or they are completely independent and the GPU flow remains intact. It looks like there's be a lot of shared code, and we should try to avoid duplication. Essentially fetching test results should not be very different for NNO from GPU.

empovit · 2025-11-24T06:11:26Z

workflows/nno_dashboard/generate_ci_dashboard.py

+# The functions just reference the field name, they don't care about the actual operator type.
+
+
+def is_valid_ocp_version(version_key: str) -> bool:


There seem to be a lot of code duplication. Let's try to either make the dashboard generation code flexible enough to accommodate both the NNO and the GPU operator, or refactor the common code and reuse it for NNO/GPU, implement just the customization.

Guy Gordani and others added 7 commits November 5, 2025 12:34

Adding NNO dashboard

e7ac02a

adding nno dashboard module for generate html page for nvidia network operator

Merge branch 'rh-ecosystem-edge:main' into adding_nno_dashboard

cfd3e99

ggordaniRed requested a review from empovit November 19, 2025 14:17

openshift-ci bot requested review from vtruhpon and wabouhamad November 19, 2025 14:17

ggordaniRed requested review from TomerNewman and removed request for vtruhpon and wabouhamad November 19, 2025 14:17

openshift-ci bot added the approved label Nov 19, 2025

coderabbitai bot reviewed Nov 19, 2025

View reviewed changes

TomerNewman reviewed Nov 23, 2025

View reviewed changes

empovit requested changes Nov 24, 2025

View reviewed changes

openshift-ci bot assigned empovit Nov 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Adding nno dashboard #366

Adding nno dashboard #366

ggordaniRed commented Nov 19, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

openshift-ci bot commented Nov 19, 2025

Uh oh!

coderabbitai bot commented Nov 19, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Nov 19, 2025

Uh oh!

TomerNewman Nov 23, 2025

Uh oh!

empovit Nov 24, 2025

Uh oh!

empovit Nov 24, 2025

Uh oh!

empovit Nov 24, 2025

Uh oh!

empovit Nov 24, 2025

Uh oh!

empovit Nov 24, 2025

Uh oh!

empovit Nov 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		# The functions just reference the field name, they don't care about the actual operator type.


		def is_valid_ocp_version(version_key: str) -> bool:

Adding nno dashboard #366

Are you sure you want to change the base?

Adding nno dashboard #366

Conversation

ggordaniRed commented Nov 19, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

openshift-ci bot commented Nov 19, 2025

Uh oh!

coderabbitai bot commented Nov 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 19, 2025

Choose a reason for hiding this comment

Uh oh!

TomerNewman Nov 23, 2025

Choose a reason for hiding this comment

Uh oh!

empovit Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

empovit Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

empovit Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

empovit Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

empovit Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

empovit Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ggordaniRed commented Nov 19, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 19, 2025 •

edited

Loading