Skip to content

Conversation

@dmitsh
Copy link
Collaborator

@dmitsh dmitsh commented Nov 4, 2025

No description provided.

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Greptile Summary

This PR adds NVLink domain discovery support to the NetQ provider by introducing a new NMX API integration.

Key Changes

  • New nmx.go file implements compute node fetching from NMX API endpoint using Basic Auth
  • Added getComputeNodes() to retrieve domain-to-host mappings via /nmx/v1/compute-nodes
  • Updated NetQ topology URL from relative path to absolute path with api/netq/ prefix
  • Improved test assertion in netq_test.go by adding nil check before error validation

Issues Found

  • Critical: Test file references missing tests/output/netq/computeNodes.json - tests will fail
  • Critical: Potential nil pointer dereference in error handling when httpreq.DoRequest fails (existing pattern in codebase)

Architecture

The new functionality follows established patterns from other providers (infiniband, DRA) for domain mapping, using DomainMap.AddHost() to associate domain UUIDs with compute node names.

Confidence Score: 2/5

  • This PR has critical issues that will cause test failures and potential runtime panics
  • Score of 2 reflects two critical issues: (1) missing test data file will cause immediate test failure, and (2) nil pointer dereference bug in error handling could cause runtime panics. While the nil pointer issue exists in the existing codebase, it's being propagated in new code
  • Pay close attention to pkg/providers/netq/nmx_test.go (missing test data) and pkg/providers/netq/nmx.go (nil pointer issue in error handling)

Important Files Changed

File Analysis

Filename Score Overview
pkg/providers/netq/nmx.go 3/5 New file adding NVLink domain discovery via NMX API; missing test data file and potential nil pointer dereference
pkg/providers/netq/nmx_test.go 2/5 New test file with missing test data file tests/output/netq/computeNodes.json will cause test failures

Sequence Diagram

sequenceDiagram
    participant Client
    participant Provider
    participant NetQ as NetQ API
    participant NMX as NMX API
    
    Note over Client,NMX: NVLink Domain Discovery Flow
    
    Client->>Provider: Request topology with NVLink domains
    
    rect rgb(240, 240, 255)
        Note over Provider,NMX: New: NMX Compute Nodes Discovery
        Provider->>Provider: getComputeUrl()
        Provider->>Provider: Create Basic Auth header
        Provider->>NMX: GET /nmx/v1/compute-nodes
        NMX-->>Provider: ComputeNode[] (ID, Name, DomainUUID)
        Provider->>Provider: parseComputeNodes()
        Provider->>Provider: Build DomainMap
    end
    
    rect rgb(255, 240, 240)
        Note over Provider,NetQ: Existing: Network Topology Discovery
        Provider->>NetQ: POST /auth/v1/login
        NetQ-->>Provider: Access Token
        Provider->>NetQ: GET /auth/v1/select/opid/{opid}
        NetQ-->>Provider: Updated Token
        Provider->>NetQ: POST /api/netq/telemetry/v1/object/topologygraph/fetch-topology
        NetQ-->>Provider: Topology (nodes, links)
        Provider->>Provider: parseNetq()
    end
    
    Provider-->>Client: Topology with NVLink domains
Loading

4 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

@codecov
Copy link

codecov bot commented Nov 4, 2025

Codecov Report

❌ Patch coverage is 36.58537% with 26 lines in your changes missing coverage. Please review.
✅ Project coverage is 64.99%. Comparing base (4ed41ea) to head (ec5eefe).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
pkg/providers/netq/provider.go 0.00% 13 Missing ⚠️
pkg/providers/netq/nmx.go 53.84% 11 Missing and 1 partial ⚠️
pkg/providers/netq/netq.go 50.00% 1 Missing ⚠️

❌ Your patch status has failed because the patch coverage (36.58%) is below the target coverage (50.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #180      +/-   ##
==========================================
- Coverage   65.26%   64.99%   -0.28%     
==========================================
  Files          77       78       +1     
  Lines        4221     4257      +36     
==========================================
+ Hits         2755     2767      +12     
- Misses       1359     1382      +23     
- Partials      107      108       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Greptile Summary

This PR adds NVLink domain discovery support to the NetQ provider by introducing a new NMX API integration. The implementation fetches compute nodes from the /nmx/v1/compute-nodes endpoint and groups them by DomainUUID to build a topology domain map.

Key Changes:

  • Added nmx.go with getComputeNodes() function to query NMX API using Basic Authentication
  • Updated NetQ API endpoint URL to include api/netq/ prefix in the path
  • Added comprehensive unit tests for URL generation and JSON parsing
  • Included realistic test fixture data from NMX API

Critical Issue Found:

  • pkg/providers/netq/nmx.go:40-42 - nil pointer dereference when httpreq.DoRequest fails (see existing comment)

Note: The same nil pointer pattern exists in the pre-existing netq.go file at lines 69, 94, and 121, but these are outside the scope of this PR.

Confidence Score: 2/5

  • This PR introduces a critical nil pointer dereference bug that will cause runtime panics
  • Score of 2 reflects one critical bug that must be fixed: the nil pointer dereference at nmx.go:42 will cause a panic when HTTP requests fail. The rest of the implementation follows existing patterns, has good test coverage, and the API endpoint correction appears valid.
  • Pay close attention to pkg/providers/netq/nmx.go - it contains a critical nil pointer dereference bug that needs immediate fixing

Important Files Changed

File Analysis

Filename Score Overview
pkg/providers/netq/nmx.go 2/5 New file implementing NVLink domain discovery via NMX API - critical nil pointer dereference bug at line 42
pkg/providers/netq/netq.go 4/5 Updated TopologyURL constant to include api/netq/ prefix - minor API endpoint correction
pkg/providers/netq/nmx_test.go 5/5 Comprehensive test coverage for URL generation and compute nodes parsing

Sequence Diagram

sequenceDiagram
    participant Client
    participant Provider
    participant NMX as NMX API
    participant Parser

    Client->>Provider: getComputeNodes(ctx)
    Provider->>Provider: getComputeUrl()
    Provider->>Provider: Build URL & Basic Auth headers
    Provider-->>Provider: Return URL, headers
    
    Provider->>NMX: GET /nmx/v1/compute-nodes
    Note over NMX: Authenticate via Basic Auth
    
    alt Success
        NMX-->>Provider: 200 OK + JSON data
        Provider->>Parser: parseComputeNodes(data)
        Parser->>Parser: Unmarshal JSON to []ComputeNode
        Parser->>Parser: Create DomainMap
        loop For each ComputeNode
            Parser->>Parser: AddHost(DomainUUID, Name, Name)
        end
        Parser-->>Provider: DomainMap
        Provider-->>Client: DomainMap, nil
    else Error
        NMX-->>Provider: Error response
        Provider-->>Client: nil, httperr.Error
    end
Loading

5 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@dmitsh dmitsh marked this pull request as draft November 4, 2025 21:31
@dmitsh dmitsh force-pushed the ds-netq branch 3 times, most recently from 0a1481a to 99e62d4 Compare November 6, 2025 23:24
@dmitsh dmitsh marked this pull request as ready for review November 6, 2025 23:24
Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Greptile Summary

This PR adds NVLink domain discovery support to the NetQ provider. The implementation fetches compute node information from a new NMX API endpoint (/nmx/v1/compute-nodes) and extracts domain UUIDs to build a topology block structure alongside the existing network tree topology.

Key changes:

  • Added nmx.go with getNvlDomains() function that queries the NMX API using Basic authentication
  • Refactored provider.go to fetch both NVLink domains and network tree, then combine them into a unified topology structure
  • Renamed generateTopologyConfig() to getNetworkTree() and simplified its return value by removing an unnecessary vertex wrapper layer
  • Updated topology URL path from telemetry/v1/... to api/netq/telemetry/v1/...
  • Added comprehensive test coverage including URL generation and parsing logic

The implementation follows the same patterns used in other providers (DRA, Infiniband) for domain mapping, properly handles errors, and includes appropriate test fixtures.

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk
  • The code is well-structured, follows existing patterns in the codebase, includes proper error handling, and has comprehensive test coverage. No critical issues were found.
  • No files require special attention

Important Files Changed

File Analysis

Filename Score Overview
pkg/providers/netq/nmx.go 5/5 Added new file implementing NVLink domain discovery via NMX API endpoint using Basic auth
pkg/providers/netq/provider.go 5/5 Integrated NVLink domain discovery into topology generation by fetching domains and combining with network tree
pkg/providers/netq/netq.go 5/5 Renamed function and simplified return value to remove unnecessary wrapper layer

Sequence Diagram

sequenceDiagram
    participant Client
    participant Provider
    participant NetQ as NetQ API (getNetworkTree)
    participant NMX as NMX API (getNvlDomains)
    
    Client->>Provider: GenerateTopologyConfig(ctx, instances)
    
    Provider->>NMX: getNvlDomains(ctx)
    NMX->>NMX: getComputeUrl() - Build URL with Basic Auth
    NMX->>NMX: DoRequest GET /nmx/v1/compute-nodes
    NMX->>NMX: parseComputeNodes(data)
    NMX->>NMX: Create DomainMap from compute nodes
    NMX-->>Provider: Return DomainMap (domain UUID -> nodes)
    
    Provider->>NetQ: getNetworkTree(ctx, instances)
    NetQ->>NetQ: POST /auth/v1/login - Get access token
    NetQ->>NetQ: GET /auth/v1/select/opid - Set OpID
    NetQ->>NetQ: POST /api/netq/telemetry/v1/.../fetch-topology
    NetQ->>NetQ: parseNetq() - Build network tree
    NetQ-->>Provider: Return treeRoot (network topology)
    
    Provider->>Provider: Combine into root Vertex
    Provider->>Provider: root.Vertices[TopologyTree] = treeRoot
    Provider->>Provider: root.Vertices[TopologyBlock] = domains.ToBlocks()
    Provider-->>Client: Return complete topology with NVLink domains
Loading

6 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@greptile-apps
Copy link

greptile-apps bot commented Nov 14, 2025

Greptile Summary

  • Adds NVLink domain discovery by fetching compute nodes from NMX API endpoint and mapping them to topology blocks
  • Refactors existing NetQ provider to separate network tree generation from topology assembly logic

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk
  • Clean implementation with proper error handling, comprehensive tests, graceful degradation on NVL domain fetch failures, and well-structured refactoring that maintains backward compatibility
  • No files require special attention

Important Files Changed

Filename Overview
pkg/providers/netq/nmx.go New file implementing NVLink domain discovery via NMX API with proper error handling and authentication
pkg/providers/netq/provider.go Integrates NVLink domain discovery with graceful error handling, combining network tree and domain blocks
pkg/providers/netq/netq.go Refactored to extract network tree logic and updated API URL path

Sequence Diagram

sequenceDiagram
    participant User
    participant Provider
    participant NMX as "NMX API"
    participant NetQ as "NetQ API"
    
    User->>Provider: "GenerateTopologyConfig()"
    Provider->>NMX: "GET /nmx/v1/compute-nodes"
    Note over Provider,NMX: "Basic Auth (user:password)"
    NMX-->>Provider: "Compute nodes with DomainUUIDs"
    Provider->>Provider: "parseComputeNodes()"
    Provider->>Provider: "Build DomainMap"
    Provider->>NetQ: "POST /auth/v1/login"
    NetQ-->>Provider: "Access token"
    Provider->>NetQ: "GET /auth/v1/select/opid"
    NetQ-->>Provider: "Operation ID confirmed"
    Provider->>NetQ: "POST /api/netq/telemetry/v1/object/topologygraph/fetch-topology"
    NetQ-->>Provider: "Network topology data"
    Provider->>Provider: "parseNetq()"
    Provider->>Provider: "Combine tree root + domain blocks"
    Provider-->>User: "Complete topology with NVLink domains"
Loading

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

6 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

6 files reviewed, no comments

Edit Code Review Agent Settings | Greptile
React with 👍 or 👎 to share your feedback on this new summary format

@dmitsh dmitsh merged commit 7f345bf into main Nov 19, 2025
5 of 6 checks passed
@dmitsh dmitsh deleted the ds-netq branch November 19, 2025 18:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants