Skip to content

Conversation

@jgehrcke
Copy link
Collaborator

@jgehrcke jgehrcke commented Oct 6, 2025

Small-ish but valuable log message improvements that I desired while investigating things over the last few days.

  1. Explicit progress report on regular shutdown of the CD daemon
  2. Log claim name + namespace on prep for debuggability

About (1): occasionally, I wondered if the global wait group's wg.Wait() actually did return, and if it didn't -- which of the tasks in it would not have finished yet.

About (2): that standard format is for example already used in unprepare:

unprepared devices for claim 'default/nvbandwidth-test-2-worker-0-compute-domain-channel-mnkrw:67417068-92da-4bbd-803c-ddcbbf0f067f'

In prepare, we so far only log by claim UID:

prepared devices for claim '67417068-92da-4bbd-803c-ddcbbf0f067f': ...

That makes things harder to debug (grepping plugin logs for the claim name should work).

Log claim name on prep for debuggability

Standard format is for example used in unprepare:

    unprepared devices for claim \
    'default/nvbandwidth-test-2-worker-0-compute-domain-channel-mnkrw:67417068-92da-4bbd-803c-ddcbbf0f067f'

In prepare, we so far only log by claim UID:

    prepared devices for claim '67417068-92da-4bbd-803c-ddcbbf0f067f': ...

That makes things harder to debug (grepping plugin logs for
claim name should work).

Also, improve progress report on regular shutdown of the CD daemon.

Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]>
@jgehrcke jgehrcke force-pushed the jp/misc-logging-improvements branch from 56c00ff to 0207dab Compare October 7, 2025 07:24
if err != nil {
res := kubeletplugin.PrepareResult{
Err: fmt.Errorf("error preparing devices for claim %v: %w", claim.UID, err),
Err: fmt.Errorf("error preparing devices for claim %s/%s:%s: %w", claim.Namespace, claim.Name, claim.UID, err),
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can have a convenience method for this normalized string representation, just like we have for claimref.

@klueska klueska added the debuggability issue/pr related to the ability to debug the system label Oct 7, 2025
@klueska klueska added this to the v25.8.0 milestone Oct 7, 2025
@jgehrcke jgehrcke merged commit 2b7e899 into NVIDIA:main Oct 7, 2025
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

debuggability issue/pr related to the ability to debug the system

Projects

Development

Successfully merging this pull request may close these issues.

2 participants