rfc44: support jobid and possibly other fields in drain events

There are several issues related to reporting errors to users by looking at non-job specific data:

https://github.com/flux-framework/flux-core/issues/7135
https://github.com/flux-framework/flux-core/issues/7136

We would like to parse `resource.eventlog` to associate a drain event with a specific job, however the current lack of extra drain information makes this minimally inconvenient, but likely very hard when a number of corner cases are thrown in.

Example, lets say that housekeeping fails and drains a node.  A user or admin may want to associate the housekeeping failure with the job that just ran before it.  In order to do that we would have to 1) see what time the node was drained 2) find the last job that finished on that node before that timestamp.

But this is the algorithm only if we apriori know housekeeping drained the node.  The algorithm for associating a drain event from epilog is different.  Also, nothing indicates where the drain event may have come from.  What if an admin drains a node for maintenance after a job runs, it could appear it came from housekeeping.

I think we should have additional information in the drain event when that information is available.

- jobid - optional field could tie drain event to a specific jobid.  This could be automatic in `flux resource drain` if the FLUX_JOB_ID environment variable exists.

- reporter - could be housekeeping, epilog, "user" (command line).  Like if an admin drains a node for some very specific purpose, we'll know it wasn't related to a job per se.

Maybe some other info could be useful.  Only bits I thought of so far.




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rfc44: support jobid and possibly other fields in drain events #509

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

rfc44: support jobid and possibly other fields in drain events #509

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions