-
Notifications
You must be signed in to change notification settings - Fork 1
Integration of early Dropped-Jobs, RNG Seeding and Workload generator & alternative "prices" script. #3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
📝 WalkthroughPre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
environment.py (1)
81-101: Centralized job-slot clearing helpers are good, but next_empty debug reference is staleThe new
_clear_job_slotand_validate_next_emptyhelpers centralize queue maintenance and are a good idea. One small issue:
- In the debug block inside
assign_jobs_to_available_nodes, you referenceself.next_empty_slot_ref[0], but there is noself.next_empty_slot_refattribute; only the localnext_empty_slot_refparameter exists.Since the whole block is currently guarded by
if False, this won’t execute, but it’s confusing and will break if you ever flip the flag. Consider updating it to:- if False: - if getattr(self, "strict_checks", False) and (self.current_step % 100 == 0): - self._validate_next_empty(job_queue_2d, self.next_empty_slot_ref[0]) + if False: + if getattr(self, "strict_checks", False) and (self.current_step % 100 == 0) and next_empty_slot_ref is not None: + self._validate_next_empty(job_queue_2d, next_empty_slot_ref[0])to keep the debug helper self-consistent.
Also applies to: 575-596
🧹 Nitpick comments (3)
deter_wlgen.py (1)
5-71: Sanity harness is solid; consider minor message tweak for uniform modeThe determinism, bounds, and distribution smoke checks are a nice complement to the simpler tests. In
check_distribution_smoke, the assertion message"flat mean looks off"underif cfg.arrivals == "uniform":is slightly confusing now that “uniform” is a distinct mode; consider updating the message to mention “uniform” instead.environment.py (1)
304-331: Drop accounting and penalties integrate well with episode metricsThe new tracking for dropped jobs and queue-full rejections (agent vs baseline) plus the drop-based penalty in
calculate_rewardand the extra episode metrics inrecord_episode_completionare coherently wired:
assign_jobs_to_available_nodesincrementsjobs_dropped/baseline_jobs_droppedand per-episode counters when jobs exceedMAX_JOB_AGE.reset_stateandrecord_episode_completioncorrectly reset and reportjobs_dropped,drop_rate, and queue-full rejections for both agent and baseline.- The excess-dropped penalty compares agent vs baseline (
self.new_excess) and feeds a (non-positive)drop_penaltyinto the reward, avoiding penalizing equally-bad baselines.The only nit is that
if drop_penalty > 0: drop_penalty = 0is redundant, sincePENALTY_DROPPED_JOBis negative andself.new_excessis non-negative, sodrop_penaltywill never be positive. You can safely drop that branch to simplify the code.Also applies to: 480-506, 762-806, 909-954
workloadgen.py (1)
27-54: Workload generator core logic is sound; document reserved hour_idx parameter
WorkloadGenConfigandWorkloadGenerator._sample_job_count/sampleimplement the flat/poisson/uniform modes with appropriate caps and bounds, andJobSpeccleanly encapsulates job attributes. The only minor point is thathour_idxis currently unused insample; adding a short comment (as you did) or a# noqa/ similar hint will make it clear this is reserved for future daily patterns and silence static analysis noise.Also applies to: 56-110
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
WIP_prices.py(1 hunks)deter_wlgen.py(1 hunks)environment.py(20 hunks)sampler_hourly.py(2 hunks)sanity_workloadgen.py(1 hunks)train.py(4 hunks)workloadgen.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (4)
sampler_hourly.py
📄 CodeRabbit inference engine (CLAUDE.md)
Sample job characteristics independently (job count, duration, nodes, cores-per-node) from hour-of-day distributions in the Hourly Sampler
Files:
sampler_hourly.py
train.py
📄 CodeRabbit inference engine (CLAUDE.md)
train.py: Implement stable-baselines3 PPO integration in the training script with tensorboard logging and model checkpointing every 100K steps
Structure evaluation output to include per-episode metrics (cost, savings, completion rate, wait time, queue size) and cumulative analysis (total savings, monthly reduction, annual projections)
Files:
train.py
{train.py,weights.py}
📄 CodeRabbit inference engine (CLAUDE.md)
Use weighted reward components for training: efficiency-weight (default 0.7), price-weight (default 0.2), idle-weight (default 0.1), and job-age-weight (default 0.0)
Files:
train.py
environment.py
📄 CodeRabbit inference engine (CLAUDE.md)
environment.py: Use Gymnasium-compatible environment design for RL simulations in the Environment module
Configure cluster parameters as: 335 nodes maximum, 96 cores per node, up to 16 nodes per job, 150W idle power, 450W active power per node
Set environment constraints as: 2-week episodes (336 hours), max 1000 jobs in queue, max 1500 new jobs per hour, max 170 hours job runtime
Files:
environment.py
🧠 Learnings (7)
📚 Learning: 2025-12-04T08:51:13.769Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.769Z
Learning: Applies to sampler_hourly.py : Sample job characteristics independently (job count, duration, nodes, cores-per-node) from hour-of-day distributions in the Hourly Sampler
Applied to files:
workloadgen.pysampler_hourly.pyenvironment.py
📚 Learning: 2025-12-04T08:51:13.769Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.769Z
Learning: Applies to sampler_duration.py : Sample job durations from simple duration logs in the Duration Sampler
Applied to files:
sampler_hourly.py
📚 Learning: 2025-12-04T08:51:13.769Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.769Z
Learning: Applies to {train.py,weights.py} : Use weighted reward components for training: efficiency-weight (default 0.7), price-weight (default 0.2), idle-weight (default 0.1), and job-age-weight (default 0.0)
Applied to files:
train.pyenvironment.py
📚 Learning: 2025-12-04T08:51:13.769Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.769Z
Learning: Applies to environment.py : Set environment constraints as: 2-week episodes (336 hours), max 1000 jobs in queue, max 1500 new jobs per hour, max 170 hours job runtime
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.769Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.769Z
Learning: Applies to environment.py : Configure cluster parameters as: 335 nodes maximum, 96 cores per node, up to 16 nodes per job, 150W idle power, 450W active power per node
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.769Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.769Z
Learning: Applies to sampler_jobs.py : Use pattern-based replay of historical job batches with full characteristics in the Jobs Sampler
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.769Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.769Z
Learning: Applies to train.py : Structure evaluation output to include per-episode metrics (cost, savings, completion rate, wait time, queue size) and cumulative analysis (total savings, monthly reduction, annual projections)
Applied to files:
environment.py
🧬 Code graph analysis (5)
deter_wlgen.py (1)
workloadgen.py (3)
WorkloadGenerator(56-110)WorkloadGenConfig(35-53)sample(99-110)
train.py (2)
workloadgen.py (2)
WorkloadGenerator(56-110)WorkloadGenConfig(35-53)environment.py (2)
ComputeClusterEnv(67-439)render(72-73)
WIP_prices.py (1)
environment.py (1)
reset(254-284)
sanity_workloadgen.py (1)
workloadgen.py (3)
WorkloadGenerator(56-110)WorkloadGenConfig(35-53)sample(99-110)
environment.py (7)
WIP_prices.py (2)
Prices(4-141)advance_and_get_predicted_prices(86-90)prices.py (1)
Prices(4-121)weights.py (1)
Weights(5-15)plot.py (1)
plot(6-77)sampler_jobs.py (3)
DurationSampler(7-444)parse_jobs(18-115)sample(123-168)sampler_hourly.py (2)
parse_jobs(13-101)sample(103-143)workloadgen.py (1)
sample(99-110)
🪛 Ruff (0.14.7)
workloadgen.py
60-60: Avoid specifying long messages outside the exception class
(TRY003)
89-89: Avoid specifying long messages outside the exception class
(TRY003)
99-99: Unused method argument: hour_idx
(ARG002)
environment.py
440-440: Expected a statement
(invalid-syntax)
440-440: Expected a statement
(invalid-syntax)
440-440: Expected a statement
(invalid-syntax)
440-440: Expected a statement
(invalid-syntax)
441-441: Unexpected indentation
(invalid-syntax)
442-442: Expected a statement
(invalid-syntax)
442-442: Expected a statement
(invalid-syntax)
442-442: Expected a statement
(invalid-syntax)
442-442: Expected a statement
(invalid-syntax)
442-443: Expected a statement
(invalid-syntax)
444-444: Unexpected indentation
(invalid-syntax)
453-453: Expected a statement
(invalid-syntax)
453-453: Expected a statement
(invalid-syntax)
453-453: Expected a statement
(invalid-syntax)
453-453: Expected a statement
(invalid-syntax)
453-453: Invalid decimal integer literal
(invalid-syntax)
453-453: Expected ,, found name
(invalid-syntax)
453-453: Expected ,, found name
(invalid-syntax)
453-453: Expected ,, found name
(invalid-syntax)
453-453: Unparenthesized generator expression cannot be used here
(invalid-syntax)
453-453: Expected in, found name
(invalid-syntax)
453-453: Expected ,, found name
(invalid-syntax)
453-453: Expected ,, found name
(invalid-syntax)
453-453: Expected ,, found name
(invalid-syntax)
453-453: Expected ,, found name
(invalid-syntax)
453-453: Unparenthesized generator expression cannot be used here
(invalid-syntax)
453-453: Invalid assignment target
(invalid-syntax)
453-453: Expected in, found name
(invalid-syntax)
453-453: Expected ,, found name
(invalid-syntax)
453-453: Expected ,, found string
(invalid-syntax)
453-453: Expected ,, found name
(invalid-syntax)
453-453: Expected ,, found name
(invalid-syntax)
453-453: Expected ,, found name
(invalid-syntax)
453-453: Expected ,, found name
(invalid-syntax)
453-453: Expected an identifier
(invalid-syntax)
631-631: Expected a statement
(invalid-syntax)
631-631: Expected a statement
(invalid-syntax)
631-631: Expected a statement
(invalid-syntax)
631-631: Expected a statement
(invalid-syntax)
632-632: Unexpected indentation
(invalid-syntax)
634-634: Expected a statement
(invalid-syntax)
634-634: Expected a statement
(invalid-syntax)
634-634: Expected a statement
(invalid-syntax)
634-634: Expected a statement
(invalid-syntax)
634-635: Expected a statement
(invalid-syntax)
635-635: Unexpected indentation
(invalid-syntax)
638-638: Expected a statement
(invalid-syntax)
638-638: Expected a statement
(invalid-syntax)
638-638: Expected a statement
(invalid-syntax)
638-638: Expected a statement
(invalid-syntax)
638-638: Invalid decimal integer literal
(invalid-syntax)
638-638: Expected ,, found name
(invalid-syntax)
638-638: Expected ,, found name
(invalid-syntax)
638-638: Expected ,, found name
(invalid-syntax)
638-638: Unparenthesized generator expression cannot be used here
(invalid-syntax)
638-638: Expected in, found name
(invalid-syntax)
638-638: Expected ,, found name
(invalid-syntax)
638-638: Expected ,, found name
(invalid-syntax)
638-638: Expected ,, found name
(invalid-syntax)
638-638: Expected ,, found name
(invalid-syntax)
638-638: Unparenthesized generator expression cannot be used here
(invalid-syntax)
638-638: Invalid assignment target
(invalid-syntax)
638-638: Expected in, found name
(invalid-syntax)
638-638: Expected ,, found name
(invalid-syntax)
638-638: Expected ,, found string
(invalid-syntax)
638-638: Expected ,, found name
(invalid-syntax)
638-638: Expected ,, found name
(invalid-syntax)
638-638: Expected ,, found name
(invalid-syntax)
638-638: Expected ,, found name
(invalid-syntax)
638-638: Expected an identifier
(invalid-syntax)
640-640: Unexpected indentation
(invalid-syntax)
665-665: Expected a statement
(invalid-syntax)
665-665: Expected a statement
(invalid-syntax)
665-665: Expected a statement
(invalid-syntax)
665-665: Expected a statement
(invalid-syntax)
674-674: Expected a statement
(invalid-syntax)
674-674: Expected a statement
(invalid-syntax)
674-674: Expected a statement
(invalid-syntax)
674-674: Expected a statement
(invalid-syntax)
674-675: Expected a statement
(invalid-syntax)
676-676: Expected a statement
(invalid-syntax)
676-676: Expected a statement
(invalid-syntax)
676-676: Expected a statement
(invalid-syntax)
676-676: Expected a statement
(invalid-syntax)
676-676: Invalid decimal integer literal
(invalid-syntax)
676-676: Expected ,, found name
(invalid-syntax)
676-676: Expected ,, found name
(invalid-syntax)
676-676: Expected ,, found name
(invalid-syntax)
676-676: Unparenthesized generator expression cannot be used here
(invalid-syntax)
676-676: Expected in, found name
(invalid-syntax)
676-676: Expected ,, found name
(invalid-syntax)
676-676: Expected ,, found name
(invalid-syntax)
676-676: Expected ,, found name
(invalid-syntax)
676-676: Expected ,, found name
(invalid-syntax)
676-676: Unparenthesized generator expression cannot be used here
(invalid-syntax)
676-676: Invalid assignment target
(invalid-syntax)
676-676: Expected in, found name
(invalid-syntax)
676-676: Expected ,, found name
(invalid-syntax)
676-676: Expected ,, found string
(invalid-syntax)
676-676: Expected ,, found name
(invalid-syntax)
676-676: Expected ,, found name
(invalid-syntax)
676-676: Expected ,, found name
(invalid-syntax)
676-676: Expected ,, found name
(invalid-syntax)
676-676: Expected an identifier
(invalid-syntax)
677-677: Unexpected indentation
(invalid-syntax)
691-691: unindent does not match any outer indentation level
(invalid-syntax)
693-693: Unexpected indentation
(invalid-syntax)
712-712: Expected a statement
(invalid-syntax)
712-712: Expected a statement
(invalid-syntax)
712-712: Expected a statement
(invalid-syntax)
712-712: Expected a statement
(invalid-syntax)
713-713: Unexpected indentation
(invalid-syntax)
714-714: Expected a statement
(invalid-syntax)
714-714: Expected a statement
(invalid-syntax)
714-714: Expected a statement
(invalid-syntax)
714-714: Expected a statement
(invalid-syntax)
714-715: Expected a statement
(invalid-syntax)
715-715: Unexpected indentation
(invalid-syntax)
718-718: Expected a statement
(invalid-syntax)
718-718: Expected a statement
(invalid-syntax)
718-718: Expected a statement
(invalid-syntax)
718-718: Expected a statement
(invalid-syntax)
718-718: Invalid decimal integer literal
(invalid-syntax)
718-718: Expected ,, found name
(invalid-syntax)
718-718: Expected ,, found name
(invalid-syntax)
718-718: Expected ,, found name
(invalid-syntax)
718-718: Unparenthesized generator expression cannot be used here
(invalid-syntax)
718-718: Expected in, found name
(invalid-syntax)
718-718: Expected ,, found name
(invalid-syntax)
718-718: Expected ,, found name
(invalid-syntax)
718-718: Expected ,, found name
(invalid-syntax)
718-718: Expected ,, found name
(invalid-syntax)
718-718: Unparenthesized generator expression cannot be used here
(invalid-syntax)
718-718: Invalid assignment target
(invalid-syntax)
718-718: Expected in, found name
(invalid-syntax)
718-718: Expected ,, found name
(invalid-syntax)
718-718: Expected ,, found string
(invalid-syntax)
718-718: Expected ,, found name
(invalid-syntax)
718-718: Expected ,, found name
(invalid-syntax)
718-718: Expected ,, found name
(invalid-syntax)
718-718: Expected ,, found name
(invalid-syntax)
718-718: Expected an identifier
(invalid-syntax)
729-729: Expected a statement
(invalid-syntax)
729-729: Expected a statement
(invalid-syntax)
729-729: Expected a statement
(invalid-syntax)
729-729: Expected a statement
(invalid-syntax)
730-730: Unexpected indentation
(invalid-syntax)
731-731: Expected a statement
(invalid-syntax)
731-731: Expected a statement
(invalid-syntax)
731-731: Expected a statement
(invalid-syntax)
731-731: Expected a statement
(invalid-syntax)
731-732: Expected a statement
(invalid-syntax)
734-734: Unexpected indentation
(invalid-syntax)
743-743: Expected a statement
(invalid-syntax)
743-743: Expected a statement
(invalid-syntax)
743-743: Expected a statement
(invalid-syntax)
743-743: Expected a statement
(invalid-syntax)
743-743: Invalid decimal integer literal
(invalid-syntax)
743-743: Expected ,, found name
(invalid-syntax)
743-743: Expected ,, found name
(invalid-syntax)
743-743: Expected ,, found name
(invalid-syntax)
743-743: Unparenthesized generator expression cannot be used here
(invalid-syntax)
743-743: Expected in, found name
(invalid-syntax)
743-743: Expected ,, found name
(invalid-syntax)
743-743: Expected ,, found name
(invalid-syntax)
743-743: Expected ,, found name
(invalid-syntax)
743-743: Expected ,, found name
(invalid-syntax)
743-743: Unparenthesized generator expression cannot be used here
(invalid-syntax)
743-743: Invalid assignment target
(invalid-syntax)
743-743: Expected in, found name
(invalid-syntax)
743-743: Expected ,, found name
(invalid-syntax)
743-743: Expected ,, found string
(invalid-syntax)
743-743: Expected ,, found name
(invalid-syntax)
743-743: Expected ,, found name
(invalid-syntax)
743-743: Expected ,, found name
(invalid-syntax)
743-743: Expected ,, found name
(invalid-syntax)
743-743: Expected an identifier
(invalid-syntax)
🔇 Additional comments (4)
sampler_hourly.py (1)
103-143: Hourly sampler RNG integration and max_jobs cap look correctUsing an external RNG plus
max_jobsto cap arrivals preserves the independent sampling of job_count/duration/nodes/cores-per-node per hour and fits the intended usage with deterministic generators. Just ensure callers always pass a numpyGenerator-like rng.sanity_workloadgen.py (1)
5-43: Determinism, bounds, and Poisson sanity tests are well-structuredThese sanity checks exercise the main WorkloadGenerator behaviors (determinism, per-field bounds, and Poisson mean) in a simple, reproducible way and align with the generator’s API.
train.py (1)
10-18: Workload generator CLI wiring and env construction look consistent with constraintsThe new
--workload-genoptions,WorkloadGenConfigconstruction, and passingworkload_genintoComputeClusterEnvare all consistent with the environment’s limits (duration ≤ 170h, 1–16 nodes, 1–96 cores, ≤1500 jobs/hour). Thenorm_pathhelper also correctly normalizes empty-string paths toNonewhile preserving the in-memorypriceslist.Also applies to: 49-52, 91-107, 110-141
environment.py (1)
159-163: np_random is never seeded or initialized before use
self.np_randomis set toNonein__init__but is used as a numpy RNG instep()throughhourly_sampler.sample(),self.workload_gen.sample(), andself.np_random.integers()calls. Sincereset()does not currently seed or initializeself.np_random, the first call to any of these methods will raise anAttributeError.To fix this, initialize the RNG in
reset()using Gymnasium-compatible seeding:def reset(self, seed = None, options = None): + if seed is not None or self.np_random is None: + self.np_random, self._seed = seeding.np_random(seed) + self.reset_state() self.prices.reset()This ensures deterministic behavior for a given seed and avoids reliance on global RNGs, aligning with the Gymnasium-compatible environment design requirement.
Also applies to: 254-283, 362-417
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
environment.py (2)
440-441: Critical: assign_jobs_to_available_nodes returns wrong values.The call expects
(num_launched_jobs, next_empty_slot), but the function at lines 618-691 has critical bugs:
- Initializes
num_processed_jobsinstead ofnum_launched_jobs- Tries to increment undefined variables
num_launchedandnum_dropped(lines 661, 674)- Returns
num_processed_jobswhich is never incrementedThis will cause a
NameErrorat runtime. See the detailed fix in the review comment for lines 618-691.
618-691: Critical: Undefined variables cause NameError.The function has multiple critical bugs:
Line 619: Initializes
num_processed_jobs = 0but this variable is never incremented and doesn't match the return expectations.Lines 661, 674: Attempt to increment undefined variables
num_launchedandnum_dropped, causing aNameErrorat runtime.Line 691: Returns
(num_processed_jobs, next_empty_slot)but callers expect the first value to be the number of launched jobs.The function should track launched jobs and return the count along with the updated
next_empty_slot. Dropped jobs are already tracked viaself.jobs_droppedand don't need to be returned.Apply this fix:
def assign_jobs_to_available_nodes(self, job_queue_2d, nodes, cores_available, running_jobs, next_empty_slot, is_baseline=False): - num_processed_jobs = 0 + num_launched = 0 for job_idx, job in enumerate(job_queue_2d): job_duration, job_age, job_nodes, job_cores_per_node = job if job_duration <= 0: continue # Candidates: node is on and has enough free cores mask = (nodes >= 0) & (cores_available >= job_cores_per_node) candidate_nodes = np.where(mask)[0] if len(candidate_nodes) >= job_nodes: # Assign job to first job_nodes candidates job_allocation = [] for i in range(job_nodes): node_idx = int(candidate_nodes[i]) cores_available[node_idx] -= int(job_cores_per_node) nodes[node_idx] = max(int(nodes[node_idx]), int(job_duration)) job_allocation.append((node_idx, int(job_cores_per_node))) running_jobs[self.next_job_id] = { "duration": int(job_duration), "allocation": job_allocation, } self.next_job_id += 1 # Clear job from queue job_queue_2d[job_idx] = [0, 0, 0, 0] # Update next_empty_slot if we cleared a slot before it if job_idx < next_empty_slot: next_empty_slot = job_idx # Track job completion and wait time if is_baseline: self.baseline_jobs_completed += 1 self.baseline_total_job_wait_time += int(job_age) else: self.jobs_completed += 1 self.total_job_wait_time += int(job_age) num_launched += 1 continue # Not enough resources -> job waits and ages (or gets dropped) new_age = int(job_age) + 1 if new_age > MAX_JOB_AGE: # Clear job from queue job_queue_2d[job_idx] = [0, 0, 0, 0] # Update next_empty_slot if we cleared a slot before it if job_idx < next_empty_slot: next_empty_slot = job_idx - num_dropped += 1 if is_baseline: self.baseline_jobs_dropped += 1 self.baseline_dropped_this_episode += 1 else: self.jobs_dropped += 1 self.dropped_this_episode += 1 else: job_queue_2d[job_idx][1] = new_age # DEBUG CHECK for next_empty -> Add a Flag to be called if False: if getattr(self, "strict_checks", False) and (self.current_step % 100 == 0): self._validate_next_empty(job_queue_2d, self.next_empty_slot_ref[0]) - return num_processed_jobs, next_empty_slot + return num_launched, next_empty_slotNote: The
num_droppedvariable is not needed as a return value since dropped jobs are tracked via instance variables (self.jobs_dropped, etc.).
🧹 Nitpick comments (1)
environment.py (1)
627-638: Consider removing redundant int() conversions.The extensive use of
int()conversions (lines 635-638, 641-642, 656, 659) is defensive but likely unnecessary sincejob_queue_2dis defined withdtype=np.int32. These conversions add visual noise without providing value.If you want to keep them for extra safety, consider adding a comment explaining why they're needed.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
environment.py(19 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
environment.py
📄 CodeRabbit inference engine (CLAUDE.md)
environment.py: Use Gymnasium-compatible environment design for RL simulations in the Environment module
Configure cluster parameters as: 335 nodes maximum, 96 cores per node, up to 16 nodes per job, 150W idle power, 450W active power per node
Set environment constraints as: 2-week episodes (336 hours), max 1000 jobs in queue, max 1500 new jobs per hour, max 170 hours job runtime
Files:
environment.py
🧠 Learnings (7)
📓 Common learnings
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.769Z
Learning: Applies to sampler_hourly.py : Sample job characteristics independently (job count, duration, nodes, cores-per-node) from hour-of-day distributions in the Hourly Sampler
📚 Learning: 2025-12-04T08:51:13.769Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.769Z
Learning: Applies to environment.py : Set environment constraints as: 2-week episodes (336 hours), max 1000 jobs in queue, max 1500 new jobs per hour, max 170 hours job runtime
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.769Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.769Z
Learning: Applies to {train.py,weights.py} : Use weighted reward components for training: efficiency-weight (default 0.7), price-weight (default 0.2), idle-weight (default 0.1), and job-age-weight (default 0.0)
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.769Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.769Z
Learning: Applies to environment.py : Configure cluster parameters as: 335 nodes maximum, 96 cores per node, up to 16 nodes per job, 150W idle power, 450W active power per node
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.769Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.769Z
Learning: Applies to sampler_hourly.py : Sample job characteristics independently (job count, duration, nodes, cores-per-node) from hour-of-day distributions in the Hourly Sampler
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.769Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.769Z
Learning: Applies to sampler_jobs.py : Use pattern-based replay of historical job batches with full characteristics in the Jobs Sampler
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.769Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.769Z
Learning: Applies to train.py : Structure evaluation output to include per-episode metrics (cost, savings, completion rate, wait time, queue size) and cumulative analysis (total savings, monthly reduction, annual projections)
Applied to files:
environment.py
🧬 Code graph analysis (1)
environment.py (5)
WIP_prices.py (1)
advance_and_get_predicted_prices(86-90)weights.py (1)
Weights(5-15)sampler_jobs.py (4)
parse_jobs(18-115)precalculate_hourly_jobs(364-383)sample_one_hourly(416-444)sample(123-168)sampler_hourly.py (2)
parse_jobs(13-101)sample(103-143)workloadgen.py (1)
sample(99-110)
🪛 Ruff (0.14.7)
environment.py
661-661: Undefined name num_launched
(F821)
661-661: Local variable num_launched is assigned to but never used
(F841)
674-674: Undefined name num_dropped
(F821)
674-674: Local variable num_dropped is assigned to but never used
(F841)
🔇 Additional comments (8)
environment.py (8)
87-100: LGTM! Good refactoring for job slot management.The helper methods
_clear_job_slotand_validate_next_emptycentralize job queue management logic and add defensive checks. This improves maintainability and makes the invariant (next_empty_slot points to the first empty slot) explicit.
125-126: LGTM! Workload generator integration supports deterministic job generation.The
workload_genparameter enables external workload generation via the WorkloadGenerator class, supporting the PR objective for deterministic and configurable job generation.
254-265: LGTM! Price state integration is correct.The
reset()method correctly initializes price state viaself.prices.reset()and usespredicted_pricesfrom the prices object. This aligns with the deterministic price evolution from WIP_prices.
344-344: LGTM! Centralized price advancement.Using
advance_and_get_predicted_prices()from the Prices class centralizes the price evolution logic, improving maintainability.
513-514: LGTM! Correct Gymnasium episode termination semantics.Setting
truncated=Trueandterminated=Falsewhen the time limit is reached correctly follows Gymnasium's API convention. This distinguishes between time-based episode endings (truncated) and task-based endings (terminated).
17-18: The review comment cannot be verified due to repository access issues. Manual verification is required to confirm the RNG initialization concerns and the appropriateness of the suggested fix.
372-417: Verify hourly_sampler.sample() API signature and assess legacy code cleanup.Line 384 passes
max_jobs=max_to_generatetohourly_sampler.sample(). Confirm this parameter is supported by the sampler's implementation.Lines 404-416 contain commented-out legacy code mixed with the new implementation using
self.np_random. If this represents completed migration to non-global RNG, remove the commented lines to improve code clarity.
8-9: Verify the WIP_prices import is intentional for this integration.The import has been switched to
WIP_priceswith a "Test re-worked prices script" comment. Ensure this temporary change is intentional for the integration PR and won't cause issues in production or if the WIP module is incomplete.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
environment.py (2)
87-101: Leverage_clear_job_slothelper to centralize queue slot management (optional)You’ve added
_clear_job_slotand_validate_next_empty(Lines 87–101), but still clear slots and adjustnext_empty_slotmanually inassign_jobs_to_available_nodesand the drop branch (Lines 646–652, 668–673). For maintainability and to avoid subtle inconsistencies, you could route these through_clear_job_slotinstead.Example for the two clear‑and-update sites:
- # Clear job from queue - job_queue_2d[job_idx] = [0, 0, 0, 0] - - # Update next_empty_slot if we cleared a slot before it - if job_idx < next_empty_slot: - next_empty_slot = job_idx + # Clear job from queue and keep next_empty_slot consistent + if next_empty_slot is not None: + ref = [next_empty_slot] + self._clear_job_slot(job_queue_2d, job_idx, ref) + next_empty_slot = ref[0] + else: + self._clear_job_slot(job_queue_2d, job_idx)Same pattern can be used in the MAX_JOB_AGE drop path. This reduces duplication and ensures your debug validator stays in sync with all slot mutations.
Also applies to: 646-652, 668-673
467-470: Drop penalty integration looks good; consider updating debug print to include itThe new excess-drop tracking (
new_excess/prev_excess_dropped, Lines 467–470) anddrop_penalty = min(0, PENALTY_DROPPED_JOB * self.new_excess)(Lines 754–755) correctly ensure only additional excess drops are penalized and that the penalty is never positive.The debug summary at Line 766, however, omits
drop_penaltyfrom the printed breakdown even though it’s part ofreward. For more transparent debugging:- self.env_print(f" > $$$TOTAL: {reward:.4f} = {efficiency_reward_weighted:.4f} + {price_reward_weighted:.4f} + {idle_penalty_weighted:.4f} + {job_age_penalty_weighted:.4f}") + self.env_print( + f" > $$$TOTAL: {reward:.4f} = " + f"{efficiency_reward_weighted:.4f} + {price_reward_weighted:.4f} " + f"+ {idle_penalty_weighted:.4f} + {job_age_penalty_weighted:.4f} " + f"+ {drop_penalty:.4f}" + )Purely a logging clarity tweak; behavior is already correct.
Also applies to: 754-756, 766-766
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
environment.py(19 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
environment.py
📄 CodeRabbit inference engine (CLAUDE.md)
environment.py: Use Gymnasium-compatible environment design for RL simulations in the Environment module
Configure cluster parameters as: 335 nodes maximum, 96 cores per node, up to 16 nodes per job, 150W idle power, 450W active power per node
Set environment constraints as: 2-week episodes (336 hours), max 1000 jobs in queue, max 1500 new jobs per hour, max 170 hours job runtime
Files:
environment.py
🧠 Learnings (7)
📓 Common learnings
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.769Z
Learning: Applies to sampler_hourly.py : Sample job characteristics independently (job count, duration, nodes, cores-per-node) from hour-of-day distributions in the Hourly Sampler
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.769Z
Learning: Applies to {train.py,weights.py} : Use weighted reward components for training: efficiency-weight (default 0.7), price-weight (default 0.2), idle-weight (default 0.1), and job-age-weight (default 0.0)
📚 Learning: 2025-12-04T08:51:13.769Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.769Z
Learning: Applies to environment.py : Set environment constraints as: 2-week episodes (336 hours), max 1000 jobs in queue, max 1500 new jobs per hour, max 170 hours job runtime
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.769Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.769Z
Learning: Applies to {train.py,weights.py} : Use weighted reward components for training: efficiency-weight (default 0.7), price-weight (default 0.2), idle-weight (default 0.1), and job-age-weight (default 0.0)
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.769Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.769Z
Learning: Applies to environment.py : Configure cluster parameters as: 335 nodes maximum, 96 cores per node, up to 16 nodes per job, 150W idle power, 450W active power per node
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.769Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.769Z
Learning: Applies to sampler_hourly.py : Sample job characteristics independently (job count, duration, nodes, cores-per-node) from hour-of-day distributions in the Hourly Sampler
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.769Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.769Z
Learning: Applies to sampler_jobs.py : Use pattern-based replay of historical job batches with full characteristics in the Jobs Sampler
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.769Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.769Z
Learning: Applies to train.py : Structure evaluation output to include per-episode metrics (cost, savings, completion rate, wait time, queue size) and cumulative analysis (total savings, monthly reduction, annual projections)
Applied to files:
environment.py
🧬 Code graph analysis (1)
environment.py (4)
WIP_prices.py (1)
advance_and_get_predicted_prices(86-90)sampler_jobs.py (2)
DurationSampler(7-444)sample(123-168)workloadgen.py (1)
sample(99-110)sampler_hourly.py (1)
sample(103-143)
🪛 Ruff (0.14.7)
environment.py
661-661: Undefined name num_launched
(F821)
661-661: Local variable num_launched is assigned to but never used
(F841)
674-674: Undefined name num_dropped
(F821)
674-674: Local variable num_dropped is assigned to but never used
(F841)
🔇 Additional comments (4)
environment.py (4)
22-43: Environment constraints, penalties, and episode metrics align well with design goals
- Cluster and episode constants (Lines 22–43, 45) match the documented configuration: 335 nodes, 96 cores/node, up to 16 nodes/job, 2‑week (336‑hour) episodes, queue and job bounds, and appropriate power costs.
reset()now pulls predicted prices fromPrices(Lines 254–265), which keeps initial observations consistent with the price model.- Reward composition (Lines 721–769) and
record_episode_completionmetrics (Lines 867–912) now factor in drop behavior, queue‑full rejections, and baseline comparisons in a clean, inspectable way.No changes requested here; this structure looks solid for RL training and analysis.
Also applies to: 45-45, 254-265, 721-769, 867-912
159-163: Initializeself.np_randominreset()before first use
self.np_randomis set toNonein__init__and never initialized, butstep()uses it in the hourly sampler path, workload generator path, and legacy randomizer (Lines 372–416). This will raise anAttributeErroron the firststep()call.You're already importing
seedingand accepting aseedargument inreset(). A minimal fix is to initialize (or re-seed) the RNG there:- def reset(self, seed = None, options = None): - self.reset_state() - self.prices.reset() + def reset(self, seed=None, options=None): + # Initialize or re-seed deterministic RNG + if seed is not None or self.np_random is None: + self.np_random, self._seed = seeding.np_random(seed) + + self.reset_state() + self.prices.reset() @@ - 'predicted_prices': self.prices.predicted_prices.copy(), + 'predicted_prices': self.prices.predicted_prices.copy(), }This keeps Gymnasium-compatible seeding while ensuring all RNG users have a valid generator.
Also applies to: 254-265, 372-416
618-691: Fix undefined variable references inassign_jobs_to_available_nodesThe method initializes
num_processed_jobs = 0but increments undefined variablesnum_launched(line 661) andnum_dropped(line 674), causingNameErrorat runtime. The variablenum_processed_jobsis never incremented, so the return value is always 0.Increment
num_processed_jobsinstead of the undefined variables:
- Line 661: Change
num_launched += 1tonum_processed_jobs += 1- Line 674: Remove
num_dropped += 1(already tracking viaself.jobs_droppedandself.baseline_jobs_dropped)
168-177: Fixself.jobs_samplerusage whenexternal_jobsis providedIn
__init__you configure the globaljobs_sampler(Lines 168–176), but instep()you referenceself.jobs_sampler.sample_one_hourly(...)(Line 364). Ifself.jobs_sampleris never assigned, environments withexternal_jobswill hit anAttributeError.Either revert to using the module-level
jobs_samplerinstep(), or attach it to the instance in__init__:if self.external_jobs: print(f"Loading jobs from {self.external_jobs}") jobs_sampler.parse_jobs(self.external_jobs, 60) @@ print(f"Parsed hourly jobs for {len(jobs_sampler.hourly_jobs)} hours") + self.jobs_sampler = jobs_samplerThen
step()can safely useself.jobs_sampler.sample_one_hourly(wrap=True)["hourly_jobs"].Also applies to: 362-365
…step and baseline_step. Assign_jobs now delivers "launched" and "dropped" numbers. Added penalty in reward function (not normalized or weighted yet). Centralized "remove job from list". Changed from "terminated" to "truncated", to account for "time-limit" end. Fix: Untracked mergers now up to date Minor Fixes: CodeRabbit's suggestions
…RNG is used in any function that uses any type of RNG. Conserves determinism for training. Adjusted "sampler_hourly" to allow for this logic, and the way it's read in -> Now it's a "self." function.
8cfa7b9 to
0ce559a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
environment.py (1)
618-691: Critical:num_launchedandnum_droppedare undefined.The function initializes
num_processed_jobs = 0on line 619, but then incrementsnum_launched(line 661) andnum_dropped(line 674), which are never defined. This will raiseNameErrorat runtime.Either rename
num_processed_jobsor add the missing initializations:def assign_jobs_to_available_nodes(self, job_queue_2d, nodes, cores_available, running_jobs, next_empty_slot, is_baseline=False): - num_processed_jobs = 0 + num_launched = 0 + num_dropped = 0 for job_idx, job in enumerate(job_queue_2d): ... - num_launched += 1 + num_launched += 1 # now defined continue ... - num_dropped += 1 + num_dropped += 1 # now defined ... - return num_processed_jobs, next_empty_slot + return num_launched, next_empty_slotNote:
num_droppedis tracked but never returned. If drop count is needed by callers, update the return signature.
♻️ Duplicate comments (3)
WIP_prices.py (1)
20-30: Guard against emptyexternal_pricesarrays.If an empty sequence is passed,
np.minandnp.percentilewill raise errors. Add an explicit check.if self.original_prices is not None: prices = np.asarray(self.original_prices, dtype=np.float32) + if prices.size == 0: + raise ValueError("external_prices must be a non-empty sequence") min_price = float(np.min(prices))environment.py (2)
363-364: Inconsistent sampler reference.Line 364 references
self.jobs_sampler, but this instance attribute is only conditionally created inreset_state()(lines 332-335). Ifreset_state()is not called or the condition is false,self.jobs_samplerwon't exist, yet__init__uses the singletonjobs_sampler(line 170-176).This creates confusion between the singleton
jobs_samplerand instanceself.jobs_sampler. Standardize on one approach.
332-335: Redundant sampler re-initialization still present.This re-initialization was flagged in a previous review as redundant (the sampler is already initialized in
__init__lines 168-176) and marked as addressed, but the code is still present. This causes unnecessary file I/O on every episode reset.- if self.external_jobs and not self.workload_gen: - self.jobs_sampler = DurationSampler() - self.jobs_sampler.parse_jobs(self.external_jobs, 60) - self.jobs_sampler.precalculate_hourly_jobs(CORES_PER_NODE, MAX_NODES_PER_JOB)
🧹 Nitpick comments (9)
sampler_hourly.py (2)
4-4: Unusedrandomimport.The
randommodule is imported but no longer used after switching torng.choice. Consider removing it to keep imports clean.-import random
103-103: Type hintint | Nonerequires Python 3.10+.The union syntax
int | Noneis only available in Python 3.10+. For broader compatibility, useOptional[int]from thetypingmodule.+from typing import Optional + - def sample(self, hour_of_day: int, rng, max_jobs: int | None = None): + def sample(self, hour_of_day: int, rng, max_jobs: Optional[int] = None):WIP_prices.py (2)
75-77: Consider usingcollections.dequefor price history.Using
list.pop(0)is O(n). For a fixed-size rolling window,collections.deque(maxlen=HISTORY_WINDOW)provides O(1) operations.+from collections import deque + class Prices: ... def reset(self): - self.price_history = [] + self.price_history = deque(maxlen=self.HISTORY_WINDOW) ... def get_next_price(self): ... self.price_history.append(new_price) - if len(self.price_history) > self.HISTORY_WINDOW: - self.price_history.pop(0) + # deque automatically removes oldest when maxlen exceeded
137-141: Figure not closed on early return paths.If
save_pathis provided,plt.savefigis called, but if an exception occurs betweenplt.figure()andplt.close(), the figure leaks. Consider using a try/finally or context manager pattern.train.py (2)
119-126: Remove commented-out code.The old parameter lines are commented out rather than removed. Dead code should be deleted to maintain clarity.
- # external_prices=prices, - # external_durations=job_durations_file_path, - # external_jobs=jobs_file_path, - # external_hourly_jobs=hourly_jobs_file_path, external_prices=norm_path(prices),
112-113:norm_pathcould be simplified.The helper works but is slightly verbose. Consider:
- def norm_path(x): - return None if (x is None or str(x).strip() == "") else x + def norm_path(x): + return x if x else NoneThis works because empty strings are falsy in Python. However, if you need to preserve whitespace-only strings as
None, keep the current implementation.sanity_workloadgen.py (1)
22-29:test_constraintsuses defaultflat_jobs_per_hourwhich may differ frommax_new_jobs_per_hour.The config sets
max_new_jobs_per_hour=200but doesn't setflat_jobs_per_hour. Perworkloadgen.pyline 47,flat_jobs_per_hourdefaults to 200, which happens to match here. For clarity, explicitly set it:- cfg = WorkloadGenConfig(arrivals="flat", max_new_jobs_per_hour=200) + cfg = WorkloadGenConfig(arrivals="flat", flat_jobs_per_hour=200, max_new_jobs_per_hour=200)workloadgen.py (2)
57-61: Consider usingdataclasses.replace()for cleaner config reconstruction.The current approach reconstructs the config manually using a dict comprehension. Since
WorkloadGenConfigis a dataclass, you can usedataclasses.replace()for a more idiomatic solution.+from dataclasses import dataclass, replace -from dataclasses import dataclass from typing import List, Optional import numpy as npclass WorkloadGenerator: def __init__(self, cfg: WorkloadGenConfig): arrivals = cfg.arrivals.lower().strip() if arrivals not in ("flat", "poisson", "uniform"): raise ValueError(f"arrivals must be 'flat', 'uniform' or 'poisson', got: {cfg.arrivals}") - self.cfg = WorkloadGenConfig(arrivals=arrivals, **{k: v for k, v in cfg.__dict__.items() if k != "arrivals"}) + self.cfg = replace(cfg, arrivals=arrivals)
73-74: Optional: Simplify attribute access.The
getattrcalls with defaults are defensive but unnecessary since these are dataclass fields with defaults already defined inWorkloadGenConfig.- target = int(getattr(self.cfg, "flat_jobs_per_hour", self.cfg.max_new_jobs_per_hour)) - jitter = int(getattr(self.cfg, "flat_jitter", 0)) + target = int(self.cfg.flat_jobs_per_hour) + jitter = int(self.cfg.flat_jitter)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
WIP_prices.py(1 hunks)deter_wlgen.py(1 hunks)environment.py(19 hunks)sampler_hourly.py(2 hunks)sanity_workloadgen.py(1 hunks)train.py(4 hunks)workloadgen.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- deter_wlgen.py
🧰 Additional context used
📓 Path-based instructions (4)
sampler_hourly.py
📄 CodeRabbit inference engine (CLAUDE.md)
Sample job characteristics independently (job count, duration, nodes, cores-per-node) from hour-of-day distributions in the Hourly Sampler
Files:
sampler_hourly.py
train.py
📄 CodeRabbit inference engine (CLAUDE.md)
train.py: Implement stable-baselines3 PPO integration in the training script with tensorboard logging and model checkpointing every 100K steps
Structure evaluation output to include per-episode metrics (cost, savings, completion rate, wait time, queue size) and cumulative analysis (total savings, monthly reduction, annual projections)
Files:
train.py
{train.py,weights.py}
📄 CodeRabbit inference engine (CLAUDE.md)
Use weighted reward components for training: efficiency-weight (default 0.7), price-weight (default 0.2), idle-weight (default 0.1), and job-age-weight (default 0.0)
Files:
train.py
environment.py
📄 CodeRabbit inference engine (CLAUDE.md)
environment.py: Use Gymnasium-compatible environment design for RL simulations in the Environment module
Configure cluster parameters as: 335 nodes maximum, 96 cores per node, up to 16 nodes per job, 150W idle power, 450W active power per node
Set environment constraints as: 2-week episodes (336 hours), max 1000 jobs in queue, max 1500 new jobs per hour, max 170 hours job runtime
Files:
environment.py
🧠 Learnings (7)
📓 Common learnings
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.769Z
Learning: Applies to sampler_hourly.py : Sample job characteristics independently (job count, duration, nodes, cores-per-node) from hour-of-day distributions in the Hourly Sampler
📚 Learning: 2025-12-04T08:51:13.769Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.769Z
Learning: Applies to sampler_hourly.py : Sample job characteristics independently (job count, duration, nodes, cores-per-node) from hour-of-day distributions in the Hourly Sampler
Applied to files:
sampler_hourly.pyenvironment.pyworkloadgen.py
📚 Learning: 2025-12-04T08:51:13.769Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.769Z
Learning: Applies to sampler_jobs.py : Use pattern-based replay of historical job batches with full characteristics in the Jobs Sampler
Applied to files:
sampler_hourly.pyenvironment.py
📚 Learning: 2025-12-04T08:51:13.769Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.769Z
Learning: Applies to {train.py,weights.py} : Use weighted reward components for training: efficiency-weight (default 0.7), price-weight (default 0.2), idle-weight (default 0.1), and job-age-weight (default 0.0)
Applied to files:
train.pyenvironment.py
📚 Learning: 2025-12-04T08:51:13.769Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.769Z
Learning: Applies to environment.py : Set environment constraints as: 2-week episodes (336 hours), max 1000 jobs in queue, max 1500 new jobs per hour, max 170 hours job runtime
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.769Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.769Z
Learning: Applies to environment.py : Configure cluster parameters as: 335 nodes maximum, 96 cores per node, up to 16 nodes per job, 150W idle power, 450W active power per node
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.769Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.769Z
Learning: Applies to train.py : Structure evaluation output to include per-episode metrics (cost, savings, completion rate, wait time, queue size) and cumulative analysis (total savings, monthly reduction, annual projections)
Applied to files:
environment.py
🧬 Code graph analysis (2)
sanity_workloadgen.py (1)
workloadgen.py (3)
WorkloadGenerator(56-110)WorkloadGenConfig(35-53)sample(99-110)
WIP_prices.py (1)
environment.py (1)
reset(254-284)
🪛 Ruff (0.14.8)
environment.py
661-661: Undefined name num_launched
(F821)
661-661: Local variable num_launched is assigned to but never used
(F841)
674-674: Undefined name num_dropped
(F821)
674-674: Local variable num_dropped is assigned to but never used
(F841)
workloadgen.py
60-60: Avoid specifying long messages outside the exception class
(TRY003)
89-89: Avoid specifying long messages outside the exception class
(TRY003)
99-99: Unused method argument: hour_idx
(ARG002)
🔇 Additional comments (9)
sampler_hourly.py (1)
117-143: LGTM! Sampling logic correctly uses external RNG.The implementation correctly:
- Wraps
hour_of_daywith modulo 24- Uses external
rngfor all random choices- Handles
max_jobscap with proper bounds checking- Samples job characteristics (duration, nodes, cores_per_node) independently
This aligns with the coding guidelines for the Hourly Sampler.
WIP_prices.py (1)
86-90: Forecast window semantics may be surprising.
advance_and_get_predicted_pricesrolls the array left and places the current price (just fetched) at the end. This meanspredicted_prices[0]becomes the price from 1 hour ago, not the current price. Verify this aligns with howenvironment.pyusespredicted_prices[0]ascurrent_priceon line 345.environment.py (2)
88-101: LGTM! Clean helper methods for job slot management.
_clear_job_slotand_validate_next_emptyprovide good abstractions for slot management and debugging validation. The validation logic correctly checks thatnext_emptypoints to an empty slot and there are no holes before it.
753-755: LGTM! Drop penalty logic simplified.The drop penalty calculation is now cleaner with
min(0, PENALTY_DROPPED_JOB * self.new_excess), ensuring the penalty is always non-positive. This addresses the previous review feedback.train.py (1)
91-106: LGTM! Workload generator integration.The workload generator is properly configured using environment constants (
MAX_JOB_DURATION,MIN_NODES_PER_JOB, etc.) ensuring consistency. The conditional instantiation based onargs.workload_genis clean.sanity_workloadgen.py (1)
1-43: LGTM! Solid sanity tests for workload generator.The tests effectively verify:
- Determinism: Same seed produces identical job sequences
- Constraints: All generated jobs respect configured bounds
- Statistical properties: Poisson mean is within expected range
Good coverage for catching regressions in the workload generation logic.
workloadgen.py (3)
1-25: LGTM! Clear requirements and appropriate imports.The module documentation clearly articulates the design goals (deterministic, controllable, composable, future-proof) and the imports are minimal and appropriate.
27-53: LGTM! Well-designed immutable configuration.The frozen dataclasses ensure immutability, which is appropriate for configuration and job specification objects. Default values are sensible and the optional
hard_cap_jobsprovides a useful safety mechanism.
99-110: LGTM! Efficient vectorized sampling with clear future-proofing.The vectorized sampling approach is efficient and the
hour_idxparameter is intentionally unused (as documented) to enable future daily patterns without breaking the API. The+1adjustments on upper bounds correctly account forrng.integersbeing exclusive on the high end.
| else: | ||
| raise ValueError(f"Unknown arrivals mode: {mode}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unreachable error handling.
This ValueError can never be raised because __init__ already validates that mode is one of the three allowed values. While defensive programming is valuable, this creates dead code that can confuse maintainers.
Consider removing the unreachable else clause:
elif mode == "uniform":
# This is the old "flat".
k = int(rng.integers(0, self.cfg.max_new_jobs_per_hour + 1))
-
- else:
- raise ValueError(f"Unknown arrivals mode: {mode}")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| else: | |
| raise ValueError(f"Unknown arrivals mode: {mode}") | |
| elif mode == "uniform": | |
| # This is the old "flat". | |
| k = int(rng.integers(0, self.cfg.max_new_jobs_per_hour + 1)) |
🧰 Tools
🪛 Ruff (0.14.8)
89-89: Avoid specifying long messages outside the exception class
(TRY003)
🤖 Prompt for AI Agents
In workloadgen.py around lines 88-89, the final else branch that raises
ValueError is unreachable because mode is validated in __init__; remove that
else clause (delete the unreachable raise) so only the valid-mode branches
remain, or if you prefer an explicit sanity check keep a single assert False /
AssertionError with a clear message instead of the current redundant ValueError.
…ces, it would iterate too much. Agent and env became out of synch. "get_predicted_prices" now "Advance_and_get..." Reset in env, now also resets prices state. CodeRabbit Fixes: Corrected seeding. Removed double initialization of jobs sampler. Corrected num_processed jobs. Renamed to prices_deterministic. Cleaned random in sampler. Added dataclass replace in workloadgen. Supposed to provide O(1) instead of O(n). Extra Fixes: Renamed unclear script names. Removed unused functions in environment. And minor adjustments.
0ce559a to
ed14541
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
environment.py (1)
353-362: Potential reference error:self.jobs_samplermay not exist.Line 355 references
self.jobs_sampler.sample_one_hourly(), but the code in__init__(lines 161-169) uses the module-leveljobs_sampler(imported fromsampler_jobs), not an instance variableself.jobs_sampler.Either:
- Change line 355 to use the module-level
jobs_sampler.sample_one_hourly(), or- Initialize
self.jobs_sampleras an instance variable in__init__Apply this diff if option 1 is intended:
- jobs = self.jobs_sampler.sample_one_hourly(wrap=True)["hourly_jobs"] + jobs = jobs_sampler.sample_one_hourly(wrap=True)["hourly_jobs"]Or apply this diff in
__init__if option 2 is intended:if self.external_jobs: print(f"Loading jobs from {self.external_jobs}") - jobs_sampler.parse_jobs(self.external_jobs, 60) + self.jobs_sampler = jobs_sampler + self.jobs_sampler.parse_jobs(self.external_jobs, 60)
♻️ Duplicate comments (1)
workloadgen.py (1)
88-89: Unreachable error handling.This has already been flagged in a previous review.
🧹 Nitpick comments (8)
workloadgen.py (1)
72-79: Unnecessarygetattrcalls for always-present fields.
flat_jobs_per_hourandflat_jitterare defined with defaults inWorkloadGenConfig, sogetattris unnecessary overhead.if mode == "flat": - target = int(getattr(self.cfg, "flat_jobs_per_hour", self.cfg.max_new_jobs_per_hour)) - jitter = int(getattr(self.cfg, "flat_jitter", 0)) + target = int(self.cfg.flat_jobs_per_hour) + jitter = int(self.cfg.flat_jitter) if jitter <= 0: k = target else: k = int(rng.integers(target - jitter, target + jitter + 1))prices_deterministic.py (1)
78-80: Remove commented-out code.Dead code with a "remove during next PR" note should ideally be cleaned up now rather than carried forward.
self.price_history.append(new_price) - #if len(self.price_history) > self.HISTORY_WINDOW: - # self.price_history.pop(0) - # deque automatically removes oldest when maxlen exceeded | Keep the old line for now. Remove during next PR. return new_pricetest_sanity_workloadgen.py (2)
31-37: Consider tightening the tolerance band.With
poisson_lambda=50.0over 2000 samples, the standard error is approximatelysqrt(50/2000) ≈ 0.16, so the mean should be within ~0.5 of 50 with high probability. The current band of [45, 55] is very loose and may miss subtle bugs.A tighter band like
[48, 52]would still pass reliably while catching more issues:- assert 45.0 < mean < 55.0, mean # loose band, just to catch obvious bugs + assert 48.0 < mean < 52.0, mean # tighter band, still reliable for 2000 samples
1-43: Consider consolidating withtest_determinism_workloadgen.py.This file and
test_determinism_workloadgen.pyhave significant overlap in functionality (determinism checks, bounds validation). Consider consolidating into a single test module to reduce maintenance burden.train.py (1)
124-127: Potential issue:pricesis a list, not a path.
norm_pathis designed for path strings, butprices(line 124) is already a list of price values (orNone). Passing a list throughnorm_pathwill always return the list unchanged, but semantically this is confusing.- external_prices=norm_path(prices), + external_prices=prices, external_durations=norm_path(job_durations_file_path), external_jobs=norm_path(jobs_file_path), external_hourly_jobs=norm_path(hourly_jobs_file_path),test_determinism_workloadgen.py (1)
61-69: Consider adding a "uniform" config to test all modes.Currently only "flat" and "poisson" configs are tested, but
check_distribution_smokehas (or should have) validation for "uniform" mode as well.def main(): # Example configs to test flat_cfg = WorkloadGenConfig(arrivals="flat", max_new_jobs_per_hour=1000) pois_cfg = WorkloadGenConfig(arrivals="poisson", poisson_lambda=200.0, max_new_jobs_per_hour=1000) + unif_cfg = WorkloadGenConfig(arrivals="uniform", max_new_jobs_per_hour=1000) - for cfg in (flat_cfg, pois_cfg): + for cfg in (flat_cfg, pois_cfg, unif_cfg): check_determinism(cfg) check_bounds(cfg) check_distribution_smoke(cfg)environment.py (2)
619-635: Consider removing excessiveint()casts.The code contains numerous
int()casts (lines 627-630, 633-635) when assigning jobs. Ifjob_duration,job_cores_per_node, and other job attributes are already integers from the job queue, these casts are redundant and add noise.Verify the data types:
- If
job_queue_2disdtype=np.int32, the values are already integers and casts are unnecessary.- If casts are needed due to upstream type issues, consider fixing the root cause instead.
#!/bin/bash # Check the dtype of job_queue initialization rg -n "job_queue.*dtype" environment.py
752-752: Consider includingdrop_penaltyin debug output.The debug print at line 752 shows the main reward components but omits
drop_penalty. For consistency and debugging purposes, consider adding it:- self.env_print(f" > $$$TOTAL: {reward:.4f} = {efficiency_reward_weighted:.4f} + {price_reward_weighted:.4f} + {idle_penalty_weighted:.4f} + {job_age_penalty_weighted:.4f}") + self.env_print(f" > $$$TOTAL: {reward:.4f} = {efficiency_reward_weighted:.4f} + {price_reward_weighted:.4f} + {idle_penalty_weighted:.4f} + {job_age_penalty_weighted:.4f} + {drop_penalty:.4f}")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
deter_wlgen.py(1 hunks)environment.py(19 hunks)prices_deterministic.py(1 hunks)sampler_hourly.py(2 hunks)sanity_workloadgen.py(1 hunks)test_determinism_workloadgen.py(1 hunks)test_sanity_workloadgen.py(1 hunks)train.py(4 hunks)workloadgen.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- sampler_hourly.py
- deter_wlgen.py
- sanity_workloadgen.py
🧰 Additional context used
📓 Path-based instructions (3)
train.py
📄 CodeRabbit inference engine (CLAUDE.md)
train.py: Implement stable-baselines3 PPO integration in the training script with tensorboard logging and model checkpointing every 100K steps
Structure evaluation output to include per-episode metrics (cost, savings, completion rate, wait time, queue size) and cumulative analysis (total savings, monthly reduction, annual projections)
Files:
train.py
{train.py,weights.py}
📄 CodeRabbit inference engine (CLAUDE.md)
Use weighted reward components for training: efficiency-weight (default 0.7), price-weight (default 0.2), idle-weight (default 0.1), and job-age-weight (default 0.0)
Files:
train.py
environment.py
📄 CodeRabbit inference engine (CLAUDE.md)
environment.py: Use Gymnasium-compatible environment design for RL simulations in the Environment module
Configure cluster parameters as: 335 nodes maximum, 96 cores per node, up to 16 nodes per job, 150W idle power, 450W active power per node
Set environment constraints as: 2-week episodes (336 hours), max 1000 jobs in queue, max 1500 new jobs per hour, max 170 hours job runtime
Files:
environment.py
🧠 Learnings (7)
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to {train.py,weights.py} : Use weighted reward components for training: efficiency-weight (default 0.7), price-weight (default 0.2), idle-weight (default 0.1), and job-age-weight (default 0.0)
Applied to files:
train.pyenvironment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to sampler_hourly.py : Sample job characteristics independently (job count, duration, nodes, cores-per-node) from hour-of-day distributions in the Hourly Sampler
Applied to files:
workloadgen.pyenvironment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to environment.py : Use Gymnasium-compatible environment design for RL simulations in the Environment module
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to environment.py : Set environment constraints as: 2-week episodes (336 hours), max 1000 jobs in queue, max 1500 new jobs per hour, max 170 hours job runtime
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to environment.py : Configure cluster parameters as: 335 nodes maximum, 96 cores per node, up to 16 nodes per job, 150W idle power, 450W active power per node
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to sampler_jobs.py : Use pattern-based replay of historical job batches with full characteristics in the Jobs Sampler
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to train.py : Structure evaluation output to include per-episode metrics (cost, savings, completion rate, wait time, queue size) and cumulative analysis (total savings, monthly reduction, annual projections)
Applied to files:
environment.py
🧬 Code graph analysis (5)
train.py (2)
workloadgen.py (2)
WorkloadGenerator(56-110)WorkloadGenConfig(35-53)environment.py (2)
ComputeClusterEnv(67-899)render(72-73)
test_determinism_workloadgen.py (2)
workloadgen.py (3)
WorkloadGenerator(56-110)WorkloadGenConfig(35-53)sample(99-110)sampler_hourly.py (1)
sample(102-142)
workloadgen.py (1)
sampler_hourly.py (1)
sample(102-142)
prices_deterministic.py (1)
environment.py (1)
reset(247-280)
test_sanity_workloadgen.py (2)
workloadgen.py (3)
WorkloadGenerator(56-110)WorkloadGenConfig(35-53)sample(99-110)sampler_hourly.py (1)
sample(102-142)
🪛 Ruff (0.14.8)
train.py
117-117: Redefinition of unused norm_path from line 93
(F811)
workloadgen.py
60-60: Avoid specifying long messages outside the exception class
(TRY003)
89-89: Avoid specifying long messages outside the exception class
(TRY003)
99-99: Unused method argument: hour_idx
(ARG002)
prices_deterministic.py
23-23: Avoid specifying long messages outside the exception class
(TRY003)
54-54: Undefined name deque
(F821)
🔇 Additional comments (22)
workloadgen.py (3)
27-31: LGTM!Immutable
JobSpecdataclass is well-designed for deterministic workload representation.
34-53: LGTM!Configuration is well-structured with sensible defaults. The frozen dataclass ensures immutability.
99-110: LGTM!The
samplemethod is well-implemented. The unusedhour_idxparameter is documented as a placeholder for future daily pattern support, which is a reasonable design choice for API stability.prices_deterministic.py (3)
20-23: LGTM!The empty array guard addresses the concern from the previous review, ensuring
np.min/np.percentileare never called on empty arrays.
89-93:advance_and_get_predicted_pricesreturns stale "next" price.After
get_next_price()advances the index and returns the current step's price, you roll the prediction window and insert that same price at the end. However, the last slot should contain the newly predicted price forprice_index + PREDICTION_WINDOW - 1, not the price that was just consumed. This may cause the prediction window to lag by one step.Please verify whether the predicted prices window is intended to represent future prices or a rolling history including the current price.
119-144: LGTM!The plotting utility is well-implemented with appropriate percentile markers and flexible save/display options.
test_sanity_workloadgen.py (2)
5-8: LGTM!Good helper function for validating job constraints.
10-20: LGTM!Determinism test correctly validates that identical seeds produce identical outputs.
train.py (3)
41-44: LGTM!Weight defaults align with coding guidelines: efficiency-weight (0.7), price-weight (0.2), idle-weight (0.1), job-age-weight (0.0). Based on coding guidelines.
49-52: LGTM!CLI arguments for workload generator are well-designed with sensible defaults.
96-111: LGTM!Workload generator configuration correctly uses environment constants for resource bounds, ensuring consistency between the generator and environment constraints.
test_determinism_workloadgen.py (2)
5-6: LGTM!Clean helper for converting jobs to comparable tuples.
8-19: LGTM!Determinism check is well-implemented with clear error messaging.
environment.py (9)
119-119: LGTM! Workload generator and RNG scaffolding integrated correctly.The addition of the
workload_genparameter and initialization of deterministic RNG state (self.np_random,self._seed) align with the PR objectives for deterministic workload generation. The RNG will be properly seeded in thereset()method.Also applies to: 152-156
248-252: LGTM! RNG seeding properly implemented.The addition of
super().reset(seed=seed)andself.np_random, self._seed = seeding.np_random(seed)correctly implements deterministic RNG seeding. This resolves the previous critical issue whereself.np_randomwasNoneand would causeAttributeErrorat runtime.Based on learnings, the environment should use Gymnasium-compatible design for RL simulations.
363-376: Good defensive programming for queue capacity.The addition of queue capacity checks (
queue_free,max_to_generate) before callinghourly_sampler.sample()prevents generating more jobs than can be enqueued. This reduces wasted work and improves efficiency.
656-676: LGTM! Drop tracking logic is correct.The drop logic correctly:
- Increments job age when resources are unavailable
- Removes jobs exceeding
MAX_JOB_AGE- Updates drop counters for both agent and baseline
- Updates
next_empty_slotwhen clearing jobs
458-460: LGTM! Drop penalty logic is clear and correct.The drop penalty logic has been simplified and is now correct:
excesstracks how many more jobs the agent dropped vs. baseline in the episodenew_excesstracks the incremental increase in excess drops this stepdrop_penalty = min(0, PENALTY_DROPPED_JOB * self.new_excess)ensures the penalty is always non-positive- The penalty is correctly added to the total reward
This addresses the previous review comment about confusing conditional logic.
Also applies to: 740-741, 749-749
863-864: LGTM! Drop metrics properly integrated into episode recording.The episode recording correctly:
- Calculates drop rates with zero-division protection
- Includes comprehensive drop metrics (jobs_dropped, drop_rate, jobs_rejected_queue_full) for both agent and baseline
- Maintains consistency with metrics tracked throughout the episode
Also applies to: 891-897
14-14: Verify the necessity of theDurationSamplerimport.The import of
DurationSamplerfromsampler_jobsis not visibly used in the current code. If this import is not required, it should be removed to avoid clutter.Run the following script to verify if
DurationSampleris used anywhere in the file:#!/bin/bash # Check if DurationSampler is used in environment.py rg -n '\bDurationSampler\b' environment.py | grep -v '^14:'
385-408: Workload generator integration and RNG usage verified.The WorkloadGenerator API contract is correctly implemented and used:
- WorkloadGenerator class (workloadgen.py) has a
sample(hour_idx: int, rng: np.random.Generator)method that returnsList[JobSpec]- JobSpec dataclass (workloadgen.py) contains exactly the three attributes accessed in the code:
duration,nodes, andcores_per_node- The code correctly calls
workload_gen.sample(self.current_hour - 1, self.np_random)and accesses job attributes safely- RNG usage with
self.np_random.integers()is consistent throughout
252-252: Prices API is correctly implemented.The
prices_deterministic.Pricesclass provides both thereset()method (line 51) and thepredicted_pricesattribute (initialized at line 18, populated in reset() at lines 57-63). The calls at environment.py lines 252-260 (self.prices.reset()andself.prices.predicted_prices.copy()) are valid and use the correct API.
ed14541 to
13a3b28
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
environment.py (1)
359-366: Inconsistent sampler usage:self.jobs_samplervs module-leveljobs_sampler.Line 360 uses
self.jobs_sampler.sample_one_hourly(), but__init__(lines 161-169) initializes the module-leveljobs_sampler(imported fromsampler_jobs), notself.jobs_sampler.If the redundant initialization in
reset_state(lines 329-332) is removed, this line will fail withAttributeError: 'ComputeClusterEnv' object has no attribute 'jobs_sampler'.Either consistently use the module-level
jobs_sampleror properly initializeself.jobs_sampleronce in__init__.if self.external_jobs: - jobs = self.jobs_sampler.sample_one_hourly(wrap=True)["hourly_jobs"] + jobs = jobs_sampler.sample_one_hourly(wrap=True)["hourly_jobs"]
♻️ Duplicate comments (1)
environment.py (1)
328-332: Redundant jobs_sampler re-initialization in reset_state.This re-initializes
self.jobs_samplerand re-parses the jobs file on every episode reset. The sampler is already initialized in__init__(lines 161-169) and should be reused. This causes unnecessary file I/O and processing overhead each episode.A previous review marked this as addressed, but the code still appears to contain the redundant initialization.
- #Define jobs_sampler as a self. object, to ensure determinism - if self.external_jobs and not self.workload_gen: - self.jobs_sampler = DurationSampler() - self.jobs_sampler.parse_jobs(self.external_jobs, 60) - self.jobs_sampler.precalculate_hourly_jobs(CORES_PER_NODE, MAX_NODES_PER_JOB)
🧹 Nitpick comments (4)
test_sanity_workloadgen.py (2)
31-37: Consider widening the assertion bounds for statistical stability.The assertion checks
48.0 < mean < 52.0but the comment mentions "45.0–55.0". For a Poisson(50) distribution over 2000 samples, the standard error is approximatelysqrt(50/2000) ≈ 0.16, so the 95% CI is roughly [49.7, 50.3]. The current bounds (±4%) are reasonable but could still fail occasionally due to random variance.Consider using a slightly wider band or documenting the expected flakiness rate.
- assert 48.0 < mean < 52.0, mean # tighter band, still reliable for 2000 samples + assert 45.0 < mean < 55.0, f"Poisson mean {mean} outside expected range" # ~10% band, very unlikely to flake
39-43: Consider using pytest or unittest for consistent test discovery.Running tests via
if __name__ == "__main__"works but won't be discovered by standard test runners. If other tests in the project use pytest or unittest, consider aligning this module.deter_wlgen.py (1)
61-72: Consider consolidating with test_sanity_workloadgen.py.This module (
deter_wlgen.py) andtest_sanity_workloadgen.pyhave overlapping functionality (determinism, bounds, distribution checks). Consider consolidating into a single test module to avoid maintenance burden.prices_deterministic.py (1)
118-143: Consider closing figures explicitly after save.The
plot_price_histogrammethod callsplt.close()at the end, which is good. However, if an exception occurs betweenplt.figure()andplt.close(), the figure won't be closed. Consider using a try/finally or context manager pattern.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
deter_wlgen.py(1 hunks)environment.py(19 hunks)prices_deterministic.py(1 hunks)sanity_workloadgen.py(1 hunks)test_determinism_workloadgen.py(1 hunks)test_sanity_workloadgen.py(1 hunks)train.py(4 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- sanity_workloadgen.py
- test_determinism_workloadgen.py
🧰 Additional context used
📓 Path-based instructions (3)
train.py
📄 CodeRabbit inference engine (CLAUDE.md)
train.py: Implement stable-baselines3 PPO integration in the training script with tensorboard logging and model checkpointing every 100K steps
Structure evaluation output to include per-episode metrics (cost, savings, completion rate, wait time, queue size) and cumulative analysis (total savings, monthly reduction, annual projections)
Files:
train.py
{train.py,weights.py}
📄 CodeRabbit inference engine (CLAUDE.md)
Use weighted reward components for training: efficiency-weight (default 0.7), price-weight (default 0.2), idle-weight (default 0.1), and job-age-weight (default 0.0)
Files:
train.py
environment.py
📄 CodeRabbit inference engine (CLAUDE.md)
environment.py: Use Gymnasium-compatible environment design for RL simulations in the Environment module
Configure cluster parameters as: 335 nodes maximum, 96 cores per node, up to 16 nodes per job, 150W idle power, 450W active power per node
Set environment constraints as: 2-week episodes (336 hours), max 1000 jobs in queue, max 1500 new jobs per hour, max 170 hours job runtime
Files:
environment.py
🧠 Learnings (6)
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to {train.py,weights.py} : Use weighted reward components for training: efficiency-weight (default 0.7), price-weight (default 0.2), idle-weight (default 0.1), and job-age-weight (default 0.0)
Applied to files:
train.pyenvironment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to environment.py : Set environment constraints as: 2-week episodes (336 hours), max 1000 jobs in queue, max 1500 new jobs per hour, max 170 hours job runtime
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to environment.py : Configure cluster parameters as: 335 nodes maximum, 96 cores per node, up to 16 nodes per job, 150W idle power, 450W active power per node
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to sampler_hourly.py : Sample job characteristics independently (job count, duration, nodes, cores-per-node) from hour-of-day distributions in the Hourly Sampler
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to sampler_jobs.py : Use pattern-based replay of historical job batches with full characteristics in the Jobs Sampler
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to train.py : Structure evaluation output to include per-episode metrics (cost, savings, completion rate, wait time, queue size) and cumulative analysis (total savings, monthly reduction, annual projections)
Applied to files:
environment.py
🧬 Code graph analysis (4)
train.py (2)
workloadgen.py (2)
WorkloadGenerator(56-110)WorkloadGenConfig(35-53)environment.py (2)
ComputeClusterEnv(67-904)render(72-73)
prices_deterministic.py (1)
environment.py (1)
reset(247-280)
deter_wlgen.py (1)
workloadgen.py (2)
WorkloadGenerator(56-110)WorkloadGenConfig(35-53)
test_sanity_workloadgen.py (2)
workloadgen.py (3)
WorkloadGenerator(56-110)WorkloadGenConfig(35-53)sample(99-110)sampler_hourly.py (1)
sample(102-142)
🪛 Ruff (0.14.8)
prices_deterministic.py
24-24: Avoid specifying long messages outside the exception class
(TRY003)
🔇 Additional comments (25)
train.py (4)
10-18: LGTM - Clean import organization for new functionality.The imports properly bring in the
WorkloadGeneratorandWorkloadGenConfigfrom the new workloadgen module, along with environment constants needed for config initialization.
49-52: LGTM - CLI arguments align with WorkloadGenConfig.The new arguments (
--workload-gen,--wg-poisson-lambda,--wg-max-jobs-hour) properly expose the key workload generator configuration options with sensible defaults matching theWorkloadGenConfigdefaults.
90-111: LGTM - Path normalization and workload generator setup.The
norm_pathhelper correctly normalizes empty strings toNone, which addresses the inconsistent truthy/falsy evaluation mentioned in the comment. TheWorkloadGenConfigis properly constructed using environment constants for resource bounds.
118-134: LGTM - Environment initialization with normalized paths and workload generator.The
norm_pathfunction is correctly applied to the external data paths, andworkload_genis properly passed to the environment. This completes the integration wiring.test_sanity_workloadgen.py (3)
1-9: LGTM - Well-structured test helper.The
assert_job_validhelper correctly validates job attributes against config bounds, promoting code reuse across tests.
10-20: LGTM - Determinism test correctly verifies reproducibility.The test properly seeds two independent RNGs with the same seed and verifies that identical seeds produce identical job sequences across multiple hours.
22-29: LGTM - Constraint validation test is thorough.The test correctly validates that all generated jobs respect the configured bounds over 200 hours of sampling.
deter_wlgen.py (3)
1-7: LGTM - Helper function for tuple conversion.The
jobs_to_tupleshelper correctly extracts the comparable attributes for determinism checks.
8-19: LGTM - Determinism check is robust.The function correctly creates two independent generators and RNGs with identical seeds, then verifies output equality across hours.
21-36: LGTM - Bounds checking is thorough.The function validates both the job count caps and individual job attribute bounds.
prices_deterministic.py (5)
1-4: LGTM - Imports correctly include deque.The
dequeimport has been added as flagged in previous reviews.
12-41: LGTM - Robust initialization with empty array guard.The empty array check at line 23-24 addresses the previous review concern. The price shift logic correctly ensures all prices are >= 1.
44-47: LGTM - Deterministic synthetic price generation.The sinusoidal price model provides a deterministic daily pattern with a guaranteed lower bound of 1.0.
52-67: LGTM - Reset correctly initializes deterministic state.The reset method properly initializes
price_indextoPREDICTION_WINDOWand populatespredicted_pricesfrom either external data or synthetic generation.
69-92: LGTM - Stateful stepping methods are well-designed.
get_next_pricecorrectly advances the index and maintains history, whileadvance_and_get_predicted_pricesproperly rolls the prediction window forward.environment.py (10)
8-18: LGTM - Clean imports for deterministic components.The imports correctly bring in the new
Pricesclass fromprices_deterministic,DurationSampler, and Gymnasium'sseedingutility for deterministic RNG.
45-47: LGTM - Explicit drop penalty constant.The
PENALTY_DROPPED_JOB = -5.0constant is clearly defined and appropriately negative for a penalty.
86-94: LGTM - Useful debugging validator.The
_validate_next_emptyhelper provides useful assertions for debugging queue integrity. The guard at line 89 correctly handles the case wherenext_emptyequals the queue length.
118-155: LGTM - Constructor correctly extended for workload_gen and RNG.The
workload_gen=Noneparameter is properly added to the signature and stored. The RNG placeholders (np_random,_seed) are correctly initialized toNoneand will be seeded inreset().
247-252: LGTM - RNG correctly seeded in reset.The
seeding.np_random(seed)call properly initializes the deterministic RNG, addressing the previous review concern aboutAttributeErrorwhen accessingself.np_random.
368-413: LGTM - Deterministic job generation with workload_gen and non-global RNG.The code correctly prioritizes
workload_genwhen available, falls back to the legacy randomizer withself.np_randominstead of the globalnp.random, and properly handles the hourly sampler path with queue space checks.
614-682: LGTM - Drop tracking integrated into job assignment.The
assign_jobs_to_available_nodesmethod now correctly tracks dropped jobs when they exceedMAX_JOB_AGE, updates the appropriate counters for both agent and baseline, and properly managesnext_empty_slotwhen jobs are dropped.
744-754: LGTM - Drop penalty correctly integrated into reward.The drop penalty is computed as
min(0, PENALTY_DROPPED_JOB * self.new_excess), ensuring it's always non-positive. This addresses the previous review concern about dead code in the conditional check.
509-510: Clarify terminated vs truncated semantics.The code sets
truncated = Trueandterminated = Falsewhen the episode ends due to reachingEPISODE_HOURS. The comment says "Changed distinction, so we can identify if ended due to time-limit" which is correct Gymnasium semantics (truncated = time limit, terminated = natural end). This is good practice.
867-903: LGTM - Episode completion recording extended with drop metrics.The
record_episode_completionmethod now correctly includes drop rate calculations and all drop-related metrics for both agent and baseline.
13a3b28 to
c977c61
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
environment.py (1)
385-409: Consider removing commented legacy code.The workload generator integration looks correct with proper deterministic RNG usage. However, the commented-out legacy code (lines 396, 401, 405-406) could be removed to reduce clutter since the new implementation is established.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
environment.py(19 hunks)prices_deterministic.py(1 hunks)test_determinism_workloadgen.py(1 hunks)test_sanity_workloadgen.py(1 hunks)train.py(4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- test_sanity_workloadgen.py
🧰 Additional context used
📓 Path-based instructions (3)
train.py
📄 CodeRabbit inference engine (CLAUDE.md)
train.py: Implement stable-baselines3 PPO integration in the training script with tensorboard logging and model checkpointing every 100K steps
Structure evaluation output to include per-episode metrics (cost, savings, completion rate, wait time, queue size) and cumulative analysis (total savings, monthly reduction, annual projections)
Files:
train.py
{train.py,weights.py}
📄 CodeRabbit inference engine (CLAUDE.md)
Use weighted reward components for training: efficiency-weight (default 0.7), price-weight (default 0.2), idle-weight (default 0.1), and job-age-weight (default 0.0)
Files:
train.py
environment.py
📄 CodeRabbit inference engine (CLAUDE.md)
environment.py: Use Gymnasium-compatible environment design for RL simulations in the Environment module
Configure cluster parameters as: 335 nodes maximum, 96 cores per node, up to 16 nodes per job, 150W idle power, 450W active power per node
Set environment constraints as: 2-week episodes (336 hours), max 1000 jobs in queue, max 1500 new jobs per hour, max 170 hours job runtime
Files:
environment.py
🧠 Learnings (6)
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to {train.py,weights.py} : Use weighted reward components for training: efficiency-weight (default 0.7), price-weight (default 0.2), idle-weight (default 0.1), and job-age-weight (default 0.0)
Applied to files:
train.pyenvironment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to environment.py : Set environment constraints as: 2-week episodes (336 hours), max 1000 jobs in queue, max 1500 new jobs per hour, max 170 hours job runtime
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to environment.py : Configure cluster parameters as: 335 nodes maximum, 96 cores per node, up to 16 nodes per job, 150W idle power, 450W active power per node
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to sampler_hourly.py : Sample job characteristics independently (job count, duration, nodes, cores-per-node) from hour-of-day distributions in the Hourly Sampler
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to sampler_jobs.py : Use pattern-based replay of historical job batches with full characteristics in the Jobs Sampler
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to train.py : Structure evaluation output to include per-episode metrics (cost, savings, completion rate, wait time, queue size) and cumulative analysis (total savings, monthly reduction, annual projections)
Applied to files:
environment.py
🧬 Code graph analysis (4)
test_determinism_workloadgen.py (2)
workloadgen.py (2)
WorkloadGenConfig(35-53)sample(99-110)sampler_hourly.py (1)
sample(102-142)
train.py (2)
workloadgen.py (2)
WorkloadGenerator(56-110)WorkloadGenConfig(35-53)environment.py (1)
render(72-73)
prices_deterministic.py (1)
environment.py (1)
reset(248-281)
environment.py (4)
prices_deterministic.py (3)
Prices(5-143)reset(52-67)advance_and_get_predicted_prices(88-92)prices.py (1)
Prices(4-121)sampler_jobs.py (5)
DurationSampler(7-444)parse_jobs(18-115)precalculate_hourly_jobs(364-383)sample_one_hourly(416-444)sample(123-168)workloadgen.py (1)
sample(99-110)
🪛 Ruff (0.14.8)
prices_deterministic.py
24-24: Avoid specifying long messages outside the exception class
(TRY003)
🔇 Additional comments (31)
test_determinism_workloadgen.py (5)
1-6: LGTM!Clean imports and concise helper function for converting job objects to comparable tuples.
8-19: LGTM!The determinism test correctly verifies that two generators with identical seeds produce identical outputs.
21-36: LGTM!Comprehensive bounds checking against all config limits including the optional hard_cap_jobs.
38-62: LGTM!Distribution smoke tests correctly validate mean expectations for all three arrival modes with appropriate tolerances.
64-76: LGTM!Main function provides good coverage by testing all three arrival modes.
prices_deterministic.py (6)
1-11: LGTM!Clean imports and well-defined class constants for the price model.
12-41: LGTM!Proper initialization with empty array guard and price normalization to ensure positive values.
43-50: LGTM!Pure deterministic price functions with proper floor at 1.0.
52-67: LGTM!Reset correctly initializes all state for deterministic episode starts with proper array copying.
69-92: LGTM!Stateful price stepping methods correctly advance state and return copies to prevent external mutation.
94-143: LGTM!Non-mutating utility methods for statistics and visualization with proper resource cleanup via
plt.close().train.py (4)
10-18: LGTM!Clean imports for workload generator and environment constants needed for config alignment.
49-52: LGTM!CLI arguments for workload generator with sensible defaults matching
WorkloadGenConfig.
90-111: LGTM!Proper path normalization and workload generator initialization with config aligned to environment constants.
113-134: LGTM!Environment initialization correctly applies path normalization and passes workload generator.
environment.py (16)
8-18: LGTM!Clean switch to deterministic Prices and proper import of gymnasium seeding utilities.
45-47: LGTM!Explicit penalty constant for dropped jobs with clear negative value.
86-94: LGTM!Useful debug validator for the next_empty_slot invariant.
117-170: LGTM!Proper initialization of workload generator with conditional sampler setup and deferred RNG seeding following gymnasium patterns.
184-191: LGTM!Symmetric job drop tracking for both agent and baseline paths.
248-253: LGTM!Correct gymnasium seeding pattern ensuring deterministic episode resets.
312-328: LGTM!Comprehensive reset of all drop-related tracking counters.
337-338: LGTM!Correct usage of the renamed deterministic prices API.
363-383: LGTM!Smart queue-space gating prevents generating jobs that can't be enqueued, with proper deterministic RNG usage.
413-417: LGTM!Proper tracking of jobs rejected due to queue capacity limits.
459-461: LGTM!Correct incremental calculation of excess drops compared to baseline for fair penalty application.
505-506: LGTM!Correct gymnasium semantics: episode timeout is a truncation, not a termination.
610-678: LGTM!Robust job assignment with proper aging and drop tracking, including explicit type conversions for dictionary values.
685-689: LGTM!Symmetric rejected job tracking for baseline path.
740-750: LGTM!Clean drop penalty calculation using
min(0, ...)pattern for clarity.
863-899: LGTM!Comprehensive drop metrics added to episode recording with proper zero-division guards.
c977c61 to
e53b83a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (4)
prices_deterministic.py (1)
23-24: Consider defining a custom exception class for reusability.The inline error message triggers a style warning (TRY003). For better maintainability, consider defining a custom exception class if this validation pattern is reused elsewhere, or acknowledge this as an acceptable style deviation for a one-off validation.
environment.py (3)
365-376: Consider applying queue capacity check to workload generator path.The hourly sampler path pre-checks available queue space to avoid generating jobs that won't fit. The workload generator path (lines 385-393) doesn't have this optimization and may generate jobs that get rejected. While
add_new_jobshandles overflow correctly, applying the same pre-check would improve consistency and efficiency.Apply this pattern to the workload generator path:
else: + # Check available queue space + queue_used = int(np.count_nonzero(job_queue_2d[:, 0] > 0)) + queue_free = max(0, MAX_QUEUE_SIZE - queue_used) + max_to_generate = min(queue_free, MAX_NEW_JOBS_PER_HOUR) + #----------------------Use Workload Generator for Randomizer------------------------------------------------------------------------------------ if self.workload_gen is not None: - jobs = self.workload_gen.sample(self.current_hour - 1, self.np_random) + if max_to_generate == 0: + jobs = [] + else: + # Note: workload generator doesn't support max_jobs parameter yet + # Consider adding this parameter to WorkloadGenConfig for consistency + jobs = self.workload_gen.sample(self.current_hour - 1, self.np_random) + jobs = jobs[:max_to_generate] # Truncate if needed new_jobs_count = len(jobs)
610-678: Consider returning drop count for consistency.The function tracks
num_dropped(line 612, incremented at line 667) but doesn't return it. Drop tracking happens through side effects (self.jobs_dropped,self.dropped_this_episode). While this works, returning(num_processed_jobs, num_dropped, next_empty_slot)would make the function's behavior more explicit and testable.If you prefer explicit returns:
- def assign_jobs_to_available_nodes(self, job_queue_2d, nodes, cores_available, running_jobs, next_empty_slot, is_baseline=False): - num_processed_jobs = 0 - num_dropped = 0 + def assign_jobs_to_available_nodes(self, job_queue_2d, nodes, cores_available, running_jobs, next_empty_slot, is_baseline=False): + num_processed_jobs = 0 + num_dropped = 0 # ... existing logic ... - return num_processed_jobs, next_empty_slot + return num_processed_jobs, num_dropped, next_empty_slotThen update callers at lines 432 and 689 to unpack three values.
620-677: Excessive int() casts in job assignment loop.Lines 628-631, 634-635, 649, 652, and 658 contain int() casts for values that are already numpy integers. These casts are unnecessary and add clutter. NumPy integer types work correctly in these contexts (indexing, arithmetic, dictionary keys).
Consider removing the unnecessary casts:
for i in range(job_nodes): - node_idx = int(candidate_nodes[i]) - cores_available[node_idx] -= int(job_cores_per_node) - nodes[node_idx] = max(int(nodes[node_idx]), int(job_duration)) - job_allocation.append((node_idx, int(job_cores_per_node))) + node_idx = candidate_nodes[i] + cores_available[node_idx] -= job_cores_per_node + nodes[node_idx] = max(nodes[node_idx], job_duration) + job_allocation.append((node_idx, job_cores_per_node)) running_jobs[self.next_job_id] = { - "duration": int(job_duration), + "duration": job_duration, "allocation": job_allocation, }And similar changes at lines 649, 652, 658.
📜 Review details
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
environment.py(19 hunks)prices_deterministic.py(1 hunks)test_determinism_workloadgen.py(1 hunks)test_sanity_workloadgen.py(1 hunks)train.py(4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- test_determinism_workloadgen.py
🧰 Additional context used
📓 Path-based instructions (3)
environment.py
📄 CodeRabbit inference engine (CLAUDE.md)
environment.py: Use Gymnasium-compatible environment design for RL simulations in the Environment module
Configure cluster parameters as: 335 nodes maximum, 96 cores per node, up to 16 nodes per job, 150W idle power, 450W active power per node
Set environment constraints as: 2-week episodes (336 hours), max 1000 jobs in queue, max 1500 new jobs per hour, max 170 hours job runtime
Files:
environment.py
train.py
📄 CodeRabbit inference engine (CLAUDE.md)
train.py: Implement stable-baselines3 PPO integration in the training script with tensorboard logging and model checkpointing every 100K steps
Structure evaluation output to include per-episode metrics (cost, savings, completion rate, wait time, queue size) and cumulative analysis (total savings, monthly reduction, annual projections)
Files:
train.py
{train.py,weights.py}
📄 CodeRabbit inference engine (CLAUDE.md)
Use weighted reward components for training: efficiency-weight (default 0.7), price-weight (default 0.2), idle-weight (default 0.1), and job-age-weight (default 0.0)
Files:
train.py
🧠 Learnings (6)
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to environment.py : Set environment constraints as: 2-week episodes (336 hours), max 1000 jobs in queue, max 1500 new jobs per hour, max 170 hours job runtime
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to {train.py,weights.py} : Use weighted reward components for training: efficiency-weight (default 0.7), price-weight (default 0.2), idle-weight (default 0.1), and job-age-weight (default 0.0)
Applied to files:
environment.pytrain.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to environment.py : Configure cluster parameters as: 335 nodes maximum, 96 cores per node, up to 16 nodes per job, 150W idle power, 450W active power per node
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to sampler_hourly.py : Sample job characteristics independently (job count, duration, nodes, cores-per-node) from hour-of-day distributions in the Hourly Sampler
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to sampler_jobs.py : Use pattern-based replay of historical job batches with full characteristics in the Jobs Sampler
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to train.py : Structure evaluation output to include per-episode metrics (cost, savings, completion rate, wait time, queue size) and cumulative analysis (total savings, monthly reduction, annual projections)
Applied to files:
environment.py
🧬 Code graph analysis (2)
prices_deterministic.py (1)
environment.py (1)
reset(248-281)
test_sanity_workloadgen.py (1)
workloadgen.py (3)
WorkloadGenerator(56-110)WorkloadGenConfig(35-53)sample(99-110)
🪛 Ruff (0.14.8)
prices_deterministic.py
24-24: Avoid specifying long messages outside the exception class
(TRY003)
🔇 Additional comments (5)
test_sanity_workloadgen.py (1)
1-43: LGTM! Well-structured test suite.The test suite effectively validates the workload generator's key properties:
- Determinism via identical RNG seeds
- Constraint adherence for job attributes
- Statistical sanity for Poisson arrivals
prices_deterministic.py (1)
43-143: LGTM! Clean deterministic pricing implementation.The Prices class provides a well-structured deterministic pricing model with clear state management, proper reset behavior, and helpful statistics/visualization utilities.
train.py (1)
49-51: LGTM! Clean workload generator integration.The workload generator integration follows best practices:
- CLI arguments provide flexible configuration
- Bounds are sourced from environment constants (DRY principle)
- Conditional instantiation based on user selection
- Properly wired into environment initialization
Also applies to: 96-111, 133-134
environment.py (2)
151-155: LGTM! Proper deterministic RNG initialization.The RNG is correctly initialized using Gymnasium's seeding utility, ensuring deterministic behavior when a seed is provided. This addresses the previous AttributeError issue.
Also applies to: 248-250
459-461: LGTM! Drop penalty logic is correct.The drop penalty correctly penalizes incremental excess drops each step:
- Calculates cumulative excess drops vs baseline
- Computes new excess drops this step (delta)
- Applies penalty proportional to new excess
The
min(0, ...)clamp ensures the penalty is always non-positive.Also applies to: 740-742
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
environment.py (1)
354-361: Missing queue capacity check forexternal_jobspath.The
external_hourly_jobspath (lines 364-375) andworkload_genpath (lines 386-399) both enforce queue capacity limits before generating jobs. However, theexternal_jobspath does not apply any such limit, which could lead to generating more jobs than the queue can accept.Consider adding a similar capacity check:
if self.external_jobs: - jobs = self.jobs_sampler.sample_one_hourly(wrap=True)["hourly_jobs"] + # How much room is left right now? + queue_used = int(np.count_nonzero(job_queue_2d[:, 0] > 0)) + queue_free = max(0, MAX_QUEUE_SIZE - queue_used) + max_to_generate = min(queue_free, MAX_NEW_JOBS_PER_HOUR) + + if max_to_generate == 0: + jobs = [] + else: + jobs = self.jobs_sampler.sample_one_hourly(wrap=True)["hourly_jobs"] + if len(jobs) > max_to_generate: + jobs = jobs[:max_to_generate] if len(jobs) > 0:Based on past review discussion where this issue was acknowledged.
🧹 Nitpick comments (2)
train.py (1)
66-72: Consider usingnorm_pathforprices_file_pathearlier.Per earlier discussion,
norm_pathis used here for the condition check butprices_file_pathitself is passed topd.read_csvwithout normalization. The current usage works correctly since the condition guards the read, but applyingnorm_pathtoprices_file_pathat line 61 would be more consistent with the pattern used for other external paths.environment.py (1)
311-327: Consolidate redundant counter initializations.
dropped_this_episodeandbaseline_dropped_this_episodeare initialized twice: first at lines 185-189 in__init__and again at lines 324-325 inreset_state(). The__init__initialization is redundant sincereset_state()is called from__init__at line 192.Consider removing the duplicate initialization in
__init__(lines 183-189) sincereset_state()handles all counter resets:- # Job tracking metrics for agent - self.jobs_dropped = 0 - self.dropped_this_episode = 0 - - # Job tracking metrics for baseline - self.baseline_jobs_dropped = 0 - self.baseline_dropped_this_episode = 0 - self.reset_state()
📜 Review details
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
environment.py(19 hunks)train.py(4 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
train.py
📄 CodeRabbit inference engine (CLAUDE.md)
train.py: Implement stable-baselines3 PPO integration in the training script with tensorboard logging and model checkpointing every 100K steps
Structure evaluation output to include per-episode metrics (cost, savings, completion rate, wait time, queue size) and cumulative analysis (total savings, monthly reduction, annual projections)
Files:
train.py
{train.py,weights.py}
📄 CodeRabbit inference engine (CLAUDE.md)
Use weighted reward components for training: efficiency-weight (default 0.7), price-weight (default 0.2), idle-weight (default 0.1), and job-age-weight (default 0.0)
Files:
train.py
environment.py
📄 CodeRabbit inference engine (CLAUDE.md)
environment.py: Use Gymnasium-compatible environment design for RL simulations in the Environment module
Configure cluster parameters as: 335 nodes maximum, 96 cores per node, up to 16 nodes per job, 150W idle power, 450W active power per node
Set environment constraints as: 2-week episodes (336 hours), max 1000 jobs in queue, max 1500 new jobs per hour, max 170 hours job runtime
Files:
environment.py
🧠 Learnings (6)
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to environment.py : Set environment constraints as: 2-week episodes (336 hours), max 1000 jobs in queue, max 1500 new jobs per hour, max 170 hours job runtime
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to {train.py,weights.py} : Use weighted reward components for training: efficiency-weight (default 0.7), price-weight (default 0.2), idle-weight (default 0.1), and job-age-weight (default 0.0)
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to environment.py : Configure cluster parameters as: 335 nodes maximum, 96 cores per node, up to 16 nodes per job, 150W idle power, 450W active power per node
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to sampler_hourly.py : Sample job characteristics independently (job count, duration, nodes, cores-per-node) from hour-of-day distributions in the Hourly Sampler
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to sampler_jobs.py : Use pattern-based replay of historical job batches with full characteristics in the Jobs Sampler
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to train.py : Structure evaluation output to include per-episode metrics (cost, savings, completion rate, wait time, queue size) and cumulative analysis (total savings, monthly reduction, annual projections)
Applied to files:
environment.py
🧬 Code graph analysis (2)
train.py (2)
workloadgen.py (2)
WorkloadGenerator(56-110)WorkloadGenConfig(35-53)environment.py (2)
ComputeClusterEnv(66-912)render(71-72)
environment.py (6)
prices_deterministic.py (3)
Prices(5-143)reset(52-67)advance_and_get_predicted_prices(88-92)prices.py (1)
Prices(4-121)plot.py (1)
plot(6-77)sampler_jobs.py (5)
DurationSampler(7-444)parse_jobs(18-115)precalculate_hourly_jobs(364-383)sample_one_hourly(416-444)sample(123-168)sampler_hourly.py (2)
parse_jobs(12-100)sample(102-142)workloadgen.py (1)
sample(99-110)
🔇 Additional comments (16)
train.py (4)
10-18: LGTM!Clean imports for the new workload generator integration and environment constants needed for configuration.
21-24: LGTM!Good defensive helper to normalize empty strings to
Nonefor consistent falsy handling.
55-57: LGTM!CLI arguments for workload generator configuration are well-designed with sensible defaults matching
WorkloadGenConfig.
148-169: LGTM!Environment construction correctly passes
workload_genand usesnorm_pathfor external data paths. Theevaluation_modeandworkload_genparameters align with the updatedComputeClusterEnvsignature.environment.py (12)
44-46: LGTM!Clear penalty constant for dropped jobs with an explicit negative value. This aligns with the reward system convention.
84-93: LGTM!Useful debug validator for
next_empty_slotinvariants. The assertions check both that the slot is empty and that no holes exist before it.
117-118: LGTM!Constructor properly accepts optional
workload_genand initializes RNG scaffolding (np_random,_seed) for deterministic seeding inreset().Also applies to: 150-154
160-169: LGTM!Jobs sampler initialization is correctly conditioned on
not self.workload_gen, preventing redundant parsing when the workload generator is active.
247-252: LGTM!RNG is now properly seeded in
reset()using Gymnasium'sseeding.np_random(seed). This addresses the previous critical issue whereself.np_randomwasNone.
384-421: LGTM!Workload generator integration is well-structured:
- Queue capacity is checked before sampling
- Jobs are capped to available queue space
- Legacy path correctly uses
self.np_randominstead of global RNG
471-473: LGTM!The excess drop calculation correctly computes the incremental penalty only for new excess drops beyond baseline.
517-518: Clarify truncation vs termination semantics.Setting
truncated=Trueandterminated=Falseat episode end is a valid Gymnasium pattern for time-limit truncations. This correctly signals that the episode ended due to reaching the maximum steps rather than a terminal state, which is important for value bootstrapping in RL algorithms.
622-690: LGTM!The
assign_jobs_to_available_nodesmethod now correctly:
- Tracks dropped jobs when they exceed
MAX_JOB_AGE- Updates per-episode and cumulative drop counters
- Properly updates
next_empty_slotwhen clearing dropped jobs- Returns both
num_processed_jobsand maintains the slot tracking
752-762: LGTM!Drop penalty logic is now clean:
min(0, PENALTY_DROPPED_JOB * self.new_excess)ensures the penalty is always non-positive. This addresses the previous review comment about dead code.
875-911: LGTM!Episode completion data now includes comprehensive drop metrics for both agent and baseline, enabling proper analysis of job handling performance.
8-9: prices_deterministic module is properly integrated with compatible interface.The
prices_deterministicmodule exists and provides all required methods matching the environment.py usage:reset(),get_price_context(),advance_and_get_predicted_prices(),get_price_stats(), and the attributesMAX_PRICEandMIN_PRICE. The interface is fully compatible with the environment implementation.
…tes timely distributions of incoming jobs. Direct input into agent not yet finished, but pipelines mostly set. train.py has added parsers to call generator. Fixed: Missing deque import, correct "flat" distribution check, removed duplicate lines. ist Commit-Beschreibung FairRootGroup#2: Workload generator: Added new script, including sanity checks. Generates timely distributions of incoming jobs. Direct input into agent not yet finished, but pipelines mostly set. train.py has added parsers to call generator. Fixed: Missing deque import, correct "flat" distribution check, removed duplicate lines. Additional Fixes: Remove unused import, add queue capacity also for job randomizer, norm_path check now also for prices. Removed non-essential type casts. Hotfix Triplet, which occured during rebase.
9f873b4 to
4c9e465
Compare
|
I've added 3 commits:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
environment.py (1)
354-361: Apply queue capacity checks to the external_jobs path.The
external_hourly_jobsandworkload_genpaths include queue capacity checks to prevent generating more jobs than can be enqueued (lines 365-375, 387-399). Theexternal_jobspath should have the same protection to maintain consistency and prevent queue overflow.This was previously noted by reviewer rbx.
🔎 Apply this diff to add queue capacity checks:
if self.external_jobs: + # Check queue capacity before sampling + queue_used = int(np.count_nonzero(job_queue_2d[:, 0] > 0)) + queue_free = max(0, MAX_QUEUE_SIZE - queue_used) + max_to_generate = min(queue_free, MAX_NEW_JOBS_PER_HOUR) + + if max_to_generate == 0: + jobs = [] + else: - jobs = self.jobs_sampler.sample_one_hourly(wrap=True)["hourly_jobs"] + jobs = self.jobs_sampler.sample_one_hourly(wrap=True)["hourly_jobs"] + if len(jobs) > max_to_generate: + jobs = jobs[:max_to_generate] + if len(jobs) > 0: for job in jobs:
🧹 Nitpick comments (1)
environment.py (1)
8-9: Remove commented-out import.The old import statement should be removed rather than commented out, as the code has been refactored to use the deterministic prices module.
🔎 Apply this diff to clean up the commented code:
-#from prices import Prices from prices_deterministic import Prices # Test re-worked prices script
📜 Review details
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
environment.py(20 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
environment.py
📄 CodeRabbit inference engine (CLAUDE.md)
environment.py: Use Gymnasium-compatible environment design for RL simulations in the Environment module
Configure cluster parameters as: 335 nodes maximum, 96 cores per node, up to 16 nodes per job, 150W idle power, 450W active power per node
Set environment constraints as: 2-week episodes (336 hours), max 1000 jobs in queue, max 1500 new jobs per hour, max 170 hours job runtime
Files:
environment.py
🧠 Learnings (6)
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to environment.py : Set environment constraints as: 2-week episodes (336 hours), max 1000 jobs in queue, max 1500 new jobs per hour, max 170 hours job runtime
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to {train.py,weights.py} : Use weighted reward components for training: efficiency-weight (default 0.7), price-weight (default 0.2), idle-weight (default 0.1), and job-age-weight (default 0.0)
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to environment.py : Configure cluster parameters as: 335 nodes maximum, 96 cores per node, up to 16 nodes per job, 150W idle power, 450W active power per node
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to sampler_hourly.py : Sample job characteristics independently (job count, duration, nodes, cores-per-node) from hour-of-day distributions in the Hourly Sampler
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to sampler_jobs.py : Use pattern-based replay of historical job batches with full characteristics in the Jobs Sampler
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to train.py : Structure evaluation output to include per-episode metrics (cost, savings, completion rate, wait time, queue size) and cumulative analysis (total savings, monthly reduction, annual projections)
Applied to files:
environment.py
🧬 Code graph analysis (1)
environment.py (6)
prices_deterministic.py (3)
Prices(5-143)reset(52-67)advance_and_get_predicted_prices(88-92)weights.py (1)
Weights(5-16)sampler_jobs.py (3)
DurationSampler(7-444)parse_jobs(18-115)sample(123-168)sampler_duration.py (1)
init(42-54)sampler_hourly.py (2)
parse_jobs(12-100)sample(102-142)workloadgen.py (1)
sample(99-110)
🔇 Additional comments (19)
environment.py (19)
44-45: LGTM!The explicit penalty constant for dropped jobs is well-defined and documented. The negative value correctly represents a penalty in the reward system.
85-93: LGTM!The validation helper is well-structured for debugging queue state consistency. The assertions appropriately verify that
next_emptypoints to an actual empty slot and that there are no holes before it.
117-118: LGTM!The new parameters
evaluation_modeandworkload_genextend the environment's capabilities to support deterministic workload generation and evaluation scenarios.
151-155: LGTM!The deterministic RNG scaffolding is correctly initialized. The
np_randomis set toNonehere and properly seeded in thereset()method (line 249), which follows Gymnasium's seeding pattern.
160-169: LGTM!The conditional loading of external jobs data correctly excludes the
jobs_samplerwhenworkload_genis provided, preventing redundant initialization and ensuring the pluggable generator takes precedence.
183-189: LGTM!Job drop tracking metrics are properly initialized for both agent and baseline, enabling comparison of drop behavior between the two policies.
247-261: LGTM!The reset method properly implements Gymnasium's seeding pattern:
- Calls
super().reset(seed=seed)to handle parent class seeding- Seeds the deterministic RNG using
seeding.np_random(seed)- Initializes predicted prices from the reset prices object
This ensures reproducibility across episodes.
311-327: LGTM!The additional tracking metrics for job drops and rejections are properly initialized in
reset_state(). These metrics enable comprehensive monitoring of job handling across episodes for both agent and baseline policies.
336-336: LGTM!The use of
advance_and_get_predicted_prices()aligns with the deterministic price handling refactor and ensures predicted prices are updated consistently each step.
365-375: LGTM!The queue capacity checks before sampling prevent generating jobs that cannot be enqueued, improving robustness. The logic correctly:
- Calculates available queue space
- Caps generation at the minimum of queue space and
MAX_NEW_JOBS_PER_HOUR- Handles the zero-capacity case by setting jobs to an empty list
385-421: LGTM!The workload generator integration is well-implemented:
- Queue capacity checks prevent overflow (lines 387-399)
- Jobs are capped at
max_to_generateto respect queue limits- Legacy path uses
self.np_randomfor deterministic generation (lines 409, 414, 419-420)- Maintains backward compatibility with external durations
The deterministic RNG replacement across all random calls ensures reproducibility.
428-428: LGTM!Tracking
jobs_rejected_queue_fullprovides visibility into when the queue reaches capacity, which is useful for monitoring and tuning queue size limits.
444-444: Verify signature consistency in baseline_step.The
assign_jobs_to_available_nodesmethod now returns three values (num_launched_jobs, next_empty_slot, num_dropped). Ensure the baseline_step call (line 699) handles this correctly.Based on line 699, the baseline call correctly unpacks three values with
_for the drop count, so this is consistent.
471-471: LGTM!Storing
num_dropped_this_stepenables per-step drop penalties in the reward calculation.
515-516: LGTM!Setting
truncated=Trueandterminated=Falsecorrectly signals that the episode ended due to a time limit rather than a terminal state, which aligns with Gymnasium's truncation semantics for fixed-horizon episodes.
620-688: LGTM!The job assignment logic now properly handles job drops:
- Jobs exceeding
MAX_JOB_AGEare cleared from the queue (lines 670-676)- Drop counters are correctly maintained for agent and baseline (lines 679-684)
- The
next_empty_slotis updated when jobs are dropped (lines 675-676)- Jobs that can't be assigned continue to age (line 686)
The three-value return signature (num_processed_jobs, next_empty_slot, num_dropped) is consistent with the call sites.
697-699: LGTM!The baseline step correctly:
- Tracks rejected jobs when the queue is full (line 697)
- Unpacks the three-value return from
assign_jobs_to_available_nodes(line 699)The drop count is intentionally discarded with
_since baseline drop tracking is handled internally withinassign_jobs_to_available_nodes.
751-764: LGTM!The drop penalty implementation is clean and correct:
- Uses
min(0, PENALTY_DROPPED_JOB * self.num_dropped_this_step)to ensure the penalty is non-positive (line 752)- Weighted by
drop_weightfor configurability (line 753)- Penalizes only the agent's drops without dependency on baseline, addressing the previous concern about hidden baseline dependency
- Integrated into the total reward calculation (line 761)
Based on learnings, this aligns with the weighted reward component architecture.
875-910: LGTM!The episode completion recording properly captures drop metrics:
- Drop rates are correctly calculated with zero-division protection (lines 875-876)
- All relevant drop metrics are included in the episode data for both agent and baseline (lines 902-909)
- Enables comprehensive analysis of job handling performance across episodes
Based on learnings, this extends the episode metrics structure appropriately.
Track episode/time counter in environment, pass it to prices function, and read the current price.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (4)
prices_deterministic.py (1)
23-24: Consider extracting exception message to a constant.Ruff suggests avoiding long exception messages directly in the raise statement (TRY003). While the current message is clear and relatively short, you could optionally extract it to a module-level constant for consistency with linting guidelines.
🔎 Optional refactor:
At module level (after imports):
_EMPTY_PRICES_ERROR = "external_prices must be a non-empty sequence"Then update the raise statement:
if prices.size == 0: - raise ValueError("external_prices must be a non-empty sequence") + raise ValueError(_EMPTY_PRICES_ERROR)environment.py (3)
8-9: Remove commented-out import.The old
pricesimport is commented out whileprices_deterministicis marked as a test. Either finalize the transition and remove the commented line, or clarify the intent with a TODO comment.🔎 Apply this diff to clean up the import:
-#from prices import Prices -from prices_deterministic import Prices # Test re-worked prices script +from prices_deterministic import Prices
375-383: Consider adding queue capacity check toexternal_jobspath.The
external_hourly_jobsandworkload_genpaths both include queue capacity checks to prevent generating more jobs than can be enqueued (lines 386-396, 407-420). Theexternal_jobspath should also implement this check for consistency.🔎 Apply this diff to add queue capacity check:
if self.external_jobs: - jobs = self.jobs_sampler.sample_one_hourly(wrap=True)["hourly_jobs"] + queue_used = int(np.count_nonzero(job_queue_2d[:, 0] > 0)) + queue_free = max(0, MAX_QUEUE_SIZE - queue_used) + max_to_generate = min(queue_free, MAX_NEW_JOBS_PER_HOUR) + + if max_to_generate == 0: + jobs = [] + else: + jobs = self.jobs_sampler.sample_one_hourly(wrap=True)["hourly_jobs"] + if len(jobs) > max_to_generate: + jobs = jobs[:max_to_generate] + if len(jobs) > 0: for job in jobs:
428-442: Clean up commented legacy code.The legacy
np.randomcalls are commented out but retained. Since the migration toself.np_randomis complete and functional, consider removing the commented lines to reduce clutter.🔎 Apply this diff to remove commented code:
else: - # new_jobs_count = np.random.randint(0, MAX_NEW_JOBS_PER_HOUR + 1) # Keep legacy code for now new_jobs_count = self.np_random.integers(0, MAX_NEW_JOBS_PER_HOUR + 1) if self.external_durations: new_jobs_durations = durations_sampler.sample(new_jobs_count) else: - # new_jobs_durations = np.random.randint(1, MAX_JOB_DURATION + 1, size=new_jobs_count) new_jobs_durations = self.np_random.integers(1, MAX_JOB_DURATION + 1, size=new_jobs_count) # Generate random node and core requirements for _ in range(new_jobs_count): - # new_jobs_nodes.append(np.random.randint(MIN_NODES_PER_JOB, MAX_NODES_PER_JOB + 1)) - # new_jobs_cores.append(np.random.randint(MIN_CORES_PER_JOB, CORES_PER_NODE + 1)) new_jobs_nodes.append(self.np_random.integers(MIN_NODES_PER_JOB, MAX_NODES_PER_JOB + 1)) new_jobs_cores.append(self.np_random.integers(MIN_CORES_PER_JOB, CORES_PER_NODE + 1)) -#----------------------------------------------------------------------------------------------------------------------------------------------
📜 Review details
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
environment.py(20 hunks)prices_deterministic.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
environment.py
📄 CodeRabbit inference engine (CLAUDE.md)
environment.py: Use Gymnasium-compatible environment design for RL simulations in the Environment module
Configure cluster parameters as: 335 nodes maximum, 96 cores per node, up to 16 nodes per job, 150W idle power, 450W active power per node
Set environment constraints as: 2-week episodes (336 hours), max 1000 jobs in queue, max 1500 new jobs per hour, max 170 hours job runtime
Files:
environment.py
🧠 Learnings (7)
📓 Common learnings
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to {train.py,weights.py} : Use weighted reward components for training: efficiency-weight (default 0.7), price-weight (default 0.2), idle-weight (default 0.1), and job-age-weight (default 0.0)
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to train.py : Implement stable-baselines3 PPO integration in the training script with tensorboard logging and model checkpointing every 100K steps
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to environment.py : Set environment constraints as: 2-week episodes (336 hours), max 1000 jobs in queue, max 1500 new jobs per hour, max 170 hours job runtime
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to {train.py,weights.py} : Use weighted reward components for training: efficiency-weight (default 0.7), price-weight (default 0.2), idle-weight (default 0.1), and job-age-weight (default 0.0)
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to environment.py : Configure cluster parameters as: 335 nodes maximum, 96 cores per node, up to 16 nodes per job, 150W idle power, 450W active power per node
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to sampler_hourly.py : Sample job characteristics independently (job count, duration, nodes, cores-per-node) from hour-of-day distributions in the Hourly Sampler
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to sampler_jobs.py : Use pattern-based replay of historical job batches with full characteristics in the Jobs Sampler
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to train.py : Structure evaluation output to include per-episode metrics (cost, savings, completion rate, wait time, queue size) and cumulative analysis (total savings, monthly reduction, annual projections)
Applied to files:
environment.py
🧬 Code graph analysis (2)
environment.py (4)
prices_deterministic.py (3)
Prices(5-152)reset(52-76)advance_and_get_predicted_prices(97-101)sampler_jobs.py (4)
DurationSampler(7-444)parse_jobs(18-115)sample_one_hourly(416-444)sample(123-168)sampler_hourly.py (2)
parse_jobs(12-100)sample(102-142)workloadgen.py (1)
sample(99-110)
prices_deterministic.py (1)
environment.py (1)
reset(250-301)
🪛 Ruff (0.14.8)
prices_deterministic.py
24-24: Avoid specifying long messages outside the exception class
(TRY003)
🔇 Additional comments (8)
environment.py (8)
85-93: LGTM!The debug validator is well-structured and provides useful assertions for maintaining queue invariants during development.
151-169: LGTM!The RNG initialization and workload generator integration are correctly structured. The conditional logic properly ensures only one job generation method is active.
250-281: LGTM!The deterministic RNG seeding and episode-based price window selection are well-implemented, ensuring reproducible training runs.
465-465: LGTM!The drop tracking integration and proper use of
truncated/terminatedflags align with Gymnasium conventions and provide accurate metrics.Also applies to: 492-492, 536-537
641-709: LGTM!The drop tracking logic is well-implemented. Jobs that exceed
MAX_JOB_AGEare properly cleared from the queue, and separate counters for agent and baseline are correctly maintained.
718-720: LGTM!The baseline tracking correctly mirrors the agent path, and the updated
assign_jobssignature is properly handled.
896-930: LGTM!The episode recording comprehensively captures drop metrics for both agent and baseline, enabling detailed analysis of job handling performance.
772-774: Verifydrop_weightexists in Weights class.The code references
self.weights.drop_weightwhich is properly defined in theWeightsdataclass with a default value of 0.0.
…ary import and initialization in env
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
environment.py(20 hunks)prices_deterministic.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
environment.py
📄 CodeRabbit inference engine (CLAUDE.md)
environment.py: Use Gymnasium-compatible environment design for RL simulations in the Environment module
Configure cluster parameters as: 335 nodes maximum, 96 cores per node, up to 16 nodes per job, 150W idle power, 450W active power per node
Set environment constraints as: 2-week episodes (336 hours), max 1000 jobs in queue, max 1500 new jobs per hour, max 170 hours job runtime
Files:
environment.py
🧠 Learnings (7)
📓 Common learnings
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to {train.py,weights.py} : Use weighted reward components for training: efficiency-weight (default 0.7), price-weight (default 0.2), idle-weight (default 0.1), and job-age-weight (default 0.0)
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to sampler_jobs.py : Use pattern-based replay of historical job batches with full characteristics in the Jobs Sampler
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to environment.py : Set environment constraints as: 2-week episodes (336 hours), max 1000 jobs in queue, max 1500 new jobs per hour, max 170 hours job runtime
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to {train.py,weights.py} : Use weighted reward components for training: efficiency-weight (default 0.7), price-weight (default 0.2), idle-weight (default 0.1), and job-age-weight (default 0.0)
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to environment.py : Configure cluster parameters as: 335 nodes maximum, 96 cores per node, up to 16 nodes per job, 150W idle power, 450W active power per node
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to sampler_hourly.py : Sample job characteristics independently (job count, duration, nodes, cores-per-node) from hour-of-day distributions in the Hourly Sampler
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to sampler_jobs.py : Use pattern-based replay of historical job batches with full characteristics in the Jobs Sampler
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to train.py : Structure evaluation output to include per-episode metrics (cost, savings, completion rate, wait time, queue size) and cumulative analysis (total savings, monthly reduction, annual projections)
Applied to files:
environment.py
🧬 Code graph analysis (2)
prices_deterministic.py (1)
environment.py (1)
reset(249-300)
environment.py (4)
prices_deterministic.py (3)
Prices(5-151)reset(52-76)advance_and_get_predicted_prices(96-100)prices.py (1)
Prices(4-121)sampler_jobs.py (5)
DurationSampler(7-444)parse_jobs(18-115)precalculate_hourly_jobs(364-383)sample_one_hourly(416-444)sample(123-168)workloadgen.py (1)
sample(99-110)
🪛 Ruff (0.14.8)
prices_deterministic.py
24-24: Avoid specifying long messages outside the exception class
(TRY003)
🔇 Additional comments (27)
prices_deterministic.py (7)
1-10: LGTM!The imports are complete (including
dequefromcollectionswhich addresses a previous concern), and the class constants are well-defined with sensible values for electricity price modeling.
12-41: Initialization logic is sound.The empty array guard (line 23-24) and deque initialization (line 18) address previous review concerns. The price shifting logic correctly handles negative prices, and percentile-based bounds are properly computed.
The static analysis hint about TRY003 (line 24) is a style suggestion to define a custom exception class for long messages, but the current approach is acceptable for this context.
44-50: LGTM!The synthetic price generation is deterministic and uses a sensible daily sinusoid pattern with a lower bound. The
get_real_pricemethod correctly reverses the price shift applied during initialization.
52-77: LGTM!The reset method properly reinitializes the deque (line 58) and handles both external and synthetic price modes correctly. The external mode uses modulo arithmetic for wrapping and correctly positions the price_index after the prediction window.
79-100: LGTM!The stateful stepping methods are well-designed. The
get_next_pricemethod correctly uses the bounded deque (which auto-removes the oldest entry when full),get_price_contextsafely handles empty history, andadvance_and_get_predicted_pricesreturns a copy to prevent external mutation.
103-124: LGTM!The non-mutating utilities correctly separate stats computation from internal state. The
_generated_prices_for_statsmethod generates a full year of synthetic prices without side effects, andget_price_statsprovides comprehensive statistics with proper handling of all three price modes.
126-151: LGTM!The histogram plotting method correctly handles all price modes, provides clear visual markers for key statistics, and properly closes the figure after use. The conditional logic for determining the price type label is correct.
environment.py (20)
8-8: LGTM!The import changes correctly switch to the deterministic variants (
prices_deterministic.Pricesandsampler_jobs.DurationSampler) and add the necessaryseedingutility for deterministic RNG integration, aligning with the PR's objectives.Also applies to: 12-12, 15-16
43-44: LGTM!The
PENALTY_DROPPED_JOBconstant is correctly defined as a negative value to penalize dropped jobs in the reward calculation.
84-91: LGTM!The
_validate_next_emptydebugging utility correctly validates the invariants for thenext_emptyslot pointer: that it points to an empty slot and that there are no holes before it.
117-117: LGTM!The addition of the
workload_genparameter and initialization of deterministic RNG fields (np_random,_seed) align with the PR's goal of supporting deterministic workload generation.Also applies to: 150-153
159-168: LGTM!The conditional initialization correctly avoids redundant sampler setup when
workload_genis provided. The sampler is only created when external jobs are used and no workload generator is present, which is the correct logic.
182-191: LGTM!The drop tracking metrics for both agent and baseline are properly initialized. The
episode_idx = -1initialization is a clean approach to ensure the first reset sets it to 0.
249-272: LGTM!The reset method correctly seeds the deterministic RNG (line 251), tracks episode indices, and implements episode-based price windowing. The calculation
start_index = (self.episode_idx * episode_span) % n_pricesensures each episode starts at a different point in the external price series with proper wrapping.
280-280: LGTM!Using
self.prices.predicted_prices.copy()correctly initializes the state with the deterministic price prediction window.
355-355: LGTM!The change to
advance_and_get_predicted_prices()correctly uses the deterministic prices module's method for advancing the price timeline.
374-374: LGTM!The switch to
sample_one_hourly(wrap=True)correctly uses the DurationSampler's API for retrieving hourly jobs.
383-394: LGTM!The queue capacity check correctly prevents over-generation when the queue is full by computing available space and capping the maximum jobs to generate. Passing
rng=self.np_randomandmax_jobs=max_to_generateto the sampler enables deterministic, bounded job generation.
404-440: LGTM!The workload generator integration is well-designed:
- Queue capacity checks prevent overflow
- The generator is called with deterministic RNG (
self.np_random)- Truncation handles cases where the generator produces more jobs than can be enqueued
- Legacy path correctly replaces global
np.randomwithself.np_randomfor deterministic behavior
447-447: LGTM!The calculation correctly tracks jobs rejected due to a full queue by comparing the number of jobs generated to the number actually added.
463-463: LGTM!The updated
assign_jobs_to_available_nodescall correctly unpacks the new third return value (num_dropped_this_step), and storing it inself.num_dropped_this_stepmakes it available for the reward calculation.Also applies to: 490-490
534-535: LGTM!The change to
truncated=Trueandterminated=Falsecorrectly follows Gymnasium semantics: episodes ending due to time limits should settruncated=True, while task completion or failure conditions setterminated=True.
639-707: LGTM!The updated
assign_jobs_to_available_nodesmethod correctly implements job dropping when age exceedsMAX_JOB_AGE. The logic:
- Successfully assigned jobs are processed as before (lines 649-684)
- Jobs that can't be assigned have their age incremented (line 687)
- Jobs exceeding
MAX_JOB_AGEare dropped, cleared from the queue, and tracked separately for agent vs baseline (lines 689-703)- Jobs not exceeding the age limit are aged in place (line 705)
The return signature change (line 707) correctly adds
num_droppedas the third return value.
716-716: LGTM!The baseline tracking correctly mirrors the agent's logic:
- Line 716 tracks rejected jobs due to full queue
- Line 718 correctly unpacks the updated return signature and ignores the launched/dropped counts (which are tracked separately via instance variables updated inside
assign_jobs_to_available_nodes)Also applies to: 718-718
780-780: LGTM!The drop penalty is correctly added to the total reward (line 780), and the debug print statement (line 783) is updated to include the weighted drop penalty in the reward breakdown.
Also applies to: 783-783
894-928: LGTM!The episode completion recording correctly calculates drop rates with zero-division guards and includes comprehensive drop metrics in the episode data dictionary. This enables proper tracking and analysis of job drop behavior across episodes.
770-772: The drop penalty calculation is correct, usingmin(0, PENALTY_DROPPED_JOB * self.num_dropped_this_step)to ensure the penalty is never positive. Thedrop_weightattribute is properly defined in theWeightsclass (as a float attribute) and is correctly referenced at line 772.Likely an incorrect or invalid review comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
📜 Review details
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
environment.py(20 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
environment.py
📄 CodeRabbit inference engine (CLAUDE.md)
environment.py: Use Gymnasium-compatible environment design for RL simulations in the Environment module
Configure cluster parameters as: 335 nodes maximum, 96 cores per node, up to 16 nodes per job, 150W idle power, 450W active power per node
Set environment constraints as: 2-week episodes (336 hours), max 1000 jobs in queue, max 1500 new jobs per hour, max 170 hours job runtime
Files:
environment.py
🧠 Learnings (7)
📓 Common learnings
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to {train.py,weights.py} : Use weighted reward components for training: efficiency-weight (default 0.7), price-weight (default 0.2), idle-weight (default 0.1), and job-age-weight (default 0.0)
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to sampler_jobs.py : Use pattern-based replay of historical job batches with full characteristics in the Jobs Sampler
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to environment.py : Set environment constraints as: 2-week episodes (336 hours), max 1000 jobs in queue, max 1500 new jobs per hour, max 170 hours job runtime
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to {train.py,weights.py} : Use weighted reward components for training: efficiency-weight (default 0.7), price-weight (default 0.2), idle-weight (default 0.1), and job-age-weight (default 0.0)
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to environment.py : Configure cluster parameters as: 335 nodes maximum, 96 cores per node, up to 16 nodes per job, 150W idle power, 450W active power per node
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to sampler_hourly.py : Sample job characteristics independently (job count, duration, nodes, cores-per-node) from hour-of-day distributions in the Hourly Sampler
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to sampler_jobs.py : Use pattern-based replay of historical job batches with full characteristics in the Jobs Sampler
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to train.py : Structure evaluation output to include per-episode metrics (cost, savings, completion rate, wait time, queue size) and cumulative analysis (total savings, monthly reduction, annual projections)
Applied to files:
environment.py
🧬 Code graph analysis (1)
environment.py (4)
prices_deterministic.py (3)
Prices(5-151)reset(52-76)advance_and_get_predicted_prices(96-100)weights.py (1)
Weights(5-16)sampler_jobs.py (4)
DurationSampler(7-444)parse_jobs(18-115)precalculate_hourly_jobs(364-383)sample(123-168)sampler_hourly.py (2)
parse_jobs(12-100)sample(102-142)
🔇 Additional comments (7)
environment.py (7)
1-44: Imports and constants look good.The addition of deterministic RNG seeding and the PENALTY_DROPPED_JOB constant align well with the PR objectives.
249-300: RNG seeding and episode-aware price initialization implemented correctly.The deterministic RNG is properly seeded using Gymnasium's
seeding.np_random(), and the price series correctly wraps around based on episode index. This ensures reproducible episodes.
380-423: Capacity checks and deterministic RNG usage look good.Both the
hourly_samplerandworkload_genpathways correctly:
- Calculate available queue capacity before generating jobs
- Cap generation to prevent exceeding
MAX_QUEUE_SIZEorMAX_NEW_JOBS_PER_HOUR- Use the instance-level
self.np_randomfor deterministic samplingThis prevents queue overflow and ensures reproducibility.
533-534: Correct truncation semantics for episode termination.Setting
truncated=Trueandterminated=Falseat the episode time limit is the correct Gymnasium pattern for distinguishing time-based episode endings from goal-based termination.
638-706: Job assignment and drop logic implemented correctly.The function properly:
- Ages waiting jobs by 1 hour per step
- Drops jobs exceeding
MAX_JOB_AGEand tracks the count- Updates
next_empty_slotfor efficient queue management- Returns three values (launched, next_slot, dropped) as expected by callers
769-782: Drop penalty calculation is correct and addresses previous feedback.The penalty correctly:
- Uses only the agent's dropped jobs (not baseline) as requested by rbx
- Applies
PENALTY_DROPPED_JOB(-5.0) per dropped job- Uses
min(0, ...)as defensive coding to ensure non-positive penalty- Is weighted by
drop_weight(default 0, making it experimental)Based on learnings, the default
drop_weightis 0, which aligns with the PR description stating this feature is experimental and can be disabled.
893-928: Episode metrics tracking for dropped jobs implemented correctly.The drop rate calculations use safe division, and all relevant metrics (dropped, rejected, rates) are recorded for both agent and baseline, enabling comprehensive episode analysis.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
environment.py (1)
256-259: Remove redundanthasattrcheck.The
episode_idxattribute is already initialized in__init__(line 192), making thehasattrcheck unnecessary.🔎 Apply this diff to simplify the logic:
- # Track which episode this env instance is on - if not hasattr(self, "episode_idx"): - self.episode_idx = 0 - else: - self.episode_idx += 1 + # Track which episode this env instance is on + self.episode_idx += 1
📜 Review details
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
environment.py(20 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
environment.py
📄 CodeRabbit inference engine (CLAUDE.md)
environment.py: Use Gymnasium-compatible environment design for RL simulations in the Environment module
Configure cluster parameters as: 335 nodes maximum, 96 cores per node, up to 16 nodes per job, 150W idle power, 450W active power per node
Set environment constraints as: 2-week episodes (336 hours), max 1000 jobs in queue, max 1500 new jobs per hour, max 170 hours job runtime
Files:
environment.py
🧠 Learnings (7)
📓 Common learnings
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to {train.py,weights.py} : Use weighted reward components for training: efficiency-weight (default 0.7), price-weight (default 0.2), idle-weight (default 0.1), and job-age-weight (default 0.0)
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to environment.py : Set environment constraints as: 2-week episodes (336 hours), max 1000 jobs in queue, max 1500 new jobs per hour, max 170 hours job runtime
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to sampler_jobs.py : Use pattern-based replay of historical job batches with full characteristics in the Jobs Sampler
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to {train.py,weights.py} : Use weighted reward components for training: efficiency-weight (default 0.7), price-weight (default 0.2), idle-weight (default 0.1), and job-age-weight (default 0.0)
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to environment.py : Configure cluster parameters as: 335 nodes maximum, 96 cores per node, up to 16 nodes per job, 150W idle power, 450W active power per node
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to sampler_hourly.py : Sample job characteristics independently (job count, duration, nodes, cores-per-node) from hour-of-day distributions in the Hourly Sampler
Applied to files:
environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to train.py : Structure evaluation output to include per-episode metrics (cost, savings, completion rate, wait time, queue size) and cumulative analysis (total savings, monthly reduction, annual projections)
Applied to files:
environment.py
🧬 Code graph analysis (1)
environment.py (4)
prices_deterministic.py (3)
Prices(5-151)reset(52-76)advance_and_get_predicted_prices(96-100)prices.py (1)
Prices(4-121)sampler_jobs.py (5)
DurationSampler(7-444)parse_jobs(18-115)precalculate_hourly_jobs(364-383)sample_one_hourly(416-444)sample(123-168)workloadgen.py (1)
sample(99-110)
🔇 Additional comments (19)
environment.py (19)
8-16: LGTM!The imports correctly integrate deterministic pricing and RNG seeding utilities, aligning with the PR's deterministic workflow objectives.
43-44: LGTM!The explicit drop penalty constant is well-defined and aligns with the experimental feature described in the PR comments.
84-92: LGTM!The validation helper correctly checks queue invariants for debugging purposes.
116-117: LGTM!The new constructor parameters properly support evaluation mode and workload generator integration.
150-154: LGTM!The RNG initialization is correct;
np_randomis properly seeded in thereset()method (lines 252-253).
159-168: LGTM!The jobs_sampler initialization condition now correctly matches its usage in
step()(line 374).
182-193: LGTM!Job tracking metrics are properly initialized in
__init__, addressing previous concerns about attribute availability.
282-282: LGTM!The predicted prices initialization correctly uses the deterministic Prices object.
333-347: LGTM!Job tracking metrics are properly reset without redundant initializations.
356-356: LGTM!The predicted prices are correctly updated using the deterministic price advancement method.
374-441: LGTM!Job generation properly implements queue capacity checks across all paths and uses deterministic RNG throughout.
448-448: LGTM!The queue rejection tracking correctly captures jobs that couldn't be enqueued.
464-464: LGTM!The call correctly unpacks the updated return values from
assign_jobs_to_available_nodes.
491-491: LGTM!The per-step drop count is correctly tracked for reward calculation.
535-536: LGTM!The truncated/terminated distinction correctly follows Gymnasium conventions for time-limited episodes.
640-708: LGTM!The job assignment and dropping logic is well-structured, correctly tracking drops for both agent and baseline.
717-719: LGTM!Baseline step correctly handles the updated method signature.
771-784: LGTM!The drop penalty calculation is clear and correctly ensures a non-positive penalty value.
895-929: LGTM!Episode completion recording properly captures all drop-related metrics for both agent and baseline.
Integrate my early features, to prevent divergence.