FIX: reduce memory in flatfield by evaluating tilts only at slit pixels#2098
Conversation
Replace full-frame meshgrid tilt evaluation with per-slit-pixel evaluation using PypeItFit.eval directly. For spectrographs with many slits (e.g., fiber-fed IFUs with hundreds of fibers), the previous approach allocated a full-frame tilts array per slit, causing excessive memory usage. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
kbwestfall
left a comment
There was a problem hiding this comment.
I'm approving this as it is, given that it reduces memory so substantially. But I had some questions that I hope we can think through before we merge this.
| # Build a full-frame tilts image placeholder with only slit pixels filled | ||
| tilts = np.zeros(rawflat.shape, dtype=float) | ||
| tilts[onslit_padded] = _tilts_slit | ||
| del _tilts_slit, _spec, _spat |
There was a problem hiding this comment.
3 questions:
- Instead of adding this here, could we instead pass
onslit_paddedtotracewave.fit2tiltsand essentially get the same thing? - Do we need to explicitly delete the "work" arrays, or can we lean on garbage collection?
- I'm wondering if there's a way we could minimize the number of times we need to create the
tiltsarray, and/or try to use the same memory block.
There was a problem hiding this comment.
3 questions:
- Instead of adding this here, could we instead pass
onslit_paddedtotracewave.fit2tiltsand essentially get the same thing?
maybe. i'll take a deeper look.
- Do we need to explicitly delete the "work" arrays, or can we lean on garbage collection?
i don't see any downside to just manually deleting them if we know they're not going to be used. especially given what look like GC issues with python 3.14.
- I'm wondering if there's a way we could minimize the number of times we need to create the
tiltsarray, and/or try to use the same memory block.
given the scale of the bug being fixed here, i am sure there are plenty of places in the code that can be streamlined and made more efficient.
There was a problem hiding this comment.
asked claude and got this response:
Could we pass onslit_padded to fit2tilts instead? — Yes, this is a clean approach. fit2tilts currently creates a full nspec x nspat meshgrid (line 893), which is the memory problem. Adding an optional mask parameter would let it evaluate only at masked pixels and return either a sparse full-frame array or just the 1D values. There are only 2 other callers (wavetilts.py:154 and wavetilts.py:817), so backward compatibility is easy with a default mask=None. This would also benefit those callers if they ever need it.
Explicit del vs garbage collection — The del statements are a belt-and-suspenders measure. In a tight loop over hundreds of slits, it ensures the previous iteration's arrays are freed before allocating the next. GC would eventually collect them, but in CPython the reference counting means del triggers immediate deallocation. Given the memory-sensitive context, it's cheap insurance.
Reuse the tilts array across iterations — Currently tilts = np.zeros(...) allocates a new array each iteration. You could allocate once before the loop and tilts[:] = 0 each iteration to reuse the memory block. However, spec_coo is derived from tilts on the next line, so both would need coordinated handling. Modest win for standard spectrographs, bigger win for many-slit cases.
claude's point about cpython is a good one since the objects in question are numpy arrays.
Add an optional `slit_mask` parameter to `tracewave.fit2tilts` so that tilt evaluation at only the relevant slit pixels is handled inside the function rather than being inlined at each call site. This addresses review feedback on the memory optimization: the logic now lives in the canonical location and both callers in `flatfield.py` and `wavetilts.py` benefit from reduced memory usage. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
went ahead and implemented @kbwestfall's idea of doing the fix at the |
| # extrapolation of fits which can break wavelength solution fitting | ||
| tilts = np.fmax(np.fmin(tilts, 1.2), -0.2) | ||
|
|
||
| return tilts |
There was a problem hiding this comment.
At the risk of micro-managing, can we do this instead:
if slit_mask is None:
spat_pix, spec_pix = map(
lambda x : x.ravel(), np.meshgrid(np.arange(nspat), np.arange(nspec))
)
else:
spec_pix, spat_pix = np.where(slit_mask)
tilts_vals = pypeitFit.eval(spec_pix / xnspecmin1, x2=(spat_pix - _spat_shift) / xnspatmin1)
tilts_vals = np.fmax(np.fmin(tilts_vals, 1.2), -0.2)
tilts = np.zeros(shape, dtype=float)
tilts[(spec_pix,spat_pix)] = tilts_vals
del tilts_vals, spec_pix, spat_pix
return tilts| # Tilts are created with the size of the original slitmask, | ||
| # which corresonds to the same binning as the science | ||
| # images, trace images, and pixelflats etc. | ||
| self.tilts = tracewave.fit2tilts(self.slitmask_science.shape, coeff_out, |
There was a problem hiding this comment.
Should the slit_mask parameter be added here too?
There was a problem hiding this comment.
i think so, but we'll need to fix the following check to limit it to the masked region. otherwise nanmin will always come up as 0. i'll go ahead and make that change.
Pass slit_mask=thismask_science to fit2tilts for per-slit-pixel evaluation to match the memory-saving pattern used elsewhere. Scope the subsequent tilt-range quality check to slit pixels so it is not diluted by the zeros that fit2tilts now returns outside the slit. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Dev-suite running. |
|
There are a number of test failures: All the script test failures have to do with the afterburn scripts, and the vet tests may be associated with that. @tepickering , I'll post the full report in Slack. |
kbwestfall
left a comment
There was a problem hiding this comment.
Blocking merging until we can get the tests fixed.
|
it looks like this change is the culprit: the old way was doing a full-frame evaluation of the tilts so it was always fine. the 3 failures in the test run all trip on the new slit-scoped test. one option would be to set that to something smaller like 0.1 or 0.2. or add more code to set a smarter threshold based on |
… was evaluating against the whole detector while we're now only evaluating within each slit/order
|
we should do another full dev suite run to test the change i just made. |
|
I've restarted the tests! |
|
Tests pass! The unit test failure is the known LDT one. |
Replace full-frame meshgrid tilt evaluation with per-slit-pixel evaluation using PypeItFit.eval directly. For spectrographs with many slits (e.g., fiber-fed IFUs with hundreds of fibers), the previous approach allocated a full-frame tilts array per slit, causing excessive memory usage.
This was originally implemented in #2080 and was required to get the original implementation there to work. However, it's a massive improvement for any multi-slit mode.