Fix heap OOB read in RNN operator via sequence_lens=0#28052
Conversation
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Fixes a heap out-of-bounds read in the CPU RNN operator when sequence_lens contains 0, preventing unintended heap data exposure via Y_h.
Changes:
- Add bounds check in
Assign_Y_h()to avoid negative/invalid time-step offsets and zero-fill affectedY_hentries. - Add early return in
Compute()whenmax_sequence_length == 0, zero-filling outputsYandY_h(aligns with LSTM/GRU behavior).
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
In rnn_helpers.cc:84-88
allows [len == 0] ( But the error message says:
[seq_length](int len) { return len < 0 || len > seq_length; }
The message is misleading (says > 0 but the code permits seq_length == 0; says < seq_length.
The PR relies on 0 being allowed through validation, which is correct for the fix. But the error message should be updated to match the actual constraint: [>= 0 and <= seq_length]. This is a pre-existing bug but relevant since the PR depends on this behavior.
|
No regression test. There is not a verification for the fix. Sets sequence_lens = {0} (all-zero batch) and verifies Y and Y_h are all zeros. |
|
No positivity check is performed. If hidden_size_ <= 0, the Compute() function proceeds to allocate and compute with a non-positive hidden size. The ValidateCommonRnnInputs checks W_shape[1] != hidden_size * WRB_dim_1_multipler, which would catch mismatches, but if a crafted model has hidden_size=0 AND matching zero-dimension tensors, the behavior is undefined (e.g., SafeInt<size_t>(sizeof(float)) * seq_length * batch_size * hidden_size_ = 0, allocating zero bytes then writing to it). |
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
41eb12c to
7989648
Compare
|
Addressed all review feedback in the latest force-push (rebased onto main). Here's a summary: Copilot review comments
@yuslepukhin review comments
Merge conflictResolved by rebasing onto main. PR #28003 had already landed a partial fix for |
|
Minor: The RNN_invalid_sequence_lens test uses vector {0, 5} and expects failure. It now fails on 5 > seq_length, not on 0. The comment was updated to say "5 exceeds seq_length (3)". However, this test no longer validates that 0 alone is accepted. The new regression tests (RNN_forward_sequence_lens_with_zero, etc.) implicitly cover this — they succeed with seq_lens=0, proving 0 passes validation. Adequate. |
Description
In the CPU RNN operator's \Assign_Y_h\ function, when \sequence_lens\ contains a value of 0, the computation \sequence_lens[batch] - 1 = -1\ produces a negative offset into the Y output buffer. \CopyVector\ then reads \hidden_size\ floats from heap memory before the buffer, leaking heap data into the \Y_h\ output tensor.
LSTM and GRU already handle zero-length sequences correctly (early return + zero-fill in compute path), but the basic RNN operator had neither protection.
Changes