Skip to content

Fix heap OOB read in RNN operator via sequence_lens=0#28052

Merged
vraspar merged 1 commit intomainfrom
vraspar/fix-rnn-sequence-lens-zero-oob
Apr 27, 2026
Merged

Fix heap OOB read in RNN operator via sequence_lens=0#28052
vraspar merged 1 commit intomainfrom
vraspar/fix-rnn-sequence-lens-zero-oob

Conversation

@vraspar
Copy link
Copy Markdown
Contributor

@vraspar vraspar commented Apr 13, 2026

Description

In the CPU RNN operator's \Assign_Y_h\ function, when \sequence_lens\ contains a value of 0, the computation \sequence_lens[batch] - 1 = -1\ produces a negative offset into the Y output buffer. \CopyVector\ then reads \hidden_size\ floats from heap memory before the buffer, leaking heap data into the \Y_h\ output tensor.

LSTM and GRU already handle zero-length sequences correctly (early return + zero-fill in compute path), but the basic RNN operator had neither protection.

Changes

  • rnn.cc \Compute()\: Add early return when \max_sequence_length == 0\ — zero-fills Y and Y_h outputs and returns immediately (matches existing LSTM/GRU pattern)
  • rnn.cc \Assign_Y_h()\: Add bounds check on \last_time_step\ before computing buffer offset — guards against both negative index (\seq_lens=0\) and index >= seq_length, zero-fills Y_h for invalid entries

@vraspar vraspar requested a review from Copilot April 13, 2026 20:38
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Fixes a heap out-of-bounds read in the CPU RNN operator when sequence_lens contains 0, preventing unintended heap data exposure via Y_h.

Changes:

  • Add bounds check in Assign_Y_h() to avoid negative/invalid time-step offsets and zero-fill affected Y_h entries.
  • Add early return in Compute() when max_sequence_length == 0, zero-filling outputs Y and Y_h (aligns with LSTM/GRU behavior).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread onnxruntime/core/providers/cpu/rnn/rnn.cc
Comment thread onnxruntime/core/providers/cpu/rnn/rnn.cc Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread onnxruntime/core/providers/cpu/rnn/rnn.cc
Comment thread onnxruntime/core/providers/cpu/rnn/rnn.cc Outdated
Comment thread onnxruntime/core/providers/cpu/rnn/rnn.cc Outdated
Copy link
Copy Markdown
Member

@yuslepukhin yuslepukhin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In rnn_helpers.cc:84-88

allows [len == 0] ( But the error message says:

[seq_length](int len) { return len < 0 || len > seq_length; }
The message is misleading (says > 0 but the code permits seq_length == 0; says < seq_length.

The PR relies on 0 being allowed through validation, which is correct for the fix. But the error message should be updated to match the actual constraint: [>= 0 and <= seq_length]. This is a pre-existing bug but relevant since the PR depends on this behavior.

@yuslepukhin
Copy link
Copy Markdown
Member

No regression test. There is not a verification for the fix.

Sets sequence_lens = {0} (all-zero batch) and verifies Y and Y_h are all zeros.
Sets sequence_lens = {0, 3} (mixed batch) and verifies the 0-length batch entry in Y_h is zero while the other is correct.
Tests both forward and reverse directions.

@yuslepukhin
Copy link
Copy Markdown
Member

hidden_size_ is read from the attribute:

No positivity check is performed. If hidden_size_ <= 0, the Compute() function proceeds to allocate and compute with a non-positive hidden size. The ValidateCommonRnnInputs checks W_shape[1] != hidden_size * WRB_dim_1_multipler, which would catch mismatches, but if a crafted model has hidden_size=0 AND matching zero-dimension tensors, the behavior is undefined (e.g., SafeInt<size_t>(sizeof(float)) * seq_length * batch_size * hidden_size_ = 0, allocating zero bytes then writing to it).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@vraspar vraspar force-pushed the vraspar/fix-rnn-sequence-lens-zero-oob branch from 41eb12c to 7989648 Compare April 24, 2026 22:19
@vraspar
Copy link
Copy Markdown
Contributor Author

vraspar commented Apr 24, 2026

Addressed all review feedback in the latest force-push (rebased onto main). Here's a summary:


Copilot review comments

# Comment Status
1 max_element empty range UB when batch_size==0 ✅ Added Shape().Size() > 0 guard
2 Data<int>() vs Data<int32_t>() inconsistency ✅ Fixed across all functions (Assign_Y_h, ApplyActivationToBatches, ClearMissingFrames, rnn_helpers.cc)
3 Add regression test for sequence_lens with 0 ✅ Added 2 reverse-direction tests; forward tests already exist from #28003

@yuslepukhin review comments

# Comment Status
1 Reverse direction with sequence_lens[batch]==0 returns initial_h instead of zero (inconsistent with forward) ✅ Fixed — unified seq_len == 0 check in Assign_Y_h now handles both forward and reverse directions consistently with zero-fill
2 Data<int>() should be Data<int32_t>() ✅ Fixed — also changed DataAsSpan<int>DataAsSpan<int32_t> and lambda param in rnn_helpers.cc
3 No regression test (requested: all-zero batch, mixed batch, forward + reverse) ✅ Added RNN_reverse_sequence_lens_all_zero (with non-zero initial_h) and RNN_reverse_sequence_lens_mixed_zero (with non-zero initial_h). Forward cases (RNN_seq_length_zero, RNN_forward_sequence_lens_with_zero) already on main from #28003
4 hidden_size_ <= 0 validation gap Deferred — pre-existing issue across RNN/LSTM/GRU, will address in a separate hardening PR
5 Misleading error message in rnn_helpers.cc validation ✅ Fixed — updated to >= 0 and <= seq_length to match the actual constraint. Also updated the existing RNN_invalid_sequence_lens test to match

Merge conflict

Resolved by rebasing onto main. PR #28003 had already landed a partial fix for Assign_Y_h (forward-only). This push layers the reverse-direction fix, int32_t cleanup, and regression tests on top of that.

@vraspar vraspar requested a review from yuslepukhin April 24, 2026 22:31
@yuslepukhin
Copy link
Copy Markdown
Member

Minor: The RNN_invalid_sequence_lens test uses vector {0, 5} and expects failure. It now fails on 5 > seq_length, not on 0. The comment was updated to say "5 exceeds seq_length (3)". However, this test no longer validates that 0 alone is accepted. The new regression tests (RNN_forward_sequence_lens_with_zero, etc.) implicitly cover this — they succeed with seq_lens=0, proving 0 passes validation. Adequate.

@vraspar vraspar merged commit f6eb50f into main Apr 27, 2026
88 of 89 checks passed
@vraspar vraspar deleted the vraspar/fix-rnn-sequence-lens-zero-oob branch April 27, 2026 18:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants