-
Notifications
You must be signed in to change notification settings - Fork 1.8k
multiline: ensure context is registered for REGEX type #11231
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. WalkthroughThe multiline parser now registers stream-group context earlier for first-line maps in multiple multiline modes when the group's buffer is empty, and metadata packing is skipped for truncated content. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (1)
🧰 Additional context used🧬 Code graph analysis (1)tests/internal/multiline.c (5)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (31)
🔇 Additional comments (4)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
This pull request should address the issue #10576 For the fluent-bit configuration example and steps how to reproduce the issue, navigate to #10576 Output after the fix: |
|
@mirko-lazarevic maybe tweak the commit slightly as having Can you add some unit tests as well? I really like to see those as next time the code is refactored/updated it will prevent a similar problem. |
|
The CIFuzz failure is down to something else so can be ignored: #11227 |
a398968 to
23a2e18
Compare
I saw exact the same commit message from one of the maintainers, that's why I did the same. Anyway, I removed I'll see if I can add some unit tests, although my knowledge in this area is limited. |
@patrick-stephens Done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
tests/internal/multiline.c(2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (31)
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_COMPILER_STRICT_POINTER_TYPES=On, 3.31.6, gcc, g++)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_ARROW=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-22.04, clang-12)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-24.04, clang-14)
- GitHub Check: pr-compile-without-cxx (3.31.6)
- GitHub Check: pr-compile-centos-7
- GitHub Check: PR - fuzzing test
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
🔇 Additional comments (6)
tests/internal/multiline.c (6)
1646-1663: LGTM! Well-structured test helper.The
metadata_resultstructure is well-designed for tracking metadata preservation across multiple records in the regression tests.
1671-1751: LGTM! Robust metadata verification callback.The callback correctly unpacks msgpack records, verifies field presence, and tracks metadata completeness with proper bounds checking on string copies.
1756-1815: LGTM! Clean helper function.The function correctly constructs a msgpack record with timestamp and metadata fields (stream, log, file), then sends it to the multiline processor. Memory management and error handling are appropriate for test code.
1827-1924: LGTM! Comprehensive regression test.The test effectively validates the fix for issue 10576 by simulating slow log arrival (flushing after each line) and verifying that all flushed records retain full metadata (stream, log, file fields) rather than losing metadata on continuation lines.
1942-2067: LGTM! Excellent truncation boundary test.This test validates a critical edge case: ensuring that after buffer truncation, subsequent records maintain their own metadata rather than inheriting from the truncated group. The test correctly sets a small buffer limit, triggers truncation, and verifies metadata isolation between records.
2089-2090: LGTM! Test entries properly registered.Both new test cases are correctly added to the TEST_LIST and appropriately placed in the "Issues reported on Github" section.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
tests/internal/multiline.c (1)
1671-1695: Align unpack error handling with existing flush callback for clearer test failuresIn
flush_callback_metadata_check, unpack failures currently return-1without aTEST_CHECK, whereas the existingflush_callbackasserts on the unpack result. For consistency and clearer diagnostics when tests fail, it would be better to assert as well:msgpack_unpacked_init(&result); - ret = msgpack_unpack_next(&result, buf_data, buf_size, &off); - if (ret != MSGPACK_UNPACK_SUCCESS) { - msgpack_unpacked_destroy(&result); - return -1; - } + ret = msgpack_unpack_next(&result, buf_data, buf_size, &off); + TEST_CHECK(ret == MSGPACK_UNPACK_SUCCESS); + if (ret != MSGPACK_UNPACK_SUCCESS) { + msgpack_unpacked_destroy(&result); + return -1; + }This keeps the callback behavior aligned with the rest of the file and ensures failures are visible in the test output even if the return code is ignored upstream.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
tests/internal/multiline.c(2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
tests/internal/multiline.c (2)
src/multiline/flb_ml.c (1)
flb_ml_append_object(764-863)src/multiline/flb_ml_stream.c (1)
flb_ml_stream_create(223-276)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (31)
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_COMPILER_STRICT_POINTER_TYPES=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_ARROW=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
- GitHub Check: pr-compile-centos-7
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-22.04, clang-12)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
- GitHub Check: pr-compile-without-cxx (3.31.6)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-24.04, clang-14)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
- GitHub Check: PR - fuzzing test
🔇 Additional comments (4)
tests/internal/multiline.c (4)
1753-1815: Helper for packing metadata + log into a map looks correct and robust
append_log_with_metadatafollows the existing msgpack packing/unpacking pattern, correctly builds[timestamp, {stream, log, file}], and propagatesflb_ml_append_object’s return code back to callers so tests can assert onFLB_MULTILINE_OKvsFLB_MULTILINE_TRUNCATED. Cleanup paths on failure and success are balanced (destroying bothmsgpack_unpackedandmsgpack_sbuffer), so there are no obvious leaks here.
1827-1924: Issue 10576 non-truncation scenario test is well structured and matches the problem description
test_issue_10576accurately simulates the “slow arrival” case from #10576: it defines a regex-based multiline parser, flushes after each line, and asserts that every emitted record has both stream and file metadata plus a non-zero timestamp. The rule definitions andkey_contentwiring mirror existing tests, and the final checks onrecords_missing_metadata,records_with_full_metadata, andcurrent_record == num_linesgive good coverage without overcomplicating the setup.
1942-2067: Truncation regression test correctly validates independent metadata after buffer-limit overflow
test_issue_truncation_10576configuresmultiline_buffer_limit, drives a start/cont/start sequence that forcesFLB_MULTILINE_TRUNCATEDon the second append, and then asserts:
- the second append returns
FLB_MULTILINE_TRUNCATED,- exactly two records are flushed, and
- each record’s
stream/filepair matches its own group (stdout/app1.log vs stderr/app2.log).This neatly exercises the new behavior where context must not be (re)registered from the truncated group and avoids metadata leakage across groups while still requiring full metadata on both records. The cleanup (destroying
mland exiting the config) is consistent with the rest of the file.
2089-2090: New tests are correctly wired intoTEST_LISTThe entries for
"issue_10576"and"issue_truncation_10576"match the new function names and keep them grouped under the “Issues reported on Github” section, ensuring the regressions are exercised by the standard test runner.
|
Hey, the patch is good but there is only one commit does not fit our policy of commit messages:
|
This fix ensures that when the buffer is flushed, the record will have proper timestamp and metadata instead of just the "log" field. Signed-off-by: Mirko Lazarevic <[email protected]>
Signed-off-by: Mirko Lazarevic <[email protected]>
Signed-off-by: Mirko Lazarevic <[email protected]>
Addresses PR comments and adds correspoinding unit tests Signed-off-by: Mirko Lazarevic <[email protected]>
Signed-off-by: Mirko Lazarevic <[email protected]>
Signed-off-by: Mirko Lazarevic <[email protected]>
e5cdce8 to
916c20a
Compare
This fix ensures that when the buffer is
flushed, the record will have proper timestamp
and metadata instead of just the "log" field.
Enter
[N/A]in the box, if an item is not applicable to your change.Testing
Before we can approve your change; please submit the following in a comment:
If this is a change to packaging of containers or native binaries then please confirm it works for all targets.
ok-package-testlabel to test for all targets (requires maintainer to do).Documentation
Backporting
Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.
Summary by CodeRabbit
Bug Fixes
Tests
✏️ Tip: You can customize this high-level summary in your review settings.