Skip to content

Conversation

@mirko-lazarevic
Copy link
Contributor

@mirko-lazarevic mirko-lazarevic commented Dec 1, 2025

This fix ensures that when the buffer is
flushed, the record will have proper timestamp
and metadata instead of just the "log" field.


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • Example configuration file for the change
  • Debug log output from testing the change
  • Attached Valgrind output that shows no leaks or memory corruption was found

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • Run local packaging test showing all targets (including any new ones) build.
  • Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

  • Documentation required for this feature

Backporting

  • Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

Summary by CodeRabbit

  • Bug Fixes

    • Multiline processing now registers first-line context earlier for regex, ends-with, and equality modes so metadata is available before concatenation. Metadata packing is skipped for records whose content was truncated, preventing loss or corruption of metadata.
  • Tests

    • Added regression tests that verify full metadata (time, stream, file, log) is preserved across multiline flushes, including truncation and per-record metadata scenarios.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Dec 1, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

The multiline parser now registers stream-group context earlier for first-line maps in multiple multiline modes when the group's buffer is empty, and metadata packing is skipped for truncated content.

Changes

Cohort / File(s) Change Summary
Multiline context initialization
src/multiline/flb_ml.c
Added conditional calls to flb_ml_register_context(stream_group, tm, full_map) when stream_group->mp_sbuf.size == 0 in the FLB_ML_REGEX, FLB_ML_ENDSWITH, and FLB_ML_EQ branches (REGEX registration placed after truncation handling). Changed metadata packing guard from processed && metadata != NULL to !truncated && processed && metadata != NULL.
Multiline metadata regression tests
tests/internal/multiline.c
Added tests for issue 10576: metadata_result, flush_callback_metadata_check, append_log_with_metadata, test_issue_10576, test_issue_truncation_10576, and registered them in TEST_LIST to validate metadata preservation and truncation behavior.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Verify placement and conditional guards for flb_ml_register_context() across ML modes.
  • Confirm REGEX truncation handling ordering relative to registration.
  • Ensure metadata packing change does not drop metadata in other paths.
  • Review new tests for determinism and coverage.

Poem

🐇 I hop through lines and tuck a map inside,
I stitch the start when buffers lie wide,
If edges bite or bytes are shorn,
I keep each tag from being torn,
A rabbit guards your metadata tide.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Title check ⚠️ Warning The PR title 'multiline: ensure context is registered for REGEX type' is related to the changeset but is overly narrow and not fully representative of the main changes. Revise the title to reflect the broader scope: the changes address metadata preservation across multiple multiline modes (ENDSWITH, etc.) and introduce early context registration, not just REGEX type.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 83.33% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e5cdce8 and 916c20a.

📒 Files selected for processing (2)
  • src/multiline/flb_ml.c (2 hunks)
  • tests/internal/multiline.c (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/multiline/flb_ml.c
🧰 Additional context used
🧬 Code graph analysis (1)
tests/internal/multiline.c (5)
src/multiline/flb_ml.c (3)
  • flb_ml_append_object (764-863)
  • flb_ml_create (876-928)
  • flb_ml_destroy (989-1014)
src/flb_config.c (2)
  • flb_config_init (233-486)
  • flb_config_exit (488-672)
src/multiline/flb_ml_parser.c (4)
  • flb_ml_parser_create (200-224)
  • flb_ml_parser_init (131-141)
  • flb_ml_parser_instance_create (261-312)
  • flb_ml_parser_instance_set (315-340)
src/multiline/flb_ml_rule.c (1)
  • flb_ml_rule_create (48-115)
src/multiline/flb_ml_stream.c (1)
  • flb_ml_stream_create (223-276)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (31)
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-24.04, clang-14)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-22.04, clang-12)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
  • GitHub Check: pr-compile-without-cxx (3.31.6)
  • GitHub Check: PR - fuzzing test
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
  • GitHub Check: pr-compile-centos-7
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_COMPILER_STRICT_POINTER_TYPES=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_ARROW=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
🔇 Additional comments (4)
tests/internal/multiline.c (4)

1655-1815: Metadata tracking helper and flush callback are sound

The metadata_result struct, flush_callback_metadata_check, and append_log_with_metadata are consistent with existing patterns in this file: msgpack packing/unpacking is balanced, timestamp and metadata are validated, and the stream/file values are safely bounded and nul‑terminated. No correctness or memory‑management issues spotted here.


1827-1924: test_issue_10576 cleanly exercises the slow-arrival metadata case

This test accurately models “flush after each line” with a REGEX multiline parser and verifies, via flush_callback_metadata_check, that every flushed record carries both stream and file plus a non‑zero timestamp. The expectations and assertions (including counts of records with/without metadata) align well with the bug being fixed.


1942-2067: test_issue_truncation_10576 gives good coverage of truncation semantics

The truncation scenario is well constructed: small multiline_buffer_limit, explicit check for FLB_MULTILINE_TRUNCATED on the long continuation, then a new start_state with different metadata. The test’s assertions on record counts and per‑record stream/file values reliably confirm that context is not incorrectly reused across truncated groups.


2089-2091: New tests are correctly registered in TEST_LIST

Both issue_10576 and issue_truncation_10576 are added to TEST_LIST in line with existing naming and ordering, ensuring they will run under the standard test harness.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@mirko-lazarevic
Copy link
Contributor Author

This pull request should address the issue #10576

For the fluent-bit configuration example and steps how to reproduce the issue, navigate to #10576

Output after the fix:

Fluent Bit v4.2.1
* Copyright (C) 2015-2025 The Fluent Bit Authors
* Fluent Bit is a CNCF graduated project under the Fluent organization
* https://fluentbit.io

______ _                  _    ______ _ _             ___   _____
|  ___| |                | |   | ___ (_) |           /   | / __  \
| |_  | |_   _  ___ _ __ | |_  | |_/ /_| |_  __   __/ /| | `' / /'
|  _| | | | | |/ _ \ '_ \| __| | ___ \ | __| \ \ / / /_| |   / /
| |   | | |_| |  __/ | | | |_  | |_/ / | |_   \ V /\___  |_./ /___
\_|   |_|\__,_|\___|_| |_|\__| \____/|_|\__|   \_/     |_(_)_____/

             Fluent Bit v4.2 – Direct Routes Ahead
         Celebrating 10 Years of Open, Fluent Innovation!

[2025/12/01 12:30:43.528267000] [ info] [fluent bit] version=4.2.1, commit=10ebd3a354, pid=6123
[2025/12/01 12:30:43.528771000] [ info] [storage] ver=1.5.4, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2025/12/01 12:30:43.528993000] [ info] [simd    ] disabled
[2025/12/01 12:30:43.528998000] [ info] [cmetrics] version=1.0.5
[2025/12/01 12:30:43.529349000] [ info] [ctraces ] version=0.6.6
[2025/12/01 12:30:43.529578000] [ info] [input:tail:tail.0] initializing
[2025/12/01 12:30:43.529585000] [ info] [input:tail:tail.0] storage_strategy='memory' (memory only)
[2025/12/01 12:30:43.530012000] [ info] [input:tail:tail.0] multiline core started
[2025/12/01 12:30:43.530308000] [ info] [input:tail:tail.0] thread instance initialized
[2025/12/01 12:30:43.530546000] [ info] [filter:multiline:ml-detect] created emitter: emitter_for_ml-detect
[2025/12/01 12:30:43.530591000] [ info] [input:emitter:emitter_for_ml-detect] initializing
[2025/12/01 12:30:43.530596000] [ info] [input:emitter:emitter_for_ml-detect] storage_strategy='memory' (memory only)
[2025/12/01 12:30:43.530916000] [ info] [output:stdout:stdout.0] worker #0 started
[2025/12/01 12:30:43.531683000] [ info] [http_server] listen iface=0.0.0.0 tcp_port=8081
[2025/12/01 12:30:43.531917000] [ info] [sp] stream processor started
[2025/12/01 12:30:43.532206000] [ info] [engine] Shutdown Grace Period=5, Shutdown Input Grace Period=2


[2025/12/01 12:30:49.787352000] [ info] [filter:multiline:ml-detect] created new multiline stream for tail.0_kube.Users.mila.dev.tmp.fluent-bit-10576-go-repro.output.log
[0] kube.Users.mila.dev.tmp.fluent-bit-10576-go-repro.output.log: [[1764588649.296478000, {}], {"time"=>"2025-12-01T12:30:49.296478+01:00", "stream"=>"stdout", "_p"=>"F", "log"=>"Mon Dec  1 11:30:49 UTC 2025 Likely to fail", "file"=>"/Users/mila/dev/tmp/fluent-bit-10576-go-repro/output.log"}]
[0] kube.Users.mila.dev.tmp.fluent-bit-10576-go-repro.output.log: [[1764588654.298018000, {}], {"time"=>"2025-12-01T12:30:54.298018+01:00", "stream"=>"stdout", "_p"=>"F", "log"=>"Mon Dec  1 11:30:54 UTC 2025 Likely to fail", "file"=>"/Users/mila/dev/tmp/fluent-bit-10576-go-repro/output.log"}]
[0] kube.Users.mila.dev.tmp.fluent-bit-10576-go-repro.output.log: [[1764588659.299245000, {}], {"time"=>"2025-12-01T12:30:59.299245+01:00", "stream"=>"stdout", "_p"=>"F", "log"=>"2025-12-01T11:30:59+00:00 should be ok", "file"=>"/Users/mila/dev/tmp/fluent-bit-10576-go-repro/output.log"}]
[0] kube.Users.mila.dev.tmp.fluent-bit-10576-go-repro.output.log: [[1764588667.512873000, {}], {"time"=>"2025-12-01T12:31:07.512873+01:00", "stream"=>"stdout", "_p"=>"F", "log"=>"Mon Dec  1 11:31:07 UTC 2025 Likely to fail", "file"=>"/Users/mila/dev/tmp/fluent-bit-10576-go-repro/output.log"}]
[0] kube.Users.mila.dev.tmp.fluent-bit-10576-go-repro.output.log: [[1764588672.513383999, {}], {"time"=>"2025-12-01T12:31:12.513384+01:00", "stream"=>"stdout", "_p"=>"F", "log"=>"Mon Dec  1 11:31:12 UTC 2025 Likely to fail", "file"=>"/Users/mila/dev/tmp/fluent-bit-10576-go-repro/output.log"}]

@patrick-stephens
Copy link
Collaborator

@mirko-lazarevic maybe tweak the commit slightly as having ml: in there is redundant and confusing.

Can you add some unit tests as well? I really like to see those as next time the code is refactored/updated it will prevent a similar problem.

@patrick-stephens
Copy link
Collaborator

The CIFuzz failure is down to something else so can be ignored: #11227

@mirko-lazarevic
Copy link
Contributor Author

@patrick-stephens

@mirko-lazarevic maybe tweak the commit slightly as having ml: in there is redundant and confusing.

I saw exact the same commit message from one of the maintainers, that's why I did the same. Anyway, I removed ml:.

I'll see if I can add some unit tests, although my knowledge in this area is limited.

@mirko-lazarevic
Copy link
Contributor Author

mirko-lazarevic commented Dec 2, 2025

@mirko-lazarevic maybe tweak the commit slightly as having ml: in there is redundant and confusing.

Can you add some unit tests as well? I really like to see those as next time the code is refactored/updated it will prevent a similar problem.

@patrick-stephens Done

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ec6417e and db62242.

📒 Files selected for processing (1)
  • tests/internal/multiline.c (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (31)
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_COMPILER_STRICT_POINTER_TYPES=On, 3.31.6, gcc, g++)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_ARROW=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-22.04, clang-12)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-24.04, clang-14)
  • GitHub Check: pr-compile-without-cxx (3.31.6)
  • GitHub Check: pr-compile-centos-7
  • GitHub Check: PR - fuzzing test
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
🔇 Additional comments (6)
tests/internal/multiline.c (6)

1646-1663: LGTM! Well-structured test helper.

The metadata_result structure is well-designed for tracking metadata preservation across multiple records in the regression tests.


1671-1751: LGTM! Robust metadata verification callback.

The callback correctly unpacks msgpack records, verifies field presence, and tracks metadata completeness with proper bounds checking on string copies.


1756-1815: LGTM! Clean helper function.

The function correctly constructs a msgpack record with timestamp and metadata fields (stream, log, file), then sends it to the multiline processor. Memory management and error handling are appropriate for test code.


1827-1924: LGTM! Comprehensive regression test.

The test effectively validates the fix for issue 10576 by simulating slow log arrival (flushing after each line) and verifying that all flushed records retain full metadata (stream, log, file fields) rather than losing metadata on continuation lines.


1942-2067: LGTM! Excellent truncation boundary test.

This test validates a critical edge case: ensuring that after buffer truncation, subsequent records maintain their own metadata rather than inheriting from the truncated group. The test correctly sets a small buffer limit, triggers truncation, and verifies metadata isolation between records.


2089-2090: LGTM! Test entries properly registered.

Both new test cases are correctly added to the TEST_LIST and appropriately placed in the "Issues reported on Github" section.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
tests/internal/multiline.c (1)

1671-1695: Align unpack error handling with existing flush callback for clearer test failures

In flush_callback_metadata_check, unpack failures currently return -1 without a TEST_CHECK, whereas the existing flush_callback asserts on the unpack result. For consistency and clearer diagnostics when tests fail, it would be better to assert as well:

     msgpack_unpacked_init(&result);
-    ret = msgpack_unpack_next(&result, buf_data, buf_size, &off);
-    if (ret != MSGPACK_UNPACK_SUCCESS) {
-        msgpack_unpacked_destroy(&result);
-        return -1;
-    }
+    ret = msgpack_unpack_next(&result, buf_data, buf_size, &off);
+    TEST_CHECK(ret == MSGPACK_UNPACK_SUCCESS);
+    if (ret != MSGPACK_UNPACK_SUCCESS) {
+        msgpack_unpacked_destroy(&result);
+        return -1;
+    }

This keeps the callback behavior aligned with the rest of the file and ensures failures are visible in the test output even if the return code is ignored upstream.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between db62242 and e5cdce8.

📒 Files selected for processing (1)
  • tests/internal/multiline.c (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
tests/internal/multiline.c (2)
src/multiline/flb_ml.c (1)
  • flb_ml_append_object (764-863)
src/multiline/flb_ml_stream.c (1)
  • flb_ml_stream_create (223-276)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (31)
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_COMPILER_STRICT_POINTER_TYPES=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_ARROW=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
  • GitHub Check: pr-compile-centos-7
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-22.04, clang-12)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
  • GitHub Check: pr-compile-without-cxx (3.31.6)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-24.04, clang-14)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
  • GitHub Check: PR - fuzzing test
🔇 Additional comments (4)
tests/internal/multiline.c (4)

1753-1815: Helper for packing metadata + log into a map looks correct and robust

append_log_with_metadata follows the existing msgpack packing/unpacking pattern, correctly builds [timestamp, {stream, log, file}], and propagates flb_ml_append_object’s return code back to callers so tests can assert on FLB_MULTILINE_OK vs FLB_MULTILINE_TRUNCATED. Cleanup paths on failure and success are balanced (destroying both msgpack_unpacked and msgpack_sbuffer), so there are no obvious leaks here.


1827-1924: Issue 10576 non-truncation scenario test is well structured and matches the problem description

test_issue_10576 accurately simulates the “slow arrival” case from #10576: it defines a regex-based multiline parser, flushes after each line, and asserts that every emitted record has both stream and file metadata plus a non-zero timestamp. The rule definitions and key_content wiring mirror existing tests, and the final checks on records_missing_metadata, records_with_full_metadata, and current_record == num_lines give good coverage without overcomplicating the setup.


1942-2067: Truncation regression test correctly validates independent metadata after buffer-limit overflow

test_issue_truncation_10576 configures multiline_buffer_limit, drives a start/cont/start sequence that forces FLB_MULTILINE_TRUNCATED on the second append, and then asserts:

  • the second append returns FLB_MULTILINE_TRUNCATED,
  • exactly two records are flushed, and
  • each record’s stream/file pair matches its own group (stdout/app1.log vs stderr/app2.log).

This neatly exercises the new behavior where context must not be (re)registered from the truncated group and avoids metadata leakage across groups while still requiring full metadata on both records. The cleanup (destroying ml and exiting the config) is consistent with the rest of the file.


2089-2090: New tests are correctly wired into TEST_LIST

The entries for "issue_10576" and "issue_truncation_10576" match the new function names and keep them grouped under the “Issues reported on Github” section, ensuring the regressions are exercised by the standard test runner.

@cosmo0920
Copy link
Contributor

Hey, the patch is good but there is only one commit does not fit our policy of commit messages:

❌ Commit 23a2e18 failed:
Subject prefix 'multiline:' does not match files changed.
Expected one of: ml:

This fix ensures that when the buffer is
flushed, the record will have proper timestamp
and metadata instead of just the "log" field.

Signed-off-by: Mirko Lazarevic <[email protected]>
Signed-off-by: Mirko Lazarevic <[email protected]>
Signed-off-by: Mirko Lazarevic <[email protected]>
Addresses PR comments and adds correspoinding unit tests

Signed-off-by: Mirko Lazarevic <[email protected]>
Signed-off-by: Mirko Lazarevic <[email protected]>
Signed-off-by: Mirko Lazarevic <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants