multiline: ensure context is registered for REGEX type #11231

mirko-lazarevic · 2025-12-01T11:19:43Z

This fix ensures that when the buffer is
flushed, the record will have proper timestamp
and metadata instead of just the "log" field.

Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

Example configuration file for the change
Debug log output from testing the change

Attached Valgrind output that shows no leaks or memory corruption was found

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

Run local packaging test showing all targets (including any new ones) build.
Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

Documentation required for this feature

Backporting

Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

Summary by CodeRabbit

Bug Fixes
- Multiline processing now registers first-line context earlier for regex, ends-with, and equality modes so metadata is available before concatenation. Metadata packing is skipped for records whose content was truncated, preventing loss or corruption of metadata.
Tests
- Added regression tests that verify full metadata (time, stream, file, log) is preserved across multiline flushes, including truncation and per-record metadata scenarios.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2025-12-01T11:20:04Z

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

The multiline parser now registers stream-group context earlier for first-line maps in multiple multiline modes when the group's buffer is empty, and metadata packing is skipped for truncated content.

Changes

Cohort / File(s)	Change Summary
Multiline context initialization `src/multiline/flb_ml.c`	Added conditional calls to `flb_ml_register_context(stream_group, tm, full_map)` when `stream_group->mp_sbuf.size == 0` in the `FLB_ML_REGEX`, `FLB_ML_ENDSWITH`, and `FLB_ML_EQ` branches (REGEX registration placed after truncation handling). Changed metadata packing guard from `processed && metadata != NULL` to `!truncated && processed && metadata != NULL`.
Multiline metadata regression tests `tests/internal/multiline.c`	Added tests for issue 10576: `metadata_result`, `flush_callback_metadata_check`, `append_log_with_metadata`, `test_issue_10576`, `test_issue_truncation_10576`, and registered them in `TEST_LIST` to validate metadata preservation and truncation behavior.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Verify placement and conditional guards for flb_ml_register_context() across ML modes.
Confirm REGEX truncation handling ordering relative to registration.
Ensure metadata packing change does not drop metadata in other paths.
Review new tests for determinism and coverage.

Poem

🐇 I hop through lines and tuck a map inside,
I stitch the start when buffers lie wide,
If edges bite or bytes are shorn,
I keep each tag from being torn,
A rabbit guards your metadata tide.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Title check	⚠️ Warning	The PR title 'multiline: ensure context is registered for REGEX type' is related to the changeset but is overly narrow and not fully representative of the main changes.	Revise the title to reflect the broader scope: the changes address metadata preservation across multiple multiline modes (ENDSWITH, etc.) and introduce early context registration, not just REGEX type.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage	✅ Passed	Docstring coverage is 83.33% which is sufficient. The required threshold is 80.00%.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e5cdce8 and 916c20a.

📒 Files selected for processing (2)

src/multiline/flb_ml.c (2 hunks)
tests/internal/multiline.c (2 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

src/multiline/flb_ml.c

🧰 Additional context used

🧬 Code graph analysis (1)

tests/internal/multiline.c (5)

src/multiline/flb_ml.c (3)

flb_ml_append_object (764-863)

flb_ml_create (876-928)

flb_ml_destroy (989-1014)

src/flb_config.c (2)

flb_config_init (233-486)

flb_config_exit (488-672)

src/multiline/flb_ml_parser.c (4)

flb_ml_parser_create (200-224)

flb_ml_parser_init (131-141)

flb_ml_parser_instance_create (261-312)

flb_ml_parser_instance_set (315-340)

src/multiline/flb_ml_rule.c (1)

flb_ml_rule_create (48-115)

src/multiline/flb_ml_stream.c (1)

flb_ml_stream_create (223-276)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (31)

GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-24.04, clang-14)
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-22.04, clang-12)
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
GitHub Check: pr-compile-without-cxx (3.31.6)
GitHub Check: PR - fuzzing test
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
GitHub Check: pr-compile-centos-7
GitHub Check: run-ubuntu-unit-tests (-DFLB_COMPILER_STRICT_POINTER_TYPES=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_ARROW=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)

🔇 Additional comments (4)

tests/internal/multiline.c (4)

1655-1815: Metadata tracking helper and flush callback are sound

The metadata_result struct, flush_callback_metadata_check, and append_log_with_metadata are consistent with existing patterns in this file: msgpack packing/unpacking is balanced, timestamp and metadata are validated, and the stream/file values are safely bounded and nul‑terminated. No correctness or memory‑management issues spotted here.

1827-1924: test_issue_10576 cleanly exercises the slow-arrival metadata case

This test accurately models “flush after each line” with a REGEX multiline parser and verifies, via flush_callback_metadata_check, that every flushed record carries both stream and file plus a non‑zero timestamp. The expectations and assertions (including counts of records with/without metadata) align well with the bug being fixed.

1942-2067: test_issue_truncation_10576 gives good coverage of truncation semantics

The truncation scenario is well constructed: small multiline_buffer_limit, explicit check for FLB_MULTILINE_TRUNCATED on the long continuation, then a new start_state with different metadata. The test’s assertions on record counts and per‑record stream/file values reliably confirm that context is not incorrectly reused across truncated groups.

2089-2091: New tests are correctly registered in TEST_LIST

Both issue_10576 and issue_truncation_10576 are added to TEST_LIST in line with existing naming and ordering, ensuring they will run under the standard test harness.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

src/multiline/flb_ml.c

mirko-lazarevic · 2025-12-01T11:31:48Z

This pull request should address the issue #10576

For the fluent-bit configuration example and steps how to reproduce the issue, navigate to #10576

Output after the fix:

Fluent Bit v4.2.1
* Copyright (C) 2015-2025 The Fluent Bit Authors
* Fluent Bit is a CNCF graduated project under the Fluent organization
* https://fluentbit.io

______ _                  _    ______ _ _             ___   _____
|  ___| |                | |   | ___ (_) |           /   | / __  \
| |_  | |_   _  ___ _ __ | |_  | |_/ /_| |_  __   __/ /| | `' / /'
|  _| | | | | |/ _ \ '_ \| __| | ___ \ | __| \ \ / / /_| |   / /
| |   | | |_| |  __/ | | | |_  | |_/ / | |_   \ V /\___  |_./ /___
\_|   |_|\__,_|\___|_| |_|\__| \____/|_|\__|   \_/     |_(_)_____/

             Fluent Bit v4.2 – Direct Routes Ahead
         Celebrating 10 Years of Open, Fluent Innovation!

[2025/12/01 12:30:43.528267000] [ info] [fluent bit] version=4.2.1, commit=10ebd3a354, pid=6123
[2025/12/01 12:30:43.528771000] [ info] [storage] ver=1.5.4, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2025/12/01 12:30:43.528993000] [ info] [simd    ] disabled
[2025/12/01 12:30:43.528998000] [ info] [cmetrics] version=1.0.5
[2025/12/01 12:30:43.529349000] [ info] [ctraces ] version=0.6.6
[2025/12/01 12:30:43.529578000] [ info] [input:tail:tail.0] initializing
[2025/12/01 12:30:43.529585000] [ info] [input:tail:tail.0] storage_strategy='memory' (memory only)
[2025/12/01 12:30:43.530012000] [ info] [input:tail:tail.0] multiline core started
[2025/12/01 12:30:43.530308000] [ info] [input:tail:tail.0] thread instance initialized
[2025/12/01 12:30:43.530546000] [ info] [filter:multiline:ml-detect] created emitter: emitter_for_ml-detect
[2025/12/01 12:30:43.530591000] [ info] [input:emitter:emitter_for_ml-detect] initializing
[2025/12/01 12:30:43.530596000] [ info] [input:emitter:emitter_for_ml-detect] storage_strategy='memory' (memory only)
[2025/12/01 12:30:43.530916000] [ info] [output:stdout:stdout.0] worker #0 started
[2025/12/01 12:30:43.531683000] [ info] [http_server] listen iface=0.0.0.0 tcp_port=8081
[2025/12/01 12:30:43.531917000] [ info] [sp] stream processor started
[2025/12/01 12:30:43.532206000] [ info] [engine] Shutdown Grace Period=5, Shutdown Input Grace Period=2


[2025/12/01 12:30:49.787352000] [ info] [filter:multiline:ml-detect] created new multiline stream for tail.0_kube.Users.mila.dev.tmp.fluent-bit-10576-go-repro.output.log
[0] kube.Users.mila.dev.tmp.fluent-bit-10576-go-repro.output.log: [[1764588649.296478000, {}], {"time"=>"2025-12-01T12:30:49.296478+01:00", "stream"=>"stdout", "_p"=>"F", "log"=>"Mon Dec  1 11:30:49 UTC 2025 Likely to fail", "file"=>"/Users/mila/dev/tmp/fluent-bit-10576-go-repro/output.log"}]
[0] kube.Users.mila.dev.tmp.fluent-bit-10576-go-repro.output.log: [[1764588654.298018000, {}], {"time"=>"2025-12-01T12:30:54.298018+01:00", "stream"=>"stdout", "_p"=>"F", "log"=>"Mon Dec  1 11:30:54 UTC 2025 Likely to fail", "file"=>"/Users/mila/dev/tmp/fluent-bit-10576-go-repro/output.log"}]
[0] kube.Users.mila.dev.tmp.fluent-bit-10576-go-repro.output.log: [[1764588659.299245000, {}], {"time"=>"2025-12-01T12:30:59.299245+01:00", "stream"=>"stdout", "_p"=>"F", "log"=>"2025-12-01T11:30:59+00:00 should be ok", "file"=>"/Users/mila/dev/tmp/fluent-bit-10576-go-repro/output.log"}]
[0] kube.Users.mila.dev.tmp.fluent-bit-10576-go-repro.output.log: [[1764588667.512873000, {}], {"time"=>"2025-12-01T12:31:07.512873+01:00", "stream"=>"stdout", "_p"=>"F", "log"=>"Mon Dec  1 11:31:07 UTC 2025 Likely to fail", "file"=>"/Users/mila/dev/tmp/fluent-bit-10576-go-repro/output.log"}]
[0] kube.Users.mila.dev.tmp.fluent-bit-10576-go-repro.output.log: [[1764588672.513383999, {}], {"time"=>"2025-12-01T12:31:12.513384+01:00", "stream"=>"stdout", "_p"=>"F", "log"=>"Mon Dec  1 11:31:12 UTC 2025 Likely to fail", "file"=>"/Users/mila/dev/tmp/fluent-bit-10576-go-repro/output.log"}]

patrick-stephens · 2025-12-01T11:41:54Z

@mirko-lazarevic maybe tweak the commit slightly as having ml: in there is redundant and confusing.

Can you add some unit tests as well? I really like to see those as next time the code is refactored/updated it will prevent a similar problem.

patrick-stephens · 2025-12-01T11:42:29Z

The CIFuzz failure is down to something else so can be ignored: #11227

mirko-lazarevic · 2025-12-01T12:00:03Z

@patrick-stephens

@mirko-lazarevic maybe tweak the commit slightly as having ml: in there is redundant and confusing.

I saw exact the same commit message from one of the maintainers, that's why I did the same. Anyway, I removed ml:.

I'll see if I can add some unit tests, although my knowledge in this area is limited.

mirko-lazarevic · 2025-12-02T22:26:16Z

@mirko-lazarevic maybe tweak the commit slightly as having ml: in there is redundant and confusing.

Can you add some unit tests as well? I really like to see those as next time the code is refactored/updated it will prevent a similar problem.

@patrick-stephens Done

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ec6417e and db62242.

📒 Files selected for processing (1)

tests/internal/multiline.c (2 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (31)

GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
GitHub Check: run-ubuntu-unit-tests (-DFLB_COMPILER_STRICT_POINTER_TYPES=On, 3.31.6, gcc, g++)
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
GitHub Check: run-ubuntu-unit-tests (-DFLB_ARROW=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-22.04, clang-12)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-24.04, clang-14)
GitHub Check: pr-compile-without-cxx (3.31.6)
GitHub Check: pr-compile-centos-7
GitHub Check: PR - fuzzing test
GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)

🔇 Additional comments (6)

tests/internal/multiline.c (6)

1646-1663: LGTM! Well-structured test helper.

The metadata_result structure is well-designed for tracking metadata preservation across multiple records in the regression tests.

1671-1751: LGTM! Robust metadata verification callback.

The callback correctly unpacks msgpack records, verifies field presence, and tracks metadata completeness with proper bounds checking on string copies.

1756-1815: LGTM! Clean helper function.

The function correctly constructs a msgpack record with timestamp and metadata fields (stream, log, file), then sends it to the multiline processor. Memory management and error handling are appropriate for test code.

1827-1924: LGTM! Comprehensive regression test.

The test effectively validates the fix for issue 10576 by simulating slow log arrival (flushing after each line) and verifying that all flushed records retain full metadata (stream, log, file fields) rather than losing metadata on continuation lines.

1942-2067: LGTM! Excellent truncation boundary test.

This test validates a critical edge case: ensuring that after buffer truncation, subsequent records maintain their own metadata rather than inheriting from the truncated group. The test correctly sets a small buffer limit, triggers truncation, and verifies metadata isolation between records.

2089-2090: LGTM! Test entries properly registered.

Both new test cases are correctly added to the TEST_LIST and appropriately placed in the "Issues reported on Github" section.

tests/internal/multiline.c

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

tests/internal/multiline.c (1)
1671-1695: Align unpack error handling with existing flush callback for clearer test failures

In flush_callback_metadata_check, unpack failures currently return -1 without a TEST_CHECK, whereas the existing flush_callback asserts on the unpack result. For consistency and clearer diagnostics when tests fail, it would be better to assert as well:
     msgpack_unpacked_init(&result);
-    ret = msgpack_unpack_next(&result, buf_data, buf_size, &off);
-    if (ret != MSGPACK_UNPACK_SUCCESS) {
-        msgpack_unpacked_destroy(&result);
-        return -1;
-    }
+    ret = msgpack_unpack_next(&result, buf_data, buf_size, &off);
+    TEST_CHECK(ret == MSGPACK_UNPACK_SUCCESS);
+    if (ret != MSGPACK_UNPACK_SUCCESS) {
+        msgpack_unpacked_destroy(&result);
+        return -1;
+    }
This keeps the callback behavior aligned with the rest of the file and ensures failures are visible in the test output even if the return code is ignored upstream.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between db62242 and e5cdce8.

📒 Files selected for processing (1)

tests/internal/multiline.c (2 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

tests/internal/multiline.c (2)

src/multiline/flb_ml.c (1)

flb_ml_append_object (764-863)

src/multiline/flb_ml_stream.c (1)

flb_ml_stream_create (223-276)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (31)

GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_COMPILER_STRICT_POINTER_TYPES=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_ARROW=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
GitHub Check: pr-compile-centos-7
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-22.04, clang-12)
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
GitHub Check: pr-compile-without-cxx (3.31.6)
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-24.04, clang-14)
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
GitHub Check: PR - fuzzing test

🔇 Additional comments (4)

tests/internal/multiline.c (4)

1753-1815: Helper for packing metadata + log into a map looks correct and robust

append_log_with_metadata follows the existing msgpack packing/unpacking pattern, correctly builds [timestamp, {stream, log, file}], and propagates flb_ml_append_object’s return code back to callers so tests can assert on FLB_MULTILINE_OK vs FLB_MULTILINE_TRUNCATED. Cleanup paths on failure and success are balanced (destroying both msgpack_unpacked and msgpack_sbuffer), so there are no obvious leaks here.

1827-1924: Issue 10576 non-truncation scenario test is well structured and matches the problem description

test_issue_10576 accurately simulates the “slow arrival” case from #10576: it defines a regex-based multiline parser, flushes after each line, and asserts that every emitted record has both stream and file metadata plus a non-zero timestamp. The rule definitions and key_content wiring mirror existing tests, and the final checks on records_missing_metadata, records_with_full_metadata, and current_record == num_lines give good coverage without overcomplicating the setup.

1942-2067: Truncation regression test correctly validates independent metadata after buffer-limit overflow

test_issue_truncation_10576 configures multiline_buffer_limit, drives a start/cont/start sequence that forces FLB_MULTILINE_TRUNCATED on the second append, and then asserts:

the second append returns FLB_MULTILINE_TRUNCATED,

exactly two records are flushed, and

each record’s stream/file pair matches its own group (stdout/app1.log vs stderr/app2.log).

This neatly exercises the new behavior where context must not be (re)registered from the truncated group and avoids metadata leakage across groups while still requiring full metadata on both records. The cleanup (destroying ml and exiting the config) is consistent with the rest of the file.

2089-2090: New tests are correctly wired into TEST_LIST

The entries for "issue_10576" and "issue_truncation_10576" match the new function names and keep them grouped under the “Issues reported on Github” section, ensuring the regressions are exercised by the standard test runner.

tests/internal/multiline.c

cosmo0920 · 2025-12-05T08:03:39Z

Hey, the patch is good but there is only one commit does not fit our policy of commit messages:

❌ Commit 23a2e18 failed:
Subject prefix 'multiline:' does not match files changed.
Expected one of: ml:

This fix ensures that when the buffer is flushed, the record will have proper timestamp and metadata instead of just the "log" field. Signed-off-by: Mirko Lazarevic <[email protected]>

Signed-off-by: Mirko Lazarevic <[email protected]>

Addresses PR comments and adds correspoinding unit tests Signed-off-by: Mirko Lazarevic <[email protected]>

Signed-off-by: Mirko Lazarevic <[email protected]>

mirko-lazarevic requested review from cosmo0920 and edsiper as code owners December 1, 2025 11:19

github-actions bot added the docs-required label Dec 1, 2025

mirko-lazarevic temporarily deployed to pr December 1, 2025 11:19 — with GitHub Actions Inactive

chatgpt-codex-connector bot reviewed Dec 1, 2025

View reviewed changes

src/multiline/flb_ml.c Outdated Show resolved Hide resolved

mirko-lazarevic mentioned this pull request Dec 1, 2025

Multiline filter issue, the timestamp and metadata of some logs is missing #10576

Open

mirko-lazarevic temporarily deployed to pr December 1, 2025 11:38 — with GitHub Actions Inactive

mirko-lazarevic temporarily deployed to pr December 1, 2025 11:39 — with GitHub Actions Inactive

patrick-stephens changed the title ~~multiline: ml: ensure context is registered for REGEX type~~ multiline: ensure context is registered for REGEX type Dec 1, 2025

mirko-lazarevic force-pushed the master branch from a398968 to 23a2e18 Compare December 1, 2025 11:49

mirko-lazarevic temporarily deployed to pr December 1, 2025 11:50 — with GitHub Actions Inactive

mirko-lazarevic temporarily deployed to pr December 1, 2025 12:09 — with GitHub Actions Inactive

mirko-lazarevic temporarily deployed to pr December 2, 2025 22:26 — with GitHub Actions Inactive

mirko-lazarevic temporarily deployed to pr December 4, 2025 08:25 — with GitHub Actions Inactive

coderabbitai bot reviewed Dec 4, 2025

View reviewed changes

tests/internal/multiline.c Outdated Show resolved Hide resolved

mirko-lazarevic temporarily deployed to pr December 4, 2025 08:44 — with GitHub Actions Inactive

mirko-lazarevic temporarily deployed to pr December 4, 2025 09:40 — with GitHub Actions Inactive

coderabbitai bot reviewed Dec 4, 2025

View reviewed changes

tests/internal/multiline.c Show resolved Hide resolved

mirko-lazarevic temporarily deployed to pr December 4, 2025 09:59 — with GitHub Actions Inactive

mirko-lazarevic temporarily deployed to pr December 4, 2025 10:00 — with GitHub Actions Inactive

mirko-lazarevic requested a review from cosmo0920 December 4, 2025 22:00

mirko-lazarevic added 6 commits December 5, 2025 12:09

ml: ensure context is registered for REGEX type

f83b768

This fix ensures that when the buffer is flushed, the record will have proper timestamp and metadata instead of just the "log" field. Signed-off-by: Mirko Lazarevic <[email protected]>

ml: Add unit tests

48fb066

Signed-off-by: Mirko Lazarevic <[email protected]>

ml: improve unit tests

f1a9242

Signed-off-by: Mirko Lazarevic <[email protected]>

ml: handle TRUNCATED return case

c7649e9

Addresses PR comments and adds correspoinding unit tests Signed-off-by: Mirko Lazarevic <[email protected]>

ml: conding style fix

25412a0

Signed-off-by: Mirko Lazarevic <[email protected]>

ml: fix typo

916c20a

Signed-off-by: Mirko Lazarevic <[email protected]>

mirko-lazarevic force-pushed the master branch from e5cdce8 to 916c20a Compare December 5, 2025 11:10

mirko-lazarevic temporarily deployed to pr December 5, 2025 11:10 — with GitHub Actions Inactive

mirko-lazarevic temporarily deployed to pr December 5, 2025 11:29 — with GitHub Actions Inactive

multiline: ensure context is registered for REGEX type #11231

Are you sure you want to change the base?

multiline: ensure context is registered for REGEX type #11231

Uh oh!

Conversation

mirko-lazarevic commented Dec 1, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Dec 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Other AI code review bot(s) detected

Walkthrough

Changes

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

mirko-lazarevic commented Dec 1, 2025

Uh oh!

patrick-stephens commented Dec 1, 2025

Uh oh!

patrick-stephens commented Dec 1, 2025

Uh oh!

mirko-lazarevic commented Dec 1, 2025

Uh oh!

mirko-lazarevic commented Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cosmo0920 commented Dec 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mirko-lazarevic commented Dec 1, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 1, 2025 •

edited

Loading

mirko-lazarevic commented Dec 2, 2025 •

edited

Loading