Skip to content

Forward could not accept new connection/tls unexpected EOF errors #7434

@rossbishop

Description

@rossbishop

Bug Report

Describe the bug
Fluent-bit produces a large number of TLS/connection errors in its logs when TLS is enabled with the forwarding input plugin.

The use case is one instance of fluent-bit running inside EC2 outputting logs to a receiver fluent-bit instance running inside a kube cluster to securely forward messages into graylog.

Observations:

  • Turning TLS verification off - still errors
  • Running in debug mode I get further error messages e.g. [debug] [downstream] connection #51 failed, [debug] [socket] could not validate socket status for #52 (don't worry)
  • Turning TLS off resolves the errors (predictably)
  • Most messages are still making it to the receiver/server fluent-bit as far as I can see, haven't yet identified if messages are being lost or not

To Reproduce
Example log messages:

[2023/05/17 14:21:40] [error] [input:forward:forward.0] could not accept new connection
[2023/05/17 14:21:40] [error] [tls] error: unexpected EOF
[2023/05/17 14:21:40] [error] [input:forward:forward.0] could not accept new connection
[2023/05/17 14:21:41] [error] [tls] error: unexpected EOF
[2023/05/17 14:21:41] [error] [input:forward:forward.0] could not accept new connection
[2023/05/17 14:21:41] [error] [tls] error: unexpected EOF

Occasionally:

[2023/05/17 14:21:42] [error] [/src/fluent-bit/src/tls/openssl.c:433 errno=0] Success
[2023/05/17 14:21:42] [error] [tls] syscall error: error:00000005:lib(0):func(0):DH lib
  • Steps to reproduce the problem:
  1. Run receiving fluent-bit in kube
  2. Pod certificate issued by cert-manager vault issuer
  3. Second external fluent-bit (e.g. in EC2) sending messages to the receiving fluent-bit with the vault ca_chain in config

Expected behavior
Fluent-bit doesn't spew TLS errors

Your Environment

  • Version used: 2.1 (also 2.0.6)
  • Configuration:
    Fluent-bit helm chart running in kube
    fluent-bit.conf (server/receiver side - filters removed for simplicity)
 [SERVICE]
    Daemon off
    Flush 10
    Log_Level debug
    parsers_file custom_parsers.conf
    HTTP_Server On
    HTTP_Listen 0.0.0.0
    HTTP_Port 2020
    Health_Check On
    storage.path /var/logs/flb-logs
    storage.sync full

[INPUT]
    name forward
    listen 0.0.0.0
    port 24224
    tls on
    tls.debug 4
    tls.verify on
    tls.crt_file /etc/tls/fluent-bit-ingress-tls/tls.crt
    tls.key_file /etc/tls/fluent-bit-ingress-tls/tls.key
    storage.type filesystem

[OUTPUT]
    Name                    gelf
    Match                   *
    Host                    ~URL omitted~
    Port                    12212
    Mode                    tls
    tls                     On
    tls.verify              Off
    tls.ca_file             /fluent-bit/etc/ca.crt
    tls.vhost               ~URL omitted~
    Gelf_Short_Message_Key  message
    Gelf_Host_Key           container_name
    storage.total_limit_size 256MB

fluent-bit.conf (client/sender side - filters removed for simplicity):

[SERVICE]
    parsers_file /fluent-bit/etc/parsers.conf

[INPUT]
    name forward
    listen 0.0.0.0
    port 24224

[OUTPUT]
    Name stdout
    Format json_lines
    Match OUTPUT

[OUTPUT]
    Name forward
    Match OUTPUT
    Host ~URL omitted~
    Port 24224
    tls on
    tls.verify on
    tls.ca_file /etc/fluent-bit/ca.crt

  • Environment name and version (e.g. Kubernetes? What version?):
    Kubernetes 1.23 (EKS)
  • Server type and version:
  • Operating System and version:
    Fedora CoreOS EC2 instance
  • Filters and plugins:
    Forward input/outputs
    gelf output

Additional context

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions