Skip to content

Conversation

@Paramadon
Copy link
Collaborator

@Paramadon Paramadon commented Dec 1, 2025

Issue #, if available

Currently customers can not use our default configuration to have agent and fluent bit use dualstack endpoints. We want to add a top level field to allow using dualstack endpoints.

Passing tests: https://github.com/aws/amazon-cloudwatch-agent/actions/runs/19836119811

Description of changes:

If customer sets the new top level useDualstackEndpoint field to true, we'll add use_dualstack_endpoint to true in the agent configuration so that it uses dualstack endpoints(this is the current way to enable cloudwatch agent to use dualstack endpoints.)

As for fluent bit, we'll update endpoints/sts_endpoint for fluent bit configuration (under [OUTPUT] section) to use dualstack endpoints and then have fluent bit prefer ipv6 dns using net.dns.prefer_ipv6 true. (Note setting useDualstackEndpoint to true on an ipv4 instance would NOT break the cluster as dualstack endpoints also work on ipv4 instances and net dns prefer ipv6 does fall back to ipv4. (Applied helm on ipv4 instance and set useDualstackEndpoint, and it works.

Also we do aren't replacing endpoints if there is already an endpoint field (for example adc regions). ADC regions are out of scope of this change.

Testing

All testing was done on IPv6 EKS cluster, we also tested that dualstack endpoints work on K8s so we can make this helm chart change.

Below are logs from fluent bit and the agent configuration showcasing what happens if we set the top level field useDualstackEndpoint to true or false.

Setting useDualstackEndpoint to true

Helm upgrade command:

helm upgrade --install amazon-cloudwatch-observability \
  charts/amazon-cloudwatch-observability \
  -n amazon-cloudwatch \
  -f charts/amazon-cloudwatch-observability/values.yaml \
  --set clusterName=<cluster-name> \
  --set region=us-west-2 \
  --set useDualstackEndpoint=true

Resulting agent configuration

 config: '{"agent":{"region":"us-west-2","use_dualstack_endpoint":true},"logs":{"metrics_collected":{"application_signals":{"hosted_in":"Ipv6-auto-mode-6n"},"kubernetes":{"cluster_name":"Ipv6-auto-mode-6n","enhanced_container_insights":true}}},"traces":{"traces_collected":{"application_signals":{}}}}

As you can setting useDualstackEndpoint field to true sets agent configuration to use dualstack endpoint.

Resulting fluent bit config

[SERVICE]
 net.dns.prefer_ipv6       true
 Flush                     5
 Grace                     30
 Log_Level                 error
 Daemon                    off
 Parsers_File              parsers.conf
 storage.path              /var/fluent-bit/state/flb-storage/
 storage.sync              normal
 storage.checksum          off
 storage.backlog.mem_limit 5M
 
[INPUT]
Name tail
Path /var/log/containers/*.log
Parser docker

[OUTPUT]
 Name                cloudwatch_logs
 Match               application.*
 region              ${AWS_REGION}
 endpoint            logs.${AWS_REGION}.api.aws
 sts_endpoint        sts.${AWS_REGION}.api.aws
 log_group_name      /aws/containerinsights/${CLUSTER_NAME}/application
 log_stream_prefix   ${HOST_NAME}-
 auto_create_group   true
 extra_user_agent    container-insights


Resulting fluent bit logs

As for Fluent bit, here are some log output showcasing successfully sending logs using dualstack endpoints:

Screenshot 2025-12-01 at 3 34 40 PM

Setting useDualstackEndpoint to false

Helm upgrade command:

helm upgrade --install amazon-cloudwatch-observability \
  charts/amazon-cloudwatch-observability \
  -n amazon-cloudwatch \
  -f charts/amazon-cloudwatch-observability/values.yaml \
  --set clusterName=<cluster-name> \
  --set region=us-west-2 \
  --set useDualstackEndpoint=false

Resulting agent configuration

config: '{"agent":{"region":"us-west-2"},"logs":{"metrics_collected":{"application_signals":{"hosted_in":"Ipv6-auto-mode-6n"},"kubernetes":{"cluster_name":"Ipv6-auto-mode-6n","enhanced_container_insights":true}}},"traces":{"traces_collected":{"application_signals":{}}}}

As you can see seeting useDualstackEndpoint to false does not set the use_dualstack_endpoint to true nor adds it to the agent configuration.

Resulting fluent bit config

[SERVICE]
 Flush                     5
 Grace                     30
 Log_Level                 error
 Daemon                    off
 Parsers_File              parsers.conf
 storage.path              /var/fluent-bit/state/flb-storage/
 storage.sync              normal
 storage.checksum          off
 storage.backlog.mem_limit 5M
 
[INPUT]
Name tail
Path /var/log/containers/*.log
Parser docker


[OUTPUT]
 Name                cloudwatch_logs
 Match               application.*
 region              ${AWS_REGION}
 log_group_name      /aws/containerinsights/${CLUSTER_NAME}/application
 log_stream_prefix   ${HOST_NAME}-
 auto_create_group   true
 extra_user_agent    container-insights

Resulting fluent bit logs

As for Fluent bit, here are some log output showcasing successfully sending logs using regular ipv4 endpoints.

Screenshot 2025-12-01 at 3 41 44 PM

Test to ensure we don't override endpoint if it is already given

In the fluent bit configuration I added endpoint logs.us-east-1.api.aws to see if the endpoint is overwritten with useDualstackEndpoint is true and it isn't overridden and the fluent bit configuration does take precedence making sure that we don't override adc region endpoints.

Screenshot 2025-12-02 at 3 08 03 PM

Added scenario tests

Added minikube-based integration tests to validate the useDualstackEndpoint configuration:

What each test validates:

default_test.go: Confirms FluentBit configs don't contain dualstack endpoints or IPv6 preference when disabled

dualstack_endpoint_enabled_test.go: Confirms FluentBit OUTPUT sections have endpoint: logs.${AWS_REGION}.api.aws and sts_endpoint: sts.${AWS_REGION}.api.aws, SERVICE section has net.dns.prefer_ipv6 true, and CloudWatch Agent has use_dualstack_endpoint: true

dualstack_endpoint_with_custom_endpoint_test.go: Confirms custom endpoints in application-log.conf are NOT overwritten, while other configs still get dualstack endpoints

*/}}
{{- define "fluent-bit.add-dualstack-endpoints" -}}
{{- $config := .config -}}
{{- if and .Values.useDualstackEndpoint (not (contains "endpoint" $config)) -}}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to make sure we do not replace endpoint if it already exist in the config (for example for adc regions). ADC is out of scope for this change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have a test that validates this logic?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added to pr description, endpoint is not overwritten if it already exists in [OUTPUT] section of fluent bit configuration. So adc region endpoint override would be respected.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be helpful to add before/after FB config with ipv6 enabled similar to how you show agent config in the description

{{- $agent := set $configCopy "agent" $agentRegion }}
{{- end }}

{{- if .Values.useDualstackEndpoint }}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're adding the agent section if does not exist and then adding "use_dualstack_endpoint" to true in that config.

k8s-app: fluent-bit
data:
fluent-bit.conf: |
{{- if .Values.useDualstackEndpoint }}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll set dns prefer ipv6 to true even if environment is ipv4 it should still works as it will fallback to ipv4 as shown by this passing test:
https://github.com/aws/amazon-cloudwatch-agent/actions/runs/19839229481/job/56845933521

Ran test by enabling dualstack endpoint for ipv4 and ipv6 and they both pass.

Copy link
Member

@the-mann the-mann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add manual test to description that tests the case where the endpoint is manually set and dualstackendpoint is enabled. the test should validate that the custom endpoint that is set is used instead of the dual stack endpoint. #254 (comment)

the-mann
the-mann previously approved these changes Dec 2, 2025
@duhminick
Copy link
Contributor

@Paramadon Paramadon force-pushed the paramadon/IPv6EKSTest branch 2 times, most recently from 579fc89 to 811fdac Compare December 3, 2025 19:18
{{- define "fluent-bit.add-dualstack-endpoints" -}}
{{- $config := .config -}}
{{- if and .Values.useDualstackEndpoint (not (contains "endpoint" $config)) -}}
{{- $config = replace "region ${AWS_REGION}" (printf "region ${AWS_REGION}\n endpoint logs.${AWS_REGION}.api.aws\n sts_endpoint sts.${AWS_REGION}.api.aws") $config -}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this seems like it could break at any point depending on the spacing. I think we could use a regexReplaceAll or something

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup that makes sense, using regex now instead.

@Paramadon Paramadon force-pushed the paramadon/IPv6EKSTest branch 2 times, most recently from 7fe19c1 to fe58e20 Compare December 3, 2025 20:02
@Paramadon Paramadon force-pushed the paramadon/IPv6EKSTest branch from fe58e20 to 8d305bb Compare December 3, 2025 20:14
*/}}
{{- define "fluent-bit.add-dualstack-endpoints" -}}
{{- $config := .config -}}
{{- if and .Values.useDualstackEndpoint (not (contains "endpoint" $config)) -}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be helpful to add before/after FB config with ipv6 enabled similar to how you show agent config in the description

{{- range $key, $val := .Values.containerLogs.fluentBit.configWindows.extraFiles }}
@INCLUDE {{ $key }}
{{- end }}
parsers.conf: |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

has this been also tested on an windows instance?

Copy link
Collaborator Author

@Paramadon Paramadon Dec 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removing this altogether as windows nodes don't support ipv6 yet so there is no need for dualstack endpoints: https://docs.aws.amazon.com/eks/latest/userguide/windows-support.html

{{- define "fluent-bit.add-dualstack-endpoints" -}}
{{- $config := .config -}}
{{- if and .Values.useDualstackEndpoint (not (contains "endpoint" $config)) -}}
{{- $config = mustRegexReplaceAll "(region\\s+\\$\\{AWS_REGION\\})" $config "$1\n endpoint logs.$${AWS_REGION}.api.aws\n sts_endpoint sts.$${AWS_REGION}.api.aws" -}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think custom endpoint logic also applies to sts_endpoint? if so, please add or update minukube test case to include sts_endpoint

{{- end }}
parsers.conf: |
{{- .Values.containerLogs.fluentBit.config.customParsers | nindent 4 }}
{{- if hasPrefix "us-iso-" .Values.region }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not an expert with FB config but what happens if a custom fluentbit config doesn't have Parsers_File for some reasons? would it be better to just appned right after [SERVICE]?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So custom fluent bit configs are out of scope as that's what we also putting in public docs, but it does make sense to append right after [SERVICE], will change it to this instead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you mean custom config is out of scope? the public doc should have some instructions on how to setup ipv6 with custom configs, right? The new function fluent-bit.add-ipv6-preference will skip if a custom config already has dual stack configred.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup we are providing customers using custom configuration of a way to update the fluent bit configuration, meant to say that if customer are using customer config they would need to update their fluent bit config manually. This top level field is only for customer using default config.

@Paramadon Paramadon requested a review from movence December 4, 2025 13:37
@Paramadon Paramadon requested a review from duhminick December 5, 2025 16:02
@Paramadon Paramadon force-pushed the paramadon/IPv6EKSTest branch from 9456e18 to c152d57 Compare December 5, 2025 16:16
@Paramadon Paramadon force-pushed the paramadon/IPv6EKSTest branch from c152d57 to 96a6fcb Compare December 5, 2025 16:27
@Paramadon Paramadon merged commit 3b66fd4 into main Dec 5, 2025
19 checks passed
@Paramadon Paramadon deleted the paramadon/IPv6EKSTest branch December 5, 2025 20:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants