Skip to content

K8SPG-768 skip log creation #1217

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
Open

Conversation

nmarukovich
Copy link
Contributor

@nmarukovich nmarukovich commented Jul 9, 2025

K8SPG-768 Powered by Pull Request Badge

CHANGE DESCRIPTION

Problem:
Excessive log generation from the replication-cert-copy container within the Postgres database pod. The container checks available disk space, and if the free volume is less than 75%, it triggers a Kubernetes API call to update the pod annotation with a suggested volume size (e.g., suggested-pgdata-pvc-size: 15345Mi).

Cause:
Short explanation of the root cause of the issue if applicable.

Solution:
Avoid updating the pod annotation unless AutoGrowVolumes is enabled.

CHECKLIST

Jira

  • Is the Jira ticket created and referenced properly?
  • Does the Jira ticket have the proper statuses for documentation (Needs Doc) and QA (Needs QA)?
  • Does the Jira ticket link to the proper milestone (Fix Version field)?

Tests

  • Is an E2E test/test case added for the new feature/change?
  • Are unit tests added where appropriate?

Config/Logging/Testability

  • Are all needed new/changed options added to default YAML files?
  • Are all needed new/changed options added to the Helm Chart?
  • Did we add proper logging messages for operator actions?
  • Did we ensure compatibility with the previous version or cluster upgrade process?
  • Does the change support oldest and newest supported PG version?
  • Does the change support oldest and newest supported Kubernetes version?

Copy link
Contributor

@gkech gkech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall everything is fine, tested a little with fallocate and couldn't see the logs that we have in main (per jira)

Comment on lines -236 to -250

# Manage autogrow annotation.
# Return size in Mebibytes.
size=$(df --human-readable --block-size=M /pgdata | awk 'FNR == 2 {print $2}')
use=$(df --human-readable /pgdata | awk 'FNR == 2 {print $5}')
sizeInt="${size//M/}"
# Use the sed punctuation class, because the shell will not accept the percent sign in an expansion.
useInt=$(echo $use | sed 's/[[:punct:]]//g')
triggerExpansion="$((useInt > 75))"
if [ $triggerExpansion -eq 1 ]; then
newSize="$(((sizeInt / 2)+sizeInt))"
newSizeMi="${newSize}Mi"
d='[{"op": "add", "path": "/metadata/annotations/suggested-pgdata-pvc-size", "value": "'"$newSizeMi"'"}]'
curl --cacert ${CACERT} --header "Authorization: Bearer ${TOKEN}" -XPATCH "${APISERVER}/api/v1/namespaces/${NAMESPACE}/pods/${HOSTNAME}?fieldManager=kubectl-annotate" -H "Content-Type: application/json-patch+json" --data "$d"
fi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be nice to add a unit test with the FG enabled so that we can assert that is part exists in the script. WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I think it's a good point. I will fix it.

@nmarukovich nmarukovich requested a review from gkech July 14, 2025 17:02
@@ -170,7 +170,7 @@ func Environment(cluster *v1beta1.PostgresCluster) []corev1.EnvVar {
// reloadCommand returns an entrypoint that convinces PostgreSQL to reload
// certificate files when they change. The process will appear as name in `ps`
// and `top`.
func reloadCommand(name string, post250 bool) []string {
func reloadCommand(name string, post250 bool, AutoGrowVolumes bool) []string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to introduce a version check for changing the script in this function given that it is triggered here:

	reloader := corev1.Container{
		Name: naming.ContainerClientCertCopy,

		Command: reloadCommand(naming.ContainerClientCertCopy, inCluster.CompareVersion("2.5.0") >= 0, feature.Enabled(ctx, feature.AutoGrowVolumes)),

		Image:           container.Image,
		ImagePullPolicy: container.ImagePullPolicy,
		SecurityContext: initialize.RestrictedSecurityContext(inCluster.CompareVersion("2.5.0") >= 0),

		VolumeMounts: []corev1.VolumeMount{certVolumeMount},
	}

You can verify that by checking the failures on the upgrade-minor e2e test

@nmarukovich nmarukovich requested a review from gkech July 15, 2025 13:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants