Skip to content

Conversation

@MagicFun1241
Copy link

@MagicFun1241 MagicFun1241 commented Jul 21, 2025

This PR adds ability to deploy weed in "filer.backup" mode.

What changed?

  • bumped go version to 1.24
  • bumped seaweed version
  • added access to secrets and persistentvolumeclaims resources in manager-role
  • added support of taking credentials from secrets with optional fields mapping
  • sink option supports all providers and fields that were in scaffold
  • refactored to propagate context in more places

Example:

apiVersion: seaweed.seaweedfs.com/v1
kind: Seaweed
metadata:
  name: seaweed
spec:
  image: chrislusf/seaweedfs:latest
  volumeServerDiskCount: 1
  hostSuffix: seaweed.abcdefg.com
  master:
    replicas: 1
    volumeSizeLimitMB: 1024
  volume:
    replicas: 1
    requests:
      storage: 2Gi
  filerBackup:
    replicas: 1
    sink:
      s3:
        enabled: true
        awsCredentialsSecretRef:
          name: backup-credentials
          mapping:
            awsSecretAccessKey: secretKey
            awsAccessKeyID: accessKey
  filer:
    replicas: 1
    config: |
      [leveldb2]
      enabled = true
      dir = "/data/filerldb2"

Tests

Tested sink.local backup in local cluster, tested secret getting for configmap

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @MagicFun1241, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the SeaweedFS Operator by introducing a dedicated Filer Backup component. This new feature allows users to configure and manage asynchronous backups of their filer data to a variety of destinations, including local storage, other filers, and popular cloud storage services. The implementation includes robust credential management via Kubernetes secrets and integrates seamlessly with the existing operator framework.

Highlights

  • New Filer Backup Component: Introduced a new FilerBackup component within the Seaweed custom resource, enabling asynchronous backup capabilities for SeaweedFS filer data.
  • Flexible Backup Sink Configurations: Added comprehensive support for various backup destinations, including local file systems, filer-to-filer replication, and major cloud storage providers like AWS S3, Google Cloud Storage, Azure Blob Storage, and Backblaze B2.
  • Secure Credential Management: Implemented a mechanism to securely fetch cloud storage credentials from Kubernetes secrets, allowing for flexible key mapping to adapt to different secret structures.
  • Kubernetes Operator Integration: Extended the operator's reconciliation logic, RBAC roles, and Helm chart to properly manage the lifecycle of the new Filer Backup component, including its ConfigMap and StatefulSet.
  • Dependency and Toolchain Updates: Upgraded the Go toolchain to version 1.24 and updated numerous Go module dependencies, including the core seaweedfs library and various cloud SDKs, ensuring compatibility and leveraging newer features.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for asynchronous filer backups to various storage sinks. The changes include new API types for backup configuration, RBAC updates for accessing secrets and PVCs, and the controller logic to manage backup statefulsets and configmaps.

My review focuses on improving robustness, maintainability, and configuration correctness.

Comment on lines +23 to +32
func buildStubProbeHandler() corev1.ProbeHandler {
return corev1.ProbeHandler{
Exec: &corev1.ExecAction{
Command: []string{
"/bin/sh",
"-c",
"echo OK",
},
},
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The buildStubProbeHandler function creates readiness and liveness probes that use a stub command (echo OK). This means the pod will always be considered healthy even if the weed filer.backup process has crashed or is not functioning correctly.

Please consider implementing a more meaningful health check.

Copy link
Author

@MagicFun1241 MagicFun1241 Jul 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I noticed, filer.backup doesn't provides a place for health checking. Its don't listens HTTP/GRPC/Metrics port

@MagicFun1241
Copy link
Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces asynchronous backup capabilities for the SeaweedFS filer. The changes include new API types for configuring backups to various sinks, RBAC updates, and controller logic. The review focuses on improving security by addressing credential storage, robustness by enhancing secret handling, and correctness by fixing configuration issues and improving health probes.

Comment on lines +284 to +287
AWSAccessKeyID string `json:"awsAccessKeyID,omitempty"`

// AWS secret access key (if empty, loads from shared credentials file)
AWSSecretAccessKey string `json:"awsSecretAccessKey,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Storing credentials like AWSAccessKeyID and AWSSecretAccessKey directly in the CRD poses a security risk, as they are stored in etcd in plain text. It's safer to use awsCredentialsSecretRef to fetch them from a Kubernetes Secret. Consider deprecating these fields and adding a warning in the field documentation.

Suggested change
AWSAccessKeyID string `json:"awsAccessKeyID,omitempty"`
// AWS secret access key (if empty, loads from shared credentials file)
AWSSecretAccessKey string `json:"awsSecretAccessKey,omitempty"`
// AWS access key ID (DEPRECATED: Use awsCredentialsSecretRef instead for better security)
AWSAccessKeyID string `json:"awsAccessKeyID,omitempty"`
// AWS secret access key (DEPRECATED: Use awsCredentialsSecretRef instead for better security)
AWSSecretAccessKey string `json:"awsSecretAccessKey,omitempty"`

Comment on lines 113 to 129
ReadinessProbe: &corev1.Probe{
ProbeHandler: buildStubProbeHandler(),
InitialDelaySeconds: 10,
TimeoutSeconds: 3,
PeriodSeconds: 15,
SuccessThreshold: 1,
FailureThreshold: 100,
},
LivenessProbe: &corev1.Probe{
ProbeHandler: buildStubProbeHandler(),
InitialDelaySeconds: 20,
TimeoutSeconds: 3,
PeriodSeconds: 30,
SuccessThreshold: 1,
FailureThreshold: 6,
},
}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The readiness and liveness probes use echo OK, verifying only the container's shell. Implement more meaningful probes using a health check endpoint (e.g., an HTTP endpoint or a metrics port) exposed by the filer.backup process to improve deployment reliability.

@MagicFun1241
Copy link
Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the ability to deploy weed in "filer.backup" mode, adding support for asynchronous backups to various cloud storage providers. The API changes are well-structured.

Comment on lines 113 to 128
ReadinessProbe: &corev1.Probe{
ProbeHandler: buildStubProbeHandler(),
InitialDelaySeconds: 10,
TimeoutSeconds: 3,
PeriodSeconds: 15,
SuccessThreshold: 1,
FailureThreshold: 100,
},
LivenessProbe: &corev1.Probe{
ProbeHandler: buildStubProbeHandler(),
InitialDelaySeconds: 20,
TimeoutSeconds: 3,
PeriodSeconds: 30,
SuccessThreshold: 1,
FailureThreshold: 6,
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The readiness and liveness probes use a stub command (echo OK), confirming only the pod's shell responsiveness, not the weed filer.backup process's health. A long-running backup job could be stuck or failed without detection.

Investigate a more robust health check mechanism, like an HTTP endpoint or a command to check its status.

@MagicFun1241
Copy link
Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces asynchronous backup support for SeaweedFS filers. I've provided feedback focusing on improving security by removing plaintext credentials from the CRD and tightening RBAC permissions. I also suggest enhancing the robustness of the backup pods with more meaningful health probes and fixing a bug in the credential extraction logic. Overall, this is a great feature, and with these changes, it will be even more secure and reliable.

Comment on lines +283 to +287
// AWS access key ID (if empty, loads from shared credentials file)
AWSAccessKeyID string `json:"awsAccessKeyID,omitempty"`

// AWS secret access key (if empty, loads from shared credentials file)
AWSSecretAccessKey string `json:"awsSecretAccessKey,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Storing credentials like AWSAccessKeyID and AWSSecretAccessKey directly in the CRD spec poses a security risk, as they are stored in plaintext in etcd. It's recommended to only support sourcing these from Kubernetes secrets. This concern applies to other sink configurations that handle sensitive credentials, such as AzureSinkConfig and BackblazeSinkConfig. Consider removing these fields and relying solely on the ...CredentialsSecretRef fields.

Comment on lines 54 to 56
if masterKey, exists := secret[mapping.B2MasterApplicationKey]; exists {
b2MasterApplicationKey = masterKey
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

There are a couple of issues here with fetching the Backblaze master application key:

  • Bug: If mapping.B2MasterApplicationKey is empty, the code will try to look up an empty string key in the secret data (secret[""]), which is likely not the intended behavior. A default key name should be used, similar to how b2AccountIDKey is handled.
  • Inconsistent Logging: There is no log message if a custom-mapped B2MasterApplicationKey is not found in the secret, which is inconsistent with the handling of b2AccountID and other credential extraction logic in this PR.
b2MasterApplicationKeyName := mapping.B2MasterApplicationKey
if b2MasterApplicationKeyName == "" {
    b2MasterApplicationKeyName = "b2MasterApplicationKey"
}
if masterKey, exists := secret[b2MasterApplicationKeyName]; exists {
    b2MasterApplicationKey = masterKey
} else if mapping.B2MasterApplicationKey != "" {
    log.Info("Secret key not found in secret", "secret", backblazeConfig.BackblazeCredentialsSecretRef.Name, "mapping", b2MasterApplicationKeyName)
}

Comment on lines +23 to +32
func buildStubProbeHandler() corev1.ProbeHandler {
return corev1.ProbeHandler{
Exec: &corev1.ExecAction{
Command: []string{
"/bin/sh",
"-c",
"echo OK",
},
},
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The readiness and liveness probes are using a stub echo OK command. This is not a meaningful health check. It only verifies that the shell is available in the container, not that the filer.backup process is running and healthy. This could lead to a situation where the backup process has crashed or is stuck, but the pod is still considered healthy, causing silent backup failures.

Please implement a more robust health check. If weed filer.backup does not provide a health check endpoint, consider alternatives like:

  • Checking if the weed process is running (pgrep weed).
  • Having the process touch a file periodically and checking the file's modification time.

// +kubebuilder:rbac:groups=extensions,resources=ingresses,verbs=get;list;watch;create;update;patch;delete
// +kubebuilder:rbac:groups=networking.k8s.io,resources=ingresses,verbs=get;list;watch;create;update;patch;delete
// +kubebuilder:rbac:groups=core,resources=pods,verbs=get;list;
// +kubebuilder:rbac:groups=core,resources=secrets,verbs=get;list;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The ClusterRole is granted list permission on secrets cluster-wide. This is a security risk as it violates the principle of least privilege. The controller logic in getSecret only ever fetches a specific secret by name and namespace using r.Get(), so get permission is sufficient. The list permission does not appear to be used and should be removed to reduce the operator's potential attack surface.

Suggested change
// +kubebuilder:rbac:groups=core,resources=secrets,verbs=get;list;
// +kubebuilder:rbac:groups=core,resources=secrets,verbs=get;

Comment on lines 50 to 52
} else {
log.Info("Secret key not found in secret", "secret", azureConfig.AzureCredentialsSecretRef.Name, "mapping", accountNameKey)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The logging for a missing secret key is inconsistent with other sinks like S3. Here, it logs an info message even if no custom mapping was provided and the default key is not found. Other sinks only log if a custom mapping was provided but the key was not found in the secret. For consistency, this should be changed.

Suggested change
} else {
log.Info("Secret key not found in secret", "secret", azureConfig.AzureCredentialsSecretRef.Name, "mapping", accountNameKey)
}
} else if mapping.AccountName != "" {
log.Info("Secret key not found in secret", "secret", azureConfig.AzureCredentialsSecretRef.Name, "mapping", accountNameKey)
}

Comment on lines 62 to 64
} else {
log.Info("Secret key not found in secret", "secret", azureConfig.AzureCredentialsSecretRef.Name, "mapping", accountKeyKey)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Similar to the AccountName field, the logging for a missing AccountKey is inconsistent with other sinks. It should only log if a custom mapping was provided but the key was not found.

Suggested change
} else {
log.Info("Secret key not found in secret", "secret", azureConfig.AzureCredentialsSecretRef.Name, "mapping", accountKeyKey)
}
} else if mapping.AccountKey != "" {
log.Info("Secret key not found in secret", "secret", azureConfig.AzureCredentialsSecretRef.Name, "mapping", accountKeyKey)
}

TimeoutSeconds: 3,
PeriodSeconds: 15,
SuccessThreshold: 1,
FailureThreshold: 100,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The FailureThreshold for the readiness probe is set to 100. With a PeriodSeconds of 15, this means the pod can fail readiness checks for up to 25 minutes (100 * 15s) before being marked as Unready. This seems excessively high and could mask startup problems for a long time. A lower value, e.g., 20, would be more appropriate to detect issues faster.

Suggested change
FailureThreshold: 100,
FailureThreshold: 20,

@MagicFun1241
Copy link
Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces asynchronous backup support. The changes include adding access to secrets and persistentvolumeclaims resources, supporting credentials from secrets, and enabling sink options for various providers. The code is well-structured, but there are some areas for improvement, including health probes, context propagation, and dependency management.

Comment on lines +113 to +128
ReadinessProbe: &corev1.Probe{
ProbeHandler: buildStubProbeHandler(),
InitialDelaySeconds: 10,
TimeoutSeconds: 3,
PeriodSeconds: 15,
SuccessThreshold: 1,
FailureThreshold: 20,
},
LivenessProbe: &corev1.Probe{
ProbeHandler: buildStubProbeHandler(),
InitialDelaySeconds: 20,
TimeoutSeconds: 3,
PeriodSeconds: 30,
SuccessThreshold: 1,
FailureThreshold: 6,
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The readiness and liveness probes for the filer-backup container are using a stub echo OK command. This is a critical correctness issue, as it means the pod will always be reported as healthy to Kubernetes, even if the filer.backup process has crashed, hung, or is failing to perform backups. This can lead to a non-functional backup system that appears to be working correctly.

A proper health check must be implemented. Please investigate if the weed filer.backup command provides a health check endpoint or mechanism that can be used. If not, a more robust check should be devised, for example:

  • Checking for the existence and recent modification of a status file.
  • Executing a command that verifies the weed process is running and responsive.

Comment on lines 31 to 42
- apiGroups:
- ""
resources:
- persistentvolumeclaims
verbs:
- create
- delete
- get
- list
- patch
- update
- watch
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

These permissions grant access to create, delete, get, list, patch, update, and watch persistentvolumeclaims. While necessary for managing the backup process, ensure that these permissions are scoped appropriately to prevent unintended modifications to other PVCs in the cluster. Consider using more restrictive resource names or labels if possible.

go.mod Outdated
github.com/spf13/pflag v1.0.5 // indirect
github.com/spf13/viper v1.13.0 // indirect
github.com/subosito/gotenv v1.4.1 // indirect
github.com/seaweedfs/seaweedfs v0.0.0-20250625060439-95261a712ee2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Consider using a more specific version or a version range instead of a wildcard version for github.com/seaweedfs/seaweedfs to ensure consistent and reproducible builds. This can prevent unexpected issues caused by breaking changes in future versions of the dependency.

…ons in seaweed_webhook.go, and refactor admin service and stateful set management in controller_admin.go
…address configuration for service monitoring
…installation and usage instructions for SeaweedFS Operator
…ols version to v0.18.0, and enhance CRD definitions with bucketclaims support and RBAC permissions
…configurations, update Dockerfile and go.mod for improved build process
…ithQuota to utilize CreateS3BucketWithObjectLock for improved bucket creation logic
@chrislusf
Copy link
Collaborator

thanks for the big effort. Is it done and ready for review?

…figuration in deployment and service templates for seaweedfs-operator
…ted webhook templates for seaweedfs-operator
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants