-
Notifications
You must be signed in to change notification settings - Fork 772
feat: add task for in-cluster load test #4007
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
e53c1ad
to
8c66465
Compare
Taskfile.yml
Outdated
@@ -244,6 +244,10 @@ tasks: | |||
cmds: | |||
- cmd: go run ./tests/load/c/main --runtime=kube --kube-use-exclusive-scheduling {{.CLI_ARGS}} | |||
|
|||
test-load-kind-cluster: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe choose a name more reflective of the test running inside the cluster? 'kind cluster' isn't very specific.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done: d9fdb87
@@ -10,6 +10,7 @@ set -euo pipefail | |||
# DOCKER_IMAGE=avaplatform/avalanchego ./scripts/build_image.sh # Build and push multi-arch image to docker hub | |||
# DOCKER_IMAGE=localhost:5001/avalanchego ./scripts/build_image.sh # Build and push multi-arch image to private registry | |||
# DOCKER_IMAGE=localhost:5001/avalanchego FORCE_TAG_LATEST=1 ./scripts/build_image.sh # Build and push image to private registry with tag `latest` | |||
# DOCKERFILE="./Dockerfile" ./scripts/build_image.sh # Build image with a custom Dockerfile |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(No action required) Why is it desirable to customize this build script instead of following the example of scripts/build_bootstrap_monitor_image.sh?
Note that a compiled binary is suggested rather than using 'go run' at runtime.
scripts/tests.load.kind.sh
Outdated
fi | ||
|
||
# Start kind cluster | ||
./scripts/start_kind_cluster.sh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest passing arguments as per the example of other kind-using scripts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done: 387e4ef
metadata: | ||
name: load-test | ||
namespace: tmpnet | ||
rules: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(No action required) How did you arrive at these permissions?
@@ -101,5 +101,9 @@ func (s *MetricsServer) GenerateMonitoringConfig(monitoringLabels map[string]str | |||
return "", err | |||
} | |||
|
|||
if err := os.MkdirAll(filepath.Dir(collectorFilePath), 0o755); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While this addition might avoid an error, the fact that the path does not exist is a symptom of a larger problem: collection is not being configured in the pod. Given the requirement to label test workload metrics with network uuid, which isn't known at the time of pod deployment, I think deployment of local prometheus collector would be suggested so that tmpnet configure it. That would mean setting the collector credentials to the pod - easy enough - but also ensuring the availability of a compatible version of prometheus so that tmpnet could start it.
Maybe coordinate with Elvis to see what the timeline is for getting ARC online? Other than as a learning exercise, I'm less convinced of the wisdom of supporting pod-based workloads if it requires not just publishing an image and that image being complex to build. CI-launched tests won't need to publish images, and don't need extra work to support workload monitoring. Local iteration would likely be easier to support via enabling external access to nodes via a proxy instead of forwarding.
This PR has become stale because it has been open for 30 days with no activity. Adding the |
Why this should be merged
This PR adds a task allowing for running load tests within a local kind cluster.
How this works
The task does the following:
How this was tested
The load test was run with
task test-load-kind-cluster
and passed.Need to be documented in RELEASES.md?
N/A