-
Notifications
You must be signed in to change notification settings - Fork 13
Partial manual revert of changes from commit 0f8cd8af0a46e3e860c38f18… #229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Partial manual revert of changes from commit 0f8cd8af0a46e3e860c38f18… #229
Conversation
WalkthroughAdds environment-based endpoint discovery scaffolding to cache utilities: a new helper Changes
Sequence Diagram(s)sequenceDiagram
participant NewClient
participant cacheDefaultEndpoint
participant Env as Environment
participant Fallback as defaultKfpApiEndpoint
NewClient->>cacheDefaultEndpoint: request endpoint for namespace
cacheDefaultEndpoint->>Env: read ML_PIPELINE_SERVICE_HOST, ML_PIPELINE_SERVICE_PORT_GRPC
alt env vars present
Env-->>cacheDefaultEndpoint: host + port
cacheDefaultEndpoint-->>NewClient: resolved host:port
else env vars missing
Fallback-->>cacheDefaultEndpoint: defaultKfpApiEndpoint
cacheDefaultEndpoint-->>NewClient: fallback endpoint
end
NewClient->>NewClient: compute cacheEndPoint, log resolved endpoint
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes
Poem
Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🧰 Additional context used📓 Path-based instructions (1)backend/**/*.go📄 CodeRabbit inference engine (AGENTS.md)
Files:
🧬 Code graph analysis (1)backend/src/v2/cacheutils/cache.go (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (22)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
A set of new images have been built to help with testing out this PR: |
|
An OCP cluster where you are logged in as cluster admin is required. The Data Science Pipelines team recommends testing this using the Data Science Pipelines Operator. Check here for more information on using the DSPO. To use and deploy a DSP stack with these images (assuming the DSPO is deployed), first save the following YAML to a file named apiVersion: datasciencepipelinesapplications.opendatahub.io/v1
kind: DataSciencePipelinesApplication
metadata:
name: pr-229
spec:
dspVersion: v2
apiServer:
image: "quay.io/opendatahub/ds-pipelines-api-server:pr-229"
argoDriverImage: "quay.io/opendatahub/ds-pipelines-driver:pr-229"
argoLauncherImage: "quay.io/opendatahub/ds-pipelines-launcher:pr-229"
persistenceAgent:
image: "quay.io/opendatahub/ds-pipelines-persistenceagent:pr-229"
scheduledWorkflow:
image: "quay.io/opendatahub/ds-pipelines-scheduledworkflow:pr-229"
mlmd:
deploy: true # Optional component
grpc:
image: "quay.io/opendatahub/mlmd-grpc-server:latest"
envoy:
image: "registry.redhat.io/openshift-service-mesh/proxyv2-rhel8:2.3.9-2"
mlpipelineUI:
deploy: true # Optional component
image: "quay.io/opendatahub/ds-pipelines-frontend:pr-229"
objectStorage:
minio:
deploy: true
image: 'quay.io/opendatahub/minio:RELEASE.2019-08-14T20-37-41Z-license-compliance'Then run the following: cd $(mktemp -d)
git clone [email protected]:opendatahub-io/data-science-pipelines.git
cd data-science-pipelines/
git fetch origin pull/229/head
git checkout -b pullrequest 5cb54cdf0036ca5c8a1a8306426f13c82a37b3c0
oc apply -f dspa.pr-229.yamlMore instructions here on how to deploy and test a Data Science Pipelines Application. |
…Endpoint() in backend/src/v2/cacheutils/cache.go Signed-off-by: alyssacgoins <[email protected]>
5cb54cd to
b8dbb48
Compare
|
Change to PR detected. A new PR build was completed. |
2 similar comments
|
Change to PR detected. A new PR build was completed. |
|
Change to PR detected. A new PR build was completed. |
Signed-off-by: alyssacgoins <[email protected]>
b533134 to
4b58574
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
backend/src/v2/cacheutils/cache.go (1)
98-100: Correct the misleading error message.The error message references "metadata.NewClient()" but this is actually in the cache client's
NewClientfunction. This appears to be a copy-paste artifact from similar code in the metadata client.Apply this diff to fix the error message:
if err != nil { - return nil, fmt.Errorf("metadata.NewClient() failed: %w", err) + return nil, fmt.Errorf("cache.NewClient() failed: %w", err) }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
backend/src/v2/cacheutils/cache.go(3 hunks)manifests/kustomize/env/cert-manager/base-tls-certs/kfp-api-cert.yaml(1 hunks)manifests/kustomize/env/cert-manager/platform-agnostic-standalone-tls/patches/ml-pipeline-apiserver-deployment.yaml(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
backend/**/*.go
📄 CodeRabbit inference engine (AGENTS.md)
Run golangci-lint on Go backend code to enforce Go linting rules
Files:
backend/src/v2/cacheutils/cache.go
🧬 Code graph analysis (1)
backend/src/v2/cacheutils/cache.go (1)
backend/src/v2/metadata/client.go (1)
NewClient(119-144)
🔇 Additional comments (4)
manifests/kustomize/env/cert-manager/base-tls-certs/kfp-api-cert.yaml (1)
11-11: LGTM! DNS name addition supports FQDN service access.Adding the fully-qualified domain name ensures TLS certificate validation works correctly when the service is accessed using its complete DNS name within the cluster.
backend/src/v2/cacheutils/cache.go (2)
10-10: LGTM! Import addition is necessary.The
osimport is required for the environment variable discovery logic incacheDefaultEndpoint().
90-93: LGTM! Dynamic endpoint resolution implemented correctly.The client now discovers the cache endpoint dynamically and logs the connection target, which improves observability and flexibility for different deployment configurations.
manifests/kustomize/env/cert-manager/platform-agnostic-standalone-tls/patches/ml-pipeline-apiserver-deployment.yaml (1)
23-26: Let me get the complete file content to see the full context:No verification issues found - configuration is correct.
The environment variables properly enable endpoint discovery for the cache client. Both DNS names resolve correctly to the same service:
- Any pod in the cluster can resolve
ml-pipeline.kubeflowor the full FQDN and obtain that service's ClusterIP- The certificate in
kfp-api-cert.yamlalready includes all three DNS name formats (short name, namespace-qualified, and FQDN), so using"ml-pipeline.kubeflow"is supported- Port
8887is correct and verified across the codebase for gRPCNo hostname format consistency issue exists. The deployment correctly uses one of the certificate's supported DNS names.
| func cacheDefaultEndpoint() string { | ||
| // Discover ml-pipeline in the same namespace by env var. | ||
| // https://kubernetes.io/docs/concepts/services-networking/service/#environment-variables | ||
| cacheHost := os.Getenv("ML_PIPELINE_SERVICE_HOST") | ||
| glog.Infof("cacheHost: %s", cacheHost) | ||
| cachePort := os.Getenv("ML_PIPELINE_SERVICE_PORT_GRPC") | ||
| glog.Infof("cachePort: %s", cachePort) | ||
| if cacheHost != "" && cachePort != "" { | ||
| // If there is a ml-pipeline Kubernetes service in the same namespace, | ||
| // ML_PIPELINE_SERVICE_HOST and ML_PIPELINE_SERVICE_PORT env vars should | ||
| // exist by default, so we use it as default. | ||
| return cacheHost + ":" + cachePort | ||
| } | ||
| // If the env vars do not exist, use default ml-pipeline grpc endpoint `ml-pipeline.kubeflow:8887`. | ||
| glog.Infof("Cannot detect ml-pipeline in the same namespace, default to %s as KFP endpoint.", defaultKfpApiEndpoint) | ||
| return defaultKfpApiEndpoint | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Verify the implementation and consider minor refinements.
The endpoint discovery logic is functionally correct. However, consider the following:
-
The comment on lines 115-117 mentions "in the same namespace," but these environment variables are set explicitly in the deployment (not automatically by Kubernetes). The phrasing might mislead future maintainers.
-
The logging is quite verbose with separate Info logs for host, port, and fallback. While helpful for debugging, you might consider consolidating these logs or using a debug level for the individual host/port logs.
Run the following verification to confirm the gRPC port value and environment variable usage pattern:
🏁 Script executed:
#!/bin/bash
# Verify the gRPC port configuration across the codebase
echo "=== Checking gRPC port references ==="
rg -n "8887" --type yaml --type go -C 2
echo ""
echo "=== Checking service definitions for ml-pipeline ==="
fd -e yaml -e yml --exec rg -l "kind: Service" | xargs rg -A 10 "name: ml-pipeline"
echo ""
echo "=== Checking for other ML_PIPELINE_SERVICE environment variables ==="
rg -n "ML_PIPELINE_SERVICE" --type yaml -C 2Length of output: 74906
Clarify the endpoint discovery comment to reflect explicit environment variable injection.
The gRPC port value (8887) is correctly configured across the codebase. However, the comment on lines 115-117 is misleading—it suggests these environment variables are automatically discovered by Kubernetes service discovery, but the deployment manifests show ML_PIPELINE_SERVICE_HOST and ML_PIPELINE_SERVICE_PORT_GRPC are explicitly injected as environment variables, not auto-populated based on in-cluster service discovery.
Update the comment to clarify that these variables are explicitly configured in the deployment, not automatically set by Kubernetes.
Regarding logging: the three separate Info log statements (host, port, fallback) are verbose but may be intentional for debugging. If consolidation is preferred, consider reducing to a single structured log entry.
🤖 Prompt for AI Agents
In backend/src/v2/cacheutils/cache.go around lines 107 to 123, update the
misleading comment that says the ML_PIPELINE_SERVICE_HOST and
ML_PIPELINE_SERVICE_PORT_GRPC env vars are discovered automatically by
Kubernetes; replace it with a brief note that these variables are explicitly
injected via the deployment manifests (not auto-populated by in-cluster service
discovery). Also tidy the logging: either keep a single debug/info log that
reports the discovered host and port and the fallback to defaultKfpApiEndpoint,
or consolidate the three separate glog.Infof calls into one structured log entry
showing host, port and which endpoint will be used.
|
Change to PR detected. A new PR build was completed. |
| dnsNames: | ||
| - ml-pipeline | ||
| - ml-pipeline.kubeflow | ||
| - ml-pipeline.kubeflow.svc.cluster.local |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we actually use kustomize replacements to replace the hardcoded namespace with the actual namespace, example here:
https://github.com/nsingla/kfp_pipelines/blob/master/.github/resources/manifests/standalone/default/kustomization.yaml#L39
| value: "true" | ||
| - name: ML_PIPELINE_SERVICE_HOST | ||
| value: "ml-pipeline.kubeflow.svc.cluster.local" | ||
| value: "ml-pipeline.kubeflow" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here, can we add not hard code namespace here?
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
backend/src/v2/cacheutils/cache.go (1)
108-124: Unused function: endpoint discovery logic is dead code.The
cacheDefaultEndpoint()function is defined but never called. Line 91 constructs the endpoint manually instead of using this function, making this entire endpoint discovery implementation dead code.Once the critical issue on line 91 is fixed to call this function, the concerns from the previous review will apply:
- The comment on lines 116-118 is misleading about environment variables being "discovered automatically by Kubernetes" when they're explicitly injected via deployment manifests (as noted in past reviews)
- The logging is verbose with three separate
glog.Infofcalls (lines 112, 114, 122)For now, fixing line 91 to call this function will activate this code. Then consider addressing the comment clarity and log consolidation as suggested in the previous review.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
backend/src/v2/cacheutils/cache.go(3 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
backend/**/*.go
📄 CodeRabbit inference engine (AGENTS.md)
Run golangci-lint on Go backend code to enforce Go linting rules
Files:
backend/src/v2/cacheutils/cache.go
🧬 Code graph analysis (1)
backend/src/v2/cacheutils/cache.go (1)
backend/src/apiserver/common/config.go (1)
GetPodNamespace(114-116)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (44)
- GitHub Check: build / image-build-with-cache (apiserver, backend/Dockerfile, .)
- GitHub Check: build / image-build-with-cache (persistenceagent, backend/Dockerfile.persistenceagent, .)
- GitHub Check: build / image-build-with-cache (driver, backend/Dockerfile.driver, .)
- GitHub Check: build / image-build-with-cache (frontend, frontend/Dockerfile, .)
- GitHub Check: build / image-build-with-cache (launcher, backend/Dockerfile.launcher, .)
- GitHub Check: build / image-build-with-cache (driver, backend/Dockerfile.driver, .)
- GitHub Check: build / image-build-with-cache (scheduledworkflow, backend/Dockerfile.scheduledworkflow, .)
- GitHub Check: build / image-build-with-cache (launcher, backend/Dockerfile.launcher, .)
- GitHub Check: build / image-build-with-cache (apiserver, backend/Dockerfile, .)
- GitHub Check: build / image-build-with-cache (frontend, frontend/Dockerfile, .)
- GitHub Check: build / image-build-with-cache (scheduledworkflow, backend/Dockerfile.scheduledworkflow, .)
- GitHub Check: build / image-build-with-cache (persistenceagent, backend/Dockerfile.persistenceagent, .)
- GitHub Check: build / image-build-with-cache (persistenceagent, backend/Dockerfile.persistenceagent, .)
- GitHub Check: build / image-build-with-cache (scheduledworkflow, backend/Dockerfile.scheduledworkflow, .)
- GitHub Check: build / image-build-with-cache (frontend, frontend/Dockerfile, .)
- GitHub Check: build / image-build-with-cache (driver, backend/Dockerfile.driver, .)
- GitHub Check: build / image-build-with-cache (launcher, backend/Dockerfile.launcher, .)
- GitHub Check: build / image-build-with-cache (apiserver, backend/Dockerfile, .)
- GitHub Check: build / image-build-with-cache (frontend, frontend/Dockerfile, .)
- GitHub Check: build / image-build-with-cache (launcher, backend/Dockerfile.launcher, .)
- GitHub Check: build / image-build-with-cache (apiserver, backend/Dockerfile, .)
- GitHub Check: build / image-build-with-cache (scheduledworkflow, backend/Dockerfile.scheduledworkflow, .)
- GitHub Check: build / image-build-with-cache (driver, backend/Dockerfile.driver, .)
- GitHub Check: build / image-build-with-cache (persistenceagent, backend/Dockerfile.persistenceagent, .)
- GitHub Check: build / image-build-with-cache (apiserver, backend/Dockerfile, .)
- GitHub Check: build / image-build-with-cache (scheduledworkflow, backend/Dockerfile.scheduledworkflow, .)
- GitHub Check: build / image-build-with-cache (driver, backend/Dockerfile.driver, .)
- GitHub Check: build / image-build-with-cache (frontend, frontend/Dockerfile, .)
- GitHub Check: build / image-build-with-cache (launcher, backend/Dockerfile.launcher, .)
- GitHub Check: build / image-build-with-cache (persistenceagent, backend/Dockerfile.persistenceagent, .)
- GitHub Check: build / image-build-with-cache (frontend, frontend/Dockerfile, .)
- GitHub Check: build / image-build-with-cache (persistenceagent, backend/Dockerfile.persistenceagent, .)
- GitHub Check: build / image-build-with-cache (launcher, backend/Dockerfile.launcher, .)
- GitHub Check: build / image-build-with-cache (apiserver, backend/Dockerfile, .)
- GitHub Check: build / image-build-with-cache (driver, backend/Dockerfile.driver, .)
- GitHub Check: build / image-build-with-cache (scheduledworkflow, backend/Dockerfile.scheduledworkflow, .)
- GitHub Check: build / image-build-with-cache (scheduledworkflow, backend/Dockerfile.scheduledworkflow, .)
- GitHub Check: build / image-build-with-cache (driver, backend/Dockerfile.driver, .)
- GitHub Check: build / image-build-with-cache (frontend, frontend/Dockerfile, .)
- GitHub Check: build / image-build-with-cache (apiserver, backend/Dockerfile, .)
- GitHub Check: build / image-build-with-cache (launcher, backend/Dockerfile.launcher, .)
- GitHub Check: build / image-build-with-cache (persistenceagent, backend/Dockerfile.persistenceagent, .)
- GitHub Check: backend-tests
- GitHub Check: check_ci_status
backend/src/v2/cacheutils/cache.go
Outdated
| cacheEndPoint := "ml-pipeline" + common.GetPodNamespace() + "8887" | ||
| glog.Infof("Connecting to cache endpoint %s", cacheEndPoint) | ||
| conn, err := grpc.NewClient( | ||
| defaultKfpApiEndpoint, | ||
| cacheEndPoint, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Critical: Malformed endpoint string breaks cache connection.
Line 91 constructs the cache endpoint with incorrect string concatenation, missing the required separators:
- Missing
.between service name and namespace - Missing
:between host and port
For example, if the namespace is kubeflow, this produces ml-pipelinekubeflow8887 instead of ml-pipeline.kubeflow:8887, causing the gRPC client connection to fail.
Additionally, the properly implemented cacheDefaultEndpoint() function (lines 108-124) exists but is never called. It should be used here instead of manual string construction.
Apply this diff to fix the endpoint construction:
- cacheEndPoint := "ml-pipeline" + common.GetPodNamespace() + "8887"
+ cacheEndPoint := cacheDefaultEndpoint()
glog.Infof("Connecting to cache endpoint %s", cacheEndPoint)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| cacheEndPoint := "ml-pipeline" + common.GetPodNamespace() + "8887" | |
| glog.Infof("Connecting to cache endpoint %s", cacheEndPoint) | |
| conn, err := grpc.NewClient( | |
| defaultKfpApiEndpoint, | |
| cacheEndPoint, | |
| cacheEndPoint := cacheDefaultEndpoint() | |
| glog.Infof("Connecting to cache endpoint %s", cacheEndPoint) | |
| conn, err := grpc.NewClient( | |
| cacheEndPoint, |
🤖 Prompt for AI Agents
In backend/src/v2/cacheutils/cache.go around lines 91 to 94, the cache endpoint
is built via incorrect string concatenation (missing '.' between service and
namespace and ':' before port) which yields e.g. "ml-pipelinekubeflow8887" and
breaks the gRPC connection; replace the manual construction with a call to the
existing cacheDefaultEndpoint() helper (or otherwise build
"ml-pipeline.<namespace>:8887") and pass that value to grpc.NewClient, and
update the log to print the corrected endpoint value returned by
cacheDefaultEndpoint().
|
Change to PR detected. A new PR build was completed. |
Signed-off-by: alyssacgoins <[email protected]>
cb92bc3 to
93c2e76
Compare
|
Change to PR detected. A new PR build was completed. |
|
Closed in favor of kubeflow#12463 |
…a2159facde391da1 on cacheDefaultEndpoint() in backend/src/v2/cacheutils/cache.go.
Description of your changes:
Checklist:
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.