Skip to content

Conversation

@alyssacgoins
Copy link

@alyssacgoins alyssacgoins commented Nov 18, 2025

…a2159facde391da1 on cacheDefaultEndpoint() in backend/src/v2/cacheutils/cache.go.

Description of your changes:

Checklist:

Summary by CodeRabbit

  • Chores
    • Improved service endpoint discovery to support environment-based configuration, enhancing deployment flexibility and connectivity resilience. No user-facing API changes.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Nov 18, 2025

Walkthrough

Adds environment-based endpoint discovery scaffolding to cache utilities: a new helper cacheDefaultEndpoint() reads ML_PIPELINE_SERVICE_HOST/ML_PIPELINE_SERVICE_PORT_GRPC (or falls back to default) and NewClient computes and logs a resolved cacheEndPoint value.

Changes

Cohort / File(s) Summary
Cache endpoint discovery scaffolding
backend/src/v2/cacheutils/cache.go
Adds cacheDefaultEndpoint() to resolve the ml-pipeline endpoint from ML_PIPELINE_SERVICE_HOST/ML_PIPELINE_SERVICE_PORT_GRPC or fallback. Imports os and common. NewClient computes cacheEndPoint and updates logging to show the discovered endpoint.

Sequence Diagram(s)

sequenceDiagram
  participant NewClient
  participant cacheDefaultEndpoint
  participant Env as Environment
  participant Fallback as defaultKfpApiEndpoint

  NewClient->>cacheDefaultEndpoint: request endpoint for namespace
  cacheDefaultEndpoint->>Env: read ML_PIPELINE_SERVICE_HOST, ML_PIPELINE_SERVICE_PORT_GRPC
  alt env vars present
    Env-->>cacheDefaultEndpoint: host + port
    cacheDefaultEndpoint-->>NewClient: resolved host:port
  else env vars missing
    Fallback-->>cacheDefaultEndpoint: defaultKfpApiEndpoint
    cacheDefaultEndpoint-->>NewClient: fallback endpoint
  end
  NewClient->>NewClient: compute cacheEndPoint, log resolved endpoint
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

  • Single-file change with localized logic.
  • Pay attention to:
    • Correct handling and fallback when env vars are partially set.
    • Logging content and any unintended exposure of sensitive host/port.
    • Whether the computed cacheEndPoint is actually used downstream as intended.

Poem

🐰 In burrows of code I softly peep,

I sniff the host and count the port deep,
If envs are shy, I hop to default's tone,
A tiny endpoint finds its home,
Hooray — the cache has found its phone! 📡

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Description check ⚠️ Warning The description is incomplete; the 'Description of your changes' section is empty with only placeholder text remaining, violating the template requirement for substantive change documentation. Fill in the 'Description of your changes' section with specific details about what was reverted, why the revert was necessary, and what the expected impact is.
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly identifies the main change: a partial manual revert of specific changes to cacheDefaultEndpoint() in cache.go, which accurately reflects the changeset.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cb92bc3 and 93c2e76.

📒 Files selected for processing (1)
  • backend/src/v2/cacheutils/cache.go (3 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
backend/**/*.go

📄 CodeRabbit inference engine (AGENTS.md)

Run golangci-lint on Go backend code to enforce Go linting rules

Files:

  • backend/src/v2/cacheutils/cache.go
🧬 Code graph analysis (1)
backend/src/v2/cacheutils/cache.go (1)
backend/src/apiserver/common/config.go (1)
  • GetPodNamespace (114-116)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (22)
  • GitHub Check: build / image-build-with-cache (persistenceagent, backend/Dockerfile.persistenceagent, .)
  • GitHub Check: build / image-build-with-cache (frontend, frontend/Dockerfile, .)
  • GitHub Check: build / image-build-with-cache (persistenceagent, backend/Dockerfile.persistenceagent, .)
  • GitHub Check: build / image-build-with-cache (apiserver, backend/Dockerfile, .)
  • GitHub Check: build / image-build-with-cache (frontend, frontend/Dockerfile, .)
  • GitHub Check: build / image-build-with-cache (driver, backend/Dockerfile.driver, .)
  • GitHub Check: build / image-build-with-cache (launcher, backend/Dockerfile.launcher, .)
  • GitHub Check: build / image-build-with-cache (scheduledworkflow, backend/Dockerfile.scheduledworkflow, .)
  • GitHub Check: build / image-build-with-cache (apiserver, backend/Dockerfile, .)
  • GitHub Check: build / image-build-with-cache (launcher, backend/Dockerfile.launcher, .)
  • GitHub Check: build / image-build-with-cache (persistenceagent, backend/Dockerfile.persistenceagent, .)
  • GitHub Check: build / image-build-with-cache (persistenceagent, backend/Dockerfile.persistenceagent, .)
  • GitHub Check: build / image-build-with-cache (frontend, frontend/Dockerfile, .)
  • GitHub Check: build / image-build-with-cache (metadata-writer, backend/metadata_writer/Dockerfile, .)
  • GitHub Check: build / image-build-with-cache (frontend, frontend/Dockerfile, .)
  • GitHub Check: build / image-build-with-cache (scheduledworkflow, backend/Dockerfile.scheduledworkflow, .)
  • GitHub Check: build / image-build-with-cache (launcher, backend/Dockerfile.launcher, .)
  • GitHub Check: build / image-build-with-cache (apiserver, backend/Dockerfile, .)
  • GitHub Check: build / image-build-with-cache (driver, backend/Dockerfile.driver, .)
  • GitHub Check: build / image-build-with-cache (driver, backend/Dockerfile.driver, .)
  • GitHub Check: backend-tests
  • GitHub Check: check_ci_status

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@dsp-developers
Copy link

A set of new images have been built to help with testing out this PR:
API Server: quay.io/opendatahub/ds-pipelines-api-server:pr-229
DSP DRIVER: quay.io/opendatahub/ds-pipelines-driver:pr-229
DSP LAUNCHER: quay.io/opendatahub/ds-pipelines-launcher:pr-229
Persistence Agent: quay.io/opendatahub/ds-pipelines-persistenceagent:pr-229
Scheduled Workflow Manager: quay.io/opendatahub/ds-pipelines-scheduledworkflow:pr-229
MLMD Server: quay.io/opendatahub/mlmd-grpc-server:latest
MLMD Envoy Proxy: registry.redhat.io/openshift-service-mesh/proxyv2-rhel8:2.3.9-2
UI: quay.io/opendatahub/ds-pipelines-frontend:pr-229
TESTS: quay.io/opendatahub/ds-pipelines-tests:pr-229

@dsp-developers
Copy link

An OCP cluster where you are logged in as cluster admin is required.

The Data Science Pipelines team recommends testing this using the Data Science Pipelines Operator. Check here for more information on using the DSPO.

To use and deploy a DSP stack with these images (assuming the DSPO is deployed), first save the following YAML to a file named dspa.pr-229.yaml:

apiVersion: datasciencepipelinesapplications.opendatahub.io/v1
kind: DataSciencePipelinesApplication
metadata:
  name: pr-229
spec:
  dspVersion: v2
  apiServer:
    image: "quay.io/opendatahub/ds-pipelines-api-server:pr-229"
    argoDriverImage: "quay.io/opendatahub/ds-pipelines-driver:pr-229"
    argoLauncherImage: "quay.io/opendatahub/ds-pipelines-launcher:pr-229"
  persistenceAgent:
    image: "quay.io/opendatahub/ds-pipelines-persistenceagent:pr-229"
  scheduledWorkflow:
    image: "quay.io/opendatahub/ds-pipelines-scheduledworkflow:pr-229"
  mlmd:  
    deploy: true  # Optional component
    grpc:
      image: "quay.io/opendatahub/mlmd-grpc-server:latest"
    envoy:
      image: "registry.redhat.io/openshift-service-mesh/proxyv2-rhel8:2.3.9-2"
  mlpipelineUI:
    deploy: true  # Optional component 
    image: "quay.io/opendatahub/ds-pipelines-frontend:pr-229"
  objectStorage:
    minio:
      deploy: true
      image: 'quay.io/opendatahub/minio:RELEASE.2019-08-14T20-37-41Z-license-compliance'

Then run the following:

cd $(mktemp -d)
git clone [email protected]:opendatahub-io/data-science-pipelines.git
cd data-science-pipelines/
git fetch origin pull/229/head
git checkout -b pullrequest 5cb54cdf0036ca5c8a1a8306426f13c82a37b3c0
oc apply -f dspa.pr-229.yaml

More instructions here on how to deploy and test a Data Science Pipelines Application.

…Endpoint() in backend/src/v2/cacheutils/cache.go

Signed-off-by: alyssacgoins <[email protected]>
@alyssacgoins alyssacgoins force-pushed the revert-cache-endpoint-change branch from 5cb54cd to b8dbb48 Compare November 19, 2025 13:52
@dsp-developers
Copy link

Change to PR detected. A new PR build was completed.
A set of new images have been built to help with testing out this PR:
API Server: quay.io/opendatahub/ds-pipelines-api-server:pr-229
DSP DRIVER: quay.io/opendatahub/ds-pipelines-driver:pr-229
DSP LAUNCHER: quay.io/opendatahub/ds-pipelines-launcher:pr-229
Persistence Agent: quay.io/opendatahub/ds-pipelines-persistenceagent:pr-229
Scheduled Workflow Manager: quay.io/opendatahub/ds-pipelines-scheduledworkflow:pr-229
MLMD Server: quay.io/opendatahub/mlmd-grpc-server:latest
MLMD Envoy Proxy: registry.redhat.io/openshift-service-mesh/proxyv2-rhel8:2.3.9-2
UI: quay.io/opendatahub/ds-pipelines-frontend:pr-229
TESTS: quay.io/opendatahub/ds-pipelines-tests:pr-229

2 similar comments
@dsp-developers
Copy link

Change to PR detected. A new PR build was completed.
A set of new images have been built to help with testing out this PR:
API Server: quay.io/opendatahub/ds-pipelines-api-server:pr-229
DSP DRIVER: quay.io/opendatahub/ds-pipelines-driver:pr-229
DSP LAUNCHER: quay.io/opendatahub/ds-pipelines-launcher:pr-229
Persistence Agent: quay.io/opendatahub/ds-pipelines-persistenceagent:pr-229
Scheduled Workflow Manager: quay.io/opendatahub/ds-pipelines-scheduledworkflow:pr-229
MLMD Server: quay.io/opendatahub/mlmd-grpc-server:latest
MLMD Envoy Proxy: registry.redhat.io/openshift-service-mesh/proxyv2-rhel8:2.3.9-2
UI: quay.io/opendatahub/ds-pipelines-frontend:pr-229
TESTS: quay.io/opendatahub/ds-pipelines-tests:pr-229

@dsp-developers
Copy link

Change to PR detected. A new PR build was completed.
A set of new images have been built to help with testing out this PR:
API Server: quay.io/opendatahub/ds-pipelines-api-server:pr-229
DSP DRIVER: quay.io/opendatahub/ds-pipelines-driver:pr-229
DSP LAUNCHER: quay.io/opendatahub/ds-pipelines-launcher:pr-229
Persistence Agent: quay.io/opendatahub/ds-pipelines-persistenceagent:pr-229
Scheduled Workflow Manager: quay.io/opendatahub/ds-pipelines-scheduledworkflow:pr-229
MLMD Server: quay.io/opendatahub/mlmd-grpc-server:latest
MLMD Envoy Proxy: registry.redhat.io/openshift-service-mesh/proxyv2-rhel8:2.3.9-2
UI: quay.io/opendatahub/ds-pipelines-frontend:pr-229
TESTS: quay.io/opendatahub/ds-pipelines-tests:pr-229

Signed-off-by: alyssacgoins <[email protected]>
@alyssacgoins alyssacgoins force-pushed the revert-cache-endpoint-change branch from b533134 to 4b58574 Compare November 19, 2025 18:22
@alyssacgoins alyssacgoins marked this pull request as ready for review November 19, 2025 18:36
@openshift-ci openshift-ci bot requested review from gmfrasca and mprahl November 19, 2025 18:37
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
backend/src/v2/cacheutils/cache.go (1)

98-100: Correct the misleading error message.

The error message references "metadata.NewClient()" but this is actually in the cache client's NewClient function. This appears to be a copy-paste artifact from similar code in the metadata client.

Apply this diff to fix the error message:

 	if err != nil {
-		return nil, fmt.Errorf("metadata.NewClient() failed: %w", err)
+		return nil, fmt.Errorf("cache.NewClient() failed: %w", err)
 	}
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 294a328 and 4b58574.

📒 Files selected for processing (3)
  • backend/src/v2/cacheutils/cache.go (3 hunks)
  • manifests/kustomize/env/cert-manager/base-tls-certs/kfp-api-cert.yaml (1 hunks)
  • manifests/kustomize/env/cert-manager/platform-agnostic-standalone-tls/patches/ml-pipeline-apiserver-deployment.yaml (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
backend/**/*.go

📄 CodeRabbit inference engine (AGENTS.md)

Run golangci-lint on Go backend code to enforce Go linting rules

Files:

  • backend/src/v2/cacheutils/cache.go
🧬 Code graph analysis (1)
backend/src/v2/cacheutils/cache.go (1)
backend/src/v2/metadata/client.go (1)
  • NewClient (119-144)
🔇 Additional comments (4)
manifests/kustomize/env/cert-manager/base-tls-certs/kfp-api-cert.yaml (1)

11-11: LGTM! DNS name addition supports FQDN service access.

Adding the fully-qualified domain name ensures TLS certificate validation works correctly when the service is accessed using its complete DNS name within the cluster.

backend/src/v2/cacheutils/cache.go (2)

10-10: LGTM! Import addition is necessary.

The os import is required for the environment variable discovery logic in cacheDefaultEndpoint().


90-93: LGTM! Dynamic endpoint resolution implemented correctly.

The client now discovers the cache endpoint dynamically and logs the connection target, which improves observability and flexibility for different deployment configurations.

manifests/kustomize/env/cert-manager/platform-agnostic-standalone-tls/patches/ml-pipeline-apiserver-deployment.yaml (1)

23-26: Let me get the complete file content to see the full context:

No verification issues found - configuration is correct.

The environment variables properly enable endpoint discovery for the cache client. Both DNS names resolve correctly to the same service:

  • Any pod in the cluster can resolve ml-pipeline.kubeflow or the full FQDN and obtain that service's ClusterIP
  • The certificate in kfp-api-cert.yaml already includes all three DNS name formats (short name, namespace-qualified, and FQDN), so using "ml-pipeline.kubeflow" is supported
  • Port 8887 is correct and verified across the codebase for gRPC

No hostname format consistency issue exists. The deployment correctly uses one of the certificate's supported DNS names.

Comment on lines +107 to +123
func cacheDefaultEndpoint() string {
// Discover ml-pipeline in the same namespace by env var.
// https://kubernetes.io/docs/concepts/services-networking/service/#environment-variables
cacheHost := os.Getenv("ML_PIPELINE_SERVICE_HOST")
glog.Infof("cacheHost: %s", cacheHost)
cachePort := os.Getenv("ML_PIPELINE_SERVICE_PORT_GRPC")
glog.Infof("cachePort: %s", cachePort)
if cacheHost != "" && cachePort != "" {
// If there is a ml-pipeline Kubernetes service in the same namespace,
// ML_PIPELINE_SERVICE_HOST and ML_PIPELINE_SERVICE_PORT env vars should
// exist by default, so we use it as default.
return cacheHost + ":" + cachePort
}
// If the env vars do not exist, use default ml-pipeline grpc endpoint `ml-pipeline.kubeflow:8887`.
glog.Infof("Cannot detect ml-pipeline in the same namespace, default to %s as KFP endpoint.", defaultKfpApiEndpoint)
return defaultKfpApiEndpoint
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

Verify the implementation and consider minor refinements.

The endpoint discovery logic is functionally correct. However, consider the following:

  1. The comment on lines 115-117 mentions "in the same namespace," but these environment variables are set explicitly in the deployment (not automatically by Kubernetes). The phrasing might mislead future maintainers.

  2. The logging is quite verbose with separate Info logs for host, port, and fallback. While helpful for debugging, you might consider consolidating these logs or using a debug level for the individual host/port logs.

Run the following verification to confirm the gRPC port value and environment variable usage pattern:


🏁 Script executed:

#!/bin/bash
# Verify the gRPC port configuration across the codebase

echo "=== Checking gRPC port references ==="
rg -n "8887" --type yaml --type go -C 2

echo ""
echo "=== Checking service definitions for ml-pipeline ==="
fd -e yaml -e yml --exec rg -l "kind: Service" | xargs rg -A 10 "name: ml-pipeline"

echo ""
echo "=== Checking for other ML_PIPELINE_SERVICE environment variables ==="
rg -n "ML_PIPELINE_SERVICE" --type yaml -C 2

Length of output: 74906


Clarify the endpoint discovery comment to reflect explicit environment variable injection.

The gRPC port value (8887) is correctly configured across the codebase. However, the comment on lines 115-117 is misleading—it suggests these environment variables are automatically discovered by Kubernetes service discovery, but the deployment manifests show ML_PIPELINE_SERVICE_HOST and ML_PIPELINE_SERVICE_PORT_GRPC are explicitly injected as environment variables, not auto-populated based on in-cluster service discovery.

Update the comment to clarify that these variables are explicitly configured in the deployment, not automatically set by Kubernetes.

Regarding logging: the three separate Info log statements (host, port, fallback) are verbose but may be intentional for debugging. If consolidation is preferred, consider reducing to a single structured log entry.

🤖 Prompt for AI Agents
In backend/src/v2/cacheutils/cache.go around lines 107 to 123, update the
misleading comment that says the ML_PIPELINE_SERVICE_HOST and
ML_PIPELINE_SERVICE_PORT_GRPC env vars are discovered automatically by
Kubernetes; replace it with a brief note that these variables are explicitly
injected via the deployment manifests (not auto-populated by in-cluster service
discovery). Also tidy the logging: either keep a single debug/info log that
reports the discovered host and port and the fallback to defaultKfpApiEndpoint,
or consolidate the three separate glog.Infof calls into one structured log entry
showing host, port and which endpoint will be used.

@dsp-developers
Copy link

Change to PR detected. A new PR build was completed.
A set of new images have been built to help with testing out this PR:
API Server: quay.io/opendatahub/ds-pipelines-api-server:pr-229
DSP DRIVER: quay.io/opendatahub/ds-pipelines-driver:pr-229
DSP LAUNCHER: quay.io/opendatahub/ds-pipelines-launcher:pr-229
Persistence Agent: quay.io/opendatahub/ds-pipelines-persistenceagent:pr-229
Scheduled Workflow Manager: quay.io/opendatahub/ds-pipelines-scheduledworkflow:pr-229
MLMD Server: quay.io/opendatahub/mlmd-grpc-server:latest
MLMD Envoy Proxy: registry.redhat.io/openshift-service-mesh/proxyv2-rhel8:2.3.9-2
UI: quay.io/opendatahub/ds-pipelines-frontend:pr-229
TESTS: quay.io/opendatahub/ds-pipelines-tests:pr-229

dnsNames:
- ml-pipeline
- ml-pipeline.kubeflow
- ml-pipeline.kubeflow.svc.cluster.local
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we actually use kustomize replacements to replace the hardcoded namespace with the actual namespace, example here:
https://github.com/nsingla/kfp_pipelines/blob/master/.github/resources/manifests/standalone/default/kustomization.yaml#L39

value: "true"
- name: ML_PIPELINE_SERVICE_HOST
value: "ml-pipeline.kubeflow.svc.cluster.local"
value: "ml-pipeline.kubeflow"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, can we add not hard code namespace here?

@openshift-ci
Copy link

openshift-ci bot commented Nov 19, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign hbelmiro for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
backend/src/v2/cacheutils/cache.go (1)

108-124: Unused function: endpoint discovery logic is dead code.

The cacheDefaultEndpoint() function is defined but never called. Line 91 constructs the endpoint manually instead of using this function, making this entire endpoint discovery implementation dead code.

Once the critical issue on line 91 is fixed to call this function, the concerns from the previous review will apply:

  • The comment on lines 116-118 is misleading about environment variables being "discovered automatically by Kubernetes" when they're explicitly injected via deployment manifests (as noted in past reviews)
  • The logging is verbose with three separate glog.Infof calls (lines 112, 114, 122)

For now, fixing line 91 to call this function will activate this code. Then consider addressing the comment clarity and log consolidation as suggested in the previous review.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4b58574 and cb92bc3.

📒 Files selected for processing (1)
  • backend/src/v2/cacheutils/cache.go (3 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
backend/**/*.go

📄 CodeRabbit inference engine (AGENTS.md)

Run golangci-lint on Go backend code to enforce Go linting rules

Files:

  • backend/src/v2/cacheutils/cache.go
🧬 Code graph analysis (1)
backend/src/v2/cacheutils/cache.go (1)
backend/src/apiserver/common/config.go (1)
  • GetPodNamespace (114-116)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (44)
  • GitHub Check: build / image-build-with-cache (apiserver, backend/Dockerfile, .)
  • GitHub Check: build / image-build-with-cache (persistenceagent, backend/Dockerfile.persistenceagent, .)
  • GitHub Check: build / image-build-with-cache (driver, backend/Dockerfile.driver, .)
  • GitHub Check: build / image-build-with-cache (frontend, frontend/Dockerfile, .)
  • GitHub Check: build / image-build-with-cache (launcher, backend/Dockerfile.launcher, .)
  • GitHub Check: build / image-build-with-cache (driver, backend/Dockerfile.driver, .)
  • GitHub Check: build / image-build-with-cache (scheduledworkflow, backend/Dockerfile.scheduledworkflow, .)
  • GitHub Check: build / image-build-with-cache (launcher, backend/Dockerfile.launcher, .)
  • GitHub Check: build / image-build-with-cache (apiserver, backend/Dockerfile, .)
  • GitHub Check: build / image-build-with-cache (frontend, frontend/Dockerfile, .)
  • GitHub Check: build / image-build-with-cache (scheduledworkflow, backend/Dockerfile.scheduledworkflow, .)
  • GitHub Check: build / image-build-with-cache (persistenceagent, backend/Dockerfile.persistenceagent, .)
  • GitHub Check: build / image-build-with-cache (persistenceagent, backend/Dockerfile.persistenceagent, .)
  • GitHub Check: build / image-build-with-cache (scheduledworkflow, backend/Dockerfile.scheduledworkflow, .)
  • GitHub Check: build / image-build-with-cache (frontend, frontend/Dockerfile, .)
  • GitHub Check: build / image-build-with-cache (driver, backend/Dockerfile.driver, .)
  • GitHub Check: build / image-build-with-cache (launcher, backend/Dockerfile.launcher, .)
  • GitHub Check: build / image-build-with-cache (apiserver, backend/Dockerfile, .)
  • GitHub Check: build / image-build-with-cache (frontend, frontend/Dockerfile, .)
  • GitHub Check: build / image-build-with-cache (launcher, backend/Dockerfile.launcher, .)
  • GitHub Check: build / image-build-with-cache (apiserver, backend/Dockerfile, .)
  • GitHub Check: build / image-build-with-cache (scheduledworkflow, backend/Dockerfile.scheduledworkflow, .)
  • GitHub Check: build / image-build-with-cache (driver, backend/Dockerfile.driver, .)
  • GitHub Check: build / image-build-with-cache (persistenceagent, backend/Dockerfile.persistenceagent, .)
  • GitHub Check: build / image-build-with-cache (apiserver, backend/Dockerfile, .)
  • GitHub Check: build / image-build-with-cache (scheduledworkflow, backend/Dockerfile.scheduledworkflow, .)
  • GitHub Check: build / image-build-with-cache (driver, backend/Dockerfile.driver, .)
  • GitHub Check: build / image-build-with-cache (frontend, frontend/Dockerfile, .)
  • GitHub Check: build / image-build-with-cache (launcher, backend/Dockerfile.launcher, .)
  • GitHub Check: build / image-build-with-cache (persistenceagent, backend/Dockerfile.persistenceagent, .)
  • GitHub Check: build / image-build-with-cache (frontend, frontend/Dockerfile, .)
  • GitHub Check: build / image-build-with-cache (persistenceagent, backend/Dockerfile.persistenceagent, .)
  • GitHub Check: build / image-build-with-cache (launcher, backend/Dockerfile.launcher, .)
  • GitHub Check: build / image-build-with-cache (apiserver, backend/Dockerfile, .)
  • GitHub Check: build / image-build-with-cache (driver, backend/Dockerfile.driver, .)
  • GitHub Check: build / image-build-with-cache (scheduledworkflow, backend/Dockerfile.scheduledworkflow, .)
  • GitHub Check: build / image-build-with-cache (scheduledworkflow, backend/Dockerfile.scheduledworkflow, .)
  • GitHub Check: build / image-build-with-cache (driver, backend/Dockerfile.driver, .)
  • GitHub Check: build / image-build-with-cache (frontend, frontend/Dockerfile, .)
  • GitHub Check: build / image-build-with-cache (apiserver, backend/Dockerfile, .)
  • GitHub Check: build / image-build-with-cache (launcher, backend/Dockerfile.launcher, .)
  • GitHub Check: build / image-build-with-cache (persistenceagent, backend/Dockerfile.persistenceagent, .)
  • GitHub Check: backend-tests
  • GitHub Check: check_ci_status

Comment on lines 91 to 94
cacheEndPoint := "ml-pipeline" + common.GetPodNamespace() + "8887"
glog.Infof("Connecting to cache endpoint %s", cacheEndPoint)
conn, err := grpc.NewClient(
defaultKfpApiEndpoint,
cacheEndPoint,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical: Malformed endpoint string breaks cache connection.

Line 91 constructs the cache endpoint with incorrect string concatenation, missing the required separators:

  • Missing . between service name and namespace
  • Missing : between host and port

For example, if the namespace is kubeflow, this produces ml-pipelinekubeflow8887 instead of ml-pipeline.kubeflow:8887, causing the gRPC client connection to fail.

Additionally, the properly implemented cacheDefaultEndpoint() function (lines 108-124) exists but is never called. It should be used here instead of manual string construction.

Apply this diff to fix the endpoint construction:

-	cacheEndPoint := "ml-pipeline" + common.GetPodNamespace() + "8887"
+	cacheEndPoint := cacheDefaultEndpoint()
 	glog.Infof("Connecting to cache endpoint %s", cacheEndPoint)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
cacheEndPoint := "ml-pipeline" + common.GetPodNamespace() + "8887"
glog.Infof("Connecting to cache endpoint %s", cacheEndPoint)
conn, err := grpc.NewClient(
defaultKfpApiEndpoint,
cacheEndPoint,
cacheEndPoint := cacheDefaultEndpoint()
glog.Infof("Connecting to cache endpoint %s", cacheEndPoint)
conn, err := grpc.NewClient(
cacheEndPoint,
🤖 Prompt for AI Agents
In backend/src/v2/cacheutils/cache.go around lines 91 to 94, the cache endpoint
is built via incorrect string concatenation (missing '.' between service and
namespace and ':' before port) which yields e.g. "ml-pipelinekubeflow8887" and
breaks the gRPC connection; replace the manual construction with a call to the
existing cacheDefaultEndpoint() helper (or otherwise build
"ml-pipeline.<namespace>:8887") and pass that value to grpc.NewClient, and
update the log to print the corrected endpoint value returned by
cacheDefaultEndpoint().

@dsp-developers
Copy link

Change to PR detected. A new PR build was completed.
A set of new images have been built to help with testing out this PR:
API Server: quay.io/opendatahub/ds-pipelines-api-server:pr-229
DSP DRIVER: quay.io/opendatahub/ds-pipelines-driver:pr-229
DSP LAUNCHER: quay.io/opendatahub/ds-pipelines-launcher:pr-229
Persistence Agent: quay.io/opendatahub/ds-pipelines-persistenceagent:pr-229
Scheduled Workflow Manager: quay.io/opendatahub/ds-pipelines-scheduledworkflow:pr-229
MLMD Server: quay.io/opendatahub/mlmd-grpc-server:latest
MLMD Envoy Proxy: registry.redhat.io/openshift-service-mesh/proxyv2-rhel8:2.3.9-2
UI: quay.io/opendatahub/ds-pipelines-frontend:pr-229
TESTS: quay.io/opendatahub/ds-pipelines-tests:pr-229

@alyssacgoins alyssacgoins force-pushed the revert-cache-endpoint-change branch from cb92bc3 to 93c2e76 Compare November 19, 2025 20:42
@dsp-developers
Copy link

Change to PR detected. A new PR build was completed.
A set of new images have been built to help with testing out this PR:
API Server: quay.io/opendatahub/ds-pipelines-api-server:pr-229
DSP DRIVER: quay.io/opendatahub/ds-pipelines-driver:pr-229
DSP LAUNCHER: quay.io/opendatahub/ds-pipelines-launcher:pr-229
Persistence Agent: quay.io/opendatahub/ds-pipelines-persistenceagent:pr-229
Scheduled Workflow Manager: quay.io/opendatahub/ds-pipelines-scheduledworkflow:pr-229
MLMD Server: quay.io/opendatahub/mlmd-grpc-server:latest
MLMD Envoy Proxy: registry.redhat.io/openshift-service-mesh/proxyv2-rhel8:2.3.9-2
UI: quay.io/opendatahub/ds-pipelines-frontend:pr-229
TESTS: quay.io/opendatahub/ds-pipelines-tests:pr-229

@alyssacgoins
Copy link
Author

Closed in favor of kubeflow#12463

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants