Skip to content

Conversation

@nsingla
Copy link

@nsingla nsingla commented Dec 3, 2025

Related Issues/PRs

#xxx

What changes are proposed in this pull request?

How is this PR tested?

  • Existing unit/integration tests
  • New unit/integration tests
  • Manual tests

Summary by CodeRabbit

  • Documentation
    • Added a comprehensive MLflow multitenancy test plan covering P0–P3 priorities with end-to-end test cases and expected outcomes; includes prerequisites, environment/setup and execution steps, data and automation guidance, and detailed scenarios for workspace isolation, authentication, RBAC, permissions (admin/resource), artifact/run operations, proxy/client integration, caching/configuration, namespace filtering, nested runs, artifact URI resolution, and edge-case considerations.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Dec 3, 2025

Walkthrough

Added a new MLflow Multitenancy Test Plan document that defines prioritized (P0–P3) end-to-end test cases, environment/setup requirements, execution steps, automation guidance, and appendices for MLflow with the Kubernetes Workspace Provider, covering authentication, RBAC, workspace/resource permissions, artifacts, caching, proxy/client integration, and edge cases.

Changes

Cohort / File(s) Change Summary
MLflow Multitenancy Test Plan Documentation
proposals/multi-tenancy-test-plan/mlflow_multitenancy_test_plan.md
New comprehensive test plan covering P0–P3 priorities, detailed test cases (prerequisites, steps, expected outcomes), environment and infrastructure requirements, test data setup, execution guidance, automation considerations, and an appendix with notes and priorities.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

  • Check clarity and completeness of test cases and expected outcomes
  • Verify environment/setup instructions, RBAC snippets, and example commands
  • Confirm automation guidance, priority ordering, and referenced configurations

Poem

🐇 I hop through test plans with whiskers bright,
Namespaces neat and RBAC set just right.
Artifacts safe, runs kept in line,
Multitenant gardens orderly and fine.
A rabbit cheers — this plan will shine.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description is a template with empty/placeholder sections. The 'What changes are proposed in this pull request?' section is completely empty, and only placeholder text is provided throughout. Complete all required sections: fill in the proposed changes describing the multitenancy test plan added, reference any related issues, and indicate the testing approach used (e.g., manual review/verification).
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'adding multi tenancy test plan' is directly related to the changeset, which adds a comprehensive MLflow Multitenancy Test Plan document. It clearly identifies the main change.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ff8e545 and f6db208.

📒 Files selected for processing (1)
  • proposals/multi-tenancy-test-plan/mlflow_multitenancy_test_plan.md (1 hunks)
🧰 Additional context used
🪛 markdownlint-cli2 (0.18.1)
proposals/multi-tenancy-test-plan/mlflow_multitenancy_test_plan.md

157-157: Spaces inside emphasis markers

(MD037, no-space-in-emphasis)


157-157: Spaces inside emphasis markers

(MD037, no-space-in-emphasis)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (23)
  • GitHub Check: pyfunc (3)
  • GitHub Check: pyfunc (4)
  • GitHub Check: pyfunc (2)
  • GitHub Check: evaluate (1)
  • GitHub Check: genai
  • GitHub Check: evaluate (2)
  • GitHub Check: windows (4)
  • GitHub Check: pyfunc (1)
  • GitHub Check: models (1)
  • GitHub Check: windows (3)
  • GitHub Check: windows (1)
  • GitHub Check: models (2)
  • GitHub Check: windows (2)
  • GitHub Check: flavors
  • GitHub Check: python (3)
  • GitHub Check: python (2)
  • GitHub Check: python (1)
  • GitHub Check: java
  • GitHub Check: python-skinny
  • GitHub Check: database
  • GitHub Check: build (skinny)
  • GitHub Check: build (tracing)
  • GitHub Check: build (dev)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
proposals/multi-tenancy-test-plan/mlflow_multitenancy_test_plan.md (2)

76-81: Clarify cache verification approach for caching tests.

The caching tests (TC-CACHE-001 to TC-CACHE-003) reference monitoring Kubernetes API calls but don't specify the verification mechanism. For TC-CACHE-001, consider explicitly documenting how to detect cache hits (e.g., via metrics endpoint, logging, or network request inspection). This will help test implementers understand what evidence constitutes a passing test.

Consider adding a subsection under "Authorization Caching" with guidance like:

Verification Approach: Cache hits can be verified by:

  • Querying MLflow metrics/tracing endpoints to confirm reduced API call volume
  • Enabling MLflow debug logging to observe [CACHE HIT] markers
  • Monitoring Kubernetes API server metrics to confirm no additional SelfSubjectAccessReview requests

208-247: Add guidance on test data cleanup/teardown.

The "Test Data Setup" section provides detailed provisioning steps but doesn't include cleanup instructions for post-test teardown. Consider adding a cleanup section to restore the cluster to a clean state (e.g., removing test namespaces, service accounts, and role bindings). This is important for test isolation and preventing namespace resource exhaustion in repeated test cycles.

Add a "Test Data Cleanup" section similar to the provisioning section:

### Test Data Cleanup

After completing all tests, clean up resources:

\`\`\`bash
# Remove role bindings and service accounts
kubectl delete rolebinding -n namespace-A mlflow-reader-binding
kubectl delete rolebinding -n namespace-B mlflow-writer-binding
kubectl delete rolebinding -n default mlflow-multiuser-reader-A
kubectl delete rolebinding -n default mlflow-multiuser-writer-B

kubectl delete serviceaccount mlflow-reader -n namespace-A
kubectl delete serviceaccount mlflow-writer -n namespace-B
kubectl delete serviceaccount mlflow-multiuser -n default

# Remove test namespaces
kubectl delete namespace namespace-A
kubectl delete namespace namespace-B
kubectl delete namespace namespace-C
\`\`\`
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e90a465 and 7d4aaed.

📒 Files selected for processing (1)
  • proposals/multi-tenancy-test-plan/mlflow_multitenancy_test_plan.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
proposals/multi-tenancy-test-plan/mlflow_multitenancy_test_plan.md

[uncategorized] ~42-~42: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ...idden error | --- ## P1 - High Priority Tests ### Multi-Workspace Access | Te...

(EN_COMPOUND_ADJECTIVE_INTERNAL)


[uncategorized] ~72-~72: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ... | --- ## P2 - Medium Priority Tests ### Authorization Caching | Tes...

(EN_COMPOUND_ADJECTIVE_INTERNAL)


[uncategorized] ~114-~114: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ...nshift-*) filtered out | --- ## P3 - Low Priority Tests ### Non-Kubernetes Deployment |...

(EN_COMPOUND_ADJECTIVE_INTERNAL)


[style] ~129-~129: This adverb was used twice in the sentence. Consider removing one of them or replacing them with a synonym.
Context: ... | Workspace name properly encoded/decoded in API calls ...

(ADVERB_REPETITION_PREMIUM)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (24)
  • GitHub Check: windows (3)
  • GitHub Check: windows (1)
  • GitHub Check: windows (4)
  • GitHub Check: pyfunc (4)
  • GitHub Check: pyfunc (2)
  • GitHub Check: windows (2)
  • GitHub Check: pyfunc (3)
  • GitHub Check: pyfunc (1)
  • GitHub Check: flavors
  • GitHub Check: genai
  • GitHub Check: python (1)
  • GitHub Check: evaluate (2)
  • GitHub Check: python (2)
  • GitHub Check: python (3)
  • GitHub Check: evaluate (1)
  • GitHub Check: models (1)
  • GitHub Check: java
  • GitHub Check: models (2)
  • GitHub Check: python-skinny
  • GitHub Check: database
  • GitHub Check: build (dev)
  • GitHub Check: build (tracing)
  • GitHub Check: build (skinny)
  • GitHub Check: typescript-sdk (20)
🔇 Additional comments (1)
proposals/multi-tenancy-test-plan/mlflow_multitenancy_test_plan.md (1)

20-69: Well-structured and comprehensive test cases.

The test cases across all priority levels (P0–P3) are well-defined with clear prerequisites, step-by-step instructions, and measurable expected results. The progression from foundational authentication (P0) through complex multi-workspace scenarios (P1–P2) to edge cases (P3) is logical and appropriate. The use of consistent formatting with test IDs, complexity/impact ratings, and prerequisite checklists makes the plan actionable.

Comment on lines 1 to 14
# MLflow Multitenancy Test Plan

## Overview

This test plan covers MLflow Multitenancy functionality using the Kubernetes Workspace Provider. Tests are prioritized based on complexity and impact:

- **P0 (Critical)**: Core security, basic RBAC, authentication - 8 tests
- **P1 (High)**: Multi-workspace scenarios, admin access, resource-level permissions - 11 tests
- **P2 (Medium)**: Caching, configuration, proxies, client integration - 13 tests
- **P3 (Low)**: Non-Kubernetes deployment, edge cases - 7 tests

**Total Test Cases**: 39

---
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

MLflow 3.6 version workspaces feature release status

💡 Result:

Short answer: MLflow 3.6.0 was released, but it does not introduce a new OSS "workspaces" feature — the 3.6 release notes (OpenTelemetry, trace UI, experiment UI, Agent Server, etc.) do not mention a workspace feature. Workspace/Workspace Model Registry functionality is a Databricks product concern (Databricks continues to document workspace vs. Unity Catalog behavior and MLflow 3 compatibility). See MLflow release notes and Databricks docs for details. [1][2][3]

Sources:

  • MLflow 3.6.0 release notes. [1]
  • MLflow GitHub releases (v3.6.0). [2]
  • Databricks documentation on the Workspace Model Registry and MLflow 3 behavior. [3]

Clarify the intended scope: MLflow 3.6 does not include a workspaces feature.

MLflow 3.6.0 was released, but the OSS project does not include a "workspaces" feature. The 3.6 release notes document OpenTelemetry support, trace UI, experiment UI, and Agent Server—not workspaces. Workspace functionality is a Databricks product concern, not part of MLflow's open-source offering.

This test plan cannot be executed as written because:

  • Configuration flags --enable-workspaces and --workspace-store-uri do not exist in MLflow 3.6
  • The kubernetes:// URI scheme for workspace storage is not supported
  • All 39 test cases depend on a feature that does not exist in the referenced version

Clarify whether this test plan is intended for:

  1. A future/planned MLflow version (not yet released)
  2. A Databricks-specific MLflow deployment
  3. A different product or internal fork with custom workspaces support

Alternatively, revise the test plan to align with actual MLflow 3.6 capabilities and document the correct prerequisites, version requirements, and feature assumptions.

🤖 Prompt for AI Agents
In proposals/multi-tenancy-test-plan/mlflow_multitenancy_test_plan.md lines
1-14, the test plan assumes a "workspaces" feature, CLI flags
(--enable-workspaces, --workspace-store-uri) and a kubernetes:// URI scheme that
do not exist in MLflow 3.6; either explicitly state the intended target (a
future MLflow release, a Databricks-specific MLflow build, or an internal fork)
or rewrite the plan to match MLflow 3.6: remove workspace-specific tests and
flags, list exact MLflow version and any required custom builds/plugins, replace
workspace scenarios with achievable alternatives (e.g., Kubernetes namespaces +
RBAC, separate tracking servers or model registry multi-tenant patterns), and
update prerequisites and each test case to reference supported features and
configuration steps.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
proposals/multi-tenancy-test-plan/mlflow_multitenancy_test_plan.md (1)

1-14: Unresolved critical issue: Test plan assumes MLflow 3.6 workspaces feature that does not exist.

This issue was previously flagged in the code review but remains unaddressed. MLflow 3.6.0 does not include a workspaces feature in the open-source project. The test plan cannot be executed as written because:

  • Configuration flags --enable-workspaces and --workspace-store-uri do not exist in MLflow 3.6
  • The kubernetes:// URI scheme is not supported in MLflow 3.6
  • Workspace APIs (e.g., GET /api/2.0/mlflow/workspaces/list) are not available
  • All 71 test cases depend on non-existent workspace functionality

Clarify whether this test plan targets:

  1. A future/unreleased MLflow version with workspaces support planned
  2. A Databricks-specific MLflow deployment or custom fork
  3. An internal MLflow build with custom extensions

Alternatively, revise the plan to align with MLflow 3.6 capabilities by documenting an achievable multitenancy pattern (e.g., Kubernetes namespaces + RBAC without workspace features, or separate tracking servers).

Also applies to: 141-141, 147-154

🧹 Nitpick comments (1)
proposals/multi-tenancy-test-plan/mlflow_multitenancy_test_plan.md (1)

42-42: Apply hyphenation to compound adjectives preceding nouns.

Three section headers use compound adjectives without hyphens. Correct to: "P1 - High-Priority Tests" (line 42), "P2 - Medium-Priority Tests" (line 72), and "P3 - Low-Priority Tests" (line 114).

Also applies to: 72-72, 114-114

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7d4aaed and 3753bc7.

📒 Files selected for processing (1)
  • proposals/multi-tenancy-test-plan/mlflow_multitenancy_test_plan.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
proposals/multi-tenancy-test-plan/mlflow_multitenancy_test_plan.md

[uncategorized] ~42-~42: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ...idden error | --- ## P1 - High Priority Tests ### Multi-Workspace Access | Te...

(EN_COMPOUND_ADJECTIVE_INTERNAL)


[uncategorized] ~72-~72: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ... | --- ## P2 - Medium Priority Tests ### Authorization Caching | Tes...

(EN_COMPOUND_ADJECTIVE_INTERNAL)


[uncategorized] ~114-~114: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ...nshift-*) filtered out | --- ## P3 - Low Priority Tests ### Non-Kubernetes Deployment |...

(EN_COMPOUND_ADJECTIVE_INTERNAL)


[style] ~129-~129: This adverb was used twice in the sentence. Consider removing one of them or replacing them with a synonym.
Context: ... | Workspace name properly encoded/decoded in API calls ...

(ADVERB_REPETITION_PREMIUM)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
  • GitHub Check: evaluate (1)
  • GitHub Check: evaluate (2)
  • GitHub Check: windows (4)
  • GitHub Check: pyfunc (2)
  • GitHub Check: pyfunc (4)
  • GitHub Check: pyfunc (3)
  • GitHub Check: java
  • GitHub Check: windows (1)
  • GitHub Check: flavors
  • GitHub Check: windows (2)
  • GitHub Check: pyfunc (1)
  • GitHub Check: database
  • GitHub Check: windows (3)
  • GitHub Check: models (1)
  • GitHub Check: models (2)
  • GitHub Check: genai
  • GitHub Check: python (3)
  • GitHub Check: python-skinny
  • GitHub Check: python (1)
  • GitHub Check: python (2)
  • GitHub Check: typescript-sdk (20)
  • GitHub Check: typescript-sdk (24)
  • GitHub Check: typescript-sdk (22)
  • GitHub Check: lint-macos
  • GitHub Check: build (dev)
  • GitHub Check: build (skinny)
  • GitHub Check: build (tracing)
  • GitHub Check: core

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
proposals/multi-tenancy-test-plan/mlflow_multitenancy_test_plan.md (1)

1-14: Critical blocker: MLflow 3.6 does not include a workspaces feature—entire test plan cannot execute as written.

MLflow 3.6.0 release notes document OpenTelemetry support, trace UI, experiment UI, and Agent Server, but do not mention workspaces. The OSS project does not include multitenancy/workspaces functionality. This means:

  • Configuration flags --enable-workspaces and --workspace-store-uri do not exist in MLflow 3.6
  • The kubernetes:// URI scheme is not supported for workspace storage
  • All 71 test cases depend on a non-existent feature

Before proceeding, clarify the intended scope: Is this test plan for (1) a future/unreleased MLflow version, (2) a Databricks-specific MLflow deployment, or (3) an internal fork with custom workspaces support? Alternatively, revise the test plan to align with actual MLflow 3.6 capabilities (e.g., using Kubernetes namespaces + RBAC patterns without assuming built-in workspace support).

Also applies to: 228-228, 235-241

🧹 Nitpick comments (2)
proposals/multi-tenancy-test-plan/mlflow_multitenancy_test_plan.md (2)

68-68: Apply hyphenation to section header compound adjectives.

Per LanguageTool static analysis, compound adjectives modifying nouns should be hyphenated. Consider:

  • Line 68: "High Priority Tests" → "High-Priority Tests"
  • Line 132: "Medium Priority Tests" → "Medium-Priority Tests"
  • Line 194: "Low Priority Tests" → "Low-Priority Tests"

Apply this diff:

-## P1 - High Priority Tests
+## P1 - High-Priority Tests

-## P2 - Medium Priority Tests
+## P2 - Medium-Priority Tests

-## P3 - Low Priority Tests
+## P3 - Low-Priority Tests

Also applies to: 132-132, 194-194


216-216: Simplify redundant adverb use.

Line 216 uses "properly encoded/decoded"—the adverb "properly" is applied to both verbs, which is redundant. Consider rephrasing for clarity:

  • Current: "Workspace name properly encoded/decoded in API calls"
  • Suggested: "Workspace name correctly encoded and decoded in API calls" or "Workspace name handling verified in API calls"
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3753bc7 and ff8e545.

📒 Files selected for processing (1)
  • proposals/multi-tenancy-test-plan/mlflow_multitenancy_test_plan.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
proposals/multi-tenancy-test-plan/mlflow_multitenancy_test_plan.md

[uncategorized] ~68-~68: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ...MISSION_DENIED | --- ## P1 - High Priority Tests ### Multi-Workspace Access | Te...

(EN_COMPOUND_ADJECTIVE_INTERNAL)


[uncategorized] ~132-~132: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ... | --- ## P2 - Medium Priority Tests ### Authorization Caching | Tes...

(EN_COMPOUND_ADJECTIVE_INTERNAL)


[uncategorized] ~194-~194: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ... (no workspace prefix) | --- ## P3 - Low Priority Tests ### Non-Kubernetes Deployment |...

(EN_COMPOUND_ADJECTIVE_INTERNAL)


[style] ~216-~216: This adverb was used twice in the sentence. Consider removing one of them or replacing them with a synonym.
Context: ... | Workspace name properly encoded/decoded in API calls ...

(ADVERB_REPETITION_PREMIUM)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (27)
  • GitHub Check: windows (4)
  • GitHub Check: windows (1)
  • GitHub Check: windows (3)
  • GitHub Check: windows (2)
  • GitHub Check: models (1)
  • GitHub Check: models (2)
  • GitHub Check: pyfunc (4)
  • GitHub Check: pyfunc (3)
  • GitHub Check: pyfunc (2)
  • GitHub Check: genai
  • GitHub Check: pyfunc (1)
  • GitHub Check: flavors
  • GitHub Check: evaluate (2)
  • GitHub Check: java
  • GitHub Check: evaluate (1)
  • GitHub Check: python (2)
  • GitHub Check: python (1)
  • GitHub Check: python (3)
  • GitHub Check: python-skinny
  • GitHub Check: database
  • GitHub Check: typescript-sdk (22)
  • GitHub Check: typescript-sdk (24)
  • GitHub Check: typescript-sdk (20)
  • GitHub Check: build (tracing)
  • GitHub Check: build (dev)
  • GitHub Check: build (skinny)
  • GitHub Check: core

Signed-off-by: Nelesh Singla <[email protected]>
Signed-off-by: Nelesh Singla <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant