Skip to content

Conversation

@nikochiko
Copy link
Member

  • feat: update commonImg to version 9
  • feat: add 'WHISPER_TOKENIZER_FROM' and update image version for sunbird model

Legal Boilerplate

Look, I get it. The entity doing business as “Gooey.AI” and/or “Dara.network” was incorporated in the State of Delaware in 2020 as Dara Network Inc. and is gonna need some rights from me in order to utilize my contributions in this PR. So here's the deal: I retain all rights, title and interest in and to my contributions, and by keeping this boilerplate intact I confirm that Dara Network Inc can use, modify, copy, and redistribute my contributions, under its choice of terms.

@coderabbitai
Copy link

coderabbitai bot commented Sep 10, 2025

📝 Walkthrough

Walkthrough

Updates chart/model-values.yaml to bump the commonImg anchor from gooey-gpu-common:8 to :9 and switch several Whisper deployments to use this anchor. Adds WHISPER_TOKENIZER_FROM to sunbird-short. Introduces a new deployment common-whisper-sunbird-long with specified autoscaling, GPU/memory limits, and WHISPER_MODEL_IDS plus tokenizer settings. Renames common-whisper-sunbird-swahili-long to common-whisper-swahili-long, reducing GPU and memory limits and updating WHISPER_MODEL_IDS to remove Sunbird. Some Whisper deployments (akera-kikuyu short/long) now reference *commonImg instead of explicit images.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related PRs

  • feat: add sunbird asr model #34 — Also modifies chart/model-values.yaml for Whisper deployments, touching the same entries (image anchors, model IDs, and sunbird/swahili variants).

Suggested reviewers

  • devxpy

Pre-merge checks (2 passed, 1 warning)

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Title Check ⚠️ Warning The current title “fix: languages for sunbird asr model” does not capture the primary changes in this pull request—namely bumping the commonImg anchor to version 9, standardizing image references across Whisper deployments, and adding a new sunbird-long deployment with the WHISPER_TOKENIZER_FROM environment variable. Instead, it vaguely references “languages” without specifying which deployments or the core image updates, making it misleading for reviewers. A concise title that highlights the image upgrade and tokenizer addition would better reflect the main intent of the changes. Rename the title to directly reference the key changes—for example, “chore: bump gooey-gpu-common image to v9 and add WHISPER_TOKENIZER_FROM for Sunbird deployments”—so it clearly summarizes the image version update and the new tokenizer environment variable.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed The description includes two clear bullet points summarizing the version bump and addition of WHISPER_TOKENIZER_FROM, and it exactly matches the required “### Legal Boilerplate” section from the repository template. All mandated sections are present and correctly formatted.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

  • Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
  • Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Please see the documentation for more information.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
      - name: "Undocumented Breaking Changes"
        mode: "warning"
        instructions: |
          Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).

Please share your feedback with us on this Discord post.


📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 265226b and bb9a84d.

📒 Files selected for processing (1)
  • chart/model-values.yaml (5 hunks)
🔇 Additional comments (6)
chart/model-values.yaml (6)

217-231: *Akera Kikuyu short: anchor swap to commonImg is good.

Keeps image pin via the central anchor; env already includes tokenizer. No issues spotted.


233-245: Akera Kikuyu long: anchor swap consistent; resources unchanged.

Looks fine and aligned with the short variant and other long whisper entries.


159-173: common/whisper.py properly consumes WHISPER_TOKENIZER_FROM for tokenizer selection; no WHISPER_LANGUAGES env required.


203-216: Confirm resource reduction and standardize image anchor

  • Verify via your metrics pipeline that lowering GPU from 18Gi→8Gi and memory from 50Gi→20Gi does not cause OOMs under peak or large-batch workloads.
  • Use the current image anchor for consistency:
-    image: *commonImgOld
+    image: *commonImg

40-40: Bump all commonImg references to :9 and verify tag availability
In chart/model-values.yaml, replace every image: *commonImgOld with image: *commonImg so all pods use the new :9 anchor. Confirm crgooeyprodwestus1.azurecr.io/gooey-gpu-common:9 is published in each cluster’s ACR (e.g. via az acr repository show-tags) and that common.whisper runtime deps remain backward-compatible.


174-188: Approve Sunbird long deployment configuration. Settings (minReplicaCount=0, 10 Gi GPU / 27 Gi memory) align with the short variant, both entries share the same WHISPER_TOKENIZER_FROM, and only two Sunbird deployments target the model as expected.

✨ Finishing Touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch sunbird-asr-fix

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@nikochiko nikochiko merged commit 5ed6ae0 into main Sep 10, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants