Skip to content

Conversation

@nikochiko
Copy link
Member

Legal Boilerplate

Look, I get it. The entity doing business as “Gooey.AI” and/or “Dara.network” was incorporated in the State of Delaware in 2020 as Dara Network Inc. and is gonna need some rights from me in order to utilize my contributions in this PR. So here's the deal: I retain all rights, title and interest in and to my contributions, and by keeping this boilerplate intact I confirm that Dara Network Inc can use, modify, copy, and redistribute my contributions, under its choice of terms.

@coderabbitai
Copy link

coderabbitai bot commented Aug 28, 2025

📝 Walkthrough

Walkthrough

WhisperInputs in api.py was modified: task now accepts Literal["translate","transcribe"] | None (default None), language is now str | None, decoder_kwargs is dict | None and moved after a newly added max_length: int | None = None, and batch_size: int = 16 was added. common/whisper.py now inserts max_length into generate_kwargs when provided. chart/model-values.yaml adds two deployments: common-whisper-akera-kikuyu-short and common-whisper-akera-kikuyu-long with specified images, GPU/memory limits, QUEUE_PREFIX, tokenizer/model IDs, and IMPORTS.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested reviewers

  • devxpy

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 911468b and 68fe01e.

📒 Files selected for processing (1)
  • chart/model-values.yaml (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • chart/model-values.yaml
✨ Finishing Touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch akera-asr

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore or @coderabbit ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
api.py (1)

102-106: Tighten input typing/validation for new optional fields.

Use Optional consistently and validate max_length as positive; prefer Mapping for decoder kwargs to avoid unintended mutation.

Apply this diff:

-class WhisperInputs(BaseModel):
+class WhisperInputs(BaseModel):
     audio: str
-    task: typing.Literal["translate", "transcribe"] | None = None
-    language: str | None = None
+    task: typing.Optional[typing.Literal["translate", "transcribe"]] = None
+    language: typing.Optional[str] = None
     return_timestamps: bool = False
-    max_length: int | None = None
-    decoder_kwargs: dict | None = None
+    max_length: typing.Optional[int] = None  # consider enforcing >0 with pydantic Field if desired
+    decoder_kwargs: typing.Optional[typing.Mapping[str, typing.Any]] = None

If you want runtime validation for max_length > 0, I can add a Field(ge=1) with the corresponding import.

scripts/run-dev.sh (1)

22-24: Make env overrides flexible with sane defaults.

Allow overriding via host env while keeping current defaults; safer for local/dev variations.

-  -e QUEUE_PREFIX="gooey-gpu/short" \
-  -e WHISPER_MODEL_IDS=akera/whisper-large-v3-kik-full_v2 \
-  -e WHISPER_TOKENIZER_FROM=akera/whisper-large-v3-kik-full_v2 \
+  -e QUEUE_PREFIX="${QUEUE_PREFIX:-gooey-gpu/short}" \
+  -e WHISPER_MODEL_IDS="${WHISPER_MODEL_IDS:-akera/whisper-large-v3-kik-full_v2}" \
+  -e WHISPER_TOKENIZER_FROM="${WHISPER_TOKENIZER_FROM:-akera/whisper-large-v3-kik-full_v2}" \

Note: WHISPER_MODEL_IDS is split on spaces in setup_queues, so provide space-separated IDs if passing multiple.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 137307b and 29f61c7.

📒 Files selected for processing (3)
  • api.py (1 hunks)
  • common/whisper.py (1 hunks)
  • scripts/run-dev.sh (1 hunks)

@nikochiko nikochiko marked this pull request as ready for review August 28, 2025 16:25
@nikochiko
Copy link
Member Author

@devxpy let me know if this change is okay. i'll add the image & model to the chart once this is OK

docker rm -f $IMG || true
docker run \
-e IMPORTS=$IMPORTS \
-e WAV2LIP_MODEL_IDS="
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keep these around? easy to test

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, will revert this one. i did have to add WIHSPER_TOKENIZER_FROM for this model to work. we don't add that for any other whisper model

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
chart/model-values.yaml (3)

202-216: Avoid hardcoding image tag; use an anchor for maintainability.

You’ve pinned these entries to gooey-gpu-common:9 while the file defines anchors for other common images. Define a new anchor for :9 and reference it here to ease future bumps.

Apply within this hunk:

-    image: "crgooeyprodwestus1.azurecr.io/gooey-gpu-common:9"
+    image: *commonImg9

And add (outside this hunk, near existing image anchors):

commonImg9: &commonImg9 "crgooeyprodwestus1.azurecr.io/gooey-gpu-common:9"

217-231: Mirror the image anchor change for the long variant.

Keep both akera deployments pointing to the same image via the new anchor.

-    image: "crgooeyprodwestus1.azurecr.io/gooey-gpu-common:9"
+    image: *commonImg9

217-231: Capacity guardrail: set explicit maxReplicaCount if burst expected.

Most deployments inherit the global maxReplicaCount: 3. If akera long jobs are heavy, consider an explicit cap to avoid queue buildup or GPU contention.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between e2c487b and 911468b.

📒 Files selected for processing (1)
  • chart/model-values.yaml (1 hunks)
🔇 Additional comments (2)
chart/model-values.yaml (2)

202-231: Right-size resources for KIK model; confirm 5Gi/13Gi and 18Gi/50Gi are sufficient.

Given large-v3 variants can be memory-sensitive, please confirm these limits match runtime peaks; otherwise we’ll see OOMs or throttling.

You can validate by checking recent pod OOM events and GPU memory telemetry in your cluster monitoring.


202-216: WHISPER_TOKENIZER_FROM is consumed by common.whisper. The env var is read in common/whisper.py to configure the tokenizer, so it should be retained.

@nikochiko nikochiko merged commit 265226b into main Sep 3, 2025
6 checks passed
@nikochiko nikochiko deleted the akera-asr branch September 3, 2025 13:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants