-
Notifications
You must be signed in to change notification settings - Fork 2
fix: change kwargs for akera ASR model #40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
📝 WalkthroughWalkthroughWhisperInputs in api.py was modified: Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested reviewers
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 💡 Knowledge Base configuration:
You can enable these sources in your CodeRabbit configuration. 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
✨ Finishing Touches🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
api.py (1)
102-106: Tighten input typing/validation for new optional fields.Use Optional consistently and validate
max_lengthas positive; prefer Mapping for decoder kwargs to avoid unintended mutation.Apply this diff:
-class WhisperInputs(BaseModel): +class WhisperInputs(BaseModel): audio: str - task: typing.Literal["translate", "transcribe"] | None = None - language: str | None = None + task: typing.Optional[typing.Literal["translate", "transcribe"]] = None + language: typing.Optional[str] = None return_timestamps: bool = False - max_length: int | None = None - decoder_kwargs: dict | None = None + max_length: typing.Optional[int] = None # consider enforcing >0 with pydantic Field if desired + decoder_kwargs: typing.Optional[typing.Mapping[str, typing.Any]] = NoneIf you want runtime validation for
max_length > 0, I can add aField(ge=1)with the corresponding import.scripts/run-dev.sh (1)
22-24: Make env overrides flexible with sane defaults.Allow overriding via host env while keeping current defaults; safer for local/dev variations.
- -e QUEUE_PREFIX="gooey-gpu/short" \ - -e WHISPER_MODEL_IDS=akera/whisper-large-v3-kik-full_v2 \ - -e WHISPER_TOKENIZER_FROM=akera/whisper-large-v3-kik-full_v2 \ + -e QUEUE_PREFIX="${QUEUE_PREFIX:-gooey-gpu/short}" \ + -e WHISPER_MODEL_IDS="${WHISPER_MODEL_IDS:-akera/whisper-large-v3-kik-full_v2}" \ + -e WHISPER_TOKENIZER_FROM="${WHISPER_TOKENIZER_FROM:-akera/whisper-large-v3-kik-full_v2}" \Note:
WHISPER_MODEL_IDSis split on spaces insetup_queues, so provide space-separated IDs if passing multiple.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (3)
api.py(1 hunks)common/whisper.py(1 hunks)scripts/run-dev.sh(1 hunks)
|
@devxpy let me know if this change is okay. i'll add the image & model to the chart once this is OK |
| docker rm -f $IMG || true | ||
| docker run \ | ||
| -e IMPORTS=$IMPORTS \ | ||
| -e WAV2LIP_MODEL_IDS=" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
keep these around? easy to test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, will revert this one. i did have to add WIHSPER_TOKENIZER_FROM for this model to work. we don't add that for any other whisper model
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
chart/model-values.yaml (3)
202-216: Avoid hardcoding image tag; use an anchor for maintainability.You’ve pinned these entries to gooey-gpu-common:9 while the file defines anchors for other common images. Define a new anchor for :9 and reference it here to ease future bumps.
Apply within this hunk:
- image: "crgooeyprodwestus1.azurecr.io/gooey-gpu-common:9" + image: *commonImg9And add (outside this hunk, near existing image anchors):
commonImg9: &commonImg9 "crgooeyprodwestus1.azurecr.io/gooey-gpu-common:9"
217-231: Mirror the image anchor change for the long variant.Keep both akera deployments pointing to the same image via the new anchor.
- image: "crgooeyprodwestus1.azurecr.io/gooey-gpu-common:9" + image: *commonImg9
217-231: Capacity guardrail: set explicit maxReplicaCount if burst expected.Most deployments inherit the global maxReplicaCount: 3. If akera long jobs are heavy, consider an explicit cap to avoid queue buildup or GPU contention.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
chart/model-values.yaml(1 hunks)
🔇 Additional comments (2)
chart/model-values.yaml (2)
202-231: Right-size resources for KIK model; confirm 5Gi/13Gi and 18Gi/50Gi are sufficient.Given large-v3 variants can be memory-sensitive, please confirm these limits match runtime peaks; otherwise we’ll see OOMs or throttling.
You can validate by checking recent pod OOM events and GPU memory telemetry in your cluster monitoring.
202-216: WHISPER_TOKENIZER_FROM is consumed by common.whisper. The env var is read in common/whisper.py to configure the tokenizer, so it should be retained.
Legal Boilerplate
Look, I get it. The entity doing business as “Gooey.AI” and/or “Dara.network” was incorporated in the State of Delaware in 2020 as Dara Network Inc. and is gonna need some rights from me in order to utilize my contributions in this PR. So here's the deal: I retain all rights, title and interest in and to my contributions, and by keeping this boilerplate intact I confirm that Dara Network Inc can use, modify, copy, and redistribute my contributions, under its choice of terms.