Skip to content

Conversation

@vincentkoc
Copy link
Member

@vincentkoc vincentkoc commented Nov 7, 2025

Warning

Large base64 data such as video should ideally be moved to a blob store and appropriate routes, handlers and retention configured. For test purposes this will work but in large volumes could be unstable. See SDK chanegs in #3998 for how we might use attachments in traces and spans.

Note

Additional work might be required to ensure video output from supported SDKs/LLMs is captured and can be routed for the online evaluation (not yet tested, i.e. Wan 2.5). See #3998 for some support via attachments.

Note

Frontend UI with image (+) and video (+) is crowded and should really be using a file (+) with smart mime/ext detection, however if we have a url based approach it might not be aparent so would/might want to improve the UX here later.

Details

Add support for video based online and SDK based evals (LLM-as-a-judge), video datsets (base64, url), and normalizing functions to a "media" handler ubiquitous of video or image for simplification.

Change checklist

  • User facing
  • Documentation update

Issues

  • OPIK-2940
  • OPIK-2880

Testing

Testing requires a local inference model, for this I have been using Ollama with vLLM endpoint. For cloud testing you can tunnel using ngrok. Ollama support added here ollama/ollama#12962 local build can be provided if not merged.

Documentation

Updated

Examples

Screenshot 2025-11-06 at 15 13 15 Screenshot 2025-11-06 at 15 21 19

Important

Traces with attachments for videos (minio/s3) are aware to the online evaluation flow but not exposed in an accessiable manner. See current fields from #3998

Screenshot 2025-11-07 at 17 46 55

@comet-ml comet-ml deleted a comment from github-actions bot Nov 7, 2025
@comet-ml comet-ml deleted a comment from github-actions bot Nov 7, 2025
@github-actions
Copy link
Contributor

🔄 Test environment deployment started

Building images for PR #3988...

You can monitor the build progress here.

@CometActions
Copy link
Collaborator

Test environment is now available!

Access Information

The deployment has completed successfully and the version has been verified.

@Nimrod007 Nimrod007 changed the title [OPIK-2940] [FE][BE][DOCS] Video: Add support for Video LLM-as-a-judge, Datasets and Playground [OPIK-2940] [FE][BE][DOCS] Video: Add support for Video LLM-as-a-judge, Datasets and Playground - WIP DO NOT MERGE Nov 12, 2025
@vincentkoc vincentkoc added test-environment Deploy Opik adhoc environment and removed test-environment Deploy Opik adhoc environment labels Nov 12, 2025
@github-actions
Copy link
Contributor

🔄 Test environment deployment started

Building images for PR #3988...

You can monitor the build progress here.

@comet-ml comet-ml deleted a comment from github-actions bot Nov 12, 2025
@comet-ml comet-ml deleted a comment from github-actions bot Nov 12, 2025
@comet-ml comet-ml deleted a comment from github-actions bot Nov 12, 2025
@comet-ml comet-ml deleted a comment from github-actions bot Nov 12, 2025
@CometActions
Copy link
Collaborator

Test environment is now available!

Access Information

The deployment has completed successfully and the version has been verified.

@comet-ml comet-ml deleted a comment from github-actions bot Nov 12, 2025
…to feat/video-eval

* 'feat/video-eval' of https://github.com/comet-ml/opik:
  Update base version to 1.9.10
  [OPIK-2856] [FE] Hide All time option in metrics tab and support optional date filtering (#4052)
  Update TypeScript SDK version to 1.9.9
@comet-ml comet-ml deleted a comment from github-actions bot Nov 13, 2025
@comet-ml comet-ml deleted a comment from github-actions bot Nov 13, 2025
@comet-ml comet-ml deleted a comment from github-actions bot Nov 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants