Skip to content

Commit 4a3cfed

Browse files
committed
update agent evaluation
1 parent 8d2a4ee commit 4a3cfed

File tree

17 files changed

+5020
-4481
lines changed

17 files changed

+5020
-4481
lines changed

README.md

Lines changed: 23 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -9,30 +9,36 @@
99

1010
<br />
1111
<div align="center">
12-
<img src="docs/teaser/teaser.png" alt="Logo">
12+
<div style="display: flex; justify-content: center; align-items: center; gap: 10px;">
13+
<img src="./docs/teaser/teaser2.png" alt="Teaser Image" style="max-width: auto; height: 220px;">
14+
<img src="./docs/teaser/teaser.png" alt="Teaser Image" style="max-width: auto; height: 220px;">
15+
</div>
1316

14-
<h1 align="center">VideoGen-Eval 1.0</h1>
17+
<h1 align="center">VideoGen-Eval: Agent-based System for Video Generation Evaluation</h1>
1518

16-
#### [<code>Project Page 🚀</code>](https://ailab-cvc.github.io/VideoGen-Eval/) | [<code>Technical Report 📝</code>](http://arxiv.org/abs/2410.05227) | [<code>Prompt 🎬</code>](https://ailab-cvc.github.io/VideoGen-Eval/specifc_model/prompt.html) | [<code>Video Download 🤩</code>](https://drive.google.com/drive/folders/11WxQudsVgqI-ETXQB5PQjd7dzhz41-E0?usp=sharing) | [<code>Join WeChat 💬</code>](https://github.com/AILab-CVC/VideoGen-Eval/blob/main/docs/specifc_model/wechat.md)
19+
#### [<code>Project Page 🚀</code>](https://ailab-cvc.github.io/VideoGen-Eval/) | [<code>Agent Evaluation Report 📝</code>](http://arxiv.org/abs/2503.23452) | [<code>Survey Report 📝</code>](http://arxiv.org/abs/2410.05227) | [<code>Prompt 🎬</code>](https://ailab-cvc.github.io/VideoGen-Eval/specifc_model/prompt.html) | [<code>Video Download 🤩</code>](https://drive.google.com/drive/folders/11WxQudsVgqI-ETXQB5PQjd7dzhz41-E0?usp=sharing) | [<code>Join WeChat 💬</code>](https://github.com/AILab-CVC/VideoGen-Eval/blob/main/docs/specifc_model/wechat.md)
1720

1821
<p align="center">
19-
To observe and compare the video quality of recent video generative models!
22+
<span class="author"><a href="https://yyvhang.github.io/" target="_blank">Yuhang Yang</a><sup>1</sup></span>,
23+
<span class="author"><a href="https://scholar.google.com/citations?user=b_5HJmQAAAAJ&hl=zh-CN" target="_blank">Ke Fan</a><sup>2</sup></span>,
24+
<span class="author"><a href="" target="_blank">Shangkun Sun</a><sup>3</sup></span>,
25+
<span class="author"><a href="https://lihxxx.github.io/" target="_blank">Hongxiang Li</a><sup>3</sup></span>,
26+
<span class="author"><a href="https://ailingzeng.site/" target="_blank">Ailing Zeng</a><sup>4,*</sup></span>,
27+
<span class="author"><a href="https://feilinh.cn/" target="_blank">Feilin Han</a><sup>5</sup></span>, <br>
28+
<span class="author"><a href="https://tiaotiao11-22.github.io/wzhai/" target="_blank">Wei Zhai</a><sup>1,*</sup></span>,
29+
<span class="author"><a href="https://scholar.google.com/citations?user=AjxoEpIAAAAJ&hl=en" target="_blank">Wei Liu</a><sup>4</sup></span>,
30+
<span class="author"><a href="https://scholar.google.com/citations?user=K7rTHNcAAAAJ&hl=zh-CN" target="_blank">Yang Cao</a><sup>1</sup></span>,
31+
<span class="author"><a href="https://scholar.google.fr/citations?user=gDnBC1gAAAAJ&hl=en" target="_blank">Zheng-Jun Zha</a><sup>1</sup></span>
2032
<br />
21-
<a href="https://ailingzeng.site/">Ailing Zeng<sup>1</sup><sup>*</sup></a>
22-
·
23-
<a href="https://yyvhang.github.io/">Yuhang Yang<sup>2</sup><sup>*</sup></a>
24-
·
25-
<a href="">Weidong Chen<sup>1</sup></a>
26-
·
27-
<a href="https://scholar.google.com/citations?user=AjxoEpIAAAAJ&hl=en">Wei Liu<sup>1</sup></a>
28-
<br />
29-
<p> <sub><sup>1</sup> Tencent AI Lab, <sup>2</sup> USTC. *Equal contribution</sub></p>
33+
<p><sup>1</sup>USTC, <sup>2</sup>SJTU, <sup>3</sup>PKUSZ, <sup>4</sup>Tencent, <sup>5</sup>BFA</p>
34+
<p><sup>*</sup>Corresponding Author</p>
3035
</p>
3136
</div>
3237

3338

3439

3540
## 🔥 Project Updates
41+
- **News**: ```2025/3/31```: We propose an agent-based video generation evaluation, which is dynamic, flexible, and evolutionary.
3642
- **News**: ```2024/12/21```: We update results of [PramidalFlow](https://pyramid-flow.github.io/), please check our website.
3743
- **News**: ```2024/12/10```: We update results of [Sora](https://openai.com/sora/) and the comparison results of the latest 6 Sota models.
3844
- **News**: ```2024/12/04```: We update results of [Hunyuan](https://github.com/Tencent/HunyuanVideo), please check our website.
@@ -64,14 +70,9 @@
6470
## 💡 About The Project
6571
High-quality video generation, such as text-to-video (T2V), image-to-video (I2V), and video-to-video (V2V) generation, holds considerable significance in content creation and world simulation. Models like SORA have advanced generating videos with higher resolution, more natural motion, better vision-language alignment, and increased controllability, particularly for long video sequences. These improvements have been driven by the evolution of model architectures, shifting from UNet to more scalable and parameter-rich DiT models, along with large-scale data expansion and refined training strategies. However, despite the emergence of several DiT-based closed-source and open-source models, a comprehensive investigation into their capabilities and limitations still needs to be completed. Additionally, existing evaluation metrics often fail to align with human preferences.
6672

67-
This report v1.0 studies a series of SORA-like T2V, I2V, and V2V models via to bridge the gap between academic research and industry practice and provide a more profound analysis of recent video generation advancements. This is achieved by demonstrating and comparing over 8,000 generated video cases from **ten closed-source and several open-source models** (Kling 1.0, Kling 1.5, Gen-3, Luma 1.0, Luma 1.6, Vidu, Qingying, MiniMax Hailuo, Tongyi Wanxiang, Pika1.5) via our 700 critical prompts. Seeing is believing. We encourage readers to visit our [Website](https://ailab-cvc.github.io/VideoGen-Eval/) to browse these results online. Our study systematically examines four core aspects:
68-
69-
70-
* Impacts on vertical-domain application models, such as human-centric animation and robotics;
71-
* Key objective capabilities, such as text alignment, motion diversity, composition, stability, etc.;
72-
* Video generation across ten real-life application scenarios;
73-
* In-depth discussions on potential usage scenarios and tasks, challenges, and future work.
73+
This [survey report](http://arxiv.org/abs/2410.05227) studies a series of SORA-like T2V, I2V, and V2V models via to bridge the gap between academic research and industry practice and provide a more profound analysis of recent video generation advancements. This is achieved by demonstrating and comparing over 8,000 generated video cases from **ten closed-source and several open-source models** (Kling 1.0, Kling 1.5, Gen-3, Luma 1.0, Luma 1.6, Vidu, Qingying, MiniMax Hailuo, Tongyi Wanxiang, Pika1.5) via our 700 critical prompts. Seeing is believing. We encourage readers to visit our [Website](https://ailab-cvc.github.io/VideoGen-Eval/) to browse these results online. Our study systematically examines four core aspects:
7474

75+
Our agent-based evaluation emphasizes a **flexible, scalable, and evolving** system to keep up with the rapid development of video generation
7576

7677
We assign an ID to each case. The input text, the names of input images and videos correspond to the ID. The results generated by different models are named as `model_name+id.mp4`. Please refer to the [prompt](https://ailab-cvc.github.io/VideoGen-Eval/specifc_model/prompt.html). All the results are publicly accessible, and we will continuously update the results as new models are released and existing ones undergo version updates.
7778

@@ -101,8 +102,8 @@ results = load_prompt.get_prompts([id_list], 'test_model_name')
101102
## 🦉 Job List
102103

103104
- [x] VideoGen-Eval-1.0 released
104-
- [x] Add results of Seaweed, PixVerse.
105-
- [ ] Make the arena for video generation models.
105+
- [x] Add results of multiple models.
106+
- [ ] Release the agent-based evaluation system.
106107

107108
<!-- CONTRIBUTING -->
108109
## 💗 Contributing

0 commit comments

Comments
 (0)