diff --git a/README.md b/README.md index 9c39c88..23f645e 100644 --- a/README.md +++ b/README.md @@ -388,6 +388,7 @@ This is the first work to correct hallucination in multimodal large language mod ## Multimodal Chain-of-Thought | Title | Venue | Date | Code | Demo | |:--------|:--------:|:--------:|:--------:|:--------:| +| ![Star](https://img.shields.io/github/stars/yu-rp/VisualPerceptionToken.svg?style=social&label=Star)
[**Introducing Visual Perception Token into Multimodal Large Language Model**](https://arxiv.org/pdf/2502.17425.pdf)
| arXiv | 2025-02-24 | [Github](https://github.com/yu-rp/VisualPerceptionToken) | - | | ![Star](https://img.shields.io/github/stars/dongyh20/Insight-V.svg?style=social&label=Star)
[**Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models**](https://arxiv.org/pdf/2411.14432)
| arXiv | 2024-11-21 | [Github](https://github.com/dongyh20/Insight-V) | - | | ![Star](https://img.shields.io/github/stars/ggg0919/cantor.svg?style=social&label=Star)
[**Cantor: Inspiring Multimodal Chain-of-Thought of MLLM**](https://arxiv.org/pdf/2404.16033.pdf)
| arXiv | 2024-04-24 | [Github](https://github.com/ggg0919/cantor) | Local Demo | | ![Star](https://img.shields.io/github/stars/deepcs233/Visual-CoT.svg?style=social&label=Star)
[**Visual CoT: Unleashing Chain-of-Thought Reasoning in Multi-Modal Language Models**](https://arxiv.org/pdf/2403.16999.pdf)
| arXiv | 2024-03-25 | [Github](https://github.com/deepcs233/Visual-CoT) | Local Demo |