Skip to content

Commit e4040b1

Browse files
authored
Update README.md
1 parent f982acc commit e4040b1

File tree

1 file changed

+26
-26
lines changed

1 file changed

+26
-26
lines changed

README.md

Lines changed: 26 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,32 @@
4343
* [Model Release] Oct 2022, released implementation of **PNP-VQA** (**EMNLP Findings 2022**, _"Plug-and-Play VQA: Zero-shot VQA by Conjoining Large Pretrained Models with Zero Training"_, by Anthony T.M.H. et al), <br>
4444
[Paper](https://arxiv.org/abs/2210.08773), [Project Page](https://github.com/salesforce/LAVIS/tree/main/projects/pnp-vqa), [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/salesforce/LAVIS/blob/main/projects/pnp-vqa/pnp_vqa.ipynb))
4545
> A modular zero-shot VQA framework that requires no PLMs training, achieving SoTA zero-shot VQA performance.
46-
46+
47+
## Technical Report and Citing LAVIS
48+
You can find more details in our [technical report](https://arxiv.org/abs/2209.09019).
49+
50+
**If you're using LAVIS in your research or applications, please cite it using this BibTeX**:
51+
```bibtex
52+
@inproceedings{li-etal-2023-lavis,
53+
title = "{LAVIS}: A One-stop Library for Language-Vision Intelligence",
54+
author = "Li, Dongxu and
55+
Li, Junnan and
56+
Le, Hung and
57+
Wang, Guangsen and
58+
Savarese, Silvio and
59+
Hoi, Steven C.H.",
60+
booktitle = "Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)",
61+
month = jul,
62+
year = "2023",
63+
address = "Toronto, Canada",
64+
publisher = "Association for Computational Linguistics",
65+
url = "https://aclanthology.org/2023.acl-demo.3",
66+
pages = "31--41",
67+
abstract = "We introduce LAVIS, an open-source deep learning library for LAnguage-VISion research and applications. LAVIS aims to serve as a one-stop comprehensive library that brings recent advancements in the language-vision field accessible for researchers and practitioners, as well as fertilizing future research and development. It features a unified interface to easily access state-of-the-art image-language, video-language models and common datasets. LAVIS supports training, evaluation and benchmarking on a rich variety of tasks, including multimodal classification, retrieval, captioning, visual question answering, dialogue and pre-training. In the meantime, the library is also highly extensible and configurable, facilitating future development and customization. In this technical report, we describe design principles, key components and functionalities of the library, and also present benchmarking results across common language-vision tasks.",
68+
}
69+
```
70+
71+
4772
## Table of Contents
4873
- [Introduction](#introduction)
4974
- [Installation](#installation)
@@ -293,31 +318,6 @@ We note that models in LAVIS provide no guarantees on their multimodal abilities
293318
inappropriate behaviors in the future.
294319

295320

296-
## Technical Report and Citing LAVIS
297-
You can find more details in our [technical report](https://arxiv.org/abs/2209.09019).
298-
299-
If you're using LAVIS in your research or applications, please cite using this BibTeX:
300-
```bibtex
301-
@inproceedings{li-etal-2023-lavis,
302-
title = "{LAVIS}: A One-stop Library for Language-Vision Intelligence",
303-
author = "Li, Dongxu and
304-
Li, Junnan and
305-
Le, Hung and
306-
Wang, Guangsen and
307-
Savarese, Silvio and
308-
Hoi, Steven C.H.",
309-
booktitle = "Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)",
310-
month = jul,
311-
year = "2023",
312-
address = "Toronto, Canada",
313-
publisher = "Association for Computational Linguistics",
314-
url = "https://aclanthology.org/2023.acl-demo.3",
315-
pages = "31--41",
316-
abstract = "We introduce LAVIS, an open-source deep learning library for LAnguage-VISion research and applications. LAVIS aims to serve as a one-stop comprehensive library that brings recent advancements in the language-vision field accessible for researchers and practitioners, as well as fertilizing future research and development. It features a unified interface to easily access state-of-the-art image-language, video-language models and common datasets. LAVIS supports training, evaluation and benchmarking on a rich variety of tasks, including multimodal classification, retrieval, captioning, visual question answering, dialogue and pre-training. In the meantime, the library is also highly extensible and configurable, facilitating future development and customization. In this technical report, we describe design principles, key components and functionalities of the library, and also present benchmarking results across common language-vision tasks.",
317-
}
318-
}
319-
```
320-
321321
## Contact us
322322
If you have any questions, comments or suggestions, please do not hesitate to contact us at [email protected].
323323

0 commit comments

Comments
 (0)