TempViz: On the Evaluation of Temporal Knowledge in Text-to-Image Models

Paper Abstract

Time alters the visual appearance of entities in our world, like objects, places, and animals. Thus, for accurately generating contextually-relevant images, knowledge and reasoning about time can be crucial (e.g., for generating a landscape in spring vs. in winter). Yet, although substantial work exists on understanding and improving temporal knowledge in natural language processing, research on how temporal phenomena appear and are handled in text-to-image (T2I) models remains scarce. We address this gap with TempViz, the first data set to holistically evaluate temporal knowledge in image generation, consisting of 7.9k prompts and more than 600 reference images. Using TempViz, we study the capabilities of five T2I models across five temporal knowledge categories. Human evaluation shows that temporal competence is generally weak, with no model exceeding 75% accuracy across categories. Towards larger-scale studies, we also examine automated evaluation methods, comparing several established approaches against human judgments. However, none of these approaches provides a reliable assessment of temporal cues - further indicating the pressing need for future research on temporal knowledge in T2I.

Getting Started

We conducted all our experiments with Python 3.10. Before getting started, make sure you install the requirements listed in the requirements.txt file.

pip install -r requirements.txt

📂 Directory/File Structure Overview

This repository contains all the code and data needed to reproduce the experiments and results reported in our paper.

Data

A brief description of the files and folders in data is:

tempviz
- Contains the TempViz dataset.
goldenRecordImages
- Contains the golden record images in addition to the TempViz dataset, for the Maps and Artworks category that cannot be accessed directly via a link.
eval_annotations
- Contains the annotations on the evaluation subset of TempViz (500 instances).
llm_prompts
- Contains the prompts generated by Llama3 70B to evaluate the generated images.
model_results
- Contains all model outputs of the automatic evaluation approaches.
additional_annotations
- We provide additional annotation results that should be used with care as they were generated using crowdsourced data.

Code

Includes all python files and notebooks subject to this paper.

A brief description of the files in code is:

creation_of_paper_plots.ipynb
- This notebook can be used to recreate all plots present in the paper, based on the experimental results.
calculate_clipscore_or_captioning.py
- Contains the code to compute the clipscores and captioning cosine similarities.
generate_images.py
- Contains the code to generate the images by prompting the T2I models.
get_answers_openai.py
- Contains the code to prompt GPT-5 to analyze temporal knowledge in images.
prompt_llms.py
- Contains the code to prompt LLMs to generate questions about the generated image based on the initial prompt.
prompt_vlm_models.py
- Contains the code to prompt the VLMs to analyze temporal knowledge in images.

References

Please use the following bibtex entry to cite us:

@inproceedings{}

Author contact information: carolin.holtermann@uni-hamburg.de

License

All source code is made available under a

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
code		code
data		data
img		img
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TempViz: On the Evaluation of Temporal Knowledge in Text-to-Image Models

Paper Abstract

Getting Started

📂 Directory/File Structure Overview

Data

Code

References

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

TempViz: On the Evaluation of Temporal Knowledge in Text-to-Image Models

Paper Abstract

Getting Started

📂 Directory/File Structure Overview

Data

Code

References

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages