This repo contains the code for PATIMT-Bench: A Multi-Scenario Benchmark for Position-Aware Text Image Machine Translation in Large Vision-Language Models. Please give us a like ❤️ if you find it useful !
# input.jsonl should contain image path and original OCR
python adaptive_refine.py --input_file input.jsonl
python gpt_api_label.pybash eval.sh@inproceedings{
zhuang2025patimtbench,
title={{PATIMT}-Bench: A Multi-Scenario Benchmark for Position-Aware Text Image Machine Translation in Large Vision-Language Models},
author={Wanru Zhuang and Wenbo Li and Zhibin Lan and Xu Han and Peng Li and Jinsong Su},
booktitle={The 2025 Conference on Empirical Methods in Natural Language Processing},
year={2025},
url={https://openreview.net/forum?id=UYcNSzYyi9}
}