CREPE

This repository contains the official Python package for CREPE — a fast, interpretable, and clinically grounded metric for automated chest X‑ray report evaluation.

Overview

What CREPE does. Given a reference report and a candidate (generated) report, CREPE predicts continuous error counts over six clinically meaningful categories (A–F) and returns their sum as the CREPE score (lower is better). The model is a domain‑specific BERT encoder with six regression heads (plus auxiliary presence heads used in training).

Install

# Python >= 3.9
pip install --upgrade pip
pip install "torch>=2.1" "transformers>=4.41"

This repo is a small Python package; you can import it directly from the project root (editable install optional).

The default checkpoint is hosted on the Hugging Face Hub (gihuncho/crepe-biomedbert) and will be downloaded on first use. If your environment has restricted internet access, pass a local cache_dir or model path (see below).

Quickstart

import torch
import crepe

# (optional) set a cache directory for HF model files
cache_dir = "your/path/to/cache/dir"

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# 1) Load model & tokenizer (defaults to gihuncho/crepe-biomedbert)
model, tokenizer = crepe.load_model_and_tokenizer(cache_dir=cache_dir)
model.to(device).eval()

# 2) Score a pair of reports
reference = "Normal chest radiograph"
candidate = "Bilateral pleural effusions noted"

result = crepe.compute(
    model=model,
    tokenizer=tokenizer,
    reference_report=reference,
    candidate_report=candidate,
    device=device.type,  # "cuda" or "cpu"
)

print(result)
# {
#   "crepe_score": <float>,  # lower is better
#   "predicted_error_counts": [nA, nB, nC, nD, nE, nF]  # continuous, >= 0
# }

Tokenization uses pair encoding: (reference, candidate), with truncation/padding to 512 tokens to match the training setup.

Interpreting the Outputs

predicted_error_counts: a list of six non‑negative floats [nA, nB, nC, nD, nE, nF], ordered by categories A→F as defined above. Values are continuous (not forced to integers).
crepe_score: the unweighted sum of the six predicted counts. Lower is better (fewer predicted discrepancies).

API

# crepe.__init__.py
load_model_and_tokenizer(model_name_or_path="gihuncho/crepe-biomedbert", cache_dir=None)
# -> (model, tokenizer)

compute(model, tokenizer, reference_report, candidate_report, device="cpu")
# -> {"crepe_score": float, "predicted_error_counts": List[float]}

Advanced utilities (optional):

# crepe.models.get_model_and_tokenizer(ckpt_path, device=None)
# crepe.models.get_predicted_counts(model, tokenizer, gt, pred, device=None)

Under the hood, the model is a PreTrainedModel with six regression heads; auxiliary presence heads exist for training but are not used at inference. Predictions are clipped to be non‑negative. See crepe/models.py.

Contact

If you have any questions, feel free to contact!

[email protected]

Citing

@inproceedings{cho-etal-2025-crepe,
    title = "{CREPE}: Rapid Chest {X}-ray Report Evaluation by Predicting Multi-category Error Counts",
    author = "Cho, Gihun  and
      Jang, Seunghyun  and
      Ko, Hanbin  and
      Baek, Inhyeok  and
      Park, Chang Min",
    editor = "Christodoulopoulos, Christos  and
      Chakraborty, Tanmoy  and
      Rose, Carolyn  and
      Peng, Violet",
    booktitle = "Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing",
    month = nov,
    year = "2025",
    address = "Suzhou, China",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2025.emnlp-main.1102/",
    pages = "21749--21766",
    ISBN = "979-8-89176-332-6"
}

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
crepe		crepe
docs		docs
.gitignore		.gitignore
README.md		README.md
example.ipynb		example.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CREPE

Overview

Install

Quickstart

Interpreting the Outputs

API

Contact

Citing

About

Uh oh!

Languages

gihuncho/crepe

Folders and files

Latest commit

History

Repository files navigation

CREPE

Overview

Install

Quickstart

Interpreting the Outputs

API

Contact

Citing

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Languages