MetricMate: LLM-Supported Criteria Generation

MetricMate is a full-stack application for generating evaluation criteria with the help of Large Language Models (LLMs).
It provides a Python backend for API handling and model interaction, and a JavaScript/React frontend for an interactive user interface.

Overview

MetricMate helps users define and refine evaluation criteria for model assessments by leveraging OpenAI’s language models.
It is designed for researchers, evaluators, and AI practitioners who want to:

Automate criteria creation for experiments
Standardize evaluation frameworks
Quickly adapt metrics to different tasks

Getting Started

Backend Setup

Install dependencies

pip install -r requirements.txt

Configure API key Update backend/config.ini with your personal OpenAI API key:

api_key = YOUR_OPENAI_API_KEY

Run the backend

python main.py

Running the frontend

Install dependencies

npm install

Start the frontend

npm start

Reference

If you use our tools/code for your work, please cite the following paper:

@inproceedings{gebreegziabher2025metricmate,
  title={MetricMate: An Interactive Tool for Generating Evaluation Criteria for LLM-as-a-Judge Workflow},
  author={Gebreegziabher, Simret Araya and Chiang, Charles and Wang, Zichu and Ashktorab, Zahra and Brachman, Michelle and Geyer, Werner and Li, Toby Jia-Jun and G{\'o}mez-Zar{\'a}, Diego},
  booktitle={Proceedings of the 4th Annual Symposium on Human-Computer Interaction for Work},
  pages={1--18},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
backend		backend
frontend		frontend
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MetricMate: LLM-Supported Criteria Generation

Overview

Getting Started

Backend Setup

Running the frontend

Reference

About

Uh oh!

Releases

Packages

Languages

License

ND-SaNDwichLAB/MetricMate

Folders and files

Latest commit

History

Repository files navigation

MetricMate: LLM-Supported Criteria Generation

Overview

Getting Started

Backend Setup

Running the frontend

Reference

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages