An LLM-powered paper evaluation system that provides structured peer reviews following academic conference guidelines. The system uses multiple LLM judges via OpenRouter API to evaluate research papers in PDF or LaTeX format.
- Multi-format support: Process PDF and LaTeX files
- Multiple LLM judges: Use different models with specialized personas
- Structured evaluation: Follows NeurIPS review guidelines
- Comprehensive scoring: Quality, Clarity, Significance, and Originality ratings
- Batch processing: Evaluate with multiple judges simultaneously
- Paper improvement: Iterative self-improvement based on LLM reviews
- Interactive mode: Review improvement plans before applying changes
- Automatic mode: Multi-round improvement without user intervention
-
Clone the repository:
git clone https://github.com/ChicagoHAI/paper_evaluator.git cd paper_evaluator -
Install dependencies using uv:
uv sync
This will install the project dependencies and make the
paper_evaluatorcommand available viauv run. -
Set up your configuration:
cp config.yaml config.local.yaml # Edit config.local.yaml and add your OpenRouter API key
- Get an API key from OpenRouter
- Edit
config.local.yamland replace"sk-or-v1-your-actual-api-key-here"with your actual API key - Customize judges and their personas as needed
The configuration includes two free models by default:
moonshotai/kimi-k2:free- Specialized in ML and AI systemsz-ai/glm-4.5-air:free- Specialized in NLP and computational linguistics
Evaluate a paper with all configured judges:
uv run paper_evaluator paper.pdf config.local.yamlOr equivalently:
source .venv/bin/activate
paper_evaluator paper.pdf config.local.yaml# Specify output directory
uv run paper_evaluator paper.tex config.local.yaml --output my_reviews/
# Use only a single judge
uv run paper_evaluator paper.pdf config.local.yaml --single-judge "Kimi"
# Enable verbose output
uv run paper_evaluator paper.pdf config.local.yaml --verbose
# Save prompts to logs/ directory for inspection
uv run paper_evaluator paper.pdf config.local.yaml --log-prompts
# Combine options
uv run paper_evaluator paper.pdf config.local.yaml --output reviews/ --verbose --log-promptsAutomatically improve a LaTeX paper based on LLM reviews:
# Automatic improvement (3 rounds by default)
uv run paper_evaluator paper.tex config.local.yaml --improve
# Automatic improvement with custom rounds
uv run paper_evaluator paper.tex config.local.yaml --improve --rounds 5
# Interactive improvement (pause for plan review)
uv run paper_evaluator paper.tex config.local.yaml --improve --interactive
# Combine with other options
uv run paper_evaluator paper.tex config.local.yaml --improve --interactive --verboseThe system generates:
- Reviews: Individual review files:
paper_name.judge_name.review.txt - Summary: Summary file (multi-judge):
paper_name.summary.txt - Improvements: Improved papers in
improvements/session_timestamp/ - Plans: Improvement plans:
round_N_plan.txt - Logs: Prompt logs (if enabled):
logs/{timestamp}_{paper}_{model}_{persona}.prompt.txt
Each review includes:
- Paper summary and key contributions
- Strengths and weaknesses analysis
- Numerical ratings (1-4) for Quality, Clarity, Significance, Originality
- Overall recommendation (1-6) with confidence score
- Actionable questions for authors
- Assessment of limitations
Test with the sample paper:
uv run paper_evaluator tests/sample_paper.tex config.local.yaml --verbose --log-promptspaper_evaluator/
├── src/
│ ├── main.py # Command-line interface
│ ├── evaluator.py # LLM evaluation logic
│ ├── file_processor.py # PDF/LaTeX processing
│ ├── prompts.py # Prompt generation
│ └── improver.py # Paper improvement logic
├── resource/
│ └── neurips_guidelines.txt # NeurIPS review guidelines
├── config.yaml # Example configuration
├── config.local.yaml # Local configuration (with API key)
└── pyproject.toml # Project dependencies
Run tests (when available):
uv run pytestFormat code:
uv run black src/Lint code:
uv run flake8 src/- Python 3.8+
- OpenRouter API key
- For PDF processing: PyPDF2
- For LaTeX processing: Basic text processing (included)
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a pull request
This project is licensed under the MIT License. See the LICENSE file for details.
If you use this tool in your research, please cite:
@software{paper_evaluator,
title = {Paper Evaluator},
author = {Chenhao Tan},
year = {2025},
url = {https://github.com/ChicagoHAI/paper_evaluator}
}