Paper Evaluator

An LLM-powered paper evaluation system that provides structured peer reviews following academic conference guidelines. The system uses multiple LLM judges via OpenRouter API to evaluate research papers in PDF or LaTeX format.

Features

Multi-format support: Process PDF and LaTeX files
Multiple LLM judges: Use different models with specialized personas
Structured evaluation: Follows NeurIPS review guidelines
Comprehensive scoring: Quality, Clarity, Significance, and Originality ratings
Batch processing: Evaluate with multiple judges simultaneously
Paper improvement: Iterative self-improvement based on LLM reviews
Interactive mode: Review improvement plans before applying changes
Automatic mode: Multi-round improvement without user intervention

Installation

Clone the repository:

git clone https://github.com/ChicagoHAI/paper_evaluator.git
cd paper_evaluator

Install dependencies using uv:
```
uv sync
```
This will install the project dependencies and make the paper_evaluator command available via uv run.

Set up your configuration:

cp config.yaml config.local.yaml
# Edit config.local.yaml and add your OpenRouter API key

Configuration

Get an API key from OpenRouter
Edit config.local.yaml and replace "sk-or-v1-your-actual-api-key-here" with your actual API key
Customize judges and their personas as needed

The configuration includes two free models by default:

moonshotai/kimi-k2:free - Specialized in ML and AI systems
z-ai/glm-4.5-air:free - Specialized in NLP and computational linguistics

Usage

Basic Usage

Evaluate a paper with all configured judges:

uv run paper_evaluator paper.pdf config.local.yaml

Or equivalently:

source .venv/bin/activate
paper_evaluator paper.pdf config.local.yaml

Advanced Options

# Specify output directory
uv run paper_evaluator paper.tex config.local.yaml --output my_reviews/

# Use only a single judge
uv run paper_evaluator paper.pdf config.local.yaml --single-judge "Kimi"

# Enable verbose output
uv run paper_evaluator paper.pdf config.local.yaml --verbose

# Save prompts to logs/ directory for inspection
uv run paper_evaluator paper.pdf config.local.yaml --log-prompts

# Combine options
uv run paper_evaluator paper.pdf config.local.yaml --output reviews/ --verbose --log-prompts

Paper Improvement

Automatically improve a LaTeX paper based on LLM reviews:

# Automatic improvement (3 rounds by default)
uv run paper_evaluator paper.tex config.local.yaml --improve

# Automatic improvement with custom rounds
uv run paper_evaluator paper.tex config.local.yaml --improve --rounds 5

# Interactive improvement (pause for plan review)
uv run paper_evaluator paper.tex config.local.yaml --improve --interactive

# Combine with other options
uv run paper_evaluator paper.tex config.local.yaml --improve --interactive --verbose

Output

The system generates:

Reviews: Individual review files: paper_name.judge_name.review.txt
Summary: Summary file (multi-judge): paper_name.summary.txt
Improvements: Improved papers in improvements/session_timestamp/
Plans: Improvement plans: round_N_plan.txt
Logs: Prompt logs (if enabled): logs/{timestamp}_{paper}_{model}_{persona}.prompt.txt

Each review includes:

Paper summary and key contributions
Strengths and weaknesses analysis
Numerical ratings (1-4) for Quality, Clarity, Significance, Originality
Overall recommendation (1-6) with confidence score
Actionable questions for authors
Assessment of limitations

Testing

Test with the sample paper:

uv run paper_evaluator tests/sample_paper.tex config.local.yaml --verbose --log-prompts

Project Structure

paper_evaluator/
├── src/
│   ├── main.py              # Command-line interface
│   ├── evaluator.py         # LLM evaluation logic
│   ├── file_processor.py    # PDF/LaTeX processing
│   ├── prompts.py           # Prompt generation
│   └── improver.py          # Paper improvement logic
├── resource/
│   └── neurips_guidelines.txt # NeurIPS review guidelines
├── config.yaml              # Example configuration
├── config.local.yaml        # Local configuration (with API key)
└── pyproject.toml           # Project dependencies

Development

Run tests (when available):

uv run pytest

Format code:

uv run black src/

Lint code:

uv run flake8 src/

Requirements

Python 3.8+
OpenRouter API key
For PDF processing: PyPDF2
For LaTeX processing: Basic text processing (included)

Contributing

Fork the repository
Create a feature branch
Make your changes
Add tests if applicable
Submit a pull request

License

This project is licensed under the MIT License. See the LICENSE file for details.

Citation

If you use this tool in your research, please cite:

@software{paper_evaluator,
  title = {Paper Evaluator},
  author = {Chenhao Tan},
  year = {2025},
  url = {https://github.com/ChicagoHAI/paper_evaluator}
}

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
resource		resource
src		src
tests		tests
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
config.yaml		config.yaml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Paper Evaluator

Features

Installation

Configuration

Usage

Basic Usage

Advanced Options

Paper Improvement

Output

Testing

Project Structure

Development

Requirements

Contributing

License

Citation

About

Uh oh!

Releases

Packages

Languages

License

ChicagoHAI/paper_evaluator

Folders and files

Latest commit

History

Repository files navigation

Paper Evaluator

Features

Installation

Configuration

Usage

Basic Usage

Advanced Options

Paper Improvement

Output

Testing

Project Structure

Development

Requirements

Contributing

License

Citation

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages