🏆 Gradient "Build your own AI lab" Competition Entry
Submission Deadline: November 30, 2025
A local AI application built with Parallax for detecting circular reasoning bias in AI research papers. This tool analyzes experimental data from papers to identify statistical anomalies that suggest evaluation protocol manipulation.
- 🔒 Privacy First: Sensitive research data stays local - no cloud uploads
- ⚡ Distributed Processing: Parallel analysis across multiple nodes for faster results
- 🌐 Cross-Platform: Works on GPU/CPU/Mac environments
- 🔄 P2P Network: Efficient node communication via Lattica
This tool identifies three types of bias in AI paper evaluations:
- Parameter Instability (PSI): Hyperparameters repeatedly adjusted until results "look good"
- Constraint Inconsistency (CCS): Evaluation conditions (compute, memory, dataset) changed across runs
- Performance-Constraint Correlation (ρ_PC): Performance improvements suspiciously correlate with constraint changes
- 📊 Statistical Analysis: Bootstrap confidence intervals, p-values, adaptive thresholds
- 📈 Visualization: Heatmaps, correlation matrices, interactive dashboards
- 🎯 Risk Assessment: Clear risk levels (No Risk / Low / Medium / High)
- 💡 Actionable Recommendations: Specific suggestions to fix evaluation protocols
- 🚀 Fast & Local: Powered by Parallax distributed inference
- Python 3.8+
- Parallax framework
- (Optional) GPU for faster processing
# Install Parallax
pip install parallax-ai
# Or from source
git clone https://github.com/GradientHQ/parallax.git
cd parallax
pip install -e .git clone https://github.com/YOUR_USERNAME/parallax-cbd-lab.git
cd parallax-cbd-labpip install -r requirements.txt# Start a Parallax node for CBD detection
parallax start --config config/cbd_node.yaml# Analyze paper evaluation data
python src/main.py --data data/sample_paper_eval.csvThe tool will generate:
- Console output with bias detection results
results/report.json- Detailed JSON reportresults/visualizations/- Charts and heatmaps
┌─────────────────────────────────────────┐
│ User Interface (Web/CLI) │
└─────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────┐
│ Parallax Routing Layer │
│ (Request Scheduling & Load Balancing) │
└─────────────────────────────────────────┘
│
┌───────────┴───────────┐
▼ ▼
┌──────────────┐ ┌──────────────┐
│ CBD Node 1 │ │ CBD Node 2 │
│ (PSI + CCS) │ │ (ρ_PC) │
└──────────────┘ └──────────────┘
│ │
└───────────┬───────────┘
▼
┌──────────────┐
│ Result Merger │
└──────────────┘
- P2P Communication: Nodes communicate via Lattica for low-latency data transfer
- Dynamic Scheduling: Parallax routes requests to available nodes based on load
- Pipeline Parallelism: Different bias indicators computed in parallel
- Local Inference: All processing happens on your hardware - no external API calls
Input CSV should contain:
| Column | Type | Description |
|---|---|---|
time_period |
int | Evaluation round (1, 2, 3, ...) |
algorithm |
str | Model/algorithm name |
performance |
float | Performance metric (0-1) |
constraint_compute |
float | Compute limit (FLOPs, GPU hours) |
constraint_memory |
float | Memory limit (GB) |
constraint_dataset_size |
int | Training dataset size (optional) |
Example:
time_period,algorithm,performance,constraint_compute,constraint_memory
1,GPT-4,0.85,512,16.0
1,Claude-3,0.82,512,16.0
2,GPT-4,0.87,550,18.0
2,Claude-3,0.84,550,18.0
See data/sample_paper_eval.csv for a complete example.
🔴 BIAS DETECTED - MEDIUM RISK
PSI: 0.18 (>0.15) — Hyperparameters changed during eval
CCS: 0.82 (<0.85) — Inconsistent resource limits
ρ_PC: 0.65 (>0.50) — Performance correlates with constraints
RECOMMENDATION:
1. Lock all hyperparameters (e.g., temperature, max_tokens)
2. Use identical evaluation settings across runs
3. Re-evaluate with fixed protocol
Authors can verify their experimental protocols before submitting papers to conferences.
Reviewers can quickly validate the credibility of reported experimental results.
PhD students and researchers can ensure their experiment designs meet statistical standards.
- PSI (Performance-Structure Independence): Measures parameter stability across evaluation periods
- CCS (Constraint-Consistency Score): Evaluates consistency of constraint specifications
- ρ_PC (Performance-Constraint Correlation): Detects suspicious correlations
The CBD detection service is wrapped as a Parallax node:
from parallax import ParallaxService
from cbd_detector import PaperBiasDetector
class CBDParallaxService(ParallaxService):
def __init__(self):
super().__init__(name="cbd-paper-detector")
self.detector = PaperBiasDetector()
async def process_request(self, request):
# Parallax handles routing and load balancing
csv_data = pd.read_csv(request.data['file_path'])
result = self.detector.detect(csv_data)
return result- Latency: < 2 seconds for typical paper datasets (100-500 records)
- Throughput: 10+ concurrent requests via Parallax load balancing
- Scalability: Add more nodes to handle larger workloads
parallax-cbd-lab/
├── src/
│ ├── cbd_service/ # Core detection algorithms
│ │ ├── core.py # PSI, CCS, ρ_PC
│ │ ├── detector.py # Main detector class
│ │ └── validator.py # Data validation
│ ├── parallax_node/ # Parallax integration
│ │ ├── service.py # Service wrapper
│ │ └── config.yaml # Node configuration
│ └── frontend/ # Web interface
│ ├── index.html
│ └── app.js
├── data/
│ └── sample_paper_eval.csv
├── tests/
│ ├── test_core.py
│ └── test_integration.py
├── requirements.txt
└── README.md
Contributions welcome! Please see CONTRIBUTING.md for guidelines.
- Code: MIT License
- Documentation: CC BY 4.0
If you use this tool in your research, please cite:
@software{parallax_cbd_lab,
author = {Zhang, Hongping},
title = {Parallax CBD Lab: Paper Bias Detector},
year = {2025},
url = {https://github.com/YOUR_USERNAME/parallax-cbd-lab}
}- Parallax Framework: Gradient
- CBD Algorithms: Circular Bias Detection Project
- Competition: Gradient "Build your own AI lab" Competition
Built for: Gradient "Build your own AI lab" Competition
Submission Date: November 2025
Category: Research Tools / Data Analysis
Academic integrity is crucial for AI research. This tool helps researchers ensure their evaluations are statistically sound, preventing publication of inflated or biased results. By running locally with Parallax, it protects sensitive research data while providing fast, distributed analysis.
Try Demo • View Code • Report Issue
Made with ❤️ for AI Research Integrity