-
Notifications
You must be signed in to change notification settings - Fork 4
Description
Summary
Evaluate whether switching the SQLite database from delete journal mode to wal (Write-Ahead Logging) mode improves tracker performance, and measure the impact using the existing Grafana dashboard.
Prerequisites
- Depends on: Use SQLite .backup command instead of cp for database backups #85 (Use SQLite
.backupcommand instead ofcp)
The .backup command handles WAL mode automatically, so that should be implemented first.
Current State
The tracker database currently uses delete journal mode:
$ sqlite3 ./storage/tracker/lib/database/sqlite3.db "PRAGMA journal_mode;"
deleteTraffic Profile
The Torrust Live Demo is a high-traffic production system:
- ~8,000-17,000 UDP requests per second continuously
- 17 GB database
- No low-traffic periods - constant traffic 24/7
This makes the demo an ideal testbed for evaluating WAL mode performance impact under real production load.
Why Consider WAL Mode?
WAL mode offers several potential benefits:
| Benefit | Description |
|---|---|
| Better concurrency | Readers don't block writers, writers don't block readers |
| Faster writes | Writes are sequential to WAL file |
| Fewer fsync calls | Better performance, especially on slower disks |
| Read performance | Readers see consistent snapshots without locking |
| Persistent mode | Setting persists across database close/reopen |
Potential Drawbacks
| Drawback | Mitigation |
|---|---|
| Three files instead of one | .backup command handles this automatically |
| Shared memory requirement | Not an issue (not using NFS) |
| WAL file can grow large | Regular checkpointing (automatic) |
| Checkpoint starvation | Monitor WAL file size, ensure reader gaps |
Proposed Experiment
Phase 1: Baseline Metrics
Before making any changes, collect baseline metrics from Grafana:
- Request latency (announce, scrape)
- Requests per second
- Database operation timing (if available)
- Resource usage (CPU, memory, disk I/O)
Phase 2: Switch to WAL Mode
# Stop the tracker (or during maintenance window)
sqlite3 /path/to/sqlite3.db "PRAGMA journal_mode=WAL;"
# Restart the trackerNote: WAL mode is persistent - it stays active across database close/reopen. You can switch back anytime with PRAGMA journal_mode=DELETE;
Phase 3: Collect Comparison Metrics
After running in WAL mode for a representative period (e.g., 1 week):
- Compare the same metrics from Phase 1
- Document any observed differences
- Note any issues or unexpected behavior
- Monitor WAL file size for checkpoint starvation
Phase 4: Decision
Based on the data:
- If performance improves: Keep WAL mode
- If no significant change: Keep WAL mode (still safer for concurrent access)
- If issues arise: Revert to delete mode
Expected Outcome
Document the measured performance difference (if any) between delete and WAL modes in a real production environment. This data will inform:
- Whether to recommend WAL mode for Torrust Tracker deployments
- Default settings in the Torrust Tracker Deployer project
Context
This issue is part of backup strategy research for the Torrust Tracker Deployer project. The Live Demo serves as a production lab to validate improvements before applying them more broadly.
Research documentation: torrust/torrust-tracker-deployer#310