This guide covers the performance analysis and profiling tools available in the Cache Simulator.
The Cache Simulator provides several analysis capabilities:
- Cache hit/miss analysis
- Working set analysis
- Reuse distance calculation
- Access pattern classification
- Configuration comparison
# Run basic analysis
./cachesim --vis traces/workload.txtOutput includes:
- L1/L2 hit rates
- Miss type breakdown (compulsory, capacity, conflict)
- Access pattern statistics
# Enable detailed statistics
./cachesim --verbose traces/workload.txt# Export to CSV
./cachesim --export results.csv traces/workload.txtCreate multiple configuration files and compare results:
# Compare two configurations
./cachesim --config config1.json traces/workload.txt > result1.txt
./cachesim --config config2.json traces/workload.txt > result2.txt# Run benchmark mode
./cachesim --benchmark traces/workload.txtBenchmark mode provides:
- Processing time metrics
- Throughput (accesses/second)
- Memory usage statistics
# Enable visualization
./cachesim --vis traces/workload.txtDisplays ASCII tables showing:
- Cache block states
- Tag values
- Valid/dirty status
- Access counts
The visualization output includes:
- Access distribution histogram
- Memory address heatmap
- Hit rate over time
# Default 45nm technology
./cachesim --power traces/workload.txt
# Specify technology node
./cachesim --power --tech-node 7 traces/workload.txt| Metric | Description | Unit |
|---|---|---|
| Dynamic Energy | Energy per access | pJ |
| Total Energy | Cumulative energy | nJ |
| Leakage Power | Static power consumption | mW |
| EDP | Energy-Delay Product | pJ*ns |
Supported nodes: 7nm, 14nm, 22nm, 32nm, 45nm
./cachesim --verbose traces/workload.txtReview:
- Miss rate distribution
- Access patterns
- Working set size
Look for:
- High conflict miss rate -> Consider victim cache
- High capacity miss rate -> Increase cache size
- Low reuse -> Check prefetching settings
# Test with victim cache
./cachesim --victim-cache traces/workload.txt
# Test different replacement policies
./cachesim --config lru_config.json traces/workload.txt
./cachesim --config nru_config.json traces/workload.txtCompare hit rates, access times, and power consumption to find optimal configuration.
| Hit Rate | Interpretation |
|---|---|
| > 95% | Excellent - cache is well-sized |
| 85-95% | Good - typical for most workloads |
| 70-85% | Moderate - consider optimizations |
| < 70% | Poor - investigate miss patterns |
- Compulsory: Unavoidable on first access
- Capacity: Cache too small for working set
- Conflict: Set conflicts due to low associativity
| Miss Type | High Percentage | Recommendation |
|---|---|---|
| Compulsory | > 50% | Enable prefetching |
| Capacity | > 40% | Increase cache size |
| Conflict | > 30% | Add victim cache or increase associativity |
For large trace files:
# Enable parallel processing
./cachesim --parallel 4 traces/large_workload.txtPerformance scaling:
- 2 threads: ~1.8x speedup
- 4 threads: ~3.2x speedup
- 8 threads: ~5.5x speedup
- Start with default configuration to establish baseline
- Profile before optimizing to identify bottlenecks
- Change one parameter at a time for clear comparisons
- Use representative workloads for accurate analysis
- Consider power/performance tradeoffs with power analysis
- Configuration - Configuration options
- Power Modeling - Power analysis details
- Examples - More usage examples