|
| 1 | +# NVIDIA Detection Cache Fix |
| 2 | + |
| 3 | +## Problem |
| 4 | + |
| 5 | +On Windows development environments, tests were taking very long due to repeated downloads of the 2.2GB+ torch package. This was caused by: |
| 6 | + |
| 7 | +1. **Inconsistent NVIDIA Detection**: The `has_nvidia_smi()` function was returning different values between runs |
| 8 | +2. **Dynamic PyProject Generation**: Multiple modules generate different `pyproject.toml` content based on NVIDIA GPU detection |
| 9 | +3. **uv-iso-env Behavior**: The `uv-iso-env` package performs a "nuke and pave" reinstall whenever the `pyproject.toml` fingerprint changes |
| 10 | +4. **Repeated Downloads**: Each fingerprint change triggered a complete reinstall including the large torch download |
| 11 | + |
| 12 | +## Root Cause |
| 13 | + |
| 14 | +The issue was that `has_nvidia_smi()` was being called multiple times during test runs, and on Windows systems, the detection could be inconsistent due to: |
| 15 | +- System state changes |
| 16 | +- Process timing issues |
| 17 | +- Environment variable changes |
| 18 | +- Path resolution inconsistencies |
| 19 | + |
| 20 | +This caused different `pyproject.toml` content to be generated between runs, changing the fingerprint and triggering reinstalls. |
| 21 | + |
| 22 | +## Solution |
| 23 | + |
| 24 | +### 1. NVIDIA Detection Caching |
| 25 | + |
| 26 | +Enhanced `has_nvidia_smi()` in `src/transcribe_anything/util.py` to: |
| 27 | +- Cache detection results based on system fingerprint |
| 28 | +- Store cache in `~/.transcribe_anything_nvidia_cache.json` |
| 29 | +- Use system information (platform, machine, version) + nvidia-smi existence as fingerprint |
| 30 | +- Provide consistent results across runs for the same system configuration |
| 31 | + |
| 32 | +### 2. Debug Logging |
| 33 | + |
| 34 | +Added debug logging to environment generation functions: |
| 35 | +- `src/transcribe_anything/whisper.py` |
| 36 | +- `src/transcribe_anything/insanley_fast_whisper_reqs.py` |
| 37 | +- `src/transcribe_anything/whisper_mac.py` |
| 38 | + |
| 39 | +Each now logs the MD5 hash of generated `pyproject.toml` content to help track changes. |
| 40 | + |
| 41 | +### 3. Cache Management |
| 42 | + |
| 43 | +Added command-line option to clear cache when needed: |
| 44 | +```bash |
| 45 | +transcribe-anything --clear-nvidia-cache |
| 46 | +``` |
| 47 | + |
| 48 | +### 4. Testing |
| 49 | + |
| 50 | +Created comprehensive tests in `tests/test_nvidia_cache.py` to verify: |
| 51 | +- Caching behavior works correctly |
| 52 | +- Cache clearing functionality |
| 53 | +- Different system fingerprints are handled properly |
| 54 | + |
| 55 | +## Files Modified |
| 56 | + |
| 57 | +- `src/transcribe_anything/util.py` - Enhanced NVIDIA detection with caching |
| 58 | +- `src/transcribe_anything/whisper.py` - Added debug logging |
| 59 | +- `src/transcribe_anything/insanley_fast_whisper_reqs.py` - Added debug logging |
| 60 | +- `src/transcribe_anything/whisper_mac.py` - Added debug logging |
| 61 | +- `src/transcribe_anything/_cmd.py` - Added clear cache command-line option |
| 62 | +- `tests/test_nvidia_cache.py` - New test file for cache functionality |
| 63 | + |
| 64 | +## Usage |
| 65 | + |
| 66 | +### Normal Operation |
| 67 | +The caching is automatic and transparent. The first run will detect NVIDIA availability and cache the result. Subsequent runs will use the cached result, ensuring consistent `pyproject.toml` generation. |
| 68 | + |
| 69 | +### Debugging |
| 70 | +If you suspect caching issues, you can: |
| 71 | + |
| 72 | +1. **View debug output**: The system will print debug messages showing: |
| 73 | + - Cached vs fresh NVIDIA detection results |
| 74 | + - PyProject.toml content hashes for each module |
| 75 | + |
| 76 | +2. **Clear cache**: If hardware changes or you need to force re-detection: |
| 77 | + ```bash |
| 78 | + transcribe-anything --clear-nvidia-cache |
| 79 | + ``` |
| 80 | + |
| 81 | +### Expected Behavior |
| 82 | +- **First run**: Detects NVIDIA, caches result, generates environment |
| 83 | +- **Subsequent runs**: Uses cached result, generates identical environment |
| 84 | +- **No more repeated downloads**: Same fingerprint = no reinstall needed |
| 85 | + |
| 86 | +## Benefits |
| 87 | + |
| 88 | +1. **Faster Testing**: Eliminates repeated 2.2GB+ torch downloads |
| 89 | +2. **Consistent Behavior**: Same system configuration always produces same results |
| 90 | +3. **Debuggable**: Clear logging shows what's happening |
| 91 | +4. **Manageable**: Easy cache clearing when needed |
| 92 | +5. **Backward Compatible**: No changes to existing API or behavior |
| 93 | + |
| 94 | +## Technical Details |
| 95 | + |
| 96 | +The cache file (`~/.transcribe_anything_nvidia_cache.json`) stores mappings from system fingerprints to detection results: |
| 97 | + |
| 98 | +```json |
| 99 | +{ |
| 100 | + "Windows-AMD64-10.0.19041-nvidia_smi:true": true, |
| 101 | + "Linux-x86_64-5.4.0-nvidia_smi:false": false |
| 102 | +} |
| 103 | +``` |
| 104 | + |
| 105 | +The system fingerprint includes: |
| 106 | +- Platform system (Windows, Linux, Darwin) |
| 107 | +- Machine architecture (AMD64, x86_64, arm64) |
| 108 | +- Platform version |
| 109 | +- Whether nvidia-smi executable exists |
| 110 | + |
| 111 | +This ensures that hardware or driver changes are properly detected while maintaining consistency for the same configuration. |
0 commit comments