Summary
The macOS EXO app shell sustains ~500–620 KB/sec of file-backed memory dirtied while the cluster-state polling loop is running. macOS treats this as anomalous (the per-process daily-average disk-write baseline is ~25 KB/sec) and emits microstackshot diagnostic reports under /Library/Logs/DiagnosticReports/EXO_*.diag.
The writes are entirely from __CFURLCache::CreateAndStoreCacheNode flushing HTTP response bodies to ~/Library/Caches/exolabs.EXO/. They serve no functional purpose — ClusterStateService polls /state at 2 Hz and never reads from the cache, only writes to it.
A fix is straightforward; PR follows.
Environment
- macOS 26.4.1 (Build 25E253)
- Mac Studio (Mac15,14), M3 Ultra, 512 GB
- EXO 1.0.71 (1000071999)
- Single-node and multi-node configurations both reproduce
Symptom
Six microstackshot reports collected on one node over eight days:
| Filename |
Total writes |
Duration |
Sustained rate |
EXO_2026-04-22-131216_atlas.diag |
2.15 GB |
4077 s |
527 KB/s |
EXO_2026-04-22-180238_atlas.diag |
8.59 GB |
17421 s |
493 KB/s |
EXO_2026-04-23-111208_atlas.diag |
2.15 GB |
3487 s |
616 KB/s |
EXO_2026-04-23-150044_atlas.diag |
8.59 GB |
13715 s |
626 KB/s |
EXO_2026-04-24-062837_atlas.diag |
34.36 GB |
55673 s (≈15 h) |
617 KB/s |
EXO_2026-04-29-125940_atlas.diag |
2.15 GB |
3463 s |
620 KB/s |
Each report headline:
Event: disk writes
Action taken: none
Writes: <N> MB of file backed memory dirtied over <secs> seconds (<rate> KB/sec average),
exceeding limit of 24.86 KB per second over 86400 seconds
Root cause
The Heaviest Stack on every report (177 of 185 samples on the most recent one; 3066 of 3116 samples on the 15-hour one) is:
start_wqthread
_pthread_wqthread
_dispatch_workloop_worker_thread
_dispatch_root_queue_drain_deferred_wlh
_dispatch_lane_invoke
_dispatch_lane_serial_drain
_dispatch_client_callout
_dispatch_block_async_invoke2
invocation function for block in __CFURLCache::CreateAndStoreCacheNode(...)
write + 8 (libsystem_kernel.dylib)
_CFURLCacheFSWriteCachedResponseToFS
That stack tells us that 96–98% of the dispatched work is CFNetwork's URL response cache writing cached response bodies to disk.
The relevant code is in app/EXO/EXO/Services/ClusterStateService.swift:
init defaults session: URLSession = .shared.
URLSession.shared ships with URLCache.shared attached, which has an on-disk diskCapacity by default.
startPolling(interval:) defaults to 0.5 s and calls fetchSnapshot() on every tick — that's GET /state against the local exo Python server twice per second.
- The per-
URLRequest cache policy is set to .reloadIgnoringLocalCacheData, but that only affects read behavior — the response is still written to the URL cache after each successful fetch. (See Apple's docs on URLRequest.CachePolicy.)
So every snapshot poll persists its response body to disk, regardless of whether the client will ever read it back.
/state responses scale with the number of models loaded × peers × instances and can easily reach tens of KB; at 2 Hz that's hundreds of KB/sec sustained — exactly what the diagnostic reports show.
Why this matters
- SSD wear — 34 GB of cache writes for a 15-hour idle-ish polling session is gratuitous. Internal SSDs on Mac Studios can't be replaced without sending the unit to Apple.
- Background CPU —
_dispatch_block_async_invoke2 → write sustained on a worker thread.
- Cache directory growth —
~/Library/Caches/exolabs.EXO/ accumulates indefinitely.
- macOS resource-limit microstackshots — macOS tags the process as "noisy on disk" (Action taken: none today, but the OS may escalate over time).
Cross-checked: zero microstackshot reports on the same network's other M3 Ultra running EXO 1.0.71 and serving inference but not running the EXO macOS app shell (the Swift menubar process is what hits this — the headless Python exo CLI alone does not). That confirms the issue is in the Swift shell's URL cache behavior, not in the Python core.
Suggested fix
Switch ClusterStateService's default session from URLSession.shared to an ephemeral session with urlCache = nil. Cluster-state responses are time-sensitive and small; nothing benefits from being cached on disk.
private static func makeNonCachingSession() -> URLSession {
let config = URLSessionConfiguration.ephemeral
config.urlCache = nil
config.requestCachePolicy = .reloadIgnoringLocalCacheData
return URLSession(configuration: config)
}
PR with this fix incoming as a follow-up.
Alternative considered
App-wide URLCache.shared = URLCache(memoryCapacity: 0, diskCapacity: 0) at app launch. This would also cover BugReportService (which uses URLSession.shared for crash report uploads) and any future callers. It's a one-line change but has a larger blast radius — Sparkle.framework and other system code that uses the shared session would also lose caching. The per-service fix is the minimum surgical change.
Happy to switch the PR to the app-wide approach if maintainers prefer.
Reproduction
- Run EXO 1.0.71 on macOS 26.x (any recent version).
- Let it idle (no inference) for 30+ minutes.
- Check
/Library/Logs/DiagnosticReports/ for EXO_*.diag files.
- The first sample arrives once macOS detects the per-process disk-write daily average being exceeded.
The headless Python exo CLI does not reproduce — only the macOS menubar app.
Summary
The macOS EXO app shell sustains ~500–620 KB/sec of file-backed memory dirtied while the cluster-state polling loop is running. macOS treats this as anomalous (the per-process daily-average disk-write baseline is ~25 KB/sec) and emits microstackshot diagnostic reports under
/Library/Logs/DiagnosticReports/EXO_*.diag.The writes are entirely from
__CFURLCache::CreateAndStoreCacheNodeflushing HTTP response bodies to~/Library/Caches/exolabs.EXO/. They serve no functional purpose —ClusterStateServicepolls/stateat 2 Hz and never reads from the cache, only writes to it.A fix is straightforward; PR follows.
Environment
Symptom
Six microstackshot reports collected on one node over eight days:
EXO_2026-04-22-131216_atlas.diagEXO_2026-04-22-180238_atlas.diagEXO_2026-04-23-111208_atlas.diagEXO_2026-04-23-150044_atlas.diagEXO_2026-04-24-062837_atlas.diagEXO_2026-04-29-125940_atlas.diagEach report headline:
Root cause
The Heaviest Stack on every report (177 of 185 samples on the most recent one; 3066 of 3116 samples on the 15-hour one) is:
That stack tells us that 96–98% of the dispatched work is CFNetwork's URL response cache writing cached response bodies to disk.
The relevant code is in
app/EXO/EXO/Services/ClusterStateService.swift:initdefaultssession: URLSession = .shared.URLSession.sharedships withURLCache.sharedattached, which has an on-disk diskCapacity by default.startPolling(interval:)defaults to 0.5 s and callsfetchSnapshot()on every tick — that'sGET /stateagainst the local exo Python server twice per second.URLRequestcache policy is set to.reloadIgnoringLocalCacheData, but that only affects read behavior — the response is still written to the URL cache after each successful fetch. (See Apple's docs onURLRequest.CachePolicy.)So every snapshot poll persists its response body to disk, regardless of whether the client will ever read it back.
/stateresponses scale with the number of models loaded × peers × instances and can easily reach tens of KB; at 2 Hz that's hundreds of KB/sec sustained — exactly what the diagnostic reports show.Why this matters
_dispatch_block_async_invoke2 → writesustained on a worker thread.~/Library/Caches/exolabs.EXO/accumulates indefinitely.Cross-checked: zero microstackshot reports on the same network's other M3 Ultra running EXO 1.0.71 and serving inference but not running the EXO macOS app shell (the Swift menubar process is what hits this — the headless Python
exoCLI alone does not). That confirms the issue is in the Swift shell's URL cache behavior, not in the Python core.Suggested fix
Switch
ClusterStateService's default session fromURLSession.sharedto an ephemeral session withurlCache = nil. Cluster-state responses are time-sensitive and small; nothing benefits from being cached on disk.PR with this fix incoming as a follow-up.
Alternative considered
App-wide
URLCache.shared = URLCache(memoryCapacity: 0, diskCapacity: 0)at app launch. This would also coverBugReportService(which usesURLSession.sharedfor crash report uploads) and any future callers. It's a one-line change but has a larger blast radius —Sparkle.frameworkand other system code that uses the shared session would also lose caching. The per-service fix is the minimum surgical change.Happy to switch the PR to the app-wide approach if maintainers prefer.
Reproduction
/Library/Logs/DiagnosticReports/forEXO_*.diagfiles.The headless Python
exoCLI does not reproduce — only the macOS menubar app.