Skip to content

fix: don't block stdio transport on cache warmup#249

Closed
mark-liu wants to merge 3 commits intokorotovsky:masterfrom
mark-liu:fix/stdio-lazy-startup
Closed

fix: don't block stdio transport on cache warmup#249
mark-liu wants to merge 3 commits intokorotovsky:masterfrom
mark-liu:fix/stdio-lazy-startup

Conversation

@mark-liu
Copy link
Copy Markdown

Problem

The stdio transport blocks serving until IsReady() returns true, which requires both the users and channels caches to be fully populated. On large workspaces (Kubernetes Slack has 225k+ members), the user cache load takes several seconds — long enough that MCP clients with startup timeouts (like Claude Code) fail the connection before the server is ready.

The SSE and HTTP transports don't have this problem — they serve immediately and log "still warming up caches" while the background goroutine populates.

Fix

Remove the blocking IsReady() loop for stdio and match the SSE/HTTP behaviour: serve immediately, let caches warm in the background.

Tools that depend on the user cache (DM channel naming, user search, display name resolution in message history) gracefully degrade by showing user IDs until the cache is ready.

Testing

Tested against Kubernetes Slack (225,807 members, 555 channels):

  • Before: stdio transport times out during cache load (~4s blocked)
  • After: stdio transport serves immediately, caches warm in background

The stdio transport blocks on IsReady() which requires both the users
and channels caches to be fully populated before serving. On large
workspaces (200k+ members like Kubernetes Slack), the user cache takes
several seconds to load, causing MCP clients with tight startup timeouts
to fail the connection.

This brings stdio in line with the SSE and HTTP transports, which already
serve immediately while caches warm in the background. Tools that depend
on the user cache gracefully degrade (showing user IDs instead of display
names) until the cache is ready.

Signed-off-by: Mark Liu <mark@prove.com.au>
…AuthTest dedup

Three fixes for running multiple MCP server instances concurrently:

1. Atomic cache writes — write to temp file then rename, prevents
   concurrent readers from seeing partial 400MB+ JSON files.

2. Startup jitter — random 0-3s delay before first API call, staggers
   concurrent instances to avoid Slack rate limit thundering herd.

3. AuthTest deduplication — validateAuthAndGetTeamID now returns the
   full AuthTestResponse which is passed to NewMCPSlackClient, eliminating
   a redundant AuthTest call per startup (was 2 per instance, now 1).
@mark-liu
Copy link
Copy Markdown
Author

Superseded by #259 (squashed and rebased with #256).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant