Cache pooler_check_query response per pool#254
Merged
Conversation
added 8 commits
May 15, 2026 20:50
Before, only ";" worked as pooler_check_query: pg_doorman always replied with a hardcoded EmptyQueryResponse + ReadyForQuery, regardless of the configured value. Setting "select 1" — the keep-alive query that Wildfly, HikariCP and many JDBC pools default to — produced the same empty response, which is semantically wrong for a non-empty query. Now pg_doorman forwards the first matching SimpleQuery to PostgreSQL, caches the response per pool, and serves subsequent matching probes from cache without touching the backend. The cache validates against the current general.pooler_check_query value on every lookup, so a RELOAD that changes the value triggers a fresh backend probe on the next ping without an explicit invalidation hook. ErrorResponse from the backend is never cached: the next probe re-fetches. The operator's responsibility: pooler_check_query must be a stable, side-effect-free SimpleQuery. Same input must produce the same output every time. Safe choices include ";" (default), "select 1", "select 'pg_doorman'", "select version()". Unsafe choices include "select now()", "select pg_is_in_recovery()" (changes on failover), or any UPDATE/DELETE/CALL — their first-call result would be cached and replayed indefinitely. Default ";" is no longer special-cased: it goes through the same cache flow. On a cold pool the first ";" now performs one backend round-trip and caches PostgreSQL's EmptyQueryResponse. Load balancers pinging through pg_doorman will surface a backend-unreachable condition as a probe failure on the very first probe of a fresh pool, instead of an unconditional OK from a hardcoded reply. Two new counter metrics expose the cache hit rate: - pg_doorman_pooler_check_query_backend_total (cache miss → backend probe) - pg_doorman_pooler_check_query_cache_total (cache hit → served from memory)
Operator-facing documentation for the cache shipped in the previous commit: - Stability contract: the query must be deterministic and side-effect free. - Safe values: ";", "select 1", "select 'pg_doorman'", "select version()". - Unsafe values with explicit cache-freeze failure modes: select now() / clock_timestamp() (frozen timestamp), pg_is_in_recovery() (failover blindness), counting queries (frozen count), UPDATE/INSERT/DELETE/CALL/DO (one-shot side effect cached as success forever). - Cold-pool behavior change called out: the first probe per pool now performs one PostgreSQL round-trip, even for the default ";". - Two new counter metrics with the hit-rate formula. - 3.10.0 changelog entry. pg_doorman.toml and pg_doorman.yaml regenerated from the updated fields.yaml so the library reference tests stay in sync.
The field was declared as Option<Vec<u8>> on the config struct, defaulted to None, and read by poller_check_query_request_bytes_vec via an if-let-Some shortcut. Nothing ever wrote a Some value into it, so the shortcut was unreachable and the field was permanently dead. Removed the field, the unreachable branch, and its entry in the fields.yaml structural-allowlist for the config validator.
…lity
Multi-discipline review flagged three correctness issues and a hot-path
regression:
- Stale-client cache poisoning across RELOAD. Before, the byte-match used
a snapshot captured at client connect, while the cache key came from the
live config in the handler. A RELOAD that races with an in-flight probe
could write a cache entry for one query under the key of another. Now
the byte-match and the cache key read the same process-wide snapshot,
swapped atomically when the config reloads. Bytes from a stale config
no longer match the active probe; they fall through to the normal SQL
path instead.
- The handler re-resolved the pool through the global pool map, which
could pick a different generation than the client's authenticated pool
after a RELOAD or dynamic-pool swap. The authenticated pool is now
threaded through.
- Backend lifecycle was incomplete. recv/send/cleanup errors did not mark
the backend bad before returning it to the pool, and the cache accepted
any response without an ErrorResponse, including non-idle states like
an open transaction or a COPY in progress. Errors now mark the backend
bad before propagating, and the cache only accepts responses that end
in ReadyForQuery('I')dle.
Hot path: the cache reader switched from load_full to a guarded read,
dropping an atomic refcount bump per warm hit. The full Config clone
per probe is gone — the new snapshot already carries the query string.
Cleanup: removed the per-client request_bytes vec and its initialisers,
the unused poller_check_query_request_bytes_vec helper, and a config
field that was declared but never written.
Docs: the two probe counters are now in the canonical prometheus.md
reference (EN regenerated, RU hand-edited).
Tests:
- 9 new unit tests (snapshot encoding, idle-RFQ detector).
- 1 new BDD scenario: stale probe bytes after RELOAD bypass the cache
path and reach the backend as regular SQL; the new probe value drives
a fresh backend hit and populates a fresh cache entry.
The BDD matrix in .github/workflows/bdd-tests.yml filters scenarios by @rust-1..4 / @go / @nodejs / etc. tags. The cache feature shipped with only @pooler-check-query-cache, so every CI job silently skipped it while reporting green. Adding the @rust @rust-2 tags routes the six scenarios into the existing Rust part 2 job, the least-loaded Rust shard in the current matrix.
The BDD matrix in .github/workflows/bdd-tests.yml dispatches scenarios by matrix tag (@rust-1..4, @go, @python, @nodejs, etc.). Sixteen feature files carried only their own private tags (@config-test, @connection, @hba, @prometheus, ...) — they passed `make test-bdd TAGS=@foo` locally but every CI job silently skipped them while the suite reported green. The lapse was caught when feat/pooler-check-query-cache landed on the same trap. Added @rust @rust-N to each, distributing across the three less-loaded shards (rust-1 +5, rust-2 +5, rust-4 +6) and leaving rust-3 (sleep-heavy) untouched. Resulting balance: 8 / 9 / 20 / 14. `bench.feature` already runs via .github/workflows/bench.yml in release mode, so it stays on @bench alone.
The matrix wiring in the previous commit pulled sixteen orphan features
into CI. Two of them masked existing bugs and two have pre-existing
product issues outside this PR's scope.
Fixed:
- src/app/config.rs: when -t / --test-config exits on parse error, the
message now always goes to stderr. Previously the branch for
non-terminal stdin used `log::error!`, but the logger is not yet
initialized at that point, so CI captured an empty stream. The
scenario "Test invalid configuration file with -t flag" now sees the
expected "Config parse error:" prefix.
- tests/go/hba/hba_trust_test.go: the HBA reject error message no longer
prefixes the peer with "IP address" and now includes the source port.
The Go test for HBA deny was pinned to the older string and would
never match the current output. Loosened the substring assertion to
the stable portion ("to <user>@<db> (TLS: false) is not permitted by
HBA configuration") which still proves the path is taken.
Deferred (matrix tag reverted, scenarios silently skipped again, TODO
issue notes saved in MEMORY/project_orphan_bdd_features.md):
- pool-size-show-pools.feature "Dynamic pool" scenario: auth_query
passthrough returns pool_size=1 instead of the configured
default_pool_size=3. Product bug in dynamic-pool size resolution.
- rust-deferred-begin-bug.feature: all four scenarios skip because a
step definition the file relies on is no longer registered in the
test binary. Needs a separate investigation of the extended-session
helper.
pool-size-show-pools.feature "Dynamic pool" scenario: the YAML used auth_query.default_pool_size, a key that was renamed to auth_query.pool_size in commit e383a7a but never propagated here. With the new value the SHOW POOLS column matches the configured 3. Matrix tag @rust-4 restored. rust-deferred-begin-bug.feature: implemented the missing "session should receive ParseComplete/BindComplete/CommandComplete 'tag'/ReadyForQuery 'status'" step bindings as a thin helper over the already-captured session message log. Also renamed "we create extended session" to "we create session" — extended-protocol sessions are the same regular sessions, the Parse/Bind/Execute/Sync steps already exist. The deferred-begin scenarios still fail one assertion: after Parse "BEGIN" + Bind + Execute + Sync the server backend is allocated (sv_active=1), and the test wants 0. That is a real product gap — pg_doorman only short-circuits standalone BEGIN over SimpleQuery (is_standalone_begin in client/transaction.rs), not over the extended protocol. Implementing extended-BEGIN deferral is its own change. The matrix tag stays off until the gap is closed; the new step bindings are kept because they are useful to whichever PR picks this up.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Replaces the hardcoded local reply for
general.pooler_check_querywith aper-pool response cache. The first matching SimpleQuery in each pool's
lifetime is forwarded to PostgreSQL; every subsequent matching probe is
answered from cache without touching the backend. This unblocks non-empty
keepalive queries that JDBC pools default to —
select 1from WildFly,HikariCP and most JDBC drivers now returns the real row instead of an
empty response.
Behavior changes
;is no longer special-cased. The first probe per poolperforms one PostgreSQL round-trip and the response (empty for
;) iscached. If PostgreSQL is unreachable at that moment, the probing client
sees a failure instead of an unconditional OK. The previous hardcoded
reply reported the pooler healthy even when PostgreSQL was down.
RELOADthat changespooler_check_queryinvalidates the cache on the next ping. A reloadthat keeps the same value keeps the cached response.
ErrorResponsefrom backend is forwarded to the client and never cached.
Operator contract
pooler_check_querymust be stable: same input → same bytes, no sideeffects. Safe:
;,select 1,select 'pg_doorman',select version().Unsafe (cache will freeze the response):
select now(),select pg_is_in_recovery()(failover blindness),select count(*) from <table>, UPDATE/INSERT/DELETE/CALL/DO.New metrics
pg_doorman_pooler_check_query_backend_total— probes forwarded toPostgreSQL (cache miss).
pg_doorman_pooler_check_query_cache_total— probes served from cache.Ratio
cache_total / (cache_total + backend_total)is the hit rate.Test plan
BufferingWriter,has_error_responseframe parser,CheckQueryCacheget/set semantics.make test-bdd TAGS=@pooler-check-query-cache):default
;cache, customselect 1cache, RELOAD invalidation,backend ErrorResponse never cached, large response (10 KB) drained
via the recv-loop with pool staying in sync.
cargo fmt --allclean.cargo clippy --lib --tests -- --deny warningsclean.test_prometheus_server_basicreproduces onmaster too; unrelated.
make generatere-emittedpg_doorman.toml/yamland EN referencedocs from the updated
fields.yaml; library reference tests in sync.