Symptom
GET https://aevatar-console-backend-api.aevatar.ai/api/health returns HTTP 503 with gagent-service reported as unhealthy and the entire host as not-ready.
{
"name": "gagent-service",
"category": "capability",
"critical": true,
"status": "unhealthy",
"message": "Elasticsearch query failed: 400 Bad Request. body={\"error\":{\"root_cause\":[{\"type\":\"query_shard_exception\",\"reason\":\"No mapping found for [updated_at_utc_value] in order to sort on\",\"index_uuid\":\"C_2X-E2gRsGUX_sigX8cug\",\"index\":\"aevatar-mainnet-script-catalog-entries\"}], ... ,\"status\":400}",
"details": {
"requiredRoutes": "/api/services, /api/scopes/{scopeId}/binding, /api/scopes/{scopeId}/workflows, /api/scopes/{scopeId}/scripts",
"exceptionType": "System.InvalidOperationException"
}
}
Because gagent-service is critical: true, the whole host is not-ready. Other capabilities (scripting-bundle, studio, workflow-bundle, workflow-document-readmodel, workflow-graph-readmodel) all report healthy.
Captured 2026-04-25T06:31:04Z UTC against Aevatar.Mainnet.Host.Api.
Root cause
The gagent-service probe issues an Elasticsearch query that explicitly sorts on UpdatedAt, but the index aevatar-mainnet-script-catalog-entries either has no documents (empty dynamic mapping) or has docs whose updated_at_utc_value is serialized as a nested struct (not a sortable scalar). When the sort field has no usable mapping, Elasticsearch returns 400 unless the sort clause includes unmapped_type / missing hints.
The default-sort path includes those hints; the explicit-sort path does not. That asymmetry is the bug.
Call path
GAgentServiceCapabilityHostBuilderExtensions.AddGAgentServiceCapabilityBundle's probe at src/platform/Aevatar.GAgentService.Hosting/Endpoints/GAgentServiceCapabilityHostBuilderExtensions.cs:34-35 calls IScopeScriptQueryPort.ListAsync(\"health\", ct).
ScopeScriptQueryApplicationService.ListAsync (src/platform/Aevatar.GAgentService.Application/Scripts/ScopeScriptQueryApplicationService.cs:22-35) → IScriptCatalogQueryPort.ListCatalogEntriesAsync(catalogActorId, take, ct).
ProjectionScriptCatalogQueryPort.ListCatalogEntriesAsync (src/Aevatar.Scripting.Projection/ReadPorts/ProjectionScriptCatalogQueryPort.cs:73-94) builds a ProjectionDocumentQuery with explicit Sorts = [{ FieldPath = nameof(ScriptCatalogEntryDocument.UpdatedAt), Direction = Desc }].
- The Elasticsearch payload builder resolves
\"UpdatedAt\" to the proto field name updated_at_utc_value via ResolveFieldPath / BuildFieldCandidates (src/Aevatar.CQRS.Projection.Providers.Elasticsearch/Stores/ElasticsearchProjectionDocumentStore.cs:317-394). The _utc_value candidate matches ScriptCatalogEntryDocument.updated_at_utc_value (proto field 5, google.protobuf.Timestamp, see src/Aevatar.Scripting.Projection/script_projection_read_models.proto:48-63).
BuildSortSpec in src/Aevatar.CQRS.Projection.Providers.Elasticsearch/Stores/ElasticsearchProjectionDocumentStorePayloadSupport.cs:176-202 takes the explicit branch and emits BuildSortClause(..., includeMissingHints: false).
BuildSortClause (...PayloadSupport.cs:204-224) only adds \"missing\":\"_last\" and \"unmapped_type\":\"date\" when includeMissingHints == true. The default-sort path uses true; the explicit-sort path uses false.
- The index metadata provider
ScriptCatalogEntryDocumentMetadataProvider (src/Aevatar.Scripting.Projection/Metadata/ScriptCatalogEntryDocumentMetadataProvider.cs:8-15) declares only \"dynamic\": true with no explicit field mappings.
- Without
unmapped_type and with no document-derived mapping for updated_at_utc_value, Elasticsearch returns query_shard_exception: No mapping found for [updated_at_utc_value] in order to sort on.
The other two probes in the same bundle don't hit this path: IServiceLifecycleQueryPort.ListServicesAsync and IScopeWorkflowQueryPort.ListAsync either don't sort or sort in-process after fetch (ScopeWorkflowQueryApplicationService.cs:40 orders the materialized list client-side). So the failure surfaces only on the script catalog leg.
Why now / scope of impact
- This affects any deployment where
aevatar-{env}-script-catalog-entries is empty or where updated_at_utc_value was never auto-mapped — i.e. mainnet today, and any new environment before the first script publish.
- The same defect applies to every projection query that issues an explicit
Sorts = [...] against a field that may not yet be mapped (timestamp fields are the typical victim, since proto Timestamp serializes to a nested struct under System.Text.Json defaults).
- Existing tests assert the default-sort path correctly emits
unmapped_type (test/Aevatar.CQRS.Projection.Core.Tests/ElasticsearchProjectionDocumentStoreBehaviorTests.cs:77-78), but no test covers the explicit-sort path.
Suggested fix (pick or combine)
- Always include the safety hints on every sort clause. In
BuildSortClause, always emit \"missing\":\"_last\" and \"unmapped_type\":.... Infer unmapped_type from the resolved field path (*_utc_value → \"date\", otherwise \"keyword\"), or thread the proto FieldType/MessageType through fieldPathResolver so the sort builder knows the target type.
- Declare explicit ES mappings for sortable fields in
IProjectionDocumentMetadataProvider implementations — at minimum updated_at_utc_value / created_at_utc_value as date (and document the expected JSON shape).
- Normalize Timestamp serialization so
updated_at_utc_value lands in ES as an ISO-8601 string instead of { \"seconds\":..., \"nanos\":... }. This keeps the dynamic-mapping path honest.
Option 1 is the smallest defensible patch and unblocks the health check immediately; option 2 or 3 is the correct structural fix and should follow.
Repro
curl -sSL https://aevatar-console-backend-api.aevatar.ai/api/health | jq '.components[] | select(.name==\"gagent-service\")'
Related
Not a duplicate of #355 (Channel registration startup degraded mode) — same shape (read-model bootstrap masks an unhealthy capability) but different read model, different code path.
Symptom
GET https://aevatar-console-backend-api.aevatar.ai/api/healthreturns HTTP 503 withgagent-servicereported asunhealthyand the entire host asnot-ready.{ "name": "gagent-service", "category": "capability", "critical": true, "status": "unhealthy", "message": "Elasticsearch query failed: 400 Bad Request. body={\"error\":{\"root_cause\":[{\"type\":\"query_shard_exception\",\"reason\":\"No mapping found for [updated_at_utc_value] in order to sort on\",\"index_uuid\":\"C_2X-E2gRsGUX_sigX8cug\",\"index\":\"aevatar-mainnet-script-catalog-entries\"}], ... ,\"status\":400}", "details": { "requiredRoutes": "/api/services, /api/scopes/{scopeId}/binding, /api/scopes/{scopeId}/workflows, /api/scopes/{scopeId}/scripts", "exceptionType": "System.InvalidOperationException" } }Because
gagent-serviceiscritical: true, the whole host isnot-ready. Other capabilities (scripting-bundle,studio,workflow-bundle,workflow-document-readmodel,workflow-graph-readmodel) all report healthy.Captured
2026-04-25T06:31:04ZUTC againstAevatar.Mainnet.Host.Api.Root cause
The
gagent-serviceprobe issues an Elasticsearch query that explicitly sorts onUpdatedAt, but the indexaevatar-mainnet-script-catalog-entrieseither has no documents (empty dynamic mapping) or has docs whoseupdated_at_utc_valueis serialized as a nested struct (not a sortable scalar). When the sort field has no usable mapping, Elasticsearch returns 400 unless the sort clause includesunmapped_type/missinghints.The default-sort path includes those hints; the explicit-sort path does not. That asymmetry is the bug.
Call path
GAgentServiceCapabilityHostBuilderExtensions.AddGAgentServiceCapabilityBundle's probe atsrc/platform/Aevatar.GAgentService.Hosting/Endpoints/GAgentServiceCapabilityHostBuilderExtensions.cs:34-35callsIScopeScriptQueryPort.ListAsync(\"health\", ct).ScopeScriptQueryApplicationService.ListAsync(src/platform/Aevatar.GAgentService.Application/Scripts/ScopeScriptQueryApplicationService.cs:22-35) →IScriptCatalogQueryPort.ListCatalogEntriesAsync(catalogActorId, take, ct).ProjectionScriptCatalogQueryPort.ListCatalogEntriesAsync(src/Aevatar.Scripting.Projection/ReadPorts/ProjectionScriptCatalogQueryPort.cs:73-94) builds aProjectionDocumentQuerywith explicitSorts = [{ FieldPath = nameof(ScriptCatalogEntryDocument.UpdatedAt), Direction = Desc }].\"UpdatedAt\"to the proto field nameupdated_at_utc_valueviaResolveFieldPath/BuildFieldCandidates(src/Aevatar.CQRS.Projection.Providers.Elasticsearch/Stores/ElasticsearchProjectionDocumentStore.cs:317-394). The_utc_valuecandidate matchesScriptCatalogEntryDocument.updated_at_utc_value(proto field 5,google.protobuf.Timestamp, seesrc/Aevatar.Scripting.Projection/script_projection_read_models.proto:48-63).BuildSortSpecinsrc/Aevatar.CQRS.Projection.Providers.Elasticsearch/Stores/ElasticsearchProjectionDocumentStorePayloadSupport.cs:176-202takes the explicit branch and emitsBuildSortClause(..., includeMissingHints: false).BuildSortClause(...PayloadSupport.cs:204-224) only adds\"missing\":\"_last\"and\"unmapped_type\":\"date\"whenincludeMissingHints == true. The default-sort path usestrue; the explicit-sort path usesfalse.ScriptCatalogEntryDocumentMetadataProvider(src/Aevatar.Scripting.Projection/Metadata/ScriptCatalogEntryDocumentMetadataProvider.cs:8-15) declares only\"dynamic\": truewith no explicit field mappings.unmapped_typeand with no document-derived mapping forupdated_at_utc_value, Elasticsearch returnsquery_shard_exception: No mapping found for [updated_at_utc_value] in order to sort on.The other two probes in the same bundle don't hit this path:
IServiceLifecycleQueryPort.ListServicesAsyncandIScopeWorkflowQueryPort.ListAsynceither don't sort or sort in-process after fetch (ScopeWorkflowQueryApplicationService.cs:40orders the materialized list client-side). So the failure surfaces only on the script catalog leg.Why now / scope of impact
aevatar-{env}-script-catalog-entriesis empty or whereupdated_at_utc_valuewas never auto-mapped — i.e. mainnet today, and any new environment before the first script publish.Sorts = [...]against a field that may not yet be mapped (timestamp fields are the typical victim, since proto Timestamp serializes to a nested struct underSystem.Text.Jsondefaults).unmapped_type(test/Aevatar.CQRS.Projection.Core.Tests/ElasticsearchProjectionDocumentStoreBehaviorTests.cs:77-78), but no test covers the explicit-sort path.Suggested fix (pick or combine)
BuildSortClause, always emit\"missing\":\"_last\"and\"unmapped_type\":.... Inferunmapped_typefrom the resolved field path (*_utc_value→\"date\", otherwise\"keyword\"), or thread the protoFieldType/MessageTypethroughfieldPathResolverso the sort builder knows the target type.IProjectionDocumentMetadataProviderimplementations — at minimumupdated_at_utc_value/created_at_utc_valueasdate(and document the expected JSON shape).updated_at_utc_valuelands in ES as an ISO-8601 string instead of{ \"seconds\":..., \"nanos\":... }. This keeps the dynamic-mapping path honest.Option 1 is the smallest defensible patch and unblocks the health check immediately; option 2 or 3 is the correct structural fix and should follow.
Repro
Related
Not a duplicate of #355 (Channel registration startup degraded mode) — same shape (read-model bootstrap masks an unhealthy capability) but different read model, different code path.