.NET 8 API: Graylog → OpenAI classification → PostgreSQL incidents → optional Jira outbound. On-demand POST /api/log/analyze uses a bounded queue.
| Component | Role |
|---|---|
| LogAnalyzer.Api | HTTP host, EF migrations, controllers. |
| LogAnalyzer.Processor | Queue worker + PeriodicLogAnalysisBackgroundService (Graylog batch → OpenAI when needed). |
| LogAnalyzer.Infrastructure | Npgsql, Graylog client, incidents, outbound channel + Jira REST. |
| LogAnalyzer.AI | OpenAI Chat Completions client. |
| LogAnalyzer.ServiceDefaults | OTEL (LogAnalyzer.Outbound meter), /health, /alive. |
| LogAnalyzer.AppHost | Runs LogAnalyzer.Api locally + Aspire Dashboard. |
| Docker Compose | Postgres, pgAdmin, mock-log-producer (GELF → Graylog). No API container. |
Periodic pipeline: dedupe → uncached lines partitioned by operational fingerprint (evidence-based groupId) → one OpenAI AnalyzeAsync per group (similar lines share one prompt) → one LogAnalyses row per line (line-hash cache) + one incident upsert per group (separate Jira tasks per operational group) → outbound enqueue when enabled. OpenAI call volume capped by PeriodicAnalysis:MaxOpenAiCallsPerCycle. First cycle after PeriodicAnalysis:IntervalMinutes (no run at process start).
| Layer | What runs |
|---|---|
| Compose | Database + pgAdmin + producer sending synthetic logs toward Graylog via GELF. |
| AppHost / VS | API process + Aspire Dashboard + tracing defaults. |
Do not expect the API inside Compose for day-to-day dev. Optional production-style image build still uses repo Dockerfile (manual docker build), not docker compose.
Graylog uses two different surfaces:
| Surface | Purpose | Where |
|---|---|---|
| GELF UDP | Docker logging driver for mock-log-producer only — where log lines are sent into Graylog. |
.env → GRAYLOG_GELF_ADDRESS (Compose substitution). Not read by ASP.NET GraylogOptions. |
| REST API | API periodic job searches Graylog (GraylogLogProvider). |
Graylog in appsettings*.json / user-secrets (BaseUrl, ApiToken, Query, timeouts, pagination, …). |
Jira: Jira section in appsettings.json (defaults) and appsettings.Development.json (local overrides) or dotnet user-secrets for tokens — never in .env for this repo’s Compose layout.
| Concern | Files |
|---|---|
| Postgres | POSTGRES_DB, POSTGRES_USER, POSTGRES_PASSWORD in .env.example → .env, substituted into docker-compose.yml. |
| pgAdmin | PGADMIN_DEFAULT_EMAIL, PGADMIN_DEFAULT_PASSWORD |
| mock-log-producer → Graylog | GRAYLOG_GELF_ADDRESS (GELF UDP endpoint Docker-side). Producer binary itself has no extra env in this stack. |
No OpenAI__, Graylog__, Jira__, PeriodicAnalysis__, ConnectionStrings__ (and no ASP.NET double-underscore vars) in .env.
DB pairing: POSTGRES_* must match ConnectionStrings:DefaultConnection (database name, user, password on localhost) in appsettings.json.
| Concern | Files |
|---|---|
ConnectionStrings, OpenAI, Graylog (REST), PeriodicAnalysis (interval, line caps, MaxOpenAiCallsPerCycle), IncidentReuse, IncidentAiSnapshot, Webhook, Jira, host Logging / AllowedHosts |
LogAnalyzer/appsettings.json |
| Local/dev overrides | LogAnalyzer/appsettings.Development.json (merged over appsettings.json). For shared repos prefer dotnet user-secrets instead of committing secrets here. |
Secrets (OpenAI:ApiKey, Graylog:ApiToken, Jira tokens, …) |
dotnet user-secrets on LogAnalyzer.Api |
| Concern | Files |
|---|---|
| AppHost / DCP log noise | LogAnalyzer.AppHost/appsettings.json (and .Development.json if present). Does not replace LogAnalyzer/appsettings*.json for the API. |
LogAnalyzer/configuration.template.json — same JSON as appsettings.json (secrets empty); reference / diff only — never loaded by the runtime.
dotnet user-secrets set "OpenAI:ApiKey" "<key>" --project LogAnalyzer/LogAnalyzer.Api.csproj
dotnet user-secrets set "Graylog:ApiToken" "<token>" --project LogAnalyzer/LogAnalyzer.Api.csprojdocker compose up -d— Postgres, pgAdmin, mock-log-producer.- Graylog running with a GELF UDP input matching producer endpoint (default
udp://host.docker.internal:12201from inside mock container). - Visual Studio or
dotnet run --project LogAnalyzer.AppHost/LogAnalyzer.AppHost.csproj— API starts; console shows Aspire Dashboard URL. - mock-log-producer emits structured lines → Graylog indexes them.
PeriodicLogAnalysisBackgroundService(after first interval) pulls Graylog → per operational group OpenAI →LogAnalyses/ incidents (ensureGraylog:*+OpenAI:*configured).- Jira outbound — if
Jira:EnableIntegrationis true, dispatcher processes queue → mock issue key (UseMockClient: true) or REST create (UseMockClient: false+ valid auth). Watch logs,/health, OTEL meterLogAnalyzer.Outbound.
Shortcut without Graylog delay: POST /api/log/analyze with a JSON body (still exercises OpenAI path).
- Graylog (external) + GELF input.
docker compose up -d- AppHost or
dotnet runAPI — EF migrations apply on startup.
Ports (typical): Postgres 5432, pgAdmin 5050, API 7225/5294 (VS profiles), Swagger under Development.
- OpenAI:
OpenAI:Model, caps inappsettings.json;OpenAI:ApiKeyvia user-secrets (never commit real keys in JSON). OptionalOpenAI:Organization/OpenAI:Project→OpenAI-Organization/OpenAI-Projectrequest headers when set. - Jira:
Jirasection in appsettings.EnableIntegration: false→ queue/dispatcher idle.UseMockClient: true→ no HTTP; deterministic fake keys.UseMockClient: false+EnableIntegration: true→ REST; startup validates BaseUrl, ProjectKey, Basic or Bearer auth fields.
When AppHost starts, the dashboard URL is printed to the console (Aspire assigns the port). Use it for resource status, structured logs, and traces emitted via AddServiceDefaults.
dotnet build LogAnalyzer.slnx -c Release
dotnet test LogAnalyzer.Infrastructure.Tests/LogAnalyzer.Infrastructure.Tests.csproj -c Release --no-buildHTTP: GET /alive, GET /health, POST /api/log/analyze, GET /log-analysis.
- OpenTelemetry metrics/tracing via ServiceDefaults;
LogAnalyzer.Outboundfor queue/dispatch/Jira HTTP. GET /healthincludes outbound readiness (readytag) when Jira integration is registered.
| Symptom | Check |
|---|---|
| Startup validation fails | OpenAI:ApiKey, Graylog:BaseUrl, Graylog:ApiToken (user-secrets or temporary edits to appsettings.Development.json — avoid committing secrets). |
| Periodic never hits OpenAI | Graylog empty/query window; unchanged aggregate log hash; all lines cached; interval not elapsed; MaxOpenAiCallsPerCycle exhausted (remaining uncached lines wait next cycle). |
| No logs in Graylog from producer | GELF input port/host vs GRAYLOG_GELF_ADDRESS; firewall. |
429 on analyze |
Analysis queue full (1000). |
Do not commit secrets. Prefer user-secrets. If an old .env ever contained API keys, rotate them — .env must stay Compose-only (see .env.example).
Placeholder — align with your organization.
- Periodic analysis (breaking behavior vs. prior batch): Uncached lines are clustered by evidence-only operational fingerprint (
OperationalIncidentFingerprintHeuristics.ComputeGroupIdFromEvidenceOnly→ILogGroupingService). Each cluster gets its ownAnalyzeAsync(similar lines stay in one prompt). Each line still gets aLogAnalysesrow keyed by stable line hash (cache preserved). Each cluster drives one incident upsert → one Jira issue per operational group (not one mega-batch issue). OpenAI volume per cycle is limited byPeriodicAnalysis:MaxOpenAiCallsPerCycle(default 16); overflow lines are deferred with a warning. - Removed from periodic:
EnableMultiCandidate, single mega-batchAnalyzeAsync, andAnalyzeBatchCandidatesAsync/BatchIncidentCandidatescache usage (the table and AI method remain in the solution for other scenarios). - Fingerprint: Canonical incident grouping is evidence-derived (service/component/issue class, TLS CN, SQL state hints); AI candidate titles are no longer part of
groupIdfor periodic grouping. - Incidents & Jira:
IncidentsgainedOperationalTitleandEvidenceLogExcerpt(migrationAddIncidentOperationalPresentationFields); periodic upserts passIncidentUpsertPresentation.JiraIssueDescriptionFormatteradds optional Operational title / Evidence sections, usesOperationalTitlefor the Jira issue summary when present, and labels model text AI technical summary. - OpenAI HTTP: Optional
OpenAI:OrganizationandOpenAI:Projectconfiguration maps to OpenAI platform headers.
.env/.env.example: strictly Compose (Postgres, pgAdmin,GRAYLOG_GELF_ADDRESS). Removed ASP.NET-style variables from.env.appsettings.Development.json: local/demo overrides (OpenAI, Graylog REST, periodic tuning); production forks should use user-secrets and keep this file non-secret or gitignored.configuration.template.jsonkept in lockstep withappsettings.json(application surface only).
- Docker Compose: Postgres + pgAdmin + mock-log-producer only; removed API (
log-analyzer) service from compose. - README: Compose infra + AppHost API model.
- Runbook in README;
UserSecretsIdonLogAnalyzer.Api; periodic analysis waits first timer tick before first Graylog cycle.
- Outbound metrics, Jira health/readiness, dispatcher hardening; mock producer
eval_*harness labels (not classification ground truth).
- EF schema,
GraylogLogProvider,OpenAiLogAnalyzer,PeriodicLogAnalysisBackgroundService,GET /log-analysis.
- Solution split, bounded queue +
429, AI retry/fallback, grouping.