Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
55 changes: 54 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,8 @@ See `docs/TESTING.md` for install and CLI smoke tests.
3. ACP calls an OpenAI-compatible LLM API to rank candidates.
4. Core validates and filters LLM output.

The flow now returns explicit LLM diagnostics and fallback metadata. When ACP falls back to deterministic core behavior, the response includes `llm_used`, `llm_status`, `fallback_reason`, `fallback_mode`, and a small `diagnostics` object instead of silently degrading.

For details on the design, see `docs/PHENOTYPE_RECOMMENDATION_DESIGN.md`.

### `phenotype_improvements` flow (ACP + MCP + LLM)
Expand Down Expand Up @@ -198,12 +200,18 @@ Recommended contents of `.env`:
EMBED_API_KEY=<your api key>
EMBED_MODEL=<an embedding model>
EMBED_URL=http://172.17.0.1:3000/ollama/api/embed # or equivalent
EMBED_TIMEOUT=120
LLM_API_KEY=<your api key>
LLM_API_URL=http://172.17.0.1:3000/api/chat/completions # or equivalent
LLM_MODEL=<a chat completion model>
LLM_LOG=1
LLM_USE_RESPONSES=0
LLM_TIMEOUT=180
LLM_TIMEOUT=300
STUDY_AGENT_MCP_TIMEOUT=240
ACP_TIMEOUT=360
LLM_CANDIDATE_LIMIT=5
LLM_RECOMMENDATION_TOP_K=20
LLM_RECOMMENDATION_MAX_RESULTS=3
STUDY_AGENT_ALLOW_CORE_FALLBACK=0
STUDY_AGENT_DEBUG=1
```
Expand Down Expand Up @@ -243,6 +251,51 @@ Notes:
- ACP is exposed on port 8765 and MCP on port 8790.
- The phenotype index is mounted from `./data/phenotype_index` into MCP at `/data/phenotype_index`.

### Constrained deployment guidance

Recommended timeout ladder:

- `ACP_TIMEOUT=360`
- `LLM_TIMEOUT=300`
- `STUDY_AGENT_MCP_TIMEOUT=240`
- `EMBED_TIMEOUT=120` to `180`

Recommendation tuning knobs:

- `LLM_USE_RESPONSES=0` is the default and the recommended setting when `LLM_API_URL` points at `/api/chat/completions`.
- `LLM_CANDIDATE_LIMIT=5` trims recommendation prompts before the LLM call.
- `LLM_RECOMMENDATION_TOP_K=20` controls retrieval breadth before truncation.
- `LLM_RECOMMENDATION_MAX_RESULTS=3` keeps early recommendation outputs small and easier to parse.

Recommendation responses now include explicit fallback metadata and diagnostics. Example parse-failure fallback:

```json
{
"status": "ok",
"llm_used": false,
"llm_status": "json_parse_failed",
"fallback_reason": "llm_json_parse_failed",
"fallback_mode": "stub",
"candidate_limit": 5,
"candidate_count": 5,
"diagnostics": {
"llm_status": "json_parse_failed",
"llm_duration_seconds": 12.5,
"llm_error": "json_parse_failed",
"llm_parse_stage": "chat_completions_content:json_loads",
"llm_schema_valid": false
}
}
```

You can also derive environment-specific timeout starting values with:

```bash
doit calibrate_timeouts
```

The calibration task samples the main structured phenotype flows, compares multiple candidate limits, and writes both a recommended timeout `.env` fragment and a JSON timing report.

Detailed tests can be found in `docs/TESTING.md` but this one is useful for a quick check that a tool that uses the chat completion is functioning reachable:
```
# phenotype_intent_split
Expand Down
17 changes: 15 additions & 2 deletions acp_agent/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,10 +14,23 @@ Set these environment variables to enable LLM calls from ACP:
- `LLM_API_URL` (default `http://localhost:3000/api/chat/completions`)
- `LLM_API_KEY` (required)
- `LLM_MODEL` (default `agentstudyassistant`)
- `LLM_TIMEOUT` (default `180`)
- `LLM_TIMEOUT` (default `300`)
- `LLM_LOG` (default `0`)
- `LLM_DRY_RUN` (default `0`)
- `LLM_USE_RESPONSES` (default `0`, use OpenAI Responses API payload/parse instead of Chat Completions; unrelated to MCP tool use)
- `LLM_CANDIDATE_LIMIT` (default `10`)
- `LLM_CANDIDATE_LIMIT` (default `5`)
- `LLM_RECOMMENDATION_TOP_K` (default `20`)
- `LLM_RECOMMENDATION_MAX_RESULTS` (default `3`)
- `STUDY_AGENT_MCP_TIMEOUT` (default `240`)
- `EMBED_TIMEOUT` (default `120`)

Recommended timeout ladder for constrained deployments:

- `ACP_TIMEOUT > LLM_TIMEOUT > STUDY_AGENT_MCP_TIMEOUT`
- Recommended starting point: `ACP_TIMEOUT=360`, `LLM_TIMEOUT=300`, `STUDY_AGENT_MCP_TIMEOUT=240`

ACP recommendation flows now expose explicit diagnostics and fallback metadata including `llm_status`, `llm_duration_seconds`, `llm_parse_stage`, `llm_schema_valid`, `fallback_reason`, and `fallback_mode`.

To estimate environment-specific starting values instead of relying only on defaults, run `doit calibrate_timeouts`. It writes a recommended timeout env fragment and a JSON timing summary based on repeated ACP smoke-flow samples.

See `docs/TESTING.md` for CLI smoke tests.
Loading
Loading