Skip to content

feat: add Hermes Agent session support#282

Open
srujun wants to merge 1 commit intowesm:mainfrom
srujun:feat/hermes-agent
Open

feat: add Hermes Agent session support#282
srujun wants to merge 1 commit intowesm:mainfrom
srujun:feat/hermes-agent

Conversation

@srujun
Copy link
Copy Markdown

@srujun srujun commented Apr 5, 2026

Summary

Add Hermes Agent as a supported agent with full session parsing and discovery.

Hermes Agent (github.com/hermes-ai/hermes-agent) is a multi-platform AI coding agent that records sessions across CLI, Discord, webhooks, and cron jobs. Each platform gets its own project in the UI (hermes-cli, hermes-discord, hermes-webhook, hermes-cron).

Two session formats

Gateway (.jsonl) — Line-delimited JSON used by messaging platform sessions (Discord, webhooks, cron):

  • First line is a session_meta header with model, platform, and tool definitions
  • Subsequent lines are user/assistant/tool messages with per-message timestamps
  • Tool calls embedded in assistant messages as tool_calls array

CLI (.json) — Single JSON object used by interactive CLI sessions:

  • Envelope metadata: session_id, model, platform, session_start, last_updated
  • Messages array with same role/content/tool_calls structure but no per-message timestamps
  • Filename pattern: session_<id>.json (vs <id>.jsonl for gateway)

Deduplication

Some sessions exist in both formats (gateway saves both). The discovery function collects .jsonl files first, then only adds .json files whose session ID isn't already covered — .jsonl takes priority since it has richer per-message timestamps.

Naive timestamps

Hermes records wall-clock time without UTC offsets. Timestamps are parsed with time.ParseInLocation(... time.Local) so they're interpreted in the server's timezone rather than defaulting to UTC.

Files changed

  • internal/parser/hermes.go — parser, discovery, helpers (new, ~550 lines)
  • internal/parser/types.goAgentHermes const + Registry entry
  • internal/parser/taxonomy.go — Hermes tool name → category mappings
  • internal/sync/engine.goprocessHermes dispatch + method

@roborev-ci
Copy link
Copy Markdown

roborev-ci bot commented Apr 5, 2026

roborev: Combined Review (b1de379)

Verdict: Changes are close, but there are 1 high and 2 medium issues that should be addressed before merge.

High

  • internal/parser/hermes.go:534-543
    stripHermesSkillPrefix uses strings.LastIndex(s, "\n\n") to recover the user message after the injected skill block. This will truncate any multi-paragraph prompt, blank-line-separated list, or example block down to only the final paragraph, corrupting parsed Content and FirstMessage for affected Hermes sessions.
    Fix: Strip the skill envelope structurally instead of taking the final paragraph-like chunk; preserve the full remainder of the user message.

Medium

  • internal/parser/hermes.go:120-133
    In the JSONL parser, HasToolUse is derived only from finish_reason == "tool_calls", even though tool calls are parsed from the tool_calls array itself. If Hermes emits tool calls with a missing or different finish_reason, the message will contain ToolCalls but still be marked as not using tools, creating inconsistent downstream state.
    Fix: Set HasToolUse whenever a non-empty tool_calls array is present, or derive it from len(toolCalls) > 0 after parsing.

  • internal/parser/hermes.go:294, internal/parser/hermes.go:344, internal/parser/hermes.go:362
    parseHermesJSONSession does not populate per-message Timestamp fields, unlike parseHermesJSONLSession. If Hermes JSON sessions include message-level timestamps, they are currently dropped.
    Fix: Extract the timestamp from each JSON message and assign it to the corresponding ParsedMessage.


Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

@srujun srujun force-pushed the feat/hermes-agent branch from b1de379 to 52b057a Compare April 5, 2026 15:08
Add support for indexing Hermes Agent (github.com/hermes-ai/hermes-agent)
session logs. Hermes is a multi-platform AI agent that records sessions
across CLI, Discord, webhooks, and cron — each getting its own project
in the UI (hermes-cli, hermes-discord, hermes-webhook, hermes-cron).

Hermes uses two session formats:
- Gateway (.jsonl): line-delimited JSON with a session_meta header,
  per-message timestamps, and tool_calls embedded in assistant messages.
- CLI (.json): single JSON object with envelope metadata (session_id,
  model, platform, session_start, last_updated) and a messages array.

The parser discovers both formats, with .jsonl taking priority when a
session exists in both (gateway sessions are saved in both formats).
Naive timestamps (no timezone offset) are parsed with time.Local since
Hermes records wall-clock time without UTC indicators.

Files changed:
- internal/parser/hermes.go    — parser, discovery, helpers (new)
- internal/parser/types.go     — AgentHermes const + Registry entry
- internal/parser/taxonomy.go  — Hermes tool name categorization
- internal/sync/engine.go      — processHermes dispatch
@srujun srujun force-pushed the feat/hermes-agent branch from 52b057a to f8fd9e1 Compare April 5, 2026 15:15
@srujun
Copy link
Copy Markdown
Author

srujun commented Apr 5, 2026

roborev: Combined Review (b1de379)

Verdict: Changes are close, but there are 1 high and 2 medium issues that should be addressed before merge.

High

  • internal/parser/hermes.go:534-543
    stripHermesSkillPrefix uses strings.LastIndex(s, "\n\n") to recover the user message after the injected skill block. This will truncate any multi-paragraph prompt, blank-line-separated list, or example block down to only the final paragraph, corrupting parsed Content and FirstMessage for affected Hermes sessions.
    Fix: Strip the skill envelope structurally instead of taking the final paragraph-like chunk; preserve the full remainder of the user message.

Medium

  • internal/parser/hermes.go:120-133
    In the JSONL parser, HasToolUse is derived only from finish_reason == "tool_calls", even though tool calls are parsed from the tool_calls array itself. If Hermes emits tool calls with a missing or different finish_reason, the message will contain ToolCalls but still be marked as not using tools, creating inconsistent downstream state.
    Fix: Set HasToolUse whenever a non-empty tool_calls array is present, or derive it from len(toolCalls) > 0 after parsing.
  • internal/parser/hermes.go:294, internal/parser/hermes.go:344, internal/parser/hermes.go:362
    parseHermesJSONSession does not populate per-message Timestamp fields, unlike parseHermesJSONLSession. If Hermes JSON sessions include message-level timestamps, they are currently dropped.
    Fix: Extract the timestamp from each JSON message and assign it to the corresponding ParsedMessage.

Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

Review feedback addressed

  • stripHermesSkillPrefix now parses the skill injection envelope structurally
    (keying on the explicit instruction marker) instead of using a LastIndex("\n\n")
    heuristic that could truncate multi-paragraph user messages.
  • HasToolUse is derived from len(toolCalls) > 0 rather than finish_reason,
    so it stays consistent even if Hermes omits or changes the finish reason field.
  • JSON parser now extracts per-message timestamp fields when present, matching
    the JSONL parser behavior.

@srujun srujun marked this pull request as ready for review April 5, 2026 15:27
@roborev-ci
Copy link
Copy Markdown

roborev-ci bot commented Apr 5, 2026

roborev: Combined Review (f8fd9e1)

Verdict: Changes are not ready to merge due to 1 high-severity data-loss issue and 1 medium-severity timestamp integrity issue.

High

  • internal/parser/hermes.go:159, internal/parser/hermes.go:309
    role:"tool" handling drops the actual tool output and preserves only ContentLength. Hermes sessions that include tool usage will therefore import with empty tool-result bodies, breaking transcript fidelity and any UI/search behavior that depends on tool output text.
    Fix: Preserve the tool result payload when constructing ParsedMessage / ParsedToolResult, consistent with the other parsers.

Medium

  • internal/parser/hermes.go:228
    parseHermesJSONSession seeds StartedAt / EndedAt from top-level session_start and last_updated, but does not reconcile those values with per-message timestamps even though message timestamps are parsed. If the top-level fields are missing, malformed, or stale, imported sessions can end up with incorrect zero or outdated bounds despite valid message timestamps being available.
    Fix: Track min/max msgTS while iterating messages and use them as a fallback, or to correct stale top-level timestamps.

Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

Note: gemini review skipped (agent quota exhausted)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant