Skip to content

Add robust research persona system#257

Open
SergioHC95 wants to merge 7 commits into
mainfrom
research-persona-system
Open

Add robust research persona system#257
SergioHC95 wants to merge 7 commits into
mainfrom
research-persona-system

Conversation

@SergioHC95

@SergioHC95 SergioHC95 commented Jun 4, 2026

Copy link
Copy Markdown
Collaborator

Research Persona: a memory layer for how a scientist actually thinks

This PR turns build-persona from a profile sketch into a serious, inspectable research-persona system. The goal is not just to remember a user’s name or topic area, but to let GPD build a structured model of a researcher’s intellectual terrain: papers, collaborators, tools, mathematical style, experimental instincts, theoretical taste, explanation preferences, privacy boundaries, and recurring research moves.

The result is a persona layer that can be used safely by commands and agents without dumping raw user memory into prompts.

What this unlocks

  • Researcher Doppelganger: generates a bounded, evidence-aware preview of how the user would likely frame a problem, critique an idea, choose next experiments, or triage a research direction.
  • Expertise-Aware Explanations: adapts explanations to the user’s known background, emphasizing the right mix of math, code, experiments, fundamentals, theory, or applications.
  • Scientific Taste Model: scores candidate ideas against the user’s inferred research taste: novelty appetite, rigor expectations, elegance, empirical tolerance, preferred tools, and fit with existing work.
  • Source-grounded persona building: ingests explicit source documents into conservative persona patches instead of inventing profile facts.
  • Runtime-safe persona capsules: exposes compact, prompt-safe persona context to downstream workflows without raw-profile leakage.
  • Audit and maintenance: adds read-only diagnostics for staleness, conflicts, weak evidence, privacy issues, and suggested profile cleanups.

Implementation

  • Adds core modules for ingestion, application previews, runtime capsule generation, and audit reports.
  • Adds CLI surfaces:
    • research-persona ingest-source
    • research-persona doppelganger
    • research-persona explain-plan
    • research-persona taste-check
    • research-persona audit
  • Adds dedicated agent specs for the three ambitious application modes:
    • gpd-researcher-doppelganger
    • gpd-expertise-explainer
    • gpd-scientific-taste
  • Extends the build-persona workflow with a source-ingestion stage and richer approval/apply semantics.
  • Updates help/docs so the persona system is visible from the public command surface.
  • Adds focused tests across core models, CLI behavior, workflow integration, privacy contracts, runtime behavior, and end-to-end persona acceptance.

Design principles

  • Evidence before memory: extracted profile facts are candidates until reviewed.
  • No invisible surveillance: source ingestion is explicit and privacy defaults are conservative.
  • Useful without being creepy: applications expose bounded previews and structured reasoning, not unbounded imitation.
  • Prompt safety by construction: downstream commands receive compact capsules, not raw persona dumps.
  • Scientist-first ergonomics: the model captures how a researcher thinks, not just what fields appear in a profile form.

Verification

  • uv run ruff check .: passed.
  • Generated artifact checks passed:
    • uv run python scripts/sync_repo_graph_contract.py --check
    • uv run python scripts/render_public_surface.py --check
    • uv run python scripts/render_help_surface.py --check
    • uv run python scripts/render_bootstrap_installer_metadata.py --check
  • git diff --check: passed.
  • Full suite: 13132 passed, 14 skipped, 20 warnings in 118.69s.
  • GitHub Actions: all required checks passing on PR run 26983776033.

Summary by CodeRabbit

Release Notes

  • New Features

    • Added local research persona profile management via new gpd research-persona CLI commands: show, validate, audit, diff, apply-patch, forget, and export-capsule.
    • Added gpd init build-persona command to build a candidate research persona profile through a staged workflow with explicit consent and approval.
    • Added advisory helper commands: doppelganger (objection preview), explain-plan (explanation guidance), and taste-check (direction evaluation) using privacy-preserving profile capsules.
  • Documentation

    • Added comprehensive command, workflow, and agent specifications for Research Persona feature.
  • Tests

    • Added extensive test coverage for Research Persona core, CLI, storage, ingestion, privacy contracts, and workflows.

@coderabbitai

coderabbitai Bot commented Jun 4, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

Adds a private Research Persona subsystem: core models/storage/patching/projection, CLI commands and payload helpers, ingestion/audit/applications/runtime, staged build-persona workflow and docs, plus extensive tests and diagnostics/budget updates.

Changes

Research Persona local profile and applications

Layer / File(s) Summary
Core persona foundation and staged workflow contracts
src/gpd/core/*, src/gpd/agents/*, src/gpd/specs/workflows/*, src/gpd/specs/references/*, src/gpd/registry.py, README.md
Implements strict Research Persona models, storage and patching, projections/capsules, advisory application payloads, runtime capsule bundles, auditing, CLI payload core, and initial staged build-persona workflow specs and references.
Behavioral tests and phase acceptance reports
tests/core/*, tmp/phase-0*
Adds comprehensive tests for models, storage, CLI show/validate/mutate, ingestion, auditing, applications, projections, runtime, workflow staging, privacy/acceptance, and prompt/diagnostic budgets; includes Phase 1–3 reports.
Agent, CLI, command, and reference surfaces
src/gpd/cli.py, src/gpd/commands/*, src/gpd/specs/*, src/gpd/core/*
Adds Typer subcommands for research-persona, build-persona init, command/help surfaces, model-profile tiers, return-field registry, workflow staging hooks, and documentation entries for the new command and applications.

Sequence Diagram(s)

sequenceDiagram
  participant User
  participant CLI as gpd research-persona
  participant Core as research_persona_cli
  participant Store as persona storage
  User->>CLI: audit | show | diff/apply | export-capsule
  CLI->>Core: build_*_payload(...)
  Core->>Store: load/validate/apply (if not dry-run)
  Store-->>Core: persona/capsule/diff results
  Core-->>CLI: JSON payload
  CLI-->>User: Rendered result
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120+ minutes

Possibly related PRs

Suggested reviewers

  • madeleinesong
  • cmaloney111

Poem

A rabbit with notebooks and tea by its side,
Patches a persona with cautious pride.
Diff, then apply—only when you agree—
Capsules kept prompt-safe, private as can be.
Doppel hops, taste sniffs, plans explain—
Local burrow logs every grain.
Thump! Research done, with privacy retained.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch research-persona-system

@marcpickett1

Copy link
Copy Markdown
Collaborator

🤖 RoastBot: The AI needed a persona system to do research. Buddy, if the AI doesn't know who it is without a personality quiz, maybe it shouldn't be writing your papers either.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

🧹 Nitpick comments (2)
tests/core/test_research_persona_projection.py (1)

87-104: ⚡ Quick win

Add a regression test for axis fact_ids privacy filtering.

Current tests validate fact-value redaction, but they do not assert that prompt/public outputs avoid exposing private fact identifiers through axes[].fact_ids. Adding this case will lock in the prompt-safety contract.

✅ Suggested test addition
+def test_prompt_projection_axis_fact_ids_do_not_reference_filtered_private_facts() -> None:
+    persona = _persona_with_mixed_privacy_facts()
+    persona.axes = [
+        {
+            "id": "axis-methods",
+            "name": "methods",
+            "value": 0.4,
+            "confidence": "confirmed",
+            "privacy": "safe_to_share",
+            "fact_ids": [SAFE_FACT_ID, "rp-private-local-canary"],
+            "evidence_refs": [],
+        }
+    ]  # type: ignore[assignment]
+
+    projection = project_research_persona(persona, purpose="prompt")
+    rendered = _render(projection)
+    assert SAFE_FACT_ID in rendered
+    assert "rp-private-local-canary" not in rendered
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/core/test_research_persona_projection.py` around lines 87 - 104, Extend
the existing
test_project_projection_includes_project_private_only_for_project_purpose to
also assert that private fact identifiers are removed from axes[].fact_ids in
prompt/public projections: call project_research_persona(persona,
purpose="prompt") and inspect the returned projection (or its rendered form via
_render) to ensure PROJECT_PRIVATE_FACT_ID does not appear in any
axes[].fact_ids for the prompt projection, while it does appear for the project
projection; use the same helper symbols (_render, PROJECT_PRIVATE_FACT_ID,
project_research_persona,
test_project_projection_includes_project_private_only_for_project_purpose) to
locate and add these assertions.
tests/core/test_research_persona_help_docs.py (1)

24-29: ⚡ Quick win

Decouple CLI command extraction from an unrelated sentinel function.

Parsing up to def _integrations_config_path makes this test fragile to unrelated src/gpd/cli.py refactors.

Proposed simplification
 def _research_persona_cli_commands() -> tuple[str, ...]:
     cli = _read(CLI_PATH)
-    start_match = re.search(r"research_persona_app\s*=\s*typer[.]Typer", cli)
-    assert start_match is not None
-    start = start_match.start()
-    end = cli.index("def _integrations_config_path", start)
-    block = cli[start:end]
-    return tuple(re.findall(r'`@research_persona_app`\.command\("([^"]+)"\)', block))
+    return tuple(re.findall(r'`@research_persona_app`\.command\("([^"]+)"\)', cli))
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/core/test_research_persona_help_docs.py` around lines 24 - 29, The test
currently slices the CLI text up to the unrelated sentinel function def
_integrations_config_path which makes extraction fragile; instead, after finding
the research_persona_app initializer with start_match (searching for
research_persona_app\s*=\s*typer\.Typer), extract the commands by applying
re.findall(r'`@research_persona_app`\.command\("([^"]+)"\)', cli) on the relevant
scope (preferably the whole cli string or the substring starting at start) and
return that tuple—replace the block/end slicing logic (start/end/block) with a
direct re.findall call scoped from start or globally so the test no longer
depends on _integrations_config_path.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@README.md`:
- Around line 537-546: The table cells contain literal pipe characters that
break Markdown table parsing; update each command example (e.g., the `gpd
research-persona validate`, `diff`, `apply-patch`, `audit`, `forget`,
`ingest-source`, `export-capsule`, `doppelganger`, `explain-plan`, and
`taste-check` lines) to escape the pipe characters by replacing "|" with "\|"
inside the code spans so the pipes are treated as literal characters and the
table renders correctly.

In `@src/gpd/core/research_persona.py`:
- Around line 339-343: The current early return for purpose in
{"prompt","public"} checks only fact.privacy and makes the subsequent
sensitive-category guard dead; change the logic in the code that filters facts
for prompt/public (the block referencing purpose, fact.privacy, fact.category,
and _SENSITIVE_PROMPT_CATEGORIES) to combine the checks so any fact for
prompt/public is rejected if either fact.privacy != "safe_to_share" OR
fact.category.casefold() is in _SENSITIVE_PROMPT_CATEGORIES; update the
conditional (and keep using casefold() for category comparison) to return None
when either condition is true.
- Around line 1204-1213: The axis projection is leaking private fact identifiers
because axes are projected via _project_axis without reconciling axis.fact_ids
against the filtered persona.facts; compute the set of allowed fact ids from the
projected facts (the comprehension using _project_fact) and when projecting axes
(the comprehension using _project_axis) remove or filter any axis.fact_ids not
present in that allowed set (or omit the field entirely for prompt/public
purpose). Update _project_axis or the axis comprehension to perform this
filtering so axes in prompt/public payloads cannot reference non-prompt-safe
facts.

In `@src/gpd/specs/workflows/build-persona.md`:
- Around line 211-213: The workflow doc uses patch operation names add_fact,
replace_fact, add_axis, replace_axis, add_list_item, remove_list_item (and
pointer ops) but the persona builder schema expects upsert_fact, append_list,
upsert_axis, tombstone_fact; update the workflow text to use the schema's
canonical operation names (replace add_fact/replace_fact →
upsert_fact/tombstone_fact as appropriate, add_list_item/remove_list_item →
append_list or the tombstone pattern, add_axis/replace_axis → upsert_axis) so
generated patches match the research-persona patch contract; search for the
operation names in this file and replace them with the corresponding schema
names (upsert_fact, append_list, upsert_axis, tombstone_fact) and adjust any
examples or guidance to match the builder's expected semantics.

In `@src/gpd/specs/workflows/build-persona/persona-intake.md`:
- Around line 41-44: Replace the ambiguous "gpd research-persona diff"
invocation with the explicit diff form that includes the candidate patch
artifact path so the approval boundary always diffs the exact patch; locate the
"gpd research-persona diff" command in persona-intake.md and change it to call
the diff with the same patch argument used by "gpd research-persona apply-patch"
(e.g., the candidate patch variable/path) to ensure consistent, unambiguous
behavior.

In `@tests/core/test_research_persona_builder_privacy.py`:
- Around line 61-67: In tests/core/test_research_persona_builder_privacy.py the
comparison uses line.casefold() against the storage_needles tuple which contains
mixed-case tokens (e.g. "$GPD_DATA_DIR"), causing case-fold mismatches; fix by
normalizing the needles as well (e.g., build a lowercased/casefolded version of
storage_needles before the membership check) or compare using both sides
casefolded so the check uses line.casefold() in combination with
needle.casefold(); apply the same change to the second occurrence referenced
around lines 81-83 where the other tuple is checked.

In `@tests/core/test_research_persona_phase04_05_acceptance.py`:
- Around line 234-237: The test picks module_paths[0] which is ambiguous; update
test_phase4_source_ingestion_module_exposes_patch_only_api to assert a single
canonical module by checking that len(module_paths) == 1 (or using an explicit
equality assertion) before assigning module_path = module_paths[0]; use the
existing symbols _require_existing and PHASE4_MODULE_CANDIDATES to obtain
module_paths, assert exactly one candidate, then set module_path from that
single element. Apply the same change to the other analogous test block that
uses module_paths and module_path (the one around the second occurrence).
- Around line 145-148: The current public-API collector uses ast.walk(tree)
which traverses nested definitions; change it to iterate only module-level
statements (e.g., for node in tree.body) so only top-level ast.FunctionDef,
ast.AsyncFunctionDef and ast.ClassDef names are appended to names; keep the
existing underscore-name filter and return tuple(sorted(names)) as before.

In `@tests/core/test_research_persona_system_acceptance.py`:
- Around line 288-290: The test currently runs the audit-command assertion
whenever any optional module exists; change it to only run when audit-related
optional modules are present by replacing the generic
_existing_optional_modules() gate with a check for audit modules (e.g., call a
variant like _existing_optional_modules("audit") or use an explicit membership
test against your audit module list), then keep the existing has_audit_command
logic that scans command_names against AUDIT_COMMAND_HINTS and uses _message for
the assertion failure.

---

Nitpick comments:
In `@tests/core/test_research_persona_help_docs.py`:
- Around line 24-29: The test currently slices the CLI text up to the unrelated
sentinel function def _integrations_config_path which makes extraction fragile;
instead, after finding the research_persona_app initializer with start_match
(searching for research_persona_app\s*=\s*typer\.Typer), extract the commands by
applying re.findall(r'`@research_persona_app`\.command\("([^"]+)"\)', cli) on the
relevant scope (preferably the whole cli string or the substring starting at
start) and return that tuple—replace the block/end slicing logic
(start/end/block) with a direct re.findall call scoped from start or globally so
the test no longer depends on _integrations_config_path.

In `@tests/core/test_research_persona_projection.py`:
- Around line 87-104: Extend the existing
test_project_projection_includes_project_private_only_for_project_purpose to
also assert that private fact identifiers are removed from axes[].fact_ids in
prompt/public projections: call project_research_persona(persona,
purpose="prompt") and inspect the returned projection (or its rendered form via
_render) to ensure PROJECT_PRIVATE_FACT_ID does not appear in any
axes[].fact_ids for the prompt projection, while it does appear for the project
projection; use the same helper symbols (_render, PROJECT_PRIVATE_FACT_ID,
project_research_persona,
test_project_projection_includes_project_private_only_for_project_purpose) to
locate and add these assertions.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: e4c5a861-bc16-4bed-b115-361641b3f4ab

📥 Commits

Reviewing files that changed from the base of the PR and between 0f41769 and 21efd59.

📒 Files selected for processing (49)
  • README.md
  • src/gpd/agents/gpd-expertise-explainer.md
  • src/gpd/agents/gpd-persona-builder.md
  • src/gpd/agents/gpd-researcher-doppelganger.md
  • src/gpd/agents/gpd-scientific-taste.md
  • src/gpd/cli.py
  • src/gpd/commands/build-persona.md
  • src/gpd/commands/help.md
  • src/gpd/core/research_persona.py
  • src/gpd/core/research_persona_applications.py
  • src/gpd/core/research_persona_audit.py
  • src/gpd/core/research_persona_cli.py
  • src/gpd/core/research_persona_ingestion.py
  • src/gpd/core/research_persona_runtime.py
  • src/gpd/registry.py
  • src/gpd/specs/references/research-persona-applications.md
  • src/gpd/specs/workflows/build-persona-stage-manifest.json
  • src/gpd/specs/workflows/build-persona.md
  • src/gpd/specs/workflows/build-persona/approval-and-apply.md
  • src/gpd/specs/workflows/build-persona/persona-intake.md
  • src/gpd/specs/workflows/build-persona/persona-synthesis.md
  • src/gpd/specs/workflows/build-persona/source-ingestion.md
  • tests/core/test_research_persona_application_surfaces.py
  • tests/core/test_research_persona_applications.py
  • tests/core/test_research_persona_audit.py
  • tests/core/test_research_persona_builder_acceptance.py
  • tests/core/test_research_persona_builder_help.py
  • tests/core/test_research_persona_builder_privacy.py
  • tests/core/test_research_persona_builder_registry.py
  • tests/core/test_research_persona_cli_audit.py
  • tests/core/test_research_persona_cli_ingestion_applications.py
  • tests/core/test_research_persona_cli_mutation.py
  • tests/core/test_research_persona_cli_privacy.py
  • tests/core/test_research_persona_cli_show_validate.py
  • tests/core/test_research_persona_cli_support.py
  • tests/core/test_research_persona_help_docs.py
  • tests/core/test_research_persona_ingestion_core.py
  • tests/core/test_research_persona_models.py
  • tests/core/test_research_persona_patch_history.py
  • tests/core/test_research_persona_phase04_05_acceptance.py
  • tests/core/test_research_persona_privacy_contract.py
  • tests/core/test_research_persona_projection.py
  • tests/core/test_research_persona_runtime.py
  • tests/core/test_research_persona_source_ingestion_workflow.py
  • tests/core/test_research_persona_storage.py
  • tests/core/test_research_persona_system_acceptance.py
  • tmp/phase-01-foundation-report.md
  • tmp/phase-02-cli-report.md
  • tmp/phase-03-persona-builder-report.md

Comment thread README.md
Comment on lines +537 to +546
| `gpd research-persona validate [PROFILE_JSON|-]` | Validate stored or supplied persona JSON |
| `gpd research-persona diff PATCH_JSON|-` | Preview a candidate patch without writing |
| `gpd research-persona apply-patch PATCH_JSON|-` | Apply an approved patch, with `--dry-run` for preview only |
| `gpd research-persona audit [PROFILE_JSON|-]` | Inspect persona quality, privacy, evidence, and capsule readiness without writing |
| `gpd research-persona forget FACT_ID` | Tombstone one fact, with `--dry-run` for preview only |
| `gpd research-persona ingest-source SOURCE_JSON|-` | Build a candidate patch from an explicit source document; optional `--output` writes only that candidate patch artifact |
| `gpd research-persona export-capsule --role doppelganger` | Export a prompt-safe role capsule; roles include planner, executor, verifier, paper_writer, literature, recovery, explainer, doppelganger, and taste |
| `gpd research-persona doppelganger` | Preview likely user objections, standards, and next moves through a prompt-safe Researcher Doppelganger capsule |
| `gpd research-persona explain-plan [PLAN_JSON|-]` | Preview Expertise-Aware Explanations guidance from an explainer capsule |
| `gpd research-persona taste-check [CANDIDATE_JSON|-]` | Preview Scientific Taste Model feedback from a taste capsule |

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Escape literal | in table command examples to prevent broken rendering

Lines 537–546 include unescaped pipe characters inside table cells, so Markdown parses extra columns and drops command text in some renderers.

Suggested fix
-| `gpd research-persona validate [PROFILE_JSON|-]` | Validate stored or supplied persona JSON |
-| `gpd research-persona diff PATCH_JSON|-` | Preview a candidate patch without writing |
-| `gpd research-persona apply-patch PATCH_JSON|-` | Apply an approved patch, with `--dry-run` for preview only |
-| `gpd research-persona audit [PROFILE_JSON|-]` | Inspect persona quality, privacy, evidence, and capsule readiness without writing |
+| `gpd research-persona validate [PROFILE_JSON\|-]` | Validate stored or supplied persona JSON |
+| `gpd research-persona diff PATCH_JSON\|-` | Preview a candidate patch without writing |
+| `gpd research-persona apply-patch PATCH_JSON\|-` | Apply an approved patch, with `--dry-run` for preview only |
+| `gpd research-persona audit [PROFILE_JSON\|-]` | Inspect persona quality, privacy, evidence, and capsule readiness without writing |
 | `gpd research-persona forget FACT_ID` | Tombstone one fact, with `--dry-run` for preview only |
-| `gpd research-persona ingest-source SOURCE_JSON|-` | Build a candidate patch from an explicit source document; optional `--output` writes only that candidate patch artifact |
+| `gpd research-persona ingest-source SOURCE_JSON\|-` | Build a candidate patch from an explicit source document; optional `--output` writes only that candidate patch artifact |
 | `gpd research-persona export-capsule --role doppelganger` | Export a prompt-safe role capsule; roles include planner, executor, verifier, paper_writer, literature, recovery, explainer, doppelganger, and taste |
 | `gpd research-persona doppelganger` | Preview likely user objections, standards, and next moves through a prompt-safe Researcher Doppelganger capsule |
-| `gpd research-persona explain-plan [PLAN_JSON|-]` | Preview Expertise-Aware Explanations guidance from an explainer capsule |
-| `gpd research-persona taste-check [CANDIDATE_JSON|-]` | Preview Scientific Taste Model feedback from a taste capsule |
+| `gpd research-persona explain-plan [PLAN_JSON\|-]` | Preview Expertise-Aware Explanations guidance from an explainer capsule |
+| `gpd research-persona taste-check [CANDIDATE_JSON\|-]` | Preview Scientific Taste Model feedback from a taste capsule |
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)

[warning] 537-537: Table column count
Expected: 2; Actual: 3; Too many cells, extra data will be missing

(MD056, table-column-count)


[warning] 538-538: Table column count
Expected: 2; Actual: 3; Too many cells, extra data will be missing

(MD056, table-column-count)


[warning] 539-539: Table column count
Expected: 2; Actual: 3; Too many cells, extra data will be missing

(MD056, table-column-count)


[warning] 540-540: Table column count
Expected: 2; Actual: 3; Too many cells, extra data will be missing

(MD056, table-column-count)


[warning] 542-542: Table column count
Expected: 2; Actual: 3; Too many cells, extra data will be missing

(MD056, table-column-count)


[warning] 545-545: Table column count
Expected: 2; Actual: 3; Too many cells, extra data will be missing

(MD056, table-column-count)


[warning] 546-546: Table column count
Expected: 2; Actual: 3; Too many cells, extra data will be missing

(MD056, table-column-count)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@README.md` around lines 537 - 546, The table cells contain literal pipe
characters that break Markdown table parsing; update each command example (e.g.,
the `gpd research-persona validate`, `diff`, `apply-patch`, `audit`, `forget`,
`ingest-source`, `export-capsule`, `doppelganger`, `explain-plan`, and
`taste-check` lines) to escape the pipe characters by replacing "|" with "\|"
inside the code spans so the pipes are treated as literal characters and the
table renders correctly.

Comment on lines +339 to +343
if purpose in {"prompt", "public"} and fact.privacy != "safe_to_share":
return None
if purpose in {"prompt", "public"} and fact.category.casefold() in _SENSITIVE_PROMPT_CATEGORIES:
if fact.privacy != "safe_to_share":
return None

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Sensitive-category prompt filter is currently ineffective.

The guard on sensitive categories is dead in prompt/public mode because Line 340 already returns for any non-safe_to_share fact, so the nested check never triggers. This currently allows contact/identity/private_paper facts through whenever they are marked safe_to_share, which conflicts with the explicit _SENSITIVE_PROMPT_CATEGORIES safety intent.

🔧 Proposed fix
-    if purpose in {"prompt", "public"} and fact.privacy != "safe_to_share":
-        return None
-    if purpose in {"prompt", "public"} and fact.category.casefold() in _SENSITIVE_PROMPT_CATEGORIES:
-        if fact.privacy != "safe_to_share":
-            return None
+    if purpose in {"prompt", "public"} and fact.privacy != "safe_to_share":
+        return None
+    if purpose in {"prompt", "public"} and fact.category.casefold() in _SENSITIVE_PROMPT_CATEGORIES:
+        return None
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if purpose in {"prompt", "public"} and fact.privacy != "safe_to_share":
return None
if purpose in {"prompt", "public"} and fact.category.casefold() in _SENSITIVE_PROMPT_CATEGORIES:
if fact.privacy != "safe_to_share":
return None
if purpose in {"prompt", "public"} and fact.privacy != "safe_to_share":
return None
if purpose in {"prompt", "public"} and fact.category.casefold() in _SENSITIVE_PROMPT_CATEGORIES:
return None
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/gpd/core/research_persona.py` around lines 339 - 343, The current early
return for purpose in {"prompt","public"} checks only fact.privacy and makes the
subsequent sensitive-category guard dead; change the logic in the code that
filters facts for prompt/public (the block referencing purpose, fact.privacy,
fact.category, and _SENSITIVE_PROMPT_CATEGORIES) to combine the checks so any
fact for prompt/public is rejected if either fact.privacy != "safe_to_share" OR
fact.category.casefold() is in _SENSITIVE_PROMPT_CATEGORIES; update the
conditional (and keep using casefold() for category comparison) to return None
when either condition is true.

Comment on lines +1204 to +1213
"facts": [
projected
for fact in persona.facts
if (projected := _project_fact(fact, purpose=normalized_purpose)) is not None
],
"axes": [
projected
for axis in persona.axes
if (projected := _project_axis(axis, purpose=normalized_purpose)) is not None
],

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Prompt/public projection can leak private fact identifiers through axes.fact_ids.

facts are filtered by privacy, but axes are projected independently and currently preserve fact_ids as-is. If a prompt-safe axis references a non-prompt-safe fact, the private fact identifier can still be exported in prompt/public payloads.

🔧 Proposed fix
-    projection: dict[str, object] = {
+    projected_facts = [
+        projected
+        for fact in persona.facts
+        if (projected := _project_fact(fact, purpose=normalized_purpose)) is not None
+    ]
+    allowed_fact_ids = {
+        fact["id"]
+        for fact in projected_facts
+        if isinstance(fact, dict) and isinstance(fact.get("id"), str)
+    }
+    projected_axes: list[dict[str, object]] = []
+    for axis in persona.axes:
+        projected = _project_axis(axis, purpose=normalized_purpose)
+        if projected is None:
+            continue
+        if normalized_purpose in {"prompt", "public"}:
+            fact_ids = projected.get("fact_ids")
+            if isinstance(fact_ids, list):
+                kept = [fid for fid in fact_ids if isinstance(fid, str) and fid in allowed_fact_ids]
+                if kept:
+                    projected["fact_ids"] = kept
+                else:
+                    projected.pop("fact_ids", None)
+
+        projected_axes.append(projected)
+
+    projection: dict[str, object] = {
         "schema_version": persona.schema_version,
         "purpose": normalized_purpose,
-        "facts": [
-            projected
-            for fact in persona.facts
-            if (projected := _project_fact(fact, purpose=normalized_purpose)) is not None
-        ],
-        "axes": [
-            projected
-            for axis in persona.axes
-            if (projected := _project_axis(axis, purpose=normalized_purpose)) is not None
-        ],
+        "facts": projected_facts,
+        "axes": projected_axes,
     }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/gpd/core/research_persona.py` around lines 1204 - 1213, The axis
projection is leaking private fact identifiers because axes are projected via
_project_axis without reconciling axis.fact_ids against the filtered
persona.facts; compute the set of allowed fact ids from the projected facts (the
comprehension using _project_fact) and when projecting axes (the comprehension
using _project_axis) remove or filter any axis.fact_ids not present in that
allowed set (or omit the field entirely for prompt/public purpose). Update
_project_axis or the axis comprehension to perform this filtering so axes in
prompt/public payloads cannot reference non-prompt-safe facts.

Comment on lines +211 to +213
- Use `add_fact`, `replace_fact`, `add_axis`, `replace_axis`, `add_list_item`,
`remove_list_item`, or pointer operations supported by the research-persona
patch schema.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Patch operation names conflict with the documented patch contract.

This workflow instructs add_fact/replace_fact/add_axis/..., but the builder schema in src/gpd/agents/gpd-persona-builder.md defines upsert_fact, append_list, upsert_axis, and tombstone_fact. This mismatch can produce invalid candidate patches during synthesis.

Suggested doc fix
-- Use `add_fact`, `replace_fact`, `add_axis`, `replace_axis`, `add_list_item`,
--   `remove_list_item`, or pointer operations supported by the research-persona
--   patch schema.
+- Use `upsert_fact` for fact updates, `append_list` for list projections,
+-   `upsert_axis` for numeric preference axes, and `tombstone_fact` only when
+-   forgetting/correcting a durable fact id, consistent with the patch schema.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- Use `add_fact`, `replace_fact`, `add_axis`, `replace_axis`, `add_list_item`,
`remove_list_item`, or pointer operations supported by the research-persona
patch schema.
- Use `upsert_fact` for fact updates, `append_list` for list projections,
`upsert_axis` for numeric preference axes, and `tombstone_fact` only when
forgetting/correcting a durable fact id, consistent with the patch schema.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/gpd/specs/workflows/build-persona.md` around lines 211 - 213, The
workflow doc uses patch operation names add_fact, replace_fact, add_axis,
replace_axis, add_list_item, remove_list_item (and pointer ops) but the persona
builder schema expects upsert_fact, append_list, upsert_axis, tombstone_fact;
update the workflow text to use the schema's canonical operation names (replace
add_fact/replace_fact → upsert_fact/tombstone_fact as appropriate,
add_list_item/remove_list_item → append_list or the tombstone pattern,
add_axis/replace_axis → upsert_axis) so generated patches match the
research-persona patch contract; search for the operation names in this file and
replace them with the corresponding schema names (upsert_fact, append_list,
upsert_axis, tombstone_fact) and adjust any examples or guidance to match the
builder's expected semantics.

Comment on lines +41 to +44
```text
gpd research-persona validate
gpd research-persona diff
gpd research-persona apply-patch

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Use the explicit diff command form in the approval boundary.

Line 43 uses gpd research-persona diff without a patch path, but the rest of this workflow family expects the candidate patch artifact to be diffed explicitly. Keep this command consistent to avoid ambiguous operator behavior.

Suggested doc fix
 gpd research-persona validate
-gpd research-persona diff
+gpd research-persona diff GPD/persona/candidate-patch.json
 gpd research-persona apply-patch
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
```text
gpd research-persona validate
gpd research-persona diff
gpd research-persona apply-patch
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/gpd/specs/workflows/build-persona/persona-intake.md` around lines 41 -
44, Replace the ambiguous "gpd research-persona diff" invocation with the
explicit diff form that includes the candidate patch artifact path so the
approval boundary always diffs the exact patch; locate the "gpd research-persona
diff" command in persona-intake.md and change it to call the diff with the same
patch argument used by "gpd research-persona apply-patch" (e.g., the candidate
patch variable/path) to ensure consistent, unambiguous behavior.

Comment on lines +61 to +67
storage_needles = (
"research-persona/profile.json",
"research_persona/profile.json",
"~/.gpd/research-persona/profile.json",
"$GPD_DATA_DIR/research-persona/profile.json",
"${GPD_DATA_DIR:-~/.gpd}/research-persona/profile.json",
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Case-fold mismatch weakens private-profile detection.

line.casefold() is compared against storage_needles that include uppercase env var tokens, so those needles can be missed and unsafe lines may escape this test.

Suggested fix
 def _non_negated_storage_lines(text: str, *, verbs: tuple[str, ...]) -> list[str]:
-    storage_needles = (
+    storage_needles = tuple(
+        needle.casefold()
+        for needle in (
         "research-persona/profile.json",
         "research_persona/profile.json",
         "~/.gpd/research-persona/profile.json",
         "$GPD_DATA_DIR/research-persona/profile.json",
         "${GPD_DATA_DIR:-~/.gpd}/research-persona/profile.json",
-    )
+        )
+    )

Also applies to: 81-83

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/core/test_research_persona_builder_privacy.py` around lines 61 - 67, In
tests/core/test_research_persona_builder_privacy.py the comparison uses
line.casefold() against the storage_needles tuple which contains mixed-case
tokens (e.g. "$GPD_DATA_DIR"), causing case-fold mismatches; fix by normalizing
the needles as well (e.g., build a lowercased/casefolded version of
storage_needles before the membership check) or compare using both sides
casefolded so the check uses line.casefold() in combination with
needle.casefold(); apply the same change to the second occurrence referenced
around lines 81-83 where the other tuple is checked.

Comment on lines +145 to +148
for node in ast.walk(tree):
if isinstance(node, (ast.FunctionDef, ast.AsyncFunctionDef, ast.ClassDef)) and not node.name.startswith("_"):
names.append(node.name)
return tuple(sorted(names))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Limit public API detection to module-level definitions.

Using ast.walk includes nested defs/classes, which can falsely satisfy public-helper coverage assertions.

Proposed fix
 def _public_api_names(tree: ast.Module) -> tuple[str, ...]:
     names: list[str] = []
-    for node in ast.walk(tree):
+    for node in tree.body:
         if isinstance(node, (ast.FunctionDef, ast.AsyncFunctionDef, ast.ClassDef)) and not node.name.startswith("_"):
             names.append(node.name)
     return tuple(sorted(names))
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
for node in ast.walk(tree):
if isinstance(node, (ast.FunctionDef, ast.AsyncFunctionDef, ast.ClassDef)) and not node.name.startswith("_"):
names.append(node.name)
return tuple(sorted(names))
for node in tree.body:
if isinstance(node, (ast.FunctionDef, ast.AsyncFunctionDef, ast.ClassDef)) and not node.name.startswith("_"):
names.append(node.name)
return tuple(sorted(names))
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/core/test_research_persona_phase04_05_acceptance.py` around lines 145 -
148, The current public-API collector uses ast.walk(tree) which traverses nested
definitions; change it to iterate only module-level statements (e.g., for node
in tree.body) so only top-level ast.FunctionDef, ast.AsyncFunctionDef and
ast.ClassDef names are appended to names; keep the existing underscore-name
filter and return tuple(sorted(names)) as before.

Comment thread tests/core/test_research_persona_phase04_05_acceptance.py
Comment on lines +288 to +290
if _existing_optional_modules():
has_audit_command = any(any(hint in command for hint in AUDIT_COMMAND_HINTS) for command in command_names)
assert has_audit_command, _message("missing optional audit CLI command", command_names)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Gate audit command checks on audit modules only.

Current logic triggers audit-command enforcement when any optional module exists, including non-audit modules.

Proposed fix
-    if _existing_optional_modules():
+    optional_modules = set(_existing_optional_modules())
+    audit_modules = {"research_persona_audit", "research_persona_audit_log", "research_persona_runtime_audit"}
+    if optional_modules & audit_modules:
         has_audit_command = any(any(hint in command for hint in AUDIT_COMMAND_HINTS) for command in command_names)
         assert has_audit_command, _message("missing optional audit CLI command", command_names)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if _existing_optional_modules():
has_audit_command = any(any(hint in command for hint in AUDIT_COMMAND_HINTS) for command in command_names)
assert has_audit_command, _message("missing optional audit CLI command", command_names)
optional_modules = set(_existing_optional_modules())
audit_modules = {"research_persona_audit", "research_persona_audit_log", "research_persona_runtime_audit"}
if optional_modules & audit_modules:
has_audit_command = any(any(hint in command for hint in AUDIT_COMMAND_HINTS) for command in command_names)
assert has_audit_command, _message("missing optional audit CLI command", command_names)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/core/test_research_persona_system_acceptance.py` around lines 288 -
290, The test currently runs the audit-command assertion whenever any optional
module exists; change it to only run when audit-related optional modules are
present by replacing the generic _existing_optional_modules() gate with a check
for audit modules (e.g., call a variant like _existing_optional_modules("audit")
or use an explicit membership test against your audit module list), then keep
the existing has_audit_command logic that scans command_names against
AUDIT_COMMAND_HINTS and uses _message for the assertion failure.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/gpd/cli.py (1)

5416-5433: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Return a non-zero exit code when persona validation fails.

src/gpd/core/research_persona_cli.py:579-596 returns a {valid: false, ...} payload instead of raising, so Line 5433 currently exits 0 for invalid persona JSON. That breaks the usual validate command contract for scripts and CI.

💡 Proposed fix
 def research_persona_validate(
     input_path: str | None = typer.Argument(
         None,
         metavar="[PROFILE_JSON|-]",
         help="Optional profile JSON path, or - for stdin. Defaults to the stored profile.",
     ),
 ) -> None:
     """Validate a research persona profile document or the stored profile."""
 
     document = _load_json_document_or_error(input_path) if input_path is not None else None
     payload = _research_persona_payload(
         "build_validate_payload",
         cwd=_get_cwd(),
         document=document,
         input_path=input_path,
     )
     _output(payload)
+    if isinstance(payload, dict) and payload.get("valid") is False:
+        raise typer.Exit(code=1)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/gpd/cli.py` around lines 5416 - 5433, The validate command currently
always exits 0 because research_persona_validate builds a payload via
_research_persona_payload and writes it with _output without checking the
result; update research_persona_validate to inspect the returned payload (e.g.,
check payload.get("valid") or equivalent boolean returned by
_research_persona_payload), and if validation failed (valid is False) call
sys.exit(non-zero) or raise typer.Exit(code=1) after emitting the payload so
scripts/CI see a failing exit code; keep using _load_json_document_or_error for
input handling and still call _output(payload) before exiting.
🧹 Nitpick comments (1)
tests/core/test_cli.py (1)

613-618: ⚡ Quick win

Derive expected tier-mix text from canonical counts instead of hard-coding literals.

This assertion will keep breaking on profile catalog updates even when behavior is correct. Build the expected string from _profile_tier_mix("review") so the test stays semantic-focused.

Proposed patch
 def _assert_cost_posture_semantics(output: str) -> None:
     assert _COST_TEST_RUNTIME in output
     assert "review" in output
     assert "runtime defaults" in output
-    assert "tier-1=14, tier-2=13, tier-3=1" in output
+    expected_mix = ", ".join(
+        f"{tier}={count}"
+        for tier in ("tier-1", "tier-2", "tier-3")
+        if (count := _profile_tier_mix("review").get(tier, 0)) > 0
+    )
+    assert expected_mix in output
     assert "Advisory only; counts profile-to-tier assignments" in output
     assert "set-tier-models" in output
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/core/test_cli.py` around lines 613 - 618, The test
_assert_cost_posture_semantics currently asserts a hard-coded tier mix string
which will break when profile counts change; replace the literal check for
"tier-1=14, tier-2=13, tier-3=1" with a derived expected string built by calling
_profile_tier_mix("review") (e.g., expected_tier_mix =
_profile_tier_mix("review")) and assert that expected_tier_mix is in output,
removing the hard-coded literal so the test stays semantic-focused; keep the
other assertions unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/gpd/specs/workflows/help.md`:
- Around line 287-290: Update the gpd:build-persona command signature to match
the usage examples by either expanding the signature `gpd:build-persona
[focus|--from-current-project|--interview-only]` to include the other supported
flags (`--paper`, `--bibtex`, `--manual-patch`) or replace the signature with a
concise form that indicates more options (e.g., `gpd:build-persona
[focus|--from-current-project|--interview-only|...]`) and add a short note that
`--paper`, `--bibtex`, and `--manual-patch` are valid source-ingestion options
referenced in the usage examples; ensure the signature text and the usage
examples remain consistent.

---

Outside diff comments:
In `@src/gpd/cli.py`:
- Around line 5416-5433: The validate command currently always exits 0 because
research_persona_validate builds a payload via _research_persona_payload and
writes it with _output without checking the result; update
research_persona_validate to inspect the returned payload (e.g., check
payload.get("valid") or equivalent boolean returned by
_research_persona_payload), and if validation failed (valid is False) call
sys.exit(non-zero) or raise typer.Exit(code=1) after emitting the payload so
scripts/CI see a failing exit code; keep using _load_json_document_or_error for
input handling and still call _output(payload) before exiting.

---

Nitpick comments:
In `@tests/core/test_cli.py`:
- Around line 613-618: The test _assert_cost_posture_semantics currently asserts
a hard-coded tier mix string which will break when profile counts change;
replace the literal check for "tier-1=14, tier-2=13, tier-3=1" with a derived
expected string built by calling _profile_tier_mix("review") (e.g.,
expected_tier_mix = _profile_tier_mix("review")) and assert that
expected_tier_mix is in output, removing the hard-coded literal so the test
stays semantic-focused; keep the other assertions unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 83c05621-28cb-454b-8a29-f6894b395603

📥 Commits

Reviewing files that changed from the base of the PR and between 21efd59 and 4c5e2bc.

📒 Files selected for processing (36)
  • src/gpd/agents/gpd-expertise-explainer.md
  • src/gpd/agents/gpd-researcher-doppelganger.md
  • src/gpd/agents/gpd-scientific-taste.md
  • src/gpd/cli.py
  • src/gpd/commands/build-persona.md
  • src/gpd/commands/help.md
  • src/gpd/core/config.py
  • src/gpd/core/context.py
  • src/gpd/core/return_fields.py
  • src/gpd/core/workflow_staging.py
  • src/gpd/specs/references/help/detailed-command-reference.md
  • src/gpd/specs/references/orchestration/model-profiles.md
  • src/gpd/specs/references/research/research-persona-applications.md
  • src/gpd/specs/workflows/build-persona-stage-manifest.json
  • src/gpd/specs/workflows/help.md
  • src/gpd/specs/workflows/set-profile.md
  • tests/README.md
  • tests/adapters/test_runtime_projection_diagnostics_budget.py
  • tests/core/test_agent_prompt_budget.py
  • tests/core/test_cli.py
  • tests/core/test_command_prompt_budget.py
  • tests/core/test_config.py
  • tests/core/test_prompt_surface_diagnostics_budget.py
  • tests/core/test_prompt_surface_diagnostics_cli.py
  • tests/core/test_research_persona_application_surfaces.py
  • tests/core/test_research_persona_builder_help.py
  • tests/core/test_research_persona_help_docs.py
  • tests/core/test_research_persona_models.py
  • tests/core/test_research_persona_patch_history.py
  • tests/core/test_research_persona_phase04_05_acceptance.py
  • tests/core/test_research_persona_privacy_contract.py
  • tests/core/test_research_persona_storage.py
  • tests/core/test_research_persona_system_acceptance.py
  • tests/core/test_workflow_init_authority.py
  • tests/core/test_workflow_init_builder_registry.py
  • tests/repo_graph_contract.json
💤 Files with no reviewable changes (2)
  • src/gpd/specs/references/research/research-persona-applications.md
  • src/gpd/specs/workflows/build-persona-stage-manifest.json
✅ Files skipped from review due to trivial changes (5)
  • src/gpd/specs/workflows/set-profile.md
  • tests/core/test_prompt_surface_diagnostics_cli.py
  • src/gpd/specs/references/help/detailed-command-reference.md
  • tests/core/test_command_prompt_budget.py
  • src/gpd/agents/gpd-scientific-taste.md
🚧 Files skipped from review as they are similar to previous changes (10)
  • src/gpd/agents/gpd-expertise-explainer.md
  • src/gpd/commands/help.md
  • src/gpd/agents/gpd-researcher-doppelganger.md
  • tests/core/test_research_persona_privacy_contract.py
  • tests/core/test_research_persona_builder_help.py
  • src/gpd/commands/build-persona.md
  • tests/core/test_research_persona_system_acceptance.py
  • tests/core/test_research_persona_patch_history.py
  • tests/core/test_research_persona_storage.py
  • tests/core/test_research_persona_models.py

Comment on lines +287 to +290
**`gpd:build-persona [focus|--from-current-project|--interview-only]`**
Build a private research persona patch from explicit interview or consented source ingestion
Usage: `gpd:build-persona --interview-only`; `gpd:build-persona --from-current-project "math/code balance and citation style"`; `gpd:build-persona --paper paper/main.tex`; `gpd:build-persona --bibtex references/library.bib`; `gpd:build-persona --manual-patch /tmp/persona-patch.json`
Notes: Emits a candidate ResearchPersonaPatch only; apply it separately with `gpd research-persona apply-patch`. Interviewing, paper imports, BibTeX imports, repository scans, and manual patch review require exact source consent. Source-derived candidates use `gpd research-persona ingest-source SOURCE_JSON|- --output PATCH_JSON`. Never writes persona storage directly.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix the signature to match the usage examples.

The command signature on line 287 shows [focus|--from-current-project|--interview-only], but the usage examples on lines 289-290 demonstrate additional options (--paper, --bibtex, --manual-patch) that aren't listed in the signature. This creates a documentation inconsistency.

Either expand the signature to include all available options or add an indicator like [...] to signal that additional options exist beyond those shown.

📝 Proposed fix to make the signature comprehensive
-**`gpd:build-persona [focus|--from-current-project|--interview-only]`**
+**`gpd:build-persona [focus|--from-current-project|--interview-only|--paper PATH|--bibtex PATH|--manual-patch PATH]`**

Alternatively, if the signature should remain brief:

-**`gpd:build-persona [focus|--from-current-project|--interview-only]`**
+**`gpd:build-persona [focus|--from-current-project|--interview-only|...]`**

Then add a note explaining that additional source-ingestion options (--paper, --bibtex, --manual-patch) are available.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
**`gpd:build-persona [focus|--from-current-project|--interview-only]`**
Build a private research persona patch from explicit interview or consented source ingestion
Usage: `gpd:build-persona --interview-only`; `gpd:build-persona --from-current-project "math/code balance and citation style"`; `gpd:build-persona --paper paper/main.tex`; `gpd:build-persona --bibtex references/library.bib`; `gpd:build-persona --manual-patch /tmp/persona-patch.json`
Notes: Emits a candidate ResearchPersonaPatch only; apply it separately with `gpd research-persona apply-patch`. Interviewing, paper imports, BibTeX imports, repository scans, and manual patch review require exact source consent. Source-derived candidates use `gpd research-persona ingest-source SOURCE_JSON|- --output PATCH_JSON`. Never writes persona storage directly.
**`gpd:build-persona [focus|--from-current-project|--interview-only|--paper PATH|--bibtex PATH|--manual-patch PATH]`**
Build a private research persona patch from explicit interview or consented source ingestion
Usage: `gpd:build-persona --interview-only`; `gpd:build-persona --from-current-project "math/code balance and citation style"`; `gpd:build-persona --paper paper/main.tex`; `gpd:build-persona --bibtex references/library.bib`; `gpd:build-persona --manual-patch /tmp/persona-patch.json`
Notes: Emits a candidate ResearchPersonaPatch only; apply it separately with `gpd research-persona apply-patch`. Interviewing, paper imports, BibTeX imports, repository scans, and manual patch review require exact source consent. Source-derived candidates use `gpd research-persona ingest-source SOURCE_JSON|- --output PATCH_JSON`. Never writes persona storage directly.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/gpd/specs/workflows/help.md` around lines 287 - 290, Update the
gpd:build-persona command signature to match the usage examples by either
expanding the signature `gpd:build-persona
[focus|--from-current-project|--interview-only]` to include the other supported
flags (`--paper`, `--bibtex`, `--manual-patch`) or replace the signature with a
concise form that indicates more options (e.g., `gpd:build-persona
[focus|--from-current-project|--interview-only|...]`) and add a short note that
`--paper`, `--bibtex`, and `--manual-patch` are valid source-ingestion options
referenced in the usage examples; ensure the signature text and the usage
examples remain consistent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants