[Feat] CoreSolver 2-Phase 풀이 구현 및 문제 풀이 API 엔드포인트#43
Conversation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Warning Rate limit exceeded
You’ve run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (9)
📝 Walkthrough워크스루POST /api/v1/solve 요청을 ProovyState로 초기화해 LangGraph를 실행합니다. 도구(code_generate/code_execute)로 Phase 1 검증을 반복하고 Phase 2에서 검증된 결과를 LLM 스트리밍으로 SSE 전송하며, executor/emitter 컨텍스트와 크레딧 기록을 함께 처리합니다. 변경 사항전체 문제 풀이 시스템
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/proovy_agent/app/api/v1/solve.py`:
- Around line 42-57: The current_emitter token is set in the handler context but
reset inside the background task, causing ValueError; move the
current_emitter.set(emitter) call into the async _run() function (so set and
current_emitter.reset(token) happen in the same context), capture the returned
token inside _run, and call current_emitter.reset(token) in the _run() finally
block alongside await emitter.close(); ensure any references to token outside
_run are removed and that emitter.emit/error/close remain inside _run.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: c811b74e-a4cf-496b-a421-e4dfeeeeaa6c
📒 Files selected for processing (9)
src/proovy_agent/app/api/v1/router.pysrc/proovy_agent/app/api/v1/solve.pysrc/proovy_agent/app/schemas/solve.pysrc/proovy_agent/common/sandbox/executor_var.pysrc/proovy_agent/common/sse/context.pysrc/proovy_agent/graph/agents/core_solver/agent.pysrc/proovy_agent/graph/tools/__init__.pysrc/proovy_agent/graph/tools/code_execute.pysrc/proovy_agent/graph/tools/code_generate.py
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
@coderabbitai 지금 확인 가능해? |
|
The PR description images show a terminal demonstrating a successful end-to-end test of the
PR ✅ Actions performedReview triggered.
|
Preprocessor / Router / Planner / PlanExecutor / GeneralNode / CreditSettler / VideoNode(스텁) 노드 구현 및 LangGraph StateGraph 빌드. POST /api/v1/solve 엔드포인트를 전체 그래프 흐름으로 전환. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/proovy_agent/graph/builder.py`:
- Around line 12-20: The declared return types for get_graph() and _build() are
incorrect: they are annotated as StateGraph but actually return the result of
builder.compile(), which is a CompiledStateGraph; update the function signatures
and any _graph variable/type annotation to use CompiledStateGraph (and add an
import for CompiledStateGraph if missing) so the type hints match the
builder.compile() return type; ensure references to get_graph, _build, _graph,
and builder.compile are updated accordingly.
In `@src/proovy_agent/graph/nodes/plan_executor.py`:
- Around line 31-33: The early return in plan_executor that does "if not ready:
return Command(goto='credit_settler')" can cause premature transition while
other parallel branches are still in "running"; change the readiness check in
plan_executor to verify that there are no remaining running steps across
parallel branches (not just absence of new pending steps) before returning
Command(goto='credit_settler'). Specifically, augment the logic that uses the
ready flag to also inspect the aggregate state of in-flight steps (e.g., any
running/pending entries in the plan or step status store) and only emit
Command(goto='credit_settler') when all branches/steps are in terminal states;
otherwise keep execution suspended (return None or a retry command) so the
executor waits for running branches to finish.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: f18a49e3-0224-4b0a-9d15-f0f5e1587353
📒 Files selected for processing (9)
src/proovy_agent/app/api/v1/solve.pysrc/proovy_agent/graph/builder.pysrc/proovy_agent/graph/nodes/credit_settler.pysrc/proovy_agent/graph/nodes/general_node.pysrc/proovy_agent/graph/nodes/plan_executor.pysrc/proovy_agent/graph/nodes/planner.pysrc/proovy_agent/graph/nodes/preprocessor.pysrc/proovy_agent/graph/nodes/router.pysrc/proovy_agent/graph/nodes/video_node.py
✅ Files skipped from review due to trivial changes (1)
- src/proovy_agent/graph/nodes/video_node.py
builder.py: get_graph/_build/_graph 반환 타입을 CompiledStateGraph로 수정 plan_executor: running 상태 step 존재 시 credit_settler 조기 전환 방지 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
chowon442
left a comment
There was a problem hiding this comment.
-
현재 테스트 목록에서 plan_executor, core_solver, solve_endpoint, credit_settler, get_graph 관련 테스트가 없습니다. 이번 작업에 대한 테스트 추가 부탁드립니다.
-
Planner structured output에서
selected_model을 LLM이 직접 반환하게 하는 것은 운영 정책과 도메인 판단이 섞이는 구조로 보입니다.- LLM은
difficulty만 반환하고,difficulty -> selected_model매핑은 코드 또는 config/env에서 관리하는 편이 더 안전합니다. 모델 alias나 실제 모델은 비용/성능/운영 정책에 따라 바뀔 수 있으므로 프롬프트와 schema에 고정하지 않는 게 좋을 것 같아요! - 그리고 이건 별로 중요한 건 아닌데 difficulty 관련 값에 Enum을 써도 괜찮을 것 같아요. (지금도 괜찮긴 해요)
- LLM은
| return [ | ||
| Send( | ||
| _ACTION_TO_NODE[step.action], | ||
| { | ||
| **state.model_dump(), | ||
| "plan": [s.model_dump() for s in plan], | ||
| "executing_step_idx": idx, | ||
| }, | ||
| ) | ||
| for idx, step in ready | ||
| ] |
There was a problem hiding this comment.
PlanExecutor 병렬 경로가 LangGraph 런타임에서 실패합니다.
plan_executor노드가list[Send]를 직접 반환합니다.- 현재 LangGraph에서 노드 반환값은 state update dict 또는
Command여야 합니다. Send는 conditional edge에서 반환하거나, 노드에서는Command(goto=[Send(...), ...])형태로 감싸야 합니다.
| if len(ready) == 1: | ||
| idx, step = ready[0] | ||
| plan[idx] = step.model_copy(update={"status": "running"}) | ||
| return Command( | ||
| update={"plan": plan, "executing_step_idx": idx}, | ||
| goto=_ACTION_TO_NODE[step.action], | ||
| ) |
There was a problem hiding this comment.
plan_executor는 단일 ready step을 running으로 바꾼 뒤 _ACTION_TO_NODE[step.action]로 라우팅합니다.
- PDF 요청의 경우
step.action == "pdf"이므로pdf_node로 이동합니다. - 그런데 현재 구현된
PDFNode는 실행 후 해당 step을done으로 갱신하지 않아, 다시plan_executor로 돌아왔을 때 PDF step이 계속running으로 남습니다. - 직접 pull을 받아서 수정해주시거나, @gaeunee2 님에게 전달 부탁드립니다. (PDFNode를 이 그래프 프로토콜에 맞게 감싸거나 수정해서 executing_step_idx의 step을 done/error로 업데이트해야 합니다.)
| async def _phase1_verify( | ||
| state: ProovyState, | ||
| emitter: object | None, | ||
| ) -> tuple[str, list, int]: | ||
| """Phase 1: 내부 풀이 + 코드 검증. (verified_summary, new_messages, execute_count) 반환.""" | ||
| llm = get_llm(state.selected_model) | ||
| llm_with_tools = llm.bind_tools(_TOOLS) | ||
|
|
||
| system_msg = SystemMessage(_build_verify_system(state)) | ||
| messages: list = list(state.messages) | ||
| execute_count = 0 | ||
|
|
||
| if emitter: | ||
| await emitter.emit( | ||
| "solve_progress", | ||
| {"text": "수학 문제를 분석하고 코드로 검증하는 중입니다..."}, | ||
| ) | ||
|
|
||
| for iteration in range(_MAX_ITERATIONS): | ||
| trimmed = _trim_tool_messages(messages) | ||
| response = await llm_with_tools.ainvoke([system_msg, *trimmed]) | ||
| messages.append(response) | ||
|
|
||
| if not response.tool_calls: | ||
| # 도구 호출 없음 → LLM이 검증 완료 판단 | ||
| content = response.content | ||
| if isinstance(content, list): | ||
| content = "".join( | ||
| block.get("text", "") if isinstance(block, dict) else str(block) | ||
| for block in content | ||
| ) | ||
| return str(content), messages, execute_count | ||
|
|
||
| # 도구 호출 실행 | ||
| for tool_call in response.tool_calls: | ||
| tool_name = tool_call["name"] | ||
| tool = _TOOLS_BY_NAME.get(tool_name) | ||
| if tool is None: | ||
| result = f"Unknown tool: {tool_name}" | ||
| else: | ||
| try: | ||
| result = await tool.ainvoke(tool_call["args"]) | ||
| if tool_name == "code_execute": | ||
| execute_count += 1 | ||
| except Exception as exc: | ||
| result = f"Tool error: {exc}" | ||
|
|
||
| messages.append(ToolMessage(content=str(result), tool_call_id=tool_call["id"])) | ||
|
|
||
| if emitter and iteration > 0: | ||
| await emitter.emit( | ||
| "solve_progress", | ||
| {"text": f"검증 재시도 중... ({iteration + 1}/{_MAX_ITERATIONS})"}, | ||
| ) | ||
|
|
||
| # 최대 반복 도달 — 마지막 AI 메시지를 결과로 사용 | ||
| last_ai = next( | ||
| (m for m in reversed(messages) if isinstance(m, AIMessage) and not m.tool_calls), | ||
| None, | ||
| ) | ||
| summary = str(last_ai.content) if last_ai else "검증 결과를 확인하지 못했습니다." | ||
| return summary, messages, execute_count |
There was a problem hiding this comment.
CoreSolver가 실제 코드 검증 없이도 "검증된 풀이"로 설명을 생성할 수 있습니다.
- 도구 호출이 없으면 곧바로 검증 완료로 간주합니다.
- 최대 반복에 도달해도 실패로 멈추지 않고 summary를 만들어 Phase 2로 갑니다. 그리고 Phase 2는 무조건 실행됩니다.
- 의도하신 동작인지 확인 부탁드립니다. 서비스의 핵심 목적이 "코드 실행으로 검증된 풀이만 사용자에게 노출"이므로, 최소 1회 이상의 성공한 code_execute를 추적하고 실패 시 최종 설명 생성 대신 명시적 실패 상태로 가는 방향이 좋을까요?
| task: asyncio.Task[None] = asyncio.create_task(_run()) | ||
| _active_tasks.add(task) | ||
| task.add_done_callback(_active_tasks.discard) | ||
|
|
||
| return EventSourceResponse(emitter.stream()) |
There was a problem hiding this comment.
SSE 연결 종료 후에도 그래프/LLM/샌드박스 작업이 계속 실행될 수 있습니다!
- 엔드포인트가 producer를
EventSourceResponse생명주기 밖의 background task로 분리합니다. - 클라이언트가 끊겨도
_run()은 취소되지 않습니다. 게다가 큐가 bounded라 consumer가 사라진 뒤 emit이 누적되면 producer가 막히고 cleanup까지 못 갈 수 있습니다. EventSourceResponse의data_sender_callable/close handler를 쓰거나, disconnect 시 task를 cancel하고SandboxManager.destroy_executor()까지 도달하는지 보장해야 합니다.- 어떻게 처리하는게 좋을지 검토 후, SSE 연결 종료 시에 대한 처리 방향을 문서에도 반영 부탁드립니다.
| credit_entries.append( | ||
| CreditEntry( | ||
| node="core_solver", | ||
| action="llm_call", | ||
| model=state.selected_model, | ||
| cost=model_cost, | ||
| ) | ||
| ) |
There was a problem hiding this comment.
크레딧 정산이 실제 LLM 호출 수를 크게 과소계산합니다.
- CoreSolver는 전체 실행에 대해 LLM credit을 한 번만 기록합니다. 하지만 실제로는 Phase 1 반복 호출, Phase 2 streaming 호출, code_generate 내부 Flash 호출이 추가로 발생합니다.
- 또한
Router/Planner의 Flash 호출도 현재credit_log에 들어가지 않습니다. CreditSettler를 붙인 PR이므로 실제 호출 단위로 중앙 accounting wrapper를 두는 편이 안전합니다. 다른 방식도 고려해보시고 좋은 방향으로 진행 부탁드립니다!
- solve.py: SSE 스트림 disconnect 시 백그라운드 태스크 취소 처리 - builder.py: CompiledStateGraph 타입 annotation 수정, PDFNode wrapper 추가 - core_solver: _code_execute_succeeded 헬퍼, llm_call_count/codegen_count 세분화 추적 - plan_executor: Command(goto=[Send(...)]) LangGraph 2.0 API 준수, running 가드 추가 - planner/router: credit_log 반환 및 page_start SSE 이벤트 추가 - 테스트: core_solver, plan_executor, credit_settler, builder, solve_endpoint 단위 테스트 추가 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/proovy_agent/graph/nodes/planner.py (1)
51-63:⚠️ Potential issue | 🟠 Major | ⚡ Quick winmath_task 경로에서는
solvestep을 후처리로 보장해야 합니다.
planner()가 LLM 출력을 그대로 신뢰해서video/steps=[]를 받으면,plan_executor는 실행할 step 없이 바로credit_settler로 끝납니다. 이 경로에서는 최소 1개의solvestep을 강제하거나, 없으면 명시적으로 실패시키는 검증이 필요합니다.수정 예시
plan = [PlanStep(action=s.action, description=s.description) for s in result.steps] + if not any(step.action == "solve" for step in plan): + raise ValueError("planner must include at least one solve step for math_task") + selected_model = _DIFFICULTY_TO_MODEL[result.difficulty]🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/proovy_agent/graph/nodes/planner.py` around lines 51 - 63, In the planner() function ensure the "math_task" path always yields at least one PlanStep with action "solve": after building plan from result.steps (and before returning/credit logging and before plan_executor runs), check result.path or state.path for "math_task" and verify any(s.action == "solve" for s in result.steps); if none, either append a PlanStep(action="solve", description="Solve the math problem") to plan or raise a clear validation error so plan_executor/credit_settler cannot proceed with an empty or non-solving plan; update the return to reflect the adjusted plan and credit behavior.tests/graph/test_plan_executor.py (1)
21-88: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win테스트 함수들에
-> None을 일괄 추가해 주세요.Line 21~88의 테스트 함수 선언에서 반환 타입 힌트가 빠져 있습니다.
제안 수정안
-def test_solve_has_no_deps(): +def test_solve_has_no_deps() -> None: @@ -def test_pdf_waits_for_solve(): +def test_pdf_waits_for_solve() -> None: @@ -def test_pdf_ready_after_solve_done(): +def test_pdf_ready_after_solve_done() -> None: @@ -def test_video_and_pdf_both_ready_after_solve(): +def test_video_and_pdf_both_ready_after_solve() -> None: @@ -def test_running_step_is_not_ready(): +def test_running_step_is_not_ready() -> None: @@ `@pytest.mark.asyncio` -async def test_all_done_routes_to_credit_settler(): +async def test_all_done_routes_to_credit_settler() -> None: @@ `@pytest.mark.asyncio` -async def test_single_step_routes_to_correct_node(): +async def test_single_step_routes_to_correct_node() -> None: @@ `@pytest.mark.asyncio` -async def test_parallel_steps_wrapped_in_command(): +async def test_parallel_steps_wrapped_in_command() -> None: @@ `@pytest.mark.asyncio` -async def test_running_step_does_not_transition_to_credit_settler(): +async def test_running_step_does_not_transition_to_credit_settler() -> None:As per coding guidelines
Use Python 3.12+ with mandatory type hints in all code.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/graph/test_plan_executor.py` around lines 21 - 88, Add explicit return type hints "-> None" to every test function in this block (functions like test_solve_has_no_deps, test_pdf_waits_for_solve, test_pdf_ready_after_solve_done, test_video_and_pdf_both_ready_after_solve, test_running_step_is_not_ready, test_all_done_routes_to_credit_settler, test_single_step_routes_to_correct_node, test_parallel_steps_wrapped_in_command, test_running_step_does_not_transition_to_credit_settler) so each def line declares a return type; do this for both sync tests and async tests (keep the `@pytest.mark.asyncio` decorators and async def for plan_executor-related tests) to comply with the project’s mandatory type-hint rule.
🧹 Nitpick comments (2)
tests/app/test_solve_endpoint.py (1)
31-42: ⚡ Quick win이 테스트는 아직 스트림 producer를 검증하지 못합니다.
지금은 헤더만 확인해서
_run()이 시작되지 않거나get_graph().ainvoke()가 호출되지 않아도 통과할 수 있습니다. 최소 한 이벤트를 소비하고mock_graph.ainvoke.assert_awaited_once()까지 확인해 두는 게 안전합니다.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/app/test_solve_endpoint.py` around lines 31 - 42, Update the test_valid_request_returns_sse_stream to actually consume at least one SSE event and assert the mocked graph was awaited: after posting to "/api/v1/solve" (keeping the existing get_graph patch and mock_graph with mock_graph.ainvoke = AsyncMock()), iterate the response streaming body (e.g. use response.iter_lines() / response.iter_text() or another TestClient streaming iterator) and read the first non-empty SSE line/event, then assert mock_graph.ainvoke.assert_awaited_once() in addition to the existing status and content-type checks; ensure you reference the test function name test_valid_request_returns_sse_stream, the patched get_graph, and mock_graph.ainvoke when making these assertions.src/proovy_agent/app/api/v1/solve.py (1)
47-49: ⚡ Quick win같은 실패에
errorSSE가 두 번 나갈 수 있습니다.
core_solver()는 검증 실패 시 이미error이벤트를 보낸 뒤 예외를 다시 올립니다. 여기서 generic error를 한 번 더 emit하면 클라이언트는 동일 실패를 두 개의 오류 이벤트로 받게 됩니다. 사용자에게 이미 전달된 예외는 중복 emit을 피할 수 있게 구분해 두는 편이 좋습니다.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/proovy_agent/app/api/v1/solve.py` around lines 47 - 49, core_solver() already emits an "error" SSE for validation failures and re-raises, so the generic except block in solve.py currently causes duplicate SSEs; change the except clause to capture the exception (except Exception as e:) and only call await emitter.emit("error", ...) if the exception does not carry a marker indicating the error was already emitted (e.g., check hasattr(e, "sse_emitted") or getattr(e, "sse_emitted", False) == False); also ensure core_solver() sets that marker on exceptions it emits (e.g., set e.sse_emitted = True) so the solve.py handler can reliably skip re-emitting, while still calling logger.exception("solve 실행 중 오류 발생") and letting the exception propagate as needed.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/proovy_agent/graph/agents/core_solver/agent.py`:
- Around line 71-73: The current _code_execute_succeeded function only treats
"exit_code: 1" as failure, causing false positives for other non-zero exit
codes; update _code_execute_succeeded to parse the numeric exit_code from the
result (e.g., with a regex for "exit_code:\s*(-?\d+)") and return False if an
exit_code is present and not equal to 0, and also return False if the result
contains "error:" (case-insensitive); if no exit_code is present, fall back to
the existing "error:" check to decide success.
In `@src/proovy_agent/graph/builder.py`:
- Around line 62-68: The current broad except Exception around importing and
instantiating PDFNode hides real runtime/initialization bugs; change the handler
to only catch import-related failures (e.g., except (ImportError,
ModuleNotFoundError) as e) so that failures inside PDFNode() bubble up, and
update the warning log (logging.getLogger(__name__).warning) to include the
exception details; keep the fallback to _pdf_stub and keep using
_pdf_step_done_wrapper(PDFNode()) for the normal path.
In `@tests/app/test_solve_endpoint.py`:
- Around line 45-56: The test test_thread_id_auto_generated currently only
checks status 200 but must verify that a thread_id was auto-generated and passed
to the graph; after posting via client.post (using the patched get_graph
returning mock_graph), assert mock_graph.ainvoke was awaited and inspect the
arguments it was called with to ensure the passed-in state's thread_id is
present/non-empty (e.g., access the first await call on mock_graph.ainvoke and
assert the state's thread_id is not None/empty); update the test to perform this
check in addition to the existing status_code assertion.
In `@tests/graph/test_builder.py`:
- Around line 15-28: Add explicit return type annotations "-> None" to the two
test functions so they comply with the project type-hint rule: update the defs
for test_get_graph_builds_only_once and test_get_graph_returns_build_result to
include "-> None" after the parameter list (i.e., def
test_get_graph_builds_only_once() -> None: and def
test_get_graph_returns_build_result() -> None:) while keeping the existing
bodies and mocks unchanged.
In `@tests/graph/test_credit_settler.py`:
- Around line 15-51: Four async test functions (test_sums_total_credit,
test_emits_credit_settled_event, test_no_emitter_does_not_raise,
test_empty_log_total_is_zero) are missing the required return type hint; update
each function definition to include the explicit return type annotation "->
None" on the async def signature (e.g., async def test_sums_total_credit(...) ->
None:) so they comply with the Python 3.12+ mandatory type-hinting guideline.
---
Outside diff comments:
In `@src/proovy_agent/graph/nodes/planner.py`:
- Around line 51-63: In the planner() function ensure the "math_task" path
always yields at least one PlanStep with action "solve": after building plan
from result.steps (and before returning/credit logging and before plan_executor
runs), check result.path or state.path for "math_task" and verify any(s.action
== "solve" for s in result.steps); if none, either append a
PlanStep(action="solve", description="Solve the math problem") to plan or raise
a clear validation error so plan_executor/credit_settler cannot proceed with an
empty or non-solving plan; update the return to reflect the adjusted plan and
credit behavior.
In `@tests/graph/test_plan_executor.py`:
- Around line 21-88: Add explicit return type hints "-> None" to every test
function in this block (functions like test_solve_has_no_deps,
test_pdf_waits_for_solve, test_pdf_ready_after_solve_done,
test_video_and_pdf_both_ready_after_solve, test_running_step_is_not_ready,
test_all_done_routes_to_credit_settler, test_single_step_routes_to_correct_node,
test_parallel_steps_wrapped_in_command,
test_running_step_does_not_transition_to_credit_settler) so each def line
declares a return type; do this for both sync tests and async tests (keep the
`@pytest.mark.asyncio` decorators and async def for plan_executor-related tests)
to comply with the project’s mandatory type-hint rule.
---
Nitpick comments:
In `@src/proovy_agent/app/api/v1/solve.py`:
- Around line 47-49: core_solver() already emits an "error" SSE for validation
failures and re-raises, so the generic except block in solve.py currently causes
duplicate SSEs; change the except clause to capture the exception (except
Exception as e:) and only call await emitter.emit("error", ...) if the exception
does not carry a marker indicating the error was already emitted (e.g., check
hasattr(e, "sse_emitted") or getattr(e, "sse_emitted", False) == False); also
ensure core_solver() sets that marker on exceptions it emits (e.g., set
e.sse_emitted = True) so the solve.py handler can reliably skip re-emitting,
while still calling logger.exception("solve 실행 중 오류 발생") and letting the
exception propagate as needed.
In `@tests/app/test_solve_endpoint.py`:
- Around line 31-42: Update the test_valid_request_returns_sse_stream to
actually consume at least one SSE event and assert the mocked graph was awaited:
after posting to "/api/v1/solve" (keeping the existing get_graph patch and
mock_graph with mock_graph.ainvoke = AsyncMock()), iterate the response
streaming body (e.g. use response.iter_lines() / response.iter_text() or another
TestClient streaming iterator) and read the first non-empty SSE line/event, then
assert mock_graph.ainvoke.assert_awaited_once() in addition to the existing
status and content-type checks; ensure you reference the test function name
test_valid_request_returns_sse_stream, the patched get_graph, and
mock_graph.ainvoke when making these assertions.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 22600553-34c1-43fa-a559-ec251437f348
📒 Files selected for processing (13)
src/proovy_agent/app/api/v1/solve.pysrc/proovy_agent/graph/agents/core_solver/agent.pysrc/proovy_agent/graph/builder.pysrc/proovy_agent/graph/nodes/plan_executor.pysrc/proovy_agent/graph/nodes/planner.pysrc/proovy_agent/graph/nodes/router.pytests/app/test_solve_endpoint.pytests/graph/__init__.pytests/graph/agents/__init__.pytests/graph/agents/test_core_solver.pytests/graph/test_builder.pytests/graph/test_credit_settler.pytests/graph/test_plan_executor.py
- gemini-2.0-flash → gemini-2.5-flash: 구조화 출력 정확도 개선 - claude-opus-4-5 → claude-sonnet-4-5: 도구 호출 속도 3-4배 개선, 품질 동등 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- _code_execute_succeeded: exit_code 정규식으로 0 외 모든 비정상 코드 감지 - builder.py: PDFNode 임포트 오류만 잡도록 except 범위 축소, 오류 메시지 포함 - solve.py + core_solver: sse_emitted 마커로 중복 SSE error 이벤트 방지 - planner.py: LLM이 solve step 누락 시 자동 삽입 (plan_executor 안전성 보장) - 테스트: -> None 반환 타입 힌트 추가, ainvoke 호출 및 thread_id 검증 추가 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
모두 수정해서 반영했습니다! |
chowon442
left a comment
There was a problem hiding this comment.
src/proovy_agent/common/sse/emitter.py
await self._queue.put(None)disconnect 이후 consumer가 사라진 상태에서 큐가 가득 차 있으면 close()가 여기서 대기할 수 있습니다.
SSEEmitter 큐는 bounded queue인데, token stream처럼 이벤트가 많이 쌓인 상태에서 _stream() 소비가 중단되면 sentinel을 넣는 put(None)도 block됩니다. 그러면 _run()의 finally가 끝나지 않아 cleanup 완료가 불안정해집니다.
최소 재현에서도 256개 이벤트로 큐를 채운 뒤 close()를 호출하면 timeout이 납니다. disconnect cleanup 보장을 위해 close는 non-blocking 방식으로 sentinel을 넣거나, queue drain/drop 정책을 명확히 해야 할 것 같습니다.
async with self._close_lock:
if self._closed:
logger.debug("emit() 무시됨 — 이미 닫힌 이미터 (event=%s)", event)
return
await self._queue.put(SSEEvent(event=event, data=data))emit()도 lock을 잡은 상태에서 bounded queue put()을 기다립니다. 큐가 full이면 producer가 여기서 멈추고, close()도 같은 lock을 기다리게 되어 disconnect/cleanup 경로가 더 취약해집니다.
SSE는 best-effort 실시간 전달 성격이 강하므로, full queue에서 backpressure를 줄지/drop할지/timeout 처리할지 정책을 명확히 하는 게 필요해 보입니다.
| try: | ||
| from proovy_agent.graph.nodes.pdf_node.pdf_node import PDFNode | ||
|
|
||
| pdf_node = _pdf_step_done_wrapper(PDFNode()) | ||
| except (ImportError, ModuleNotFoundError) as e: | ||
| logging.getLogger(__name__).warning("PDFNode 로드 실패 — 스텁으로 대체합니다. 원인: %s", e) | ||
| pdf_node = _pdf_stub |
There was a problem hiding this comment.
broad Exception을 줄인 방향은 맞는데, #44의 Builder 요구사항에는 weasyprint native dependency 문제로 인한 OSError 방지가 포함되어 있습니다.
현재는 import 계열만 fallback 처리해서, WeasyPrint/Cairo/Pango 같은 native library 로딩 실패가 OSError로 발생하는 환경에서는 graph build 자체가 실패할 수 있습니다.
PDFNode() 내부 로직 버그는 숨기지 않되, import/native loading 단계에서 발생하는 OSError는 fallback 대상으로 포함하는 방식이 안전해 보입니다.
| finally: | ||
| # 클라이언트 disconnect 시 백그라운드 태스크 취소 | ||
| if not task.done(): | ||
| task.cancel() |
There was a problem hiding this comment.
이 정책은 이전 SSE 이슈의 방향과 충돌할 수 있어 확인이 필요합니다.
#9는 SSE 연결이 끊겨도 messages를 durable source로 두고 복원할 수 있다는 방향을 설명합니다. 그런데 현재 구현은 disconnect 시 graph task를 cancel하므로 풀이 진행 자체가 중단됩니다.
의도적으로 "disconnect = solve cancel"로 정책을 바꾼 것이라면 문서/이슈 설명도 같이 맞춰야 하고, 아니라면 SSE stream 수명주기와 graph 실행 수명주기를 분리해야 할 것 같습니다.
| return Command( | ||
| update={"plan": plan}, | ||
| goto=[ | ||
| Send( | ||
| _ACTION_TO_NODE[step.action], | ||
| { | ||
| **state.model_dump(), | ||
| "plan": [s.model_dump() for s in plan], | ||
| "executing_step_idx": idx, | ||
| }, | ||
| ) | ||
| for idx, step in ready | ||
| ], | ||
| ) |
There was a problem hiding this comment.
이 병렬 경로는 실제 graph 실행에서 깨질 수 있습니다.
Send payload가 dict라서 video_node/pdf_node가 받는 state는 ProovyState가 아니게 됩니다. 해당 노드들은 state.plan, state.executing_step_idx처럼 attribute access를 사용하므로 병렬 실행 시 AttributeError: 'dict' object has no attribute 'plan'가 발생합니다.
또 payload를 ProovyState로 바꾸더라도 video와 pdf 브랜치가 같은 superstep에서 둘 다 plan을 업데이트하면, 현재 plan 필드는 reducer가 없는 plain list라 LangGraph가 concurrent update를 병합하지 못합니다.
실제 병렬 fan-out을 graph로 실행하는 회귀 테스트를 추가하고, plan 업데이트 병합 방식도 같이 정리해야 할 것 같습니다.
📌 관련 이슈
🏷️ PR 타입
📝 작업 내용
Preprocessor— @커맨드 파싱, ocr_text/tags 추출Router— Flash LLM + structured output으로 general_chat / math_task 분류Planner— Flash LLM으로 plan / difficulty / selected_model / use_page 결정, use_page 시page_startSSE 발행PlanExecutor— Command API 단일 라우팅, Send API 병렬 디스패치, 완료 시 CreditSettler 이동GeneralNode— Flash LLM 토큰 스트리밍,tokenSSE 발행, credit_log 0.5cr 기록CreditSettler— credit_log 합산,credit_settledSSE 발행VideoNode— 스텁 구현 (plan step 상태 done 처리)builder.py— lazy singleton 패턴, PDFNode 등 무거운 임포트를_build()내부로 지연 (weasyprint OSError 방지)solve.py— 직접 CoreSolver 호출에서get_graph().ainvoke(state)전체 그래프 흐름으로 전환📸 스크린샷
✅ 체크리스트
📎 기타 참고사항
pdf_node/코드는 수정하지 않음. 로컬 Mac 환경에서 weasyprint 실행을 위해brew install pango필요Summary by CodeRabbit
릴리스 노트
새 기능
/solveAPI 엔드포인트 추가 (Server-Sent Events로 실시간 스트리밍)테스트