Bug Description
Environment
- OpenViking server
0.3.10 (PyPI wheel openviking-0.3.10-cp310-abi3-manylinux_2_31_x86_64.whl)
- Python 3.10
- Rerank model:
doubao-seed-rerank 251028
- Real production deployment (38 users, ~7100 vectors)
hierarchical_retriever calls RerankClient.rerank_batch(query, documents) with
documents lists that contain empty strings. The VikingDB rerank API responds
with body null for such requests; response.json() returns Python None;
the next line "result" not in None raises TypeError: argument of type 'NoneType' is not iterable; the broad except Exception swallows the error and returns
None; the retriever logs Invalid rerank result, fallback to vector scores.
The result: rerank is 100% disabled in production while logs only show a
generic warning. All retrieval ranking is silently downgraded to raw vector
scores, which is a major recall-quality regression.
This is the same root cause as #1658 (Message.content is empty for ToolPart-only
messages). PR #1675 fixed the sessions.py token-counting symptom of that bug,
but the retrieve/rerank path was not updated and still hits it.
Steps to Reproduce
。
Expected Behavior
。
Actual Behavior
。
Minimal Reproducible Example
Two layers, either is sufficient; doing both is safer.
Layer 1 — make rerank_batch defensive against empty documents (cheap, local):
def rerank_batch(self, query: str, documents: List[str]) -> Optional[List[float]]:
if not documents:
return []
# Skip empty documents; assign score 0 for them, rerank the rest, then
# merge back into the original positions.
nonempty_idx = [i for i, d in enumerate(documents) if d and d.strip()]
if not nonempty_idx:
return [0.0] * len(documents)
nonempty_docs = [documents[i] for i in nonempty_idx]
# ... existing request build/send using nonempty_docs ...
scores = [0.0] * len(documents)
for i_orig, score in zip(nonempty_idx, returned_scores):
scores[i_orig] = score
return scores
Also harden the response parsing:
try:
result = response.json()
except ValueError:
result = None
if not isinstance(result, dict) or not isinstance(result.get("result"), dict):
logger.warning("[RerankClient] Unexpected response format: %r", result)
return None
Layer 2 — fix hierarchical_retriever (and any other rerank caller) to not
forward empty Message.content: derive document text from parts (textparts +
serialized tool input/output snippets), the same way #1675 made
m.estimated_tokens cover all parts.
Error Logs
After patching `except Exception as e: logger.error(...)` to
`logger.exception(...)` plus dumping inputs, a single search produced:
[RerankClient] Rerank failed: argument of type 'NoneType' is not iterable;
query_repr='麦肯锡组织现状报告';
doc_count=10;
first_doc_repr='' <-- empty document
[RerankClient] response_text_preview=''
Traceback (most recent call last):
File ".../openviking/models/rerank/volcengine_rerank.py", line 130, in rerank_batch
if "result" not in result or "data" not in result["result"]:
TypeError: argument of type 'NoneType' is not iterable
For comparison, calling `rerank_batch` directly with **non-empty** documents
returns `200 OK` with valid scores:
▎ ▎ ▎ client.rerank_batch(query="麦肯锡组织现状报告",
▎ ▎ ▎ documents=["麦肯锡 2026 报告交付", "无关文本"])
▎ ▎ ▎ [0.4557728057342673, 0.10216368026616135]
VikingDB API response in the failing case (raw):
- HTTP 200 (or empty)
- Body: literal `null` → `response.json()` returns `None`
OpenViking Version
0.3.10
Python Version
3.10
Operating System
Linux
Model Backend
None
Additional Context
No response
Bug Description
Environment
0.3.10(PyPI wheelopenviking-0.3.10-cp310-abi3-manylinux_2_31_x86_64.whl)doubao-seed-rerank251028hierarchical_retrievercallsRerankClient.rerank_batch(query, documents)withdocumentslists that contain empty strings. The VikingDB rerank API respondswith body
nullfor such requests;response.json()returns PythonNone;the next line
"result" not in NoneraisesTypeError: argument of type 'NoneType' is not iterable; the broadexcept Exceptionswallows the error and returnsNone; the retriever logsInvalid rerank result, fallback to vector scores.The result: rerank is 100% disabled in production while logs only show a
generic warning. All retrieval ranking is silently downgraded to raw vector
scores, which is a major recall-quality regression.
This is the same root cause as #1658 (
Message.contentis empty for ToolPart-onlymessages). PR #1675 fixed the
sessions.pytoken-counting symptom of that bug,but the retrieve/rerank path was not updated and still hits it.
Steps to Reproduce
。
Expected Behavior
。
Actual Behavior
。
Minimal Reproducible Example
Error Logs
OpenViking Version
0.3.10
Python Version
3.10
Operating System
Linux
Model Backend
None
Additional Context
No response