feat: support HTTP extra headers#169
Conversation
| for k, v := range cfg.ExtraHeaders { | ||
| opts = append(opts, openaiopt.WithHeader(k, v)) | ||
| } |
There was a problem hiding this comment.
Security concern: Extra headers are appended after auth-related headers (e.g., Authorization, X-Api-Key) and User-Agent. If a user (or misconfigured environment variable) sets an extra header with one of these reserved names, it could override the authentication credentials or other critical headers.
Consider either:
- Blocking/reserving sensitive header names (e.g.,
authorization,x-api-key,user-agent) inParseExtraHeadersor here before appending, or - Applying extra headers before the auth headers so that auth headers always take precedence.
There was a problem hiding this comment.
Maybe, or not.
Should we explicitly block this? In some hacky cases, users might configure it intentionally.
| for k, v := range cfg.ExtraHeaders { | ||
| opts = append(opts, option.WithHeader(k, v)) | ||
| } |
There was a problem hiding this comment.
Security concern: Same issue as in NewOpenAIClient — extra headers are appended after auth headers and could override Authorization, X-Api-Key, or User-Agent. For the Anthropic client this is particularly risky because the auth mechanism uses WithAuthToken/WithAPIKey combined with WithHeaderDel, and a subsequent WithHeader("Authorization", ...) could reintroduce a conflicting value.
Consider filtering out reserved header names or applying extra headers before the auth-specific options.
| // ParseExtraHeaders parses a string of comma-separated key=value pairs into a dictionary. | ||
| func ParseExtraHeaders(raw string) (map[string]string, error) { | ||
| if raw == "" { | ||
| return nil, nil | ||
| } | ||
| result := make(map[string]string) | ||
| for _, pair := range strings.Split(raw, ",") { | ||
| pair = strings.TrimSpace(pair) | ||
| if pair == "" { | ||
| continue | ||
| } | ||
| parts := strings.SplitN(pair, "=", 2) | ||
| if len(parts) != 2 { | ||
| return nil, fmt.Errorf("invalid extra header %q: expected key=value", pair) | ||
| } | ||
| key := strings.TrimSpace(parts[0]) | ||
| value := strings.TrimSpace(parts[1]) | ||
| if key == "" { | ||
| return nil, fmt.Errorf("invalid extra header %q: empty header name", pair) | ||
| } | ||
| result[key] = value | ||
| } | ||
| return result, nil | ||
| } |
There was a problem hiding this comment.
Security concern: ExtraHeaders are applied after the authentication headers (Authorization, X-Api-Key) and User-Agent are set in both NewOpenAIClient and NewAnthropicClient. This means a user (or a misconfigured environment variable) could inadvertently or maliciously override the auth token by setting e.g. OCR_LLM_EXTRA_HEADERS=Authorization=Bearer-evil-token. Consider either:
- Blocking reserved header names (e.g.,
authorization,x-api-key,user-agent) inParseExtraHeadersor at the point of application, or - Applying
ExtraHeadersbefore the auth headers so that auth always takes precedence.
| return nil, nil | ||
| } | ||
| result := make(map[string]string) | ||
| for _, pair := range strings.Split(raw, ",") { |
There was a problem hiding this comment.
Correctness issue: Header values containing commas cannot be represented with this format. For example, Accept=application/json, text/plain would be split into Accept=application/json and text/plain (which lacks = and causes a parse error). Common HTTP headers like Accept, Cache-Control, and Content-Type can legitimately contain commas.
Consider using a different delimiter (e.g., semicolons), supporting an escape mechanism, or documenting this limitation clearly in user-facing documentation.
There was a problem hiding this comment.
Legit.
However, I adopted a format widely used in other ecosystems, and it likely has the same constraints.
To avoid any issue, one could use a format like key1: value1\nkey2: value2, similar to Claude Code. However, such a format is difficult to read and error-prone when editing manually.
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
lizhengfeng101
left a comment
There was a problem hiding this comment.
Nice feature! Two suggestions below regarding robustness.
|
|
||
| // ensureMessagesSuffix appends /v1/messages to base URLs that lack a versioned path. | ||
| func ensureMessagesSuffix(rawURL string) string { | ||
| u := strings.TrimRight(rawURL, "/") |
There was a problem hiding this comment.
Suggestion: validate against sensitive header names
Currently ParseExtraHeaders accepts any header name, including Authorization, X-Api-Key, Content-Type, etc. In both NewAnthropicClient and NewOpenAIClient, extra headers are appended after auth/content-type setup, so they silently override critical headers.
While this is user-configured (not an external attack surface), it can cause confusing failures — e.g., a typo like Authorization=wrong-token silently breaks auth with no clear error.
Suggestion: add a blocklist for known sensitive headers:
var sensitiveHeaders = map[string]bool{
"authorization": true,
"x-api-key": true,
"content-type": true,
}
// after parsing, before return:
for k := range result {
if sensitiveHeaders[strings.ToLower(k)] {
return nil, fmt.Errorf("extra header %q conflicts with a reserved header; use the dedicated config field instead", k)
}
}| parts := strings.SplitN(pair, "=", 2) | ||
| if len(parts) != 2 { | ||
| return nil, fmt.Errorf("invalid extra header %q: expected key=value", pair) | ||
| } |
There was a problem hiding this comment.
Suggestion: support quoted values to allow commas
The comma-separated key=value format breaks when a value itself contains commas:
ocr config set llm.extra_headers "X-Forwarded-For=1.2.3.4,5.6.7.8"
# parsed as: {"X-Forwarded-For": "1.2.3.4"} + error on "5.6.7.8"A backward-compatible fix: support optional double-quote wrapping on the value side.
ocr config set llm.extra_headers 'X-Forwarded-For="1.2.3.4,5.6.7.8",X-Org=abc'Sketch — replace strings.Split(raw, ",") with a scanner:
var value string
if len(raw) > 0 && raw[0] == '"' {
closeIdx := strings.IndexByte(raw[1:], '"')
if closeIdx < 0 {
return nil, fmt.Errorf("unclosed quote for key %q", key)
}
value = raw[1 : closeIdx+1]
raw = raw[closeIdx+2:] // skip past closing quote + comma
} else {
commaIdx := strings.IndexByte(raw, ',')
if commaIdx < 0 {
value = strings.TrimSpace(raw)
raw = ""
} else {
value = strings.TrimSpace(raw[:commaIdx])
raw = raw[commaIdx+1:]
}
}Unquoted inputs parse identically to today, so fully backward-compatible.
Description
This makes it possible to configure custom HTTP headers for LLM requests.
By environment variable (
OCR_LLM_EXTRA_HEADERS) or a config entry (llm.extra_headers)It's useful for custom LLM backend users with some special header requirements.
Type of Change
How Has This Been Tested?
make testpasses locallyChecklist
go fmt,go vet)Related Issues
Resolves #166