Return error in case of runner crash #69

p1-0tr · 2025-06-05T12:05:40Z

In case the runner crashes it would be nice to return an error to the
user. So, add an error handler on the proxy and use it to try to figure
out if the runner crashed and format the response error accordingly.

Based on: #68

In case a runner becomes defunct, e.g. as a result of a backend crash it would be neat to be able to reload it. So, if the loader finds runner, have it check if the runner is still alive, and create a new one if the runner is defunct. Signed-off-by: Piotr Stankiewicz <[email protected]>

In case the runner crashes it would be nice to return an error to the user. So, add an error handler on the proxy and use it to try to figure out if the runner crashed and format the response error accordingly. Signed-off-by: Piotr Stankiewicz <[email protected]>

pkg/inference/scheduling/loader.go

-			return l.slots[existing], nil
+			select {
+			case <-l.slots[existing].done:
+				l.log.Warnf("Will reload defunct %s runner for %s. Runner error: %s.", backendName, model,


To fix the issue, the user-provided model value should be sanitized before being logged. Since the log entries are plain text, we can remove newline characters (\n and \r) from the model string to prevent log forgery. This can be achieved using the strings.ReplaceAll function.

The sanitization should be applied in the load function in loader.go before the model value is used in the Warnf log statement. This ensures that any malicious input is neutralized before being logged.

xenoscopic · 2025-06-06T17:17:59Z

pkg/inference/scheduling/runner.go

@@ -134,6 +135,28 @@ func run(
 		proxyLog:  proxyLog,
 	}

+	proxy.ErrorHandler = func(w http.ResponseWriter, req *http.Request, err error) {


Since this is going to be sent back to OpenAI clients, I would try to structure the error responses in a format they can parse. Documentation is a little sparse, but Google says the error format looks like:

{ "error": { "message": "Invalid 'messages[1].content': string too long. Expected a string with maximum length 1048576, but got a string with length 1540820 instead.", "type": "invalid_request_error", "param": "messages[1].content", "code": "string_above_max_length" } }

The only docs I can find for the API are https://platform.openai.com/docs/api-reference/responses-streaming/error

p1-0tr added 2 commits June 5, 2025 13:16

p1-0tr requested review from xenoscopic and doringeman June 5, 2025 12:05

github-advanced-security bot found potential problems Jun 5, 2025

View reviewed changes

xenoscopic reviewed Jun 6, 2025

View reviewed changes

@@ -12,2 +12,3 @@
             	"github.com/docker/model-runner/pkg/logging"
+            	"strings"
             )
@@ -376,3 +377,5 @@
             			case <-l.slots[existing].done:
-            				l.log.Warnf("Will reload defunct %s runner for %s. Runner error: %s.", backendName, model,
+            				sanitizedModel := strings.ReplaceAll(model, "\n", "")
+            				sanitizedModel = strings.ReplaceAll(sanitizedModel, "\r", "")
+            				l.log.Warnf("Will reload defunct %s runner for %s. Runner error: %s.", backendName, sanitizedModel,
             					l.slots[existing].err)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Return error in case of runner crash #69

Return error in case of runner crash #69

Uh oh!

p1-0tr commented Jun 5, 2025

Uh oh!

Check failure

Copilot Autofix

xenoscopic Jun 6, 2025

Uh oh!

Uh oh!

Return error in case of runner crash #69

Are you sure you want to change the base?

Return error in case of runner crash #69

Uh oh!

Conversation

p1-0tr commented Jun 5, 2025

Uh oh!

Check failure

Uh oh!

Copilot Autofix

xenoscopic Jun 6, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!