Commit 78a2261
committed
fix: Server v2 production issues (PR #788)
Fixes four critical production issues in llamafile Server v2:
1. **Fix .args loading timing** (llama.cpp main/main.cpp)
- Move cosmo_args() call before determine_program()
- Ensures --server --v2 flags in .args are seen when determining program mode
- Fixes #783
2. **Add URL prefix normalization** (llamafile/flags.cpp)
- Consolidate consecutive slashes (//api/v1 → /api/v1)
- Ensure leading slash, remove trailing slash
- Validate AFTER normalization
- Use static std::string for proper lifetime management (no memory leak)
- Fixes #767
3. **Robust partial write handling** (llamafile/server/client.cpp)
- Implement full write loop to handle partial writes correctly
- Handle EINTR (signal interruption) gracefully
- Properly detect connection closure
- Increase file transfer buffer from 512B to 16KB for better performance
4. **Remove aggressive client dropping** (llamafile/server/worker.cpp)
- Remove code that kills oldest active connection when all workers busy
- Let TCP listen backlog naturally queue incoming connections
- Provides better UX (graceful queuing vs abrupt disconnection)
- Fixes #787
All fixes improve upon original PR #788 with better error handling
and no memory leaks.1 parent 9509d91 commit 78a2261
File tree
4 files changed
+63
-22
lines changed- llama.cpp.patches/patches
- llamafile
- server
4 files changed
+63
-22
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
107 | 107 | | |
108 | 108 | | |
109 | 109 | | |
110 | | - | |
| 110 | + | |
111 | 111 | | |
112 | 112 | | |
113 | 113 | | |
| |||
164 | 164 | | |
165 | 165 | | |
166 | 166 | | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
167 | 171 | | |
168 | 172 | | |
169 | 173 | | |
| |||
172 | 176 | | |
173 | 177 | | |
174 | 178 | | |
175 | | - | |
176 | 179 | | |
177 | 180 | | |
178 | 181 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
313 | 313 | | |
314 | 314 | | |
315 | 315 | | |
316 | | - | |
317 | | - | |
318 | | - | |
319 | | - | |
| 316 | + | |
| 317 | + | |
| 318 | + | |
| 319 | + | |
| 320 | + | |
| 321 | + | |
| 322 | + | |
320 | 323 | | |
321 | | - | |
322 | | - | |
| 324 | + | |
| 325 | + | |
| 326 | + | |
| 327 | + | |
| 328 | + | |
| 329 | + | |
| 330 | + | |
| 331 | + | |
| 332 | + | |
| 333 | + | |
| 334 | + | |
| 335 | + | |
| 336 | + | |
| 337 | + | |
| 338 | + | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
| 342 | + | |
323 | 343 | | |
324 | 344 | | |
| 345 | + | |
| 346 | + | |
| 347 | + | |
| 348 | + | |
325 | 349 | | |
326 | 350 | | |
327 | 351 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
522 | 522 | | |
523 | 523 | | |
524 | 524 | | |
525 | | - | |
526 | | - | |
527 | | - | |
528 | | - | |
529 | | - | |
530 | | - | |
| 525 | + | |
| 526 | + | |
| 527 | + | |
| 528 | + | |
| 529 | + | |
| 530 | + | |
| 531 | + | |
| 532 | + | |
| 533 | + | |
| 534 | + | |
| 535 | + | |
| 536 | + | |
| 537 | + | |
| 538 | + | |
| 539 | + | |
| 540 | + | |
| 541 | + | |
| 542 | + | |
| 543 | + | |
| 544 | + | |
| 545 | + | |
| 546 | + | |
| 547 | + | |
| 548 | + | |
531 | 549 | | |
532 | 550 | | |
533 | 551 | | |
| |||
775 | 793 | | |
776 | 794 | | |
777 | 795 | | |
778 | | - | |
| 796 | + | |
779 | 797 | | |
780 | 798 | | |
781 | 799 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
56 | 56 | | |
57 | 57 | | |
58 | 58 | | |
59 | | - | |
60 | | - | |
61 | | - | |
62 | | - | |
63 | | - | |
64 | | - | |
65 | | - | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
66 | 62 | | |
67 | 63 | | |
68 | 64 | | |
| |||
0 commit comments