Skip to content

Commit d05e8ce

Browse files
committed
Fix integer overflow DoS vulnerability in tokenization
Fixes #835 When an extremely large prompt (>2^31 characters) is sent to the llamafile server, the tokenization function would experience integer overflow, causing a crash with std::length_error and terminating the entire server process. Root cause: In llamafile/llama.cpp line 50, text.size() (size_t/uint64) was being added to a small value and assigned to int (int32), causing overflow when text.size() exceeded INT_MAX. Fix: Added bounds checking before the addition to prevent overflow. If the input text is too large, we now throw std::length_error with the same error message that llama.cpp naturally throws, which the worker exception handler will catch and log. This matches the behavior of standalone llama.cpp which has internal bounds checks in std::vector and returns a controlled 500 error rather than crashing the process. Security impact: Prevents remote unauthenticated DoS attack where an attacker could crash the llamafile server by sending an oversized prompt.
1 parent 78a2261 commit d05e8ce

File tree

1 file changed

+8
-0
lines changed

1 file changed

+8
-0
lines changed

llamafile/llama.cpp

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,8 @@
1818
#include "llama.h"
1919
#include "llama.cpp/llama.h"
2020
#include <cassert>
21+
#include <climits>
22+
#include <stdexcept>
2123
#include <string>
2224
#include <vector>
2325

@@ -47,6 +49,12 @@ std::string llamafile_token_to_piece(const llama_context *ctx, llama_token token
4749
std::vector<llama_token> llamafile_tokenize(const struct llama_model *model,
4850
const std::string_view &text, bool add_special,
4951
bool parse_special) {
52+
// Prevent integer overflow: ensure text.size() + 2 fits in an int
53+
// INT_MAX is typically 2147483647, so check before the addition
54+
if (text.size() > static_cast<size_t>(INT_MAX) - 2) {
55+
throw std::length_error("cannot create std::vector larger than max_size()");
56+
}
57+
5058
int n_tokens = text.size() + 2 * add_special;
5159
std::vector<llama_token> result(n_tokens);
5260
n_tokens = llama_tokenize(model, text.data(), text.size(), result.data(), result.size(),

0 commit comments

Comments
 (0)