Current approach is assigning every new character a very low probability ( $log(p) = -18$ ). - Random distribution with mean matching the existing tokenizer and standard deviation 0 - High probability relative to existing tokens