Skip to content

Commit 9f6a5ce

Browse files
authored
fix bos_token is None in TextDataset (#139)
1 parent 95e4bda commit 9f6a5ce

File tree

1 file changed

+5
-1
lines changed

1 file changed

+5
-1
lines changed

angelslim/data/text_dataset.py

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -102,7 +102,11 @@ def _load_jsonl_data(self, data_path: str, num_samples: int):
102102
thinking_data = True
103103
break
104104
if thinking_data:
105-
text = self.processor.bos_token
105+
text = (
106+
self.processor.bos_token
107+
if self.processor.bos_token is not None
108+
else ""
109+
)
106110
for dic in messages:
107111
if dic["role"] == "system":
108112
text += dic["content"]

0 commit comments

Comments
 (0)