Skip to content

Intentionally bad translator: multi-hop NLLB-200 with optional noise (vowel drop/shuffle/char del) and similarity metrics (BLEU/ROUGE/chrF). CLI + Gradio UI.

Notifications You must be signed in to change notification settings

davidyen1124/bad-translator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Bad Translator

Because sometimes you want your prose to see the world and come back weird.

Bad Translator hopscotches your text through multiple languages (NLLB‑200), sprinkles in tasteful chaos (vowel drop, word shuffles, random deletions, accent stripping), then measures how gloriously off the rails it went (BLEU, ROUGE‑L, chrF, edit distance).

Bad Translator UI screenshot

Setup

  • Requires Python 3.10+ and uv (install via brew install uv on macOS or the official install script).
  • First run downloads the NLLB‑200 distilled model (~1.3 GB). GPU optional, patience mandatory.
# CLI
uv run bad-translator --text "The quick brown fox jumps over the lazy dog." --langs en,es,fr,zh,en \
  --noise-drop-vowels 0.4 --noise-shuffle 0.25 --noise-char-del 0.02

# UI (include optional deps)
uv run --extra ui bad-translator-ui

What’s happening

  • Multi‑hop translate with NLLB‑200: en → es → fr → zh → en (your call).
  • Add noise after each hop: drop vowels, shuffle words, delete random chars, strip accents.
  • Score the final vs. original: BLEU, ROUGE‑L(F1), chrF, edit distance.

Language codes

Two‑letter codes like en, es, fr, de, it, pt, nl, sv, no, da, fi, pl, cs, ro, tr, ru, uk, ar, he, fa, hi, bn, ur, zh, ja, ko, sw are supported.

Tips

  • Want a different model? --model facebook/nllb-200-3.3B (bring a GPU and snacks).
  • Results are stochastic; set --seed for repeatably bad outcomes.
  • If downloads are slow, set HF_HOME/TRANSFORMERS_CACHE to a fast disk.

Disclaimer

This project is intentionally mischievous. Do not use it for real translation unless your goal is chaos.

About

Intentionally bad translator: multi-hop NLLB-200 with optional noise (vowel drop/shuffle/char del) and similarity metrics (BLEU/ROUGE/chrF). CLI + Gradio UI.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages