You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on May 16, 2025. It is now read-only.
Hello Rebuff team,
I have developed a small open-source tool called Puppetry Detector. It detects policy puppetry and prompt injection attempts in LLM prompts using regular expressions. The tool is modular and already includes integration with Rebuff, so it can be adapted as an optional heuristic module.
If somebody gives it a look an confirms possibility of integration, I'd be happy.
I would be happy to prepare a pull request to add this as an optional feature, if you think it could be useful. Please let me know if you are open to this idea or if you have any suggestions before I make a PR.
Thank you for your time and for your great work on Rebuff!
Hello Rebuff team,
I have developed a small open-source tool called Puppetry Detector. It detects policy puppetry and prompt injection attempts in LLM prompts using regular expressions. The tool is modular and already includes integration with Rebuff, so it can be adapted as an optional heuristic module.
If somebody gives it a look an confirms possibility of integration, I'd be happy.
I would be happy to prepare a pull request to add this as an optional feature, if you think it could be useful. Please let me know if you are open to this idea or if you have any suggestions before I make a PR.
Thank you for your time and for your great work on Rebuff!
The tool itself is here: https://github.com/metawake/puppetry-detector