Skip to content

Add first draft of base_jailbreak#177

Open
rmura498 wants to merge 2 commits intomainfrom
169-basejailbreak-skeleton-for-text-attack-interface
Open

Add first draft of base_jailbreak#177
rmura498 wants to merge 2 commits intomainfrom
169-basejailbreak-skeleton-for-text-attack-interface

Conversation

@rmura498
Copy link
Collaborator

@rmura498 rmura498 commented Jan 28, 2026

This PR introduces a first draft of a BaseJailbreakAttack abstraction.
The goal is to define a minimal, prompt-wise interface for jailbreak attacks, aligned with the existing design used for evasion attacks.

The base class provides orchestration over a list of harmful behaviors and leaves attack-specific logic, success criteria, and objectives to specific attack implementations.

@rmura498 rmura498 linked an issue Jan 28, 2026 that may be closed by this pull request
@codecov
Copy link

codecov bot commented Jan 28, 2026

Codecov Report

❌ Patch coverage is 95.34884% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 91.96%. Comparing base (9a798ff) to head (b3c57a9).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
src/secmlt/adv/jailbreak/base_jailbreak_attack.py 94.44% 1 Missing ⚠️
src/secmlt/tests/test_jailbreaks.py 96.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #177      +/-   ##
==========================================
+ Coverage   91.90%   91.96%   +0.06%     
==========================================
  Files          67       69       +2     
  Lines        2173     2216      +43     
==========================================
+ Hits         1997     2038      +41     
- Misses        176      178       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@rmura498 rmura498 marked this pull request as ready for review January 28, 2026 16:20
@rmura498 rmura498 requested a review from maurapintor January 28, 2026 16:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BaseJailbreak — skeleton for text-attack interface

1 participant