FEAT: Add Image functionality to TAP#1036
Open
awksrj wants to merge 14 commits intomicrosoft:mainfrom
Open
Conversation
nina-msft
reviewed
Jul 30, 2025
nina-msft
reviewed
Jul 30, 2025
nina-msft
reviewed
Jul 30, 2025
Contributor
Author
|
Thanks for all the comments. I'll go through them and push changes soon! |
…PyRIT into feature/tap-image-target
Contributor
Author
|
I added two unit tests to cover the pruning logic, ensuring blocked responses are scored as 0.0 and pruning only occurs when we exceed |
romanlutz
reviewed
Aug 31, 2025
Contributor
romanlutz
left a comment
There was a problem hiding this comment.
One of the maintainers should run the notebook as well once it exists. Just to make sure we aren't missing anything
…p-image-target
…attack notebooks, run notebooks
Resolve conflicts: keep both error_score_map (PR) and initial_prompt/prepended_conversation_config (main). Take main version for doc files (need separate follow-up). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add error_score_map parameter to TreeOfAttacksWithPruningAttack and
_TreeOfAttacksNode that maps response error types to fixed scores.
This prevents premature branch pruning when targets return blocked
or content-filtered responses (e.g., image generation targets).
Key changes:
- Default error_score_map maps 'blocked' -> 0.0 (pass {} to disable)
- Intercepts mapped errors in _score_response_async before calling scorer
- Creates synthetic float_scale Score for mapped errors
- Propagates map through duplicate() and _create_attack_node()
- Copies dict to avoid shared mutable state
Updates from original PR microsoft#1036:
- Adapted to current Message/MessagePiece API (was PromptRequestResponse)
- Fixed Score constructor args (message_piece_id, score_category as list)
- Made default None -> {'blocked': 0.0} per reviewer feedback
- Added comprehensive unit tests for error interception, scoring, and
map propagation
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add input validation: keys must be valid PromptResponseError values, scores must be in [0, 1] range. Errors caught at construction time. - Persist synthetic scores to memory via add_scores_to_memory() - Fix multi-piece handling: iterate all message_pieces to find the error piece, not just the first piece - Add validation unit tests for invalid key and out-of-range value Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add TAPSystemPromptPaths enum with TEXT_GENERATION and IMAGE_GENERATION variants, matching the RTASystemPromptPaths pattern - Export TAPSystemPromptPaths from pyrit.executor.attack - Add image generation target example to TAP doc (tap_attack.py/.ipynb) demonstrating use of IMAGE_GENERATION system prompt - Add TAP integration tests for both text and image targets - Regenerate tap_attack.ipynb from updated .py Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
This PR adds a code cell to
tree_of_attacks_with_pruning.ipynbto demonstrate an image target example and modifiestree_of_attacks.pyto adapt the Tree of Attacks orchestrator for image targets, particularly adding a dictionary to mapcontent_policy_violationerrors to anobjective_scoreof 0.0, which ensure nodes are kept in the completed nodes list until the branch width limit is exceeded to prevent premature pruning.Related Issue
Closes: #585
Tests and Documentation
No tests included in this commit.