test: adversarial payload library for pipeline validation#5
test: adversarial payload library for pipeline validation#5VibhorGautam wants to merge 1 commit into
Conversation
Initial set of 14 attack payloads organized by pipeline layer (prompt, context, normalization, memory/provenance). Each payload specifies which detection layer should catch it and the expected enforcement action, so we can track coverage gaps as modules get built. Ref: c2siorg#2
|
Great work on this taxonomy, @VibhorGautam . This is exactly the 'Ground Truth' we need for the PDP Evaluation Pipeline. I’ve particularly noted the Memory Layer payloads; these will be the primary test cases for the Stateful Aggregator logic we’ve defined in the Phase-1 Architecture Contract. I’ll be referencing this library in my GSoC proposal as the baseline for our Phase 2 validation. This ensures that every scanner we build has a clear target to hit. |
|
Excellent engineering, @VibhorGautam . This taxonomy is exactly what we need to ensure our pipeline is mathematically sound. While this will definitely serve as the ground truth for the Phase 2 scanners, we actually need to use this immediately in Phase 1. Before these payloads even reach the PDP Evaluation Pipeline, they have to survive the UDS IPC handshake. Your inclusion of encoding evasion payloads (Unicode homoglyphs, zero-width chars, Base64) is critical here. We need to guarantee that the binary packing in the PEP SDK interceptor (PR #4) does not corrupt or strip these obfuscations during the socket transfer before the scanners can evaluate them. I am pulling this branch down locally today. I will pipe these exact 14 payloads through the /tmp/acf.sock UDS transport we are building to stress-test the byte-offsets and ensure lossless transmission to the sidecar. Outstanding work providing the exact telemetry exhaust we need to validate the OTel audit plane. |
|
This is a really strong baseline, especially the coverage of RAG poisoning, tool reinjection, and multi-turn drift. While going through the payloads, I noticed a few potential extensions that could improve real-world robustness:
Happy to contribute a small set of payloads covering these if that aligns with the direction. |
|
Good suggestions @Ananya44444 - cross-layer attacks are definitely a gap right now. Payloads that look benign pre-normalization but become malicious after decoding would catch a whole class of bugs where stage ordering matters False positive cases are a good call too, if we only test with malicious inputs we have no idea what the precision looks like Happy to have you contribute those, open a PR against the same payloads directory and i can review, or push them into this branch directly - either way works Also going to restructure the existing payloads to align with Tharindu's v0.2 pipeline stages (Validate, Normalise, Scan, Aggregate) so everything maps to the canonical architecture |
|
@VibhorGautam Thanks! I’ve added cross-layer payloads and a false positive case in a follow-up PR #11 , built on top of this branch. |
|
superseded by #26 |
Adds the initial adversarial payload library from issue #2
This is test data and coverage mapping, not implementation - wanted to get the attack taxonomy locked down so it can be used to validate whichever architecture and policy language the team converges on
What's included
Payload schema
Every payload has:
id,name,description,payload,expected_detection_layer,expected_action,severity,tagsThe schema is framework-agnostic so payloads work regardless of whether the pipeline uses LangGraph, LangChain, or a custom loop
Closes #2