Is your feature request related to a problem? Please describe.
ConfigurationLoader (pyrit/setup/configuration_loader.py) is a
strict @dataclass and its from_dict constructor does
cls(**filtered_data). Any top-level key not in the dataclass's field
list raises TypeError.
Minimal repro:
from pyrit.setup.configuration_loader import ConfigurationLoader
ConfigurationLoader.from_dict({
"memory_db_type": "in_memory",
"targets": [{"name": "x"}],
})
# TypeError: ConfigurationLoader.__init__() got an unexpected keyword
# argument 'targets'
This makes it awkward for teams building red-teaming frameworks on
top of PyRIT to colocate their own config alongside PyRIT's. Common
downstream concepts — target definitions with custom auth, scan modes
with threshold rules, scenario-to-dataset maps — naturally live next
to PyRIT's memory_db_type / initializers / env_files. Today the
options are:
- Keep a separate config file with a custom parser. Works, but
fragments the "here is the config entrypoint" story for users who
touch both.
- Subclass ConfigurationLoader and add fields. Works, but every
downstream framework ends up with its own loader class that other
tooling doesn't recognize.
- Fork PyRIT. Not sustainable.
If ConfigurationLoader tolerated unknown top-level keys, downstream
frameworks could put their config in the same YAML file under their
own namespace and users would have a single entrypoint.
Describe the solution you'd like
Two lightweight shapes I'd be happy to implement; I don't have a
preference — both solve the problem:
Option A — passthrough field for unknown keys. Add an
extensions: dict[str, Any] = field(default_factory=dict) field.
from_dict routes known keys to their existing fields and any
remaining keys into extensions. Downstream framework reads
loader.extensions["targets"] and validates its own sub-schema.
@DataClass
class ConfigurationLoader(YamlLoadable):
# existing fields...
extensions: dict[str, Any] = field(default_factory=dict)
@classmethod
def from_dict(cls, data):
known = {k: v for k, v in data.items() if k in cls.__dataclass_fields__}
extras = {k: v for k, v in data.items() if k not in cls.__dataclass_fields__}
return cls(**known, extensions=extras)
Option B — opt-in permissive mode. Add a classmethod like
from_dict(data, strict: bool = True). When strict=True (default)
behavior is unchanged. When strict=False, unknown keys are attached
to a generic attribute (e.g. _raw_extras) instead of raising. This
keeps the default strict and confines the change to opt-in callers.
Whatever shape, PyRIT's own fields should continue to be strictly
validated — the concern is only about additional top-level keys, not
about deep merging or relaxing validation of known fields.
Describe alternatives you've considered, if relevant
- Parallel config file with its own parser. Functional but costs
users a second config surface.
- Subclassing ConfigurationLoader. Adds fields but means each
downstream framework ships an incompatible loader class; tooling
that type-checks against ConfigurationLoader doesn't see the new
fields.
- Plugin protocol (ConfigurationLoader.register_extension(...)
with a namespace + typed sub-schema). More structured than A/B but
a heavier change; probably only worth it if several downstream
frameworks ask for this.
- Environment variables / CLI flags. Works for flat scalars but
most framework config is structured (lists of targets, nested
thresholds).
Additional context
Prior art for this pattern:
- pyproject.toml [tool.*] tables — every tool claims a namespace.
- Kubernetes CRDs and annotations.
- OpenAPI x-* extension fields.
All solve the same shape via namespaced passthrough while keeping
the core schema strict.
Happy to draft the PR once there's agreement on Option A vs. B (or a
different shape the maintainers prefer).
Is your feature request related to a problem? Please describe.
ConfigurationLoader(pyrit/setup/configuration_loader.py) is astrict
@dataclassand itsfrom_dictconstructor doescls(**filtered_data). Any top-level key not in the dataclass's fieldlist raises
TypeError.Minimal repro:
This makes it awkward for teams building red-teaming frameworks on
top of PyRIT to colocate their own config alongside PyRIT's. Common
downstream concepts — target definitions with custom auth, scan modes
with threshold rules, scenario-to-dataset maps — naturally live next
to PyRIT's memory_db_type / initializers / env_files. Today the
options are:
fragments the "here is the config entrypoint" story for users who
touch both.
downstream framework ends up with its own loader class that other
tooling doesn't recognize.
If ConfigurationLoader tolerated unknown top-level keys, downstream
frameworks could put their config in the same YAML file under their
own namespace and users would have a single entrypoint.
Describe the solution you'd like
Two lightweight shapes I'd be happy to implement; I don't have a
preference — both solve the problem:
Option A — passthrough field for unknown keys. Add an
extensions: dict[str, Any] = field(default_factory=dict) field.
from_dict routes known keys to their existing fields and any
remaining keys into extensions. Downstream framework reads
loader.extensions["targets"] and validates its own sub-schema.
@DataClass
class ConfigurationLoader(YamlLoadable):
# existing fields...
extensions: dict[str, Any] = field(default_factory=dict)
Option B — opt-in permissive mode. Add a classmethod like
from_dict(data, strict: bool = True). When strict=True (default)
behavior is unchanged. When strict=False, unknown keys are attached
to a generic attribute (e.g. _raw_extras) instead of raising. This
keeps the default strict and confines the change to opt-in callers.
Whatever shape, PyRIT's own fields should continue to be strictly
validated — the concern is only about additional top-level keys, not
about deep merging or relaxing validation of known fields.
Describe alternatives you've considered, if relevant
users a second config surface.
downstream framework ships an incompatible loader class; tooling
that type-checks against ConfigurationLoader doesn't see the new
fields.
with a namespace + typed sub-schema). More structured than A/B but
a heavier change; probably only worth it if several downstream
frameworks ask for this.
most framework config is structured (lists of targets, nested
thresholds).
Additional context
Prior art for this pattern:
All solve the same shape via namespaced passthrough while keeping
the core schema strict.
Happy to draft the PR once there's agreement on Option A vs. B (or a
different shape the maintainers prefer).