Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
265 changes: 265 additions & 0 deletions doc/code/scenarios/3_adaptive_scenarios.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,265 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "94e7f44a",
"metadata": {},
"source": [
"# Adaptive Scenarios\n",
"\n",
"An **adaptive scenario** doesn't run every attack technique against every objective.\n",
Comment thread
hannahwestra25 marked this conversation as resolved.
"Instead, it picks which technique to try next per-objective, learns from what worked,\n",
"and stops as soon as one technique succeeds. This concentrates spend on techniques\n",
"that actually work on your target.\n",
"\n",
"## How it works (high level)\n",
"\n",
"For each objective, the scenario tries up to `max_attempts_per_objective` techniques:\n",
"\n",
"- With probability `epsilon`, it **explores** — picks a random technique.\n",
"- Otherwise it **exploits** — picks the technique with the highest observed success\n",
" rate so far.\n",
"- It records the outcome and stops early on success.\n",
"\n",
"Unseen techniques are tried first, so the first few objectives effectively round-robin\n",
"through every technique before the scenario settles on the best performers.\n",
"\n",
"## Adaptive vs. static scenarios\n",
"\n",
"| Feature | Static scenarios | Adaptive scenarios |\n",
"|---------------------|-----------------------------------|------------------------------------|\n",
"| Technique selection | Run every selected technique | Pick per-objective from outcomes |\n",
"| Early stopping | No | Yes — stops on first success |\n",
"| Cost | O(techniques × objectives) | O(max_attempts × objectives) |\n",
"\n",
"`AdaptiveScenario` is the modality-agnostic base class.\n",
"[`TextAdaptive`](../../../pyrit/scenario/scenarios/adaptive/text_adaptive.py) is the\n",
"text subclass used in the examples below."
]
},
{
"cell_type": "markdown",
"id": "cb716650",
"metadata": {},
"source": [
"## Setup"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4b536900",
"metadata": {},
"outputs": [],
"source": [
"from pathlib import Path\n",
"\n",
"from pyrit.registry import TargetRegistry\n",
"from pyrit.scenario import DatasetConfiguration\n",
"from pyrit.scenario.printer.console_printer import ConsoleScenarioResultPrinter\n",
"from pyrit.scenario.scenarios.adaptive import TextAdaptive\n",
"from pyrit.setup import initialize_from_config_async\n",
"\n",
"await initialize_from_config_async(config_path=Path(\"../../scanner/pyrit_conf.yaml\")) # type: ignore\n",
"\n",
"objective_target = TargetRegistry.get_registry_singleton().get_instance_by_name(\"openai_chat\")\n",
"printer = ConsoleScenarioResultPrinter()"
]
},
{
"cell_type": "markdown",
"id": "9f9ff786",
"metadata": {},
"source": [
"## Basic usage\n",
"\n",
"Defaults: `max_attempts_per_objective=3`, epsilon-greedy selector with `epsilon=0.2`,\n",
"the subclass's default datasets."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "33aa89d3",
"metadata": {},
"outputs": [],
"source": [
"scenario = TextAdaptive()\n",
"\n",
"await scenario.initialize_async( # type: ignore\n",
" objective_target=objective_target,\n",
")\n",
"result = await scenario.run_async() # type: ignore\n",
"await printer.write_async(result) # type: ignore"
]
},
{
"cell_type": "markdown",
"id": "5083bbed",
"metadata": {},
"source": [
"## Configuring a run\n",
"\n",
"- **`max_attempts_per_objective`** — caps techniques tried per objective. Higher means\n",
" more chances to succeed and more API calls. Set via `set_params_from_args`.\n",
"- **`selector`** — a pre-built `TechniqueSelector` instance. Pass an\n",
" `EpsilonGreedyTechniqueSelector(epsilon=..., random_seed=...)`\n",
" to tune the selection algorithm. Defaults to an epsilon-greedy selector with\n",
" `epsilon=0.2`.\n",
"- **`scenario_strategies`** (on `initialize_async`) — restricts which techniques the\n",
" selector can pick from. Use `TextAdaptive.get_strategy_class()` to access the enum.\n",
"\n",
"The cell below exercises all of them at once."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "db966395",
"metadata": {},
"outputs": [],
"source": [
"from pyrit.scenario.scenarios.adaptive import EpsilonGreedyTechniqueSelector\n",
"\n",
"strategy_class = TextAdaptive.get_strategy_class()\n",
"\n",
"configured_scenario = TextAdaptive(\n",
" selector=EpsilonGreedyTechniqueSelector(\n",
" epsilon=0.3,\n",
" random_seed=42,\n",
" ),\n",
")\n",
"configured_scenario.set_params_from_args(args={\"max_attempts_per_objective\": 5})\n",
"\n",
"await configured_scenario.initialize_async( # type: ignore\n",
" objective_target=objective_target,\n",
" scenario_strategies=[strategy_class(\"single_turn\")],\n",
" dataset_config=DatasetConfiguration(\n",
" dataset_names=[\"airt_hate\", \"airt_violence\"],\n",
" max_dataset_size=4,\n",
" ),\n",
")\n",
"configured_result = await configured_scenario.run_async() # type: ignore\n",
"await printer.write_async(configured_result) # type: ignore"
]
},
{
"cell_type": "markdown",
"id": "ba7e7126",
"metadata": {},
"source": [
"## Resuming a run\n",
"\n",
"Adaptive scenarios are resumable — pass `scenario_result_id=...` to the `TextAdaptive`\n",
"constructor and the run picks up where it left off. Resume must use the same\n",
"configuration as the original run."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4857bace",
"metadata": {},
"outputs": [],
"source": [
"resumed_scenario = TextAdaptive(\n",
" selector=EpsilonGreedyTechniqueSelector(\n",
" epsilon=0.3,\n",
" random_seed=42,\n",
" ),\n",
" scenario_result_id=str(configured_result.id),\n",
")\n",
"resumed_scenario.set_params_from_args(args={\"max_attempts_per_objective\": 5})\n",
"\n",
"await resumed_scenario.initialize_async( # type: ignore\n",
" objective_target=objective_target,\n",
" scenario_strategies=[strategy_class(\"single_turn\")],\n",
" dataset_config=DatasetConfiguration(\n",
" dataset_names=[\"airt_hate\", \"airt_violence\"],\n",
" max_dataset_size=4,\n",
" ),\n",
")\n",
"resumed_result = await resumed_scenario.run_async() # type: ignore\n",
"await printer.write_async(resumed_result) # type: ignore"
]
},
{
"cell_type": "markdown",
"id": "e267467c",
"metadata": {},
"source": [
"## Inspecting which techniques were tried\n",
"\n",
"The dispatcher stamps every objective's `AttackResult.metadata` with:\n",
"\n",
"- `adaptive_attempts` — the ordered list of `{\"technique\", \"outcome\"}` dicts\n",
" recording exactly which techniques the selector picked and what happened.\n",
"\n",
"Walk that metadata to see the per-objective trail and aggregate counts."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3a95436b",
"metadata": {},
"outputs": [],
"source": [
"from collections import Counter\n",
"\n",
"# Per-objective trail\n",
"for results in resumed_result.attack_results.values():\n",
" for r in results:\n",
" attempts = r.metadata.get(\"adaptive_attempts\", [])\n",
" trail = \" → \".join(f\"{a['technique']}({a['outcome']})\" for a in attempts)\n",
" print(f\"[{r.outcome.value:7s}] {r.objective!r}: {trail}\")\n",
"\n",
"# Aggregate per-technique pick counts and success rate across the run\n",
"picks: Counter[str] = Counter()\n",
"wins: Counter[str] = Counter()\n",
"for results in resumed_result.attack_results.values():\n",
" for r in results:\n",
" for step in r.metadata.get(\"adaptive_attempts\", []):\n",
" picks[step[\"technique\"]] += 1\n",
" if step[\"outcome\"] == \"success\":\n",
" wins[step[\"technique\"]] += 1\n",
"\n",
"print(\"\\nTechnique wins / picks rate\")\n",
"for technique, n in picks.most_common():\n",
" print(f\"{technique:20s} {wins[technique]:>4} / {n:<4} {wins[technique] / n:.0%}\")"
]
},
{
"cell_type": "markdown",
"id": "37cd0756",
"metadata": {},
"source": [
"## Running from the scanner CLI\n",
"\n",
"You can run `TextAdaptive` directly from the `pyrit_scan` CLI without writing Python:\n",
"\n",
"```bash\n",
"# Basic run with defaults\n",
"pyrit_scan --scenario TextAdaptive --target openai_chat\n",
"\n",
"# Tune max attempts and restrict strategies\n",
"pyrit_scan --scenario TextAdaptive --target openai_chat \\\n",
" --params max_attempts_per_objective=5 \\\n",
" --strategies single_turn\n",
"\n",
"# Use specific datasets and limit size\n",
"pyrit_scan --scenario TextAdaptive --target openai_chat \\\n",
" --datasets airt_hate airt_violence \\\n",
" --max-dataset-size 10\n",
"```"
]
}
],
"metadata": {
"jupytext": {
"main_language": "python"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Loading
Loading