FEAT: Add blur_images flag to pyrit.output for safer image rendering#1768
Merged
Conversation
Adds a blur_images (and configurable blur_radius) flag to the output module that applies a Gaussian blur to images before they are rendered to a reviewer: - Pretty conversation printer blurs image bytes in-memory before IPython display. - Markdown conversation printer writes a sibling <stem>_blurred.png next to the original (idempotent) and links to the blurred copy instead. - Flag is plumbed through PrettyAttackResultPrinter, MarkdownAttackResultPrinter, their *MemoryPrinter subclasses, and the top-level output_attack_async and output_conversation_async helpers. - Shared helper pyrit/output/_image_utils.py centralizes the PIL transformation. - Defaults to False — existing behavior unchanged. Inspired by microsoft#1659 (safe_outputs in the legacy display_response path) but adapted to the refactored output module and using a simpler Gaussian blur transformation. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…re tests - Add `blurred_dir` param to MarkdownConversationPrinter (and downstream printers/helpers) so callers can redirect blurred copies out of the source tree. - Switch _maybe_blur_image_on_disk to a write-temp + os.replace pattern with temp cleanup, so concurrent or failing renders cannot leave partial blurred files behind. - Document the stale-blurred-file caching behavior in docstrings. - Add tests for blurred_dir redirection, atomic-write failure cleanup, original-file immutability, and end-to-end blur kwargs through output_attack_async. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
rlundeen2
commented
May 20, 2026
On Windows, `os.path.relpath` raises ValueError when the target path is on a different mount than the cwd (e.g. pytest tmp_path on D: vs cwd on C:). The markdown image formatter now catches that and falls back to the absolute path. Added a regression test that forces the ValueError via patch. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
jsong468
reviewed
May 21, 2026
jsong468
reviewed
May 21, 2026
jsong468
reviewed
May 21, 2026
jsong468
reviewed
May 21, 2026
jsong468
approved these changes
May 21, 2026
…ngs, doc example - On blur failure, emit a plain-text link to the original instead of an inline image, so reviewers are never silently exposed to unblurred content (jsong468 feedback). - Log a debug message on blurred-file cache hit (jsong468 NIT). - Make output_conversation_async.blur_images docstring consistent with output_attack_async, including the fallback note. - Add a 'Blurring Images' section to doc/code/output/0_output (.py and .ipynb). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Execute 0_output.ipynb so the new 'Blurring Images' section has real rendered output. - Reorder the bootstrap cell so initialize_pyrit_async runs before CentralMemory.get_memory_instance() (the previous order errored on a fresh kernel). - Update .github/instructions/docs.instructions.md with a kernel/credentials checklist for jupytext --execute and an explicit 'keep cell outputs' rule. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The previous example called blur_images=True on a text-only attack result, so the output had no image to blur. Add an OpenAIImageTarget attack and print the result twice — once normally and once with blur_images=True — so the rendered notebook actually shows the blurred file in action. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
When reviewing attack results that include generated images, reviewers can be unnecessarily exposed to graphic content while skimming output. PR #1659 introduced a
safe_outputsflag on the legacydisplay_response/ printer paths, but the output module has since been refactored intopyrit/output/. This PR adds analogous functionality to the new module via ablur_imagesflag (with a configurableblur_radius), so reviewers can scan results with a Gaussian blur applied to any rendered imagery.The flag is plumbed end-to-end:
pyrit/output/_image_utils.py::blur_image_bytescentralizes the PILGaussianBlurtransformation and falls back to the original bytes on any failure.conversation/pretty.py) blurs image bytes in memory before handing them toIPython.display.Image.conversation/markdown.py) writes a sibling<stem>_blurred.pngnext to each referenced image (idempotent: existing blurred files are reused) and emits the markdown link to the blurred copy instead of the original.PrettyAttackResultPrinter,MarkdownAttackResultPrinter, their*MemoryPrintersubclasses, and the top-leveloutput_attack_async/output_conversation_asyncconvenience functions.Default is
blur_images=False, so existing behavior is unchanged.Inspired by #1659 but intentionally diverges in two ways: it targets the refactored
pyrit/output/module rather than the legacy printer files, and it uses a single Gaussian blur transformation rather than the resize + desaturate + rotate stack, which keeps the helper simple and the intent (visual reduction) obvious.Tests and Documentation
tests/unit/output/test_image_utils.pycovers the shared blur helper (PNG round-trip, byte-changes on multi-color images, graceful fallback on invalid input).tests/unit/output/test_blur_images.pycovers:<stem>_blurred.png, links to it, is idempotent across repeated calls, and falls back to the original path when blurring fails.blur_images/blur_radiusinto their conversation sub-printers.tests/unit/output/pass;ruff checkandruff format --checkare clean.