Skip to content

FEAT: Add blur_images flag to pyrit.output for safer image rendering#1768

Merged
rlundeen2 merged 7 commits into
microsoft:mainfrom
rlundeen2:users/rlundeen/blur_images
May 22, 2026
Merged

FEAT: Add blur_images flag to pyrit.output for safer image rendering#1768
rlundeen2 merged 7 commits into
microsoft:mainfrom
rlundeen2:users/rlundeen/blur_images

Conversation

@rlundeen2
Copy link
Copy Markdown
Contributor

Description

When reviewing attack results that include generated images, reviewers can be unnecessarily exposed to graphic content while skimming output. PR #1659 introduced a safe_outputs flag on the legacy display_response / printer paths, but the output module has since been refactored into pyrit/output/. This PR adds analogous functionality to the new module via a blur_images flag (with a configurable blur_radius), so reviewers can scan results with a Gaussian blur applied to any rendered imagery.

The flag is plumbed end-to-end:

  • Shared helper pyrit/output/_image_utils.py::blur_image_bytes centralizes the PIL GaussianBlur transformation and falls back to the original bytes on any failure.
  • Pretty path (conversation/pretty.py) blurs image bytes in memory before handing them to IPython.display.Image.
  • Markdown path (conversation/markdown.py) writes a sibling <stem>_blurred.png next to each referenced image (idempotent: existing blurred files are reused) and emits the markdown link to the blurred copy instead of the original.
  • Forwarded through PrettyAttackResultPrinter, MarkdownAttackResultPrinter, their *MemoryPrinter subclasses, and the top-level output_attack_async / output_conversation_async convenience functions.

Default is blur_images=False, so existing behavior is unchanged.

Inspired by #1659 but intentionally diverges in two ways: it targets the refactored pyrit/output/ module rather than the legacy printer files, and it uses a single Gaussian blur transformation rather than the resize + desaturate + rotate stack, which keeps the helper simple and the intent (visual reduction) obvious.

Tests and Documentation

  • New tests/unit/output/test_image_utils.py covers the shared blur helper (PNG round-trip, byte-changes on multi-color images, graceful fallback on invalid input).
  • New tests/unit/output/test_blur_images.py covers:
    • Pretty memory printer blurs bytes before display when enabled, and leaves them untouched by default.
    • Markdown printer writes <stem>_blurred.png, links to it, is idempotent across repeated calls, and falls back to the original path when blurring fails.
    • Memory printers correctly forward blur_images / blur_radius into their conversation sub-printers.
  • All 157 tests in tests/unit/output/ pass; ruff check and ruff format --check are clean.
  • No documentation updates were necessary beyond the docstrings on the new flag.

rlundeen2 and others added 2 commits May 20, 2026 15:53
Adds a blur_images (and configurable blur_radius) flag to the output module
that applies a Gaussian blur to images before they are rendered to a reviewer:

- Pretty conversation printer blurs image bytes in-memory before IPython display.
- Markdown conversation printer writes a sibling <stem>_blurred.png next to the
  original (idempotent) and links to the blurred copy instead.
- Flag is plumbed through PrettyAttackResultPrinter, MarkdownAttackResultPrinter,
  their *MemoryPrinter subclasses, and the top-level output_attack_async and
  output_conversation_async helpers.
- Shared helper pyrit/output/_image_utils.py centralizes the PIL transformation.
- Defaults to False — existing behavior unchanged.

Inspired by microsoft#1659 (safe_outputs in the legacy display_response path) but adapted
to the refactored output module and using a simpler Gaussian blur transformation.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…re tests

- Add `blurred_dir` param to MarkdownConversationPrinter (and downstream printers/helpers) so callers can redirect blurred copies out of the source tree.
- Switch _maybe_blur_image_on_disk to a write-temp + os.replace pattern with temp cleanup, so concurrent or failing renders cannot leave partial blurred files behind.
- Document the stale-blurred-file caching behavior in docstrings.
- Add tests for blurred_dir redirection, atomic-write failure cleanup, original-file immutability, and end-to-end blur kwargs through output_attack_async.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@rlundeen2 rlundeen2 marked this pull request as ready for review May 20, 2026 23:17
Comment thread pyrit/output/attack_result/markdown.py
On Windows, `os.path.relpath` raises ValueError when the target path is on a different mount than the cwd (e.g. pytest tmp_path on D: vs cwd on C:). The markdown image formatter now catches that and falls back to the absolute path. Added a regression test that forces the ValueError via patch.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Comment thread pyrit/output/conversation/markdown.py
@jsong468 jsong468 self-assigned this May 21, 2026
Comment thread pyrit/output/conversation/markdown.py
Comment thread pyrit/output/helpers.py
Comment thread pyrit/output/helpers.py
…ngs, doc example

- On blur failure, emit a plain-text link to the original instead of an inline image, so reviewers are never silently exposed to unblurred content (jsong468 feedback).
- Log a debug message on blurred-file cache hit (jsong468 NIT).
- Make output_conversation_async.blur_images docstring consistent with output_attack_async, including the fallback note.
- Add a 'Blurring Images' section to doc/code/output/0_output (.py and .ipynb).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@rlundeen2 rlundeen2 enabled auto-merge May 22, 2026 20:50
@rlundeen2 rlundeen2 added this pull request to the merge queue May 22, 2026
rlundeen2 and others added 2 commits May 22, 2026 14:09
- Execute 0_output.ipynb so the new 'Blurring Images' section has real rendered output.
- Reorder the bootstrap cell so initialize_pyrit_async runs before CentralMemory.get_memory_instance() (the previous order errored on a fresh kernel).
- Update .github/instructions/docs.instructions.md with a kernel/credentials checklist for jupytext --execute and an explicit 'keep cell outputs' rule.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The previous example called blur_images=True on a text-only attack result, so the output had no image to blur. Add an OpenAIImageTarget attack and print the result twice — once normally and once with blur_images=True — so the rendered notebook actually shows the blurred file in action.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@rlundeen2 rlundeen2 removed this pull request from the merge queue due to a manual request May 22, 2026
@rlundeen2 rlundeen2 enabled auto-merge May 22, 2026 21:21
@rlundeen2 rlundeen2 added this pull request to the merge queue May 22, 2026
Merged via the queue into microsoft:main with commit 624f539 May 22, 2026
48 checks passed
@rlundeen2 rlundeen2 deleted the users/rlundeen/blur_images branch May 22, 2026 21:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants