Fix pathologically slow assertion diffs for large inputs (#8998) by kirilklein · Pull Request #14543 · pytest-dev/pytest

kirilklein · 2026-06-01T07:00:21Z

Closes #8998.

Problem

Comparing very large strings, lists, or dataclasses inside an assert can hang for a long time (sometimes minutes) while pytest builds the failure diff.

Profiling the reproductions from the issue confirms the root cause is difflib.ndiff:

its character-level "fancy replace" step is quadratic in the size of the differing region (so two large, mostly-different strings are catastrophic), and
the underlying SequenceMatcher is quadratic in the number of lines — a large nested structure pretty-prints to a huge number of lines (the dataclass example in the issue pformats to ~418,000 lines).

Approach

Following the maintainer discussion in the issue, this uses a deterministic size heuristic rather than wall-clock timeouts (which are non-deterministic and can't reliably interrupt difflib).

A new helper module _pytest/assertion/_diff.py provides:

ndiff_too_slow(left_lines, right_lines) — True when the combined input exceeds a character budget or a line-count budget, the two dimensions that make ndiff slow.
fast_unified_diff(...) — a coarse but fast line-level difflib.unified_diff, capped to a bounded number of lines so it always completes in milliseconds. It notes in the output that a faster diff is being shown (and how many lines were hidden).

Both pathological call sites fall back to it when needed:

compare_text._diff_text (string comparisons)
_compare_sequence._compare_eq_iterable (list / dataclass / iterable comparisons)

Comparisons below the cutoffs keep the existing detailed ndiff output unchanged.

Results

On the reproductions from the issue (dataclass with large lists + two large random strings), with -v:

before: hangs (one repro profiled at ~384s of find_longest_match)
after: ~0.7s, with a useful fallback diff

Tests

Added regression tests in testing/test_assertion.py: unit tests for the ndiff_too_slow heuristic, and integration tests that large string / many-line / large-iterable comparisons fall back to the fast diff (no ndiff ? guide lines), still show which lines differ, and emit the line-cap notice. Thresholds were chosen from benchmarking.

🤖 Generated with Claude Code

Pierre-Sassoulas · 2026-06-01T07:22:32Z

We have a flying MR to use generator in assert repr that could help with this when we don't have to show the actual output. (#14523)

…8998) Comparing very large strings, lists, or dataclasses in an ``assert`` could hang for a long time (sometimes minutes) while pytest built the failure diff. The cost comes from ``difflib.ndiff``: its character-level "fancy replace" step is quadratic in the size of the differing region, and the underlying ``SequenceMatcher`` is quadratic in the number of lines (a large nested structure can pretty-print to hundreds of thousands of lines). Add a deterministic size heuristic (no wall-clock timeouts, per the maintainer discussion in the issue): when the input is too large for ``ndiff`` to be fast, fall back to a coarser line-level ``unified_diff``, capped to a bounded number of lines so it always completes in milliseconds, and note this in the output. Smaller comparisons keep the existing detailed ``ndiff`` output unchanged.

kirilklein · 2026-06-11T17:06:54Z

Thanks! I looked at #14523. It and this PR are complementary:

Use streaming in all assertion comparisons consumers #14523 avoids computing the diff when it'll be truncated anyway (great for the default/-v case via pformat_cap), but its cap is None on CI and -vv, where ndiff's SequenceMatcher stays quadratic — and it doesn't touch the string path (compare_text._diff_text), which is the original repro in this issue.
This PR caps the diff input deterministically regardless of verbosity/CI and covers both strings and iterables, so the pathological hang can't happen even when the full output is shown.

They do overlap in _compare_eq_iterable. Happy to rebase on top of #14523 once it lands, or to narrow this PR to just the cases #14523 doesn't cover (the string path + CI/-vv) — whichever you prefer.

Pierre-Sassoulas

I think this make sense, ndiff is really costly and if they're a ton of changes no one is going to look at everything in great details. Maybe we can make some lines fancy and not show everything instead of showing all the lines as non fancy though. Or making only the first line fancy because -vvv means show me the full diff after all.

Pierre-Sassoulas · 2026-06-14T07:32:57Z

+    size = sum(len(line) for line in left_lines) + sum(
+        len(line) for line in right_lines
+    )


We're summing everything here, we need to fast exit as soon as size become greater than NDIFF_MAX_INPUT_SIZE

Pierre-Sassoulas · 2026-06-14T07:34:06Z

+    yield (
+        f"Diff too large to compute in full (over {NDIFF_MAX_INPUT_SIZE} "
+        "characters); showing a faster line-level diff instead:"
+    )


Message is wrong here, could be either too many line or too many chars.

Pierre-Sassoulas · 2026-06-14T07:35:42Z

+    left_lines = left.splitlines(keepends)
+    right_lines = right.splitlines(keepends)


Do we have to split lines ? Can't we just count the line separator ?

Pierre-Sassoulas · 2026-06-14T07:43:20Z

+        assert ndiff_too_slow(["spam"], ["eggs"]) is False
+
+    def test_many_characters_is_too_slow(self) -> None:
+        assert ndiff_too_slow(["a" * 6000], ["b" * 6000]) is True


Let's mock the values, we don't have to actually construct an enormous list to test the behavior

Pierre-Sassoulas · 2026-06-14T07:43:37Z

        assert "- " + "a" * 50 + "eggs" in lines
        assert "+ " + "a" * 50 + "spam" in lines

+    def test_text_diff_large_input_skips_ndiff(self) -> None:


Let's also mock here

…est-dev#8998) Responding to review feedback on the size heuristic and fallback: - Show a real ``ndiff`` over a bounded prefix instead of a coarse ``unified_diff``, so the character-level diff is kept for the part shown (the fallback no longer drops to a "non-fancy" line diff). - Bound the input to ``ndiff`` by both line and character count: its "fancy replace" cost grows with the product of the two, so a few hundred similar lines (e.g. a pretty-printed list of repeated values) could still take seconds. Lower DIFF_MAX_LINES accordingly so the worst case stays under ~1s. - The "too slow" checks now short-circuit instead of measuring the whole input, and the text check counts line separators instead of splitting the string into a list first. - Fix the fallback message, which wrongly claimed only the character limit was exceeded when it could be either limit. - Tests shrink the limits via monkeypatch instead of building huge data.

kirilklein · 2026-06-17T19:00:19Z

@Pierre-Sassoulas thanks for the review

Fallback now runs a real ndiff over a bounded prefix, so the character-level diff is kept for the shown part (no more flat line-only diff)
While doing this I found ndiff's cost scales with lines × chars for similar lines, so a bounded slice of ~500 pretty-printed lines still took ~30s. I lowered DIFF_MAX_LINES to 100 so the worst case stays under ~1s on both paths
Fixed the note (it now mentions both the char and line limits)
Tests shrink the limits via monkeypatch instead of allocating huge inputs
The cap applies at all verbosities (including -vvv) to guarantee it never hangs

Add a direct unit test exercising all four branches of _bounded_prefix (within limits, line cap, char-truncated line, and exact-fill drop) so patch coverage stays complete.

Pierre-Sassoulas

Thank you ! Let's reach full coverage then I'll review again, a lot changed I'll start from scratch :)

psf-chronographer Bot added the bot:chronographer:provided (automation) changelog entry is part of PR label Jun 1, 2026

kirilklein force-pushed the fix-8998-large-diff-perf branch from c992d71 to e232573 Compare June 11, 2026 17:04

Pierre-Sassoulas requested changes Jun 14, 2026

View reviewed changes

Pierre-Sassoulas added the type: performance performance or memory problem/improvement label Jun 14, 2026

kirilklein added 2 commits June 17, 2026 20:56

Merge branch 'main' into fix-8998-large-diff-perf

53a6d26

Cover _bounded_prefix edge branches (pytest-dev#8998)

76143eb

Add a direct unit test exercising all four branches of _bounded_prefix (within limits, line cap, char-truncated line, and exact-fill drop) so patch coverage stays complete.

Pierre-Sassoulas requested changes Jun 17, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix pathologically slow assertion diffs for large inputs (#8998)#14543

Fix pathologically slow assertion diffs for large inputs (#8998)#14543
kirilklein wants to merge 4 commits into
pytest-dev:mainfrom
kirilklein:fix-8998-large-diff-perf

kirilklein commented Jun 1, 2026

Uh oh!

Pierre-Sassoulas commented Jun 1, 2026

Uh oh!

kirilklein commented Jun 11, 2026

Uh oh!

Pierre-Sassoulas left a comment

Uh oh!

Pierre-Sassoulas Jun 14, 2026

Uh oh!

Pierre-Sassoulas Jun 14, 2026

Uh oh!

Pierre-Sassoulas Jun 14, 2026

Uh oh!

Pierre-Sassoulas Jun 14, 2026

Uh oh!

Pierre-Sassoulas Jun 14, 2026

Uh oh!

kirilklein commented Jun 17, 2026

Uh oh!

Pierre-Sassoulas left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		left_lines = left.splitlines(keepends)
		right_lines = right.splitlines(keepends)

Uh oh!

Conversation

kirilklein commented Jun 1, 2026

Problem

Approach

Results

Tests

Uh oh!

Pierre-Sassoulas commented Jun 1, 2026

Uh oh!

kirilklein commented Jun 11, 2026

Uh oh!

Pierre-Sassoulas left a comment

Choose a reason for hiding this comment

Uh oh!

Pierre-Sassoulas Jun 14, 2026

Choose a reason for hiding this comment

Uh oh!

Pierre-Sassoulas Jun 14, 2026

Choose a reason for hiding this comment

Uh oh!

Pierre-Sassoulas Jun 14, 2026

Choose a reason for hiding this comment

Uh oh!

Pierre-Sassoulas Jun 14, 2026

Choose a reason for hiding this comment

Uh oh!

Pierre-Sassoulas Jun 14, 2026

Choose a reason for hiding this comment

Uh oh!

kirilklein commented Jun 17, 2026

Uh oh!

Pierre-Sassoulas left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants