Skip to content

feat: Mutation caching and transitive dependency tracking#509

Open
nicklafleur wants to merge 7 commits into
boxed:mainfrom
lyft:nicklafleur/function_hashing
Open

feat: Mutation caching and transitive dependency tracking#509
nicklafleur wants to merge 7 commits into
boxed:mainfrom
lyft:nicklafleur/function_hashing

Conversation

@nicklafleur
Copy link
Copy Markdown
Collaborator

@nicklafleur nicklafleur commented Apr 26, 2026

Summary

Adds incremental mutation testing to mutmut by skipping mutants in unchanged code, with transitive invalidation via a runtime call graph. On re-runs, only mutants in functions whose source (or whose dependencies' source) changed are re-tested.

High-level

  • Incremental mutation testing which cuts down mutation run duration ~linearly relative to the ratio of code changed (less code is changed, faster the run goes).
    • In practice in large codebases this means a >95% reduction in runtime on average as the amount of code not changed far outweighs the amount of code changed
    • Utility functions are particularly susceptible to "cache busting", even a noop syntactic change that modifies the AST will cause invalidation of all call chains which rely on those functions (technically correct since the code did change, but something to be aware of)
  • UI support will come in a future PR

Commit Breakdown:

  1. feat: per-function hashing for incremental cache invalidation

When a source file changes, only re-test mutants in functions whose AST
hash changed; preserve prior results for unchanged functions in the same
file.

  • compute_function_hashes / _compute_mutated_function_hashes in file_mutation.py: class-qualified mangled keys (x_foo / xǁClassǁmethod) -> 12-char sha256 of the function AST. Methods and nested-class methods are indexed under the same key the merge looks up, closing the latent silent-preservation bug for changed methods.
  • mutate_file_contents returns a 3-tuple (code, names, hashes).
  • SourceFileMutationData gains hash_by_function_name, persisted in .meta with a pop-with-default so old files still load.
  • create_mutants_for_file: mtime short-circuit now preserves all prior results instead of resetting them; on a real change, load-and-merge compares new hashes against old, resets only changed/unhashed mutants, and preserves the rest.
  • Tests: update all mutate_file_contents unpack sites; add tests for hash stability, body-change detection, comment-insensitivity, method key inclusion, two-function preserve/reset integration, and the method regression guard.
  1. feat: cross-call dependency tracking for incremental stats invalidation

Records caller->callee edges at stats collection time so stale outgoing
call edges can be cleared when a callee's code changes.

  • state.py: MutmutState singleton holding old_function_hashes, current_function_hashes, and function_dependencies (callee → callers).
  • core.py: MutmutCallStack ContextVar propagates caller context through call chains.
  • trampoline.py stats branch: resolves caller via MutmutCallStack, passes it to record_trampoline_hit, sets updated context for inner calls, respects MUTMUT_DEPENDENCY_DEPTH env ceiling.
  • record_trampoline_hit gains caller param; upstream's source-path-resolving max_stack_depth walk preserved verbatim; dependency edge written only when track_dependencies=True.
  • FileMutationResult gains changed_functions/current_hashes (deferred from commit 1); create_mutants accumulates current_hashes into state().current_function_hashes across worker results.
  • create_mutants_for_file builds module-qualified current_hashes and changed_functions for return to parent.
  • load_stats/save_stats persist function_hashes and function_dependencies alongside existing test associations (backwards-compatible pop-with-default on load).
  • _cleanup_stale_stats: removes test associations and dependency edges for modules absent from current_function_hashes.
  • _invalidate_stale_dependency_edges: clears changed functions from all caller sets so stale outgoing edges are rebuilt on next stats run.
  • collect_or_load_stats: on incremental load, runs cleanup always and invalidation when track_dependencies; persists the result.
  • Config gains track_dependencies (default True) and dependency_tracking_depth (default None); run_stats_collection sets MUTMUT_DEPENDENCY_DEPTH from config.
  • Tests: record_trampoline_hit with/without track_dependencies, _cleanup_stale_stats removes unknown modules, _invalidate_stale_dependency_edges clears changed callers and no-ops on first run, config defaults asserted.
  1. e2e: add benchmark project with 1k mutants
  • Add e2e_projects/benchmark_1k/ with ~1000 mutants for testing
  • Includes modules: numbers, strings, booleans, operators, comparisons,
    arguments, returns, complex (recursion, higher-order functions)
  • Configurable delays via BENCHMARK_IMPORT_DELAY, BENCHMARK_CONFTEST_DELAY,
    BENCHMARK_TEST_DELAY environment variables to simulate the performance
    under variable test and startup runtimes.

4.feat: invalidate cache on config and dependency changes

Cached verdicts were only invalidated when a function body changed, so
changes to config or dependency files silently produced stale results.

  • Config.config_fingerprint() hashes result-affecting config, grouped so we reset only what each change can affect:
    • timeout change -> reset only timeout verdicts
    • type_check_command change -> reset mutants whose type-check status flips (symmetric difference of old exit-37 and newly-caught)
    • pytest_add_cli_args / test-selection change -> reset all results and force full stats recollection
    • set-affecting config (source_paths, only_mutate, ...) is ignored, new mutants are uncached and dropped ones stop being walked
  • compute_watched_file_hashes() hashes dependency/build files (pyproject.toml, setup.cfg/py, requirements*.txt, lockfiles) plus user globs from the new cache_invalidation_files config. The on_dependency_change config ("warn" | "rerun" | "ignore", default "warn") controls whether a change warns or resets all results.
  • Fingerprints persist in mutmut-stats.json with pop-with-default, so old caches load and a missing fingerprint triggers no invalidation.

5.feat: use git to detect non-Python dependency file changes

Replace the fixed watched-file list with git-based change detection. mutmut now uses git diff/git ls-files to find every non-.py file changed since the last full run, falling back to the curated list when git is unavailable. A default exclude set (*.md, *.rst, docs/, LICENSE, etc.) drops files that never affect tests; users can extend it with cache_invalidation_exclude. The git commit and file hashes are persisted together as a baseline so a later git-less environment (e.g. a separate CI stage) can still detect changes to previously-tracked files by re-hashing them. New options: use_git_change_detection (default true) and cache_invalidation_exclude.

Known Issues

  • Because we only track dependencies at runtime through the trampoline logic, un-mutated function are omitted in the dependency graph that is built. The call graph represents the call graph of mutated functions not the global one.
  • We end up looping on all walkable files a few times, pushing time complexity higher than before. This is still a smaller penalty than the caching gain but definitely something that can be improves
  • The "cache" is in the form of a json file right now, which is horrifically inefficient for the sparse reads/writes which is typical in this workfow, moving to an sqlite-based store of the state could unlock some significant storage and parallelism breakthroughs
    • I have a follow-up PR that will branch out into different forking strategies that could be extended to include easy hookups for this kind of reporting strategy.

@nicklafleur nicklafleur changed the title Nicklafleur/function hashing feat: Mutation caching and transitive dependency tracking Apr 26, 2026
@Otto-AA
Copy link
Copy Markdown
Collaborator

Otto-AA commented May 1, 2026

Hi, thanks for the PR, I think this will improve working with mutmut in general :)

I think I would fix #477 before taking the time to review this PR (because I think it would be nice to fix the regression some time soon, also because I'd like to unify the external / "normal" method injection setup a bit to reduce complexity, and tbh also because currently I'm more in the mood of writing code myself rather than reviewing, as I only spend little time on open source currently).

Some initial thoughts on this PR:

I guess a (reasonable) limitation is, that caching will only notice changes within functions/methods. So all of the following would not trigger mutant reruns:

  • external library changes (dependency updates)
  • configuration changes (pyproject.toml, yaml files, etc.)
  • data file changes (my_query.sql, etc.)
  • import-time code changes (dataclass/pydantic model change, import statements, etc.)

All of these cannot be tied to some function/method, so we would need some other system than callstacks for tracking dependencies. I think it's fair to say these are out of scope.

What happens when mutmut configs change? e.g. in the first run we set the filter to only mutate some files and in the next run other files? Or we add a new pytest flag. Should we simply keep the cache, or clear it, or ask the user?

Introduces MutationMetadata (line number, mutation type, human-readable description) carried on every Mutation and serialized to JSON, plus an OPERATOR_TO_TYPE mapping and helpers (_determine_mutation_type, _describe_mutation).

Is this relevant to caching or an additional feature? The _describe_mutation method feels like the git diff of the mutmut browse

@nicklafleur
Copy link
Copy Markdown
Collaborator Author

nicklafleur commented May 1, 2026

yeah #477 and the unification of the trampoline patterns seem like great candidates to merge before this work, the dependency change thing is something that I don't really have a great answer to. My personal view here is that generally people should be proactive about doing full reruns when making big library changes, but having a "false cache" is definitely not the kind of things that most people would clue into.

The naive approach would be to detect these things in some way and simply force a rerun in those cases, which is effectively the status quo today so there's no regression in that sense.

The mutation metadata is something I've been kinda messing with in the context of LLM-driven testing. There's been a big industry push to having unit tests be written by AI, but there isn't really a mechanism to give AI meaningful feedback on the quality of passing tests. One can imagine that a math-focused lib may want to kill all calculation-based/boolean mutants but not care as much for string mutations for example. Having this kind of metadata is what would be needed to be able to filter for/express this data.

I believe (I'd have to go back and check, been a while since I made the changes) that I've included this information in my updates to the browser in the TMP branch, but I'll be sure to include that if not.

On a more general note, if you're having reviewer burnout please take some time to just do some code changes, I've been blessed by @boxed as a collaborator, and will be happy to take on the review burden of your (and other's) changes in the short term and leave mine to sit on the sidelines for a bit, you've reviewed more than enough of my code to have earned that break, especially given the size and density of my changes :), though if you have the opportunity to test out this branch to get a feel for the speed increases and workflows, I would love to hear your hands-on experience.

@Otto-AA
Copy link
Copy Markdown
Collaborator

Otto-AA commented May 2, 2026

On a more general note, if you're having reviewer burnout please take some time to just do some code changes, I've been blessed by @boxed as a collaborator, and will be happy to take on the review burden of your (and other's) changes in the short term and leave mine to sit on the sidelines for a bit, you've reviewed more than enough of my code to have earned that break, especially given the size and density of my changes :), though if you have the opportunity to test out this branch to get a feel for the speed increases and workflows, I would love to hear your hands-on experience.

Thanks for your offer ❤️ I am already taking it slow, only looking at open source a few times a month and then doing only the work I feel happy doing right now. Regarding reviewing other PRs, feel free to do so but no pressure. You could also review and ask me if there are any open questions.

My personal view here is that generally people should be proactive about doing full reruns when making big library changes, but having a "false cache" is definitely not the kind of things that most people would clue into.
The naive approach would be to detect these things in some way and simply force a rerun in those cases, which is effectively the status quo today so there's no regression in that sense.

If we want to pull in git as a dependency, we could:

  • on a full run: store the commit hash (+ changes? not sure if that's possible)
  • on a cached run
    • make a git diff to the last full run
    • inform the users about changed non-python files

So something like this (just a first idea, feel free to redesign):

# initial full run
mutmut run

# modify some files
vim src/main.py
vim src/config.yml
vim pyproject.toml

# partially cached run
mutmut run
[info] following files changed since the last full run, but cannot be tracked for changes:
[info] src/config.yml pyproject.toml (not displaying src/main.py, because we track changes there)
[info] Consider clearing the mutants cache if the changes are relevant for your tests

This would help the external files issue. I think only the import-time code caching would still be a blind spot.

I already previously thought about using git archive to setup the mutants directory, instead of the source_paths and also_include configs. So maybe adding git as an (optional?) dependency could be nice anyway.

Also somewhat related is the git option by infection: https://infection.github.io/guide/command-line-options.html#git-diff-filter (probably useful for CI; could be added in addition to this PR imo)

The mutation metadata is something I've been kinda messing with in the context of LLM-driven testing. There's been a big industry push to having unit tests be written by AI, but there isn't really a mechanism to give AI meaningful feedback on the quality of passing tests. One can imagine that a math-focused lib may want to kill all calculation-based/boolean mutants but not care as much for string mutations for example. Having this kind of metadata is what would be needed to be able to filter for/express this data.

I've been thinking about a setting to enable/disable specific types of mutations (disable_mutation_operators = [ 'string.case', 'number' ] or something like this), maybe that would be helpful for this use case as well? Though the mutation operators are also changing more frequently, so the identifiers are probably not 100% stable.

I haven't given a lot of though yet, how mutmut can be used by agents. I'd guess the git diff could work well (diffing the old and new function), and we could also output a short description in the mutation operators in node_mutation.py. But I'm pretty sure you have more AI experience, so take it just as input :)

@nicklafleur
Copy link
Copy Markdown
Collaborator Author

nicklafleur commented May 2, 2026

Thanks for your offer ❤️ I am already taking it slow, only looking at open source a few times a month and then doing only the work I feel happy doing right now. Regarding reviewing other PRs, feel free to do so but no pressure. You could also review and ask me if there are any open questions.

Glad to hear you're prioritizing yourself, I've been merging the easy ones like dependabot, I plan on checking out some of the more recent ones without conflicts and potentially poking the older ones for signs of life.

I've been thinking about a setting to enable/disable specific types of mutations (disable_mutation_operators = [ 'string.case', 'number' ] or something like this), maybe that would be helpful for this use case as well? Though the mutation operators are also changing more frequently, so the identifiers are probably not 100% stable.

I haven't given a lot of though yet, how mutmut can be used by agents. I'd guess the git diff could work well (diffing the old and new function), and we could also output a short description in the mutation operators in node_mutation.py. But I'm pretty sure you have more AI experience, so take it just as input :)

Having agents use mutmut is actually a big reason why I worked on the caching. For the workflow loop to be somewhat reasonable for our larger repos we needed to bring the runtime as low as possible so that it could be driven by subagent-type flows. I figured that diff style workflows are a large part of modern agent training data so wanted to make use of that in the way we report uncaught mutations to the LLMs instead of the json results which would require a lot of parsing and token spend to extract semantic meaning.

on a cached run

  • make a git diff to the last full run
  • inform the users about changed non-python files

That's an interesting idea, we could pretty reliably capture most typical python configs (toml, reqs.txt, manifests, etc) and potentially even offer a mechanism for people to register their own in case they have some custom internal tooling. That way it's assumes that no change happened (bumping a lib patch version practically never affects behaviour in a meaningful way) but also avoiding a completely silent pass.

btw, I plan on taking on #404 sometime soon, just need to set it up on my personal setup and I'll get a working windows impl that doesn't require wsl.

@Otto-AA
Copy link
Copy Markdown
Collaborator

Otto-AA commented May 3, 2026

That's an interesting idea, we could pretty reliably capture most typical python configs (toml, reqs.txt, manifests, etc) and potentially even offer a mechanism for people to register their own in case they have some custom internal tooling. That way it's assumes that no change happened (bumping a lib patch version practically never affects behaviour in a meaningful way) but also avoiding a completely silent pass.

I think simply informing the user about changed files (excluding ones ending with .py) would be good enough. Usually not many files change, so the user should be able to decide if that's worth a full re-run or they want to continue with cached runs.

btw, I plan on taking on #404 sometime soon, just need to set it up on my personal setup and I'll get a working windows impl that doesn't require wsl.

The main reason I discontinued working on this is, that re-using workers from a pool is more brittle to errors. If I run mutant A in a process and this mutant breaks some global setup, then running mutant B in the process will produce wrong results. The fork method executes each mutant in their own sandbox, so if mutant A breaks some global setup, mutant B won't be affected by this.

@boxed
Copy link
Copy Markdown
Owner

boxed commented May 3, 2026

A method to handle the brittleness is to ensure that a full test run runs cleanly inside the recycled worker before it gets a new process, but I think that will destroy the performance gains anyway. I just don't see how to get away from using fork and keep all the upsides.

@ChristopheDuong
Copy link
Copy Markdown

ChristopheDuong commented Jun 4, 2026

+1 from a production user. Our own GCS cache infrastructure is effectively no-op because of create_mutants_for_file's unconditional reset of exit_code_by_key. This PR's function-hash approach is exactly what's missing. Looking forward to it landing.

@nicklafleur
Copy link
Copy Markdown
Collaborator Author

+1 from a production user. Our own GCS cache infrastructure is effectively no-op because of create_mutants_for_file's unconditional reset of exit_code_by_key. This PR's function-hash approach is exactly what's missing. Looking forward to it landing.

hey @ChristopheDuong, sorry about the delay in merging this been a crazy couple of weeks for me with team changes and various other stuff. I intend on getting this rebased and merged this weekend at the latest. Thanks for the interest!

@nicklafleur nicklafleur force-pushed the nicklafleur/function_hashing branch 3 times, most recently from b73def9 to 7c3ecb0 Compare June 6, 2026 01:23
When a source file changes, only re-test mutants in functions whose AST
hash changed; preserve prior results for unchanged functions in the same
file.

- compute_function_hashes / _compute_mutated_function_hashes in
file_mutation.py: class-qualified mangled keys (x_foo /
xǁClassǁmethod) -> 12-char sha256 of the function AST. Methods and
nested-class methods are indexed under the same key the merge looks up,
closing the latent silent-preservation bug for changed methods.
- mutate_file_contents returns a 3-tuple (code, names, hashes).
- SourceFileMutationData gains hash_by_function_name, persisted in .meta
with a pop-with-default so old files still load.
- create_mutants_for_file: mtime short-circuit now preserves all prior
results instead of resetting them; on a real change, load-and-merge
compares new hashes against old, resets only changed/unhashed mutants,
and preserves the rest.
- Tests: update all mutate_file_contents unpack sites; add tests for
hash stability, body-change detection, comment-insensitivity, method
key inclusion, two-function preserve/reset integration, and the method
regression guard.
Records caller->callee edges at stats collection time so stale outgoing
call edges can be cleared when a callee's code changes.

- state.py: MutmutState singleton holding old_function_hashes,
current_function_hashes, and function_dependencies (callee → callers).
- core.py: MutmutCallStack ContextVar propagates caller context through
call chains.
- trampoline.py stats branch: resolves caller via MutmutCallStack,
passes it to record_trampoline_hit, sets updated context for inner
calls, respects MUTMUT_DEPENDENCY_DEPTH env ceiling.
- record_trampoline_hit gains caller param; upstream's source-path-
resolving max_stack_depth walk preserved verbatim; dependency edge
written only when track_dependencies=True.
- FileMutationResult gains changed_functions/current_hashes (deferred
from commit 1); create_mutants accumulates current_hashes into
state().current_function_hashes across worker results.
- create_mutants_for_file builds module-qualified current_hashes and
changed_functions for return to parent.
- load_stats/save_stats persist function_hashes and function_dependencies
alongside existing test associations (backwards-compatible pop-with-
default on load).
- _cleanup_stale_stats: removes test associations and dependency edges
for modules absent from current_function_hashes.
- _invalidate_stale_dependency_edges: clears changed functions from all
caller sets so stale outgoing edges are rebuilt on next stats run.
- collect_or_load_stats: on incremental load, runs cleanup always and
invalidation when track_dependencies; persists the result.
- Config gains track_dependencies (default True) and
dependency_tracking_depth (default None); run_stats_collection sets
MUTMUT_DEPENDENCY_DEPTH from config.
- Tests: record_trampoline_hit with/without track_dependencies,
_cleanup_stale_stats removes unknown modules, _invalidate_stale_
dependency_edges clears changed callers and no-ops on first run,
config defaults asserted.
- Add e2e_projects/benchmark_1k/ with ~1000 mutants for testing
- Includes modules: numbers, strings, booleans, operators, comparisons,
  arguments, returns, complex (recursion, higher-order functions)
- Configurable delays via BENCHMARK_IMPORT_DELAY, BENCHMARK_CONFTEST_DELAY,
  BENCHMARK_TEST_DELAY environment variables to simulate the performance
  under variable test and startup runtimes.
Cached verdicts were only invalidated when a function body changed, so
changes to config or dependency files silently produced stale results.

- Config.config_fingerprint() hashes result-affecting config, grouped so
  we reset only what each change can affect:
  - timeout change -> reset only timeout verdicts
  - type_check_command change -> reset mutants whose type-check status
    flips (symmetric difference of old exit-37 and newly-caught)
  - pytest_add_cli_args / test-selection change -> reset all results and
    force full stats recollection
  - set-affecting config (source_paths, only_mutate, ...) is ignored:
    new mutants are uncached and dropped ones stop being walked
- compute_watched_file_hashes() hashes dependency/build files
  (pyproject.toml, setup.cfg/py, requirements*.txt, lockfiles) plus user
  globs from the new cache_invalidation_files config. The
  on_dependency_change config ("warn" | "rerun" | "ignore", default
  "warn") controls whether a change warns or resets all results.
- Fingerprints persist in mutmut-stats.json with pop-with-default, so
  old
  caches load and a missing fingerprint triggers no invalidation.
Replace the fixed watched-file list with git-based change detection.
mutmut now uses `git diff`/`git ls-files` to find every non-.py file
changed since the last full run, falling back to the curated list when
git is unavailable. A default exclude set (*.md, *.rst, docs/, LICENSE,
etc.) drops files that never affect tests; users can extend it with
`cache_invalidation_exclude`. The git commit and file hashes are
persisted together as a baseline so a later git-less environment (e.g.
a separate CI stage) can still detect changes to previously-tracked
files by re-hashing them. New options: `use_git_change_detection`
(default true) and `cache_invalidation_exclude`.
@nicklafleur nicklafleur closed this Jun 6, 2026
@nicklafleur nicklafleur force-pushed the nicklafleur/function_hashing branch from e83c17f to e92d763 Compare June 6, 2026 14:57
@nicklafleur nicklafleur reopened this Jun 6, 2026
@nicklafleur
Copy link
Copy Markdown
Collaborator Author

@Otto-AA implemented the optional git-based tracking as well as a few other config knobs for adding/excluding files from the caching. Ended up removing the metadata tracking for now until I decide what direction I want to go with that.

cc @ChristopheDuong @percy-raskova

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces incremental mutation testing to mutmut by caching per-function mutation results, tracking transitive dependencies via a runtime call graph, and invalidating caches when relevant config or non-Python dependency files change (optionally detected via git). It also adds an end-to-end benchmark project to exercise performance characteristics at ~1k mutants.

Changes:

  • Add per-function AST hashing and persist hashes in per-file .meta to preserve/reset cached mutant results more selectively.
  • Record runtime caller→callee relationships during stats collection and invalidate dependency edges when function hashes change.
  • Detect cache-invalidating config/dependency changes (with git-backed non-.py change detection) and add an e2e benchmark project.

Reviewed changes

Copilot reviewed 39 out of 41 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
uv.lock Updates locked mutmut version metadata.
tests/test_mutation regression.py Adjusts mutate_file_contents unpacking for new return shape.
tests/test_configuration.py Extends Config construction/default assertions for new incremental/dependency settings.
tests/mutation/test_mutation.py Adds extensive unit tests for hashing, dependency tracking, config/dependency invalidation, and git change detection; updates call sites for new return shape.
tests/mutation/test_mutation_runtime.py Updates runtime tests for new mutate_file_contents return shape.
src/mutmut/utils/format_utils.py Adds helper to derive module name from mangled function keys (used for stale-stats cleanup).
src/mutmut/state.py Introduces a singleton state container for function hashes, dependencies, and change-detection baselines.
src/mutmut/mutation/trampoline.py Propagates async-safe caller context and records dependency edges during stats runs with depth limiting.
src/mutmut/mutation/file_mutation.py Implements per-function hashing and returns hashes from mutation generation.
src/mutmut/mutation/data.py Persists hash_by_function_name alongside cached mutant results in .meta.
src/mutmut/core.py Adds ContextVar-backed call context for dependency tracking.
src/mutmut/configuration.py Adds config options and config fingerprinting for targeted cache invalidation.
src/mutmut/main.py Integrates incremental cache merge/reset logic, dependency tracking persistence, config/dependency invalidation, and git-based non-.py change detection.
src/mutmut/init.py Ensures new global state is reset with other mutmut globals.
README.rst Documents dependency/config change detection behavior and configuration options.
HISTORY.rst Adds an Unreleased changelog entry describing the new incremental features.
e2e_projects/benchmark_1k/tests/test_strings.py Adds benchmark tests for string-focused mutation targets.
e2e_projects/benchmark_1k/tests/test_returns.py Adds benchmark tests for return/assignment mutation targets.
e2e_projects/benchmark_1k/tests/test_operators.py Adds benchmark tests for operator mutation targets.
e2e_projects/benchmark_1k/tests/test_numbers.py Adds benchmark tests for numeric mutation targets.
e2e_projects/benchmark_1k/tests/test_complex.py Adds benchmark tests for deep call chains/recursion/HOF patterns.
e2e_projects/benchmark_1k/tests/test_comparisons.py Adds benchmark tests for comparison/membership/identity patterns.
e2e_projects/benchmark_1k/tests/test_booleans.py Adds benchmark tests for boolean literals/operators/conditions.
e2e_projects/benchmark_1k/tests/test_arguments.py Adds benchmark tests for argument patterns and common call shapes.
e2e_projects/benchmark_1k/tests/conftest.py Adds benchmark test delays to simulate conftest/test runtime overhead.
e2e_projects/benchmark_1k/tests/init.py Declares benchmark tests package.
e2e_projects/benchmark_1k/src/benchmark/strings.py Adds benchmark mutation target implementations (strings).
e2e_projects/benchmark_1k/src/benchmark/returns.py Adds benchmark mutation target implementations (returns/assignments).
e2e_projects/benchmark_1k/src/benchmark/operators.py Adds benchmark mutation target implementations (operators).
e2e_projects/benchmark_1k/src/benchmark/numbers.py Adds benchmark mutation target implementations (numbers).
e2e_projects/benchmark_1k/src/benchmark/complex.py Adds benchmark mutation target implementations (complex call patterns).
e2e_projects/benchmark_1k/src/benchmark/comparisons.py Adds benchmark mutation target implementations (comparisons).
e2e_projects/benchmark_1k/src/benchmark/booleans.py Adds benchmark mutation target implementations (booleans).
e2e_projects/benchmark_1k/src/benchmark/arguments.py Adds benchmark mutation target implementations (arguments).
e2e_projects/benchmark_1k/src/benchmark/init.py Adds benchmark package initializer and configurable import delay.
e2e_projects/benchmark_1k/run_benchmark.py Adds benchmark runner for comparing process isolation/warmup strategies.
e2e_projects/benchmark_1k/requirements.txt Declares benchmark project test dependency.
e2e_projects/benchmark_1k/README.md Documents benchmark usage and expected outcomes.
e2e_projects/benchmark_1k/pyproject.toml Adds benchmark project config (mutmut + build metadata).
e2e_projects/benchmark_1k/mutmut_preload.txt Lists modules for the benchmark “import” warmup strategy.
e2e_projects/benchmark_1k/benchmark_results.json Adds a sample benchmark output dataset.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/mutmut/__main__.py
Comment on lines 304 to 308
# source_mtime > mutant_mtime: the source file was modified after the mutant has been created
# source_mtime == mutant_mtime: only copied, otherwise the mutant file is untouched
# source_mtime < mutant_mtime: the mutations have been saved after copying; source file untouched
if source_mtime < mutant_mtime:
# reset the mutation stats
source_file_mutation_data = SourceFileMutationData(path=filename)
source_file_mutation_data.load()
for key in source_file_mutation_data.exit_code_by_key:
source_file_mutation_data.exit_code_by_key[key] = None
source_file_mutation_data.save()

return FileMutationResult(unmodified=True)
"use_setproctitle", not platform.system() == "Darwin"
), # False on Mac, true otherwise as default (https://github.com/boxed/mutmut/pull/450#issuecomment-4002571055)
track_dependencies=s("track_dependencies", True),
dependency_tracking_depth=s("dependency_tracking_depth", None),
Comment on lines +24 to +31
def benchmark_test_delay():
"""Add realistic per-test runtime variance."""
if _test_delay > 0:
# Apply +/-10% gaussian jitter (std = 10% of mean)
jittered = random.gauss(_test_delay, _test_delay * 0.1)
# Clamp to 0.01s
time.sleep(max(0.01, jittered))
yield
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants