Skip to content

Fix per-proof memory leak: clear JavaParserFacade cache on dispose#3854

Merged
wadoon merged 2 commits into
mainfrom
bubel/fix-dispose-leak
Jun 21, 2026
Merged

Fix per-proof memory leak: clear JavaParserFacade cache on dispose#3854
wadoon merged 2 commits into
mainfrom
bubel/fix-dispose-leak

Conversation

@unp1

@unp1 unp1 commented Jun 20, 2026

Copy link
Copy Markdown
Member

Intended Change

This bug was found thanks to stress testing the MT PRs on 128 core :-)

Repeatedly loading, proving and disposing the same obligation in a single JVM leaked roughly a
whole proof tree per cycle — about +56.6 MB per proof, growing linearly — so a disposed Proof
was never garbage-collected. This is long-standing (it is why RunAllProofs forks a JVM per group)
and is not multithreading-specific.

Root cause

javaparser's JavaParserFacade keeps a process-global cache:

private static final Map<TypeSolver, JavaParserFacade> instances = new WeakHashMap<>();
public static synchronized JavaParserFacade get(TypeSolver ts) {
    return instances.computeIfAbsent(ts.getRoot(), JavaParserFacade::new);
}

Each value (new JavaParserFacade(typeSolver)) strongly references its own weak key. A weak
key that stays strongly reachable from its own value can never become weakly reachable, so the
entry is never evicted. KeY's JavaParserFactory type solvers (DynamicTypeSolver,
LogicalTypeSolver) are non-static inner classes that transitively reference the owning Services,
so every disposed proof's entire proof tree was pinned through this cache.

The exact GC-root chain (found with the in-process root finder added here):

static JavaParserFacade.instances  (WeakHashMap)
  → table[].value → JavaParserFacade
     .typeSolver → CombinedTypeSolver → LogicalTypeSolver.this$0
        → JavaParserFactory → Services → Proof  (+ whole tree)

Fix

A single proof registers several short-lived type-solver keys (the DynamicTypeSolver wrapper
plus every CombinedTypeSolver produced by rebuild()), and the fork exposes no per-key removal, so
JavaParserFactory.dispose() drops the cache wholesale via the public
JavaParserFacade.clearInstances(). It synchronizes on the JavaParserFacade class monitor that
get() uses, because the backing map is a plain WeakHashMap and clearInstances() is itself
unsynchronized. The facade is a pure resolution cache that is only populated while parsing Java
during loading and is rebuilt lazily on the next get, so clearing it on dispose is safe.

Proof.dispose() invokes this via the new nullable accessor Services.getJavaServiceOrNull(), which
avoids the assertion in getJavaService() for pure first-order proofs.

Result

Measured with MtRetentionProbe (8 load/prove/dispose cycles of symmArray, -Xmx2g):

Before After
Slope +56.6 MB / proof +0.1 MB / proof
Live heap (rep 0 → rep 7) 130 → 526 MB 130 → 131 MB
Verdict RETENTION flat

All eight proofs still load, prove and close after the fix, confirming that clearing the cache
between proofs does not break symbol resolution.

Type of pull request

  • Bug fix (non-breaking change which fixes an issue)
  • There are changes to the (Java) code
  • There are changes to the deployment/CI infrastructure (gradle, github, ...)

Ensuring quality

  • I made sure that introduced/changed code is well documented (javadoc and inline comments).
  • I added new test case(s) for new functionality.
  • I have tested the feature as follows:
    ./gradlew :key.core:test --tests '*MtRetentionProbe' -PstressHeap=2g \
        -Dkey.mt.retention=true -Dkey.mt.retention.reps=8
    
    verdict: flat (dispose releases per-proof state) (+0.1 MB/proof); before the fix the same
    command reports RETENTION (+56.6 MB/proof).
  • I have checked that runtime performance has not deteriorated.

Additional information and contact(s)

This change is orthogonal to the multithreading PRs (those merely mitigate the same leak for the
stress battery by forking one JVM per proof); this is the underlying fix.

Relationship to the JavaParser fork fix (wadoon/key-javaparser#7)

The true root cause is in our JavaParser fork and is addressed there by
wadoon/key-javaparser#7, which holds the JavaParserFacade.instances cache values weakly so the
self-cycle can no longer pin a TypeSolver. The two fixes are complementary:

What to do once wadoon/key-javaparser#7 is released and the JP_VERSION dependency is bumped:
this dispose()-time clearInstances() becomes belt-and-suspenders and can be relaxed. Concretely:

  1. Bump JP_VERSION in the build to the fork release containing <deleted> #7 and confirm the leak stays flat
    (re-run MtRetentionProbe; it should remain flat without the clearInstances() call).

  2. Then either:

    • remove JavaParserFactory.dispose() and its call in Proof.dispose() entirely (rely on the
      fork's weak values), or
    • keep a dispose() but switch it from the global clearInstances() to a targeted eviction of
      just this factory's type-solver entries — gentler for the IDE / multi-proof case, where clearing
      the whole shared cache on every proof disposal is wasteful.

    The global clearInstances() is used here only because the published fork exposes no per-key
    removal and a proof registers several short-lived type-solver keys; <deleted> #7 removes the need for it.
    Services.getJavaServiceOrNull() can stay regardless (it is a generally useful nullable accessor).

Two opt-in, test-only diagnostics are included as a regression guard and for future leak hunts, both
gated behind -Dkey.mt.retention=true so they never run in normal CI:

  • MtRetentionProbe — quantifies live-heap growth across disposed proofs.
  • HeapRootFinder — a MAT-free, in-process "path to GC roots" (enumerates loaded classes via the
    gcClassHistogram diagnostic command, BFS from static fields and every thread's ThreadLocalMap,
    excluding weak/soft referents). JFR's OldObjectSample only reached "VM Global / Global Object
    Handle", which is why this was needed.

The contributions within this pull request are licensed under GPLv2 (only) for inclusion in KeY.

Disposed proofs were never collected: live heap grew ~56.6 MB per
load/prove/dispose cycle of the same obligation in one JVM (linearly),
which is why RunAllProofs forks a JVM per group.

Root cause is javaparser's JavaParserFacade.instances, a
  static WeakHashMap<TypeSolver, JavaParserFacade>
whose value (new JavaParserFacade(typeSolver)) strongly references its
own weak key. A weak key that stays strongly reachable from its own
value never becomes weakly reachable, so the entry is never evicted.
KeY's JavaParserFactory type solvers transitively reference the owning
Services, so each disposed proof's whole proof tree was retained through
this cache.

A proof registers several short-lived type-solver keys (the
DynamicTypeSolver wrapper plus every CombinedTypeSolver from rebuild()),
and the fork exposes no per-key removal, so JavaParserFactory.dispose()
drops the cache wholesale via the public clearInstances(), synchronized
on the JavaParserFacade monitor that get() uses (the backing map is a
plain WeakHashMap and clearInstances() is itself unsynchronized). The
facade is a load-time-only resolution cache, rebuilt lazily, so this is
safe. Proof.dispose() invokes it via the new nullable accessor
Services.getJavaServiceOrNull() (avoids the assert in getJavaService()
for pure first-order proofs).

Measured with MtRetentionProbe (8 reps, symmArray): +56.6 MB/proof
before, +0.1 MB/proof after.
@unp1 unp1 self-assigned this Jun 20, 2026
@unp1 unp1 force-pushed the bubel/fix-dispose-leak branch from 8bc530c to 2bf95ab Compare June 20, 2026 19:39
Test-only diagnostics for the dispose leak, all gated behind
-Dkey.mt.retention=true so they never run in normal CI:

- MtRetentionProbe: loads/proves/disposes an obligation N times, forces
  GC each cycle and reports live MB per rep, a verdict, and (with
  .histo) the top live-instance growth. Serves as a regression guard
  for the dispose leak.
- HeapRootFinder: MAT-free, in-process "path to GC roots". Enumerates
  loaded classes via the gcClassHistogram diagnostic command, BFS-walks
  the object graph from their static fields and from every thread's
  ThreadLocalMap, excludes weak/soft referents, and prints the shortest
  strong reference chain to a target object. This is how the
  JavaParserFacade.instances root was identified after JFR/OldObjectSample
  only reached "VM Global / Global Object Handle".

build.gradle forwards -Dkey.mt.retention* and -PstressHeap to the test
JVM, and adds --add-opens for java.base packages when
-Dkey.mt.retention.root=true (the root finder reflects into JDK
collection internals).
@unp1 unp1 force-pushed the bubel/fix-dispose-leak branch from 2bf95ab to e15580c Compare June 20, 2026 19:45
@unp1 unp1 added the 🐞 Bug label Jun 20, 2026
@unp1 unp1 added this to the v3.0.0 milestone Jun 20, 2026
@wadoon wadoon enabled auto-merge June 21, 2026 20:52
@wadoon wadoon added this pull request to the merge queue Jun 21, 2026
Merged via the queue into main with commit f272740 Jun 21, 2026
36 checks passed
@wadoon wadoon deleted the bubel/fix-dispose-leak branch June 21, 2026 21:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants