Skip to content

Recover from AbandonedMutexException in GlobalMutex (Windows)#1504

Closed
myieye wants to merge 1 commit into
masterfrom
fix/global-mutex-abandoned-recovery
Closed

Recover from AbandonedMutexException in GlobalMutex (Windows)#1504
myieye wants to merge 1 commit into
masterfrom
fix/global-mutex-abandoned-recovery

Conversation

@myieye

@myieye myieye commented Jun 2, 2026

Copy link
Copy Markdown
Collaborator

A bit of a gutsy PR perhaps.

Fixes LT-21834 and a related shutdown-time crash from WritingSystemManager.SaveGlobalWritingSystemRepository.TryGetGlobalMutex.Lock().

WindowsGlobalMutexAdapter.Wait() now catches AbandonedMutexException, traces a warning, and proceeds. Why this feels like a safe change:

  • It matches the .NET Mutex contract verbatim. Per Microsoft's AbandonedMutexException docs: "Whether or not the exception was thrown, the current thread owns the mutex, and must release it." Subsequent _mutex.ReleaseMutex() in Release() succeeds.
  • No diagnostic value is lost by catching. AbandonedMutexException carries no information about who abandoned the mutex or why — propagating it just produced an unhandled crash. The Trace.TraceWarning preserves the only signal that actually exists ("an abandonment was observed here").
  • One fix covers both reported stack traces. Both GlobalMutex.Lock() (the new shutdown crash) and GlobalMutex.InitializeAndLock()Init(initiallyOwned: true) (LT-21834) funnel through Wait(). The Init stack shows the call frame because Wait() is JIT-inlined; the try/catch is preserved through inlining.

As far as I know/understand, the one real downside of this fix is that we/users will stop seeing crashes and errors and although those errors will likely not be useful for pinpointing bugs, they do indicate that there is presumably code out there somewhere that is not handling the mutex correctly.


This change is Reviewable

Per the .NET Mutex contract, AbandonedMutexException means the wait
succeeded and the calling thread now owns the mutex; the exception is
purely informational. The previous behaviour propagated it, crashing
every libpalaso consumer that takes the lock via `using (mutex.Lock())`.
Trace the recovery so self-abandonment by a thread in our own process
(a real bug, distinct from the cross-process case this catch absorbs)
stays visible.

Fixes FieldWorks LT-21834; also covers a recently reported shutdown-time
crash from the WritingSystemManager.Save path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions

github-actions Bot commented Jun 2, 2026

Copy link
Copy Markdown

Palaso Tests

     4 files  ±0       4 suites  ±0   9m 57s ⏱️ - 2m 39s
 5 103 tests +2   4 869 ✅ +2  234 💤 ±0  0 ❌ ±0 
16 621 runs  +6  15 899 ✅ +5  722 💤 +1  0 ❌ ±0 

Results for commit 32ca8a0. ± Comparison against base commit fd1cf9c.

// Acquire and intentionally don't release: thread exit marks the mutex abandoned.
var worker = new Thread(() =>
{
var raw = new Mutex(false, name);
@myieye

myieye commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator Author

This is obviously somewhat hairy. I'm not actually up for this right now.

@myieye myieye closed this Jun 18, 2026
@imnasnainaec

Copy link
Copy Markdown
Contributor

Should an issue be opened instead?

@myieye

myieye commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator Author

I added a comment to the Jira issue pointing at the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants