[ISSUE #10521] Use madvise(MADV_RANDOM) to disable kernel read-ahead during correctMinOffset binary search#10523
Conversation
…ahead during correctMinOffset binary search correctMinOffset performs binary search on mmap'd ConsumeQueue files (random access). The kernel's default read-ahead window is aggressively large on NVMe devices, causing each page fault to pull in far more data than needed. On cloud disks where read/write bandwidth share a single quota, these read pulses squeeze CommitLog writes and cause periodic send-RT spikes. Use madvise(MADV_RANDOM) before binary search to disable read-ahead, restore MADV_NORMAL in the finally block afterwards. Controlled by config switch correctMinOffsetMadviseEnable (default: off). Skipped on Windows where madvise is not available.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #10523 +/- ##
=============================================
- Coverage 48.18% 48.08% -0.10%
+ Complexity 13394 13369 -25
=============================================
Files 1377 1377
Lines 100730 100753 +23
Branches 13012 13019 +7
=============================================
- Hits 48536 48449 -87
- Misses 46264 46345 +81
- Partials 5930 5959 +29 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
RockteMQ-AI
left a comment
There was a problem hiding this comment.
Review by github-manager-bot
Summary
Uses madvise(MADV_RANDOM) to disable kernel read-ahead during ConsumeQueue.correctMinOffset binary search on mmap'd files, addressing periodic disk read pulses that squeeze CommitLog writes on cloud disks.
Findings
-
[Info] ConsumeQueue.java:619–640 — The
madvise(MADV_RANDOM)setup is placed before thetryblock, and the restore (MADV_NORMAL) is in thefinallyblock. This is correct: if the setup itself throws,MADV_RANDOMwas never applied so there is nothing to restore. No gap between successfulmadviseand enteringtry. -
[Info] ConsumeQueue.java:46 —
IS_LINUX = !MixAll.isWindows()is cached asstatic final— good, avoids repeated platform evaluation on everycorrectMinOffsetcall. -
[Info] MessageStoreConfig.java — New
correctMinOffsetMadviseEnabledefaults tofalse. Safe rollout path; operators opt in explicitly. -
[Info] ConsumeQueueTest.java — Three test scenarios (5000 entries, 10 entries, empty queue) with sequential correction calls verify that
MADV_NORMALis properly restored between invocations. Good coverage. -
[Info] LibC.java —
MADV_RANDOM = 1andmadvise(Pointer, NativeLong, int)already exist in the codebase; no new native bindings needed.
Suggestions
- Minor: Consider adding a
log.debugwhenmadvise(MADV_RANDOM)is successfully applied (not just on failure), to aid production diagnostics when the feature is enabled. Optional.
Verdict
Well-scoped optimization with production validation data (p99 stabilized at ~4ms, zero pulses vs 7/hour before). Config-gated, properly guarded, and well-tested. LGTM.
Automated review by github-manager-bot
…id JNA arm64 UnsatisfiedLinkError
What is the purpose of the change
close #10521
ConsumeQueue.correctMinOffsetperforms binary search on mmap files (random access pattern). The Linux kernel defaultread_ahead_kbon NVMe devices is aggressively large, so each page fault during binary search pulls in far more data than actually needed, producing periodic disk read pulses.On cloud disks where read/write bandwidth share a single quota, these read pulses squeeze CommitLog writes and cause periodic send-RT spikes.
Brief changelog
madvise(MADV_RANDOM)before binary search to disable read-ahead, restoremadvise(MADV_NORMAL)in the finally blockcorrectMinOffsetMadviseEnable(default: off)madviseis not availablestatic finalto avoid repeated evaluationVerifying this change