chore(ci): run benchmark sanity checks on size-xl-x64 runner#3012
chore(ci): run benchmark sanity checks on size-xl-x64 runner#3012danceratopz wants to merge 4 commits into
Conversation
The standard `ubuntu-latest` runner exposes only 2 cores, so `-n auto --maxprocesses 10` was capped at 2 workers. Use the `size-xl-x64` self-hosted runner so the existing `--maxprocesses 10` cap can bind, matching `test.yaml`.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## forks/amsterdam #3012 +/- ##
================================================
Coverage 93.20% 93.20%
================================================
Files 620 620
Lines 38777 38777
Branches 3342 3342
================================================
Hits 36144 36144
Misses 1773 1773
Partials 860 860
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
The self-hosted `size-xl-x64` image lacks `build-essential`, `pkg-config`, and `libsecp256k1-dev`, so `coincurve` failed to build from source. Add the `setup-env` action, matching `test.yaml`.
Drop `enable-cache: "false"` so the benchmark `setup-uv` steps use the action default, caching uv's resolved dependencies and the from-source `coincurve` build, matching `test.yaml`.
Remove the `--maxprocesses 10` cap so `-n auto` can use all 16 cores on the `size-xl-x64` runner, now that ethereum#2751 cut per-worker peak RSS to roughly 3.4 GB. Add `--durations=100` to the fill calls to surface per-test timings and find the serial bottleneck.
|
Thanks for jumping in with a review @LouisTsai-Csie. It didn't lead to an improvement (which is curious). Will revisit when I get the chance, I think we need #2693 |
My friend told me that:
We only ignore these test cases so far: An easy quick win would be to mark this test slow: But tbh, I think there's an easy optimization to be had here that would help releases; will work on that. Perhaps it will even supersede #3058. |

🗒️ Description
Run the benchmark
sanity-checksjobs (Benchmark Gas Values,Fixed Opcode Count CLI,Fixed Opcode Count Config) on thesize-xl-x64self-hosted runner instead ofubuntu-latest.The standard
ubuntu-latestrunner only exposes 2 cores, sofill/pytestwith-n auto --maxprocesses 10was capped at 2 workers, makingBenchmark Gas Valuestake roughly 28 minutes. The XL runner has enough cores for-n autoto scale toward the existing--maxprocesses 10cap, matching the runner already used bytest.yaml.🔗 Related Issues or PRs
N/A.
✅ Checklist
just statictype(scope):.