Skip to content

chore(ci): speed up bench-gas by filling only the blockchain_test format#3057

Draft
danceratopz wants to merge 1 commit into
ethereum:forks/amsterdamfrom
danceratopz:benchmark-smoke-test
Draft

chore(ci): speed up bench-gas by filling only the blockchain_test format#3057
danceratopz wants to merge 1 commit into
ethereum:forks/amsterdamfrom
danceratopz:benchmark-smoke-test

Conversation

@danceratopz

@danceratopz danceratopz commented Jun 29, 2026

Copy link
Copy Markdown
Member

🗒️ Description

bench-gas (the Benchmark Gas Values leg of the Benchmarking workflow) is the longest job in CI. It is a correctness smoke check: It fills the tests/benchmark/compute suite with the configured EVM (geth) and replays the fixtures through EELS, confirming the two implementations agree. It is not the released benchmark artifact, which is built separately from the release matrix in .github/configs/feature.yaml and is left untouched here.

This PR restructures the recipe into three steps:

  1. fill --generate-pre-alloc-groups: Generate the pre-allocation groups. This keeps the BlockchainEngineX code path smoke-tested, since pre-alloc generation is the part unique to the all-formats run and where its risk is concentrated.
  2. fill -m "blockchain_test and (not derived_test) and (not slow)": Fill only the blockchain_test format, the one EELS consumes.
  3. Replay the fixtures through EELS via json_loader. Added --no-dev --group test so the PyPy environment builds without the dev Rust crates, matching the fill-pypy recipe.

🔗 Related Issues or PRs

Addresses #2673. Alternative to #2693 (accepting float --gas-benchmark-values): Lowering the gas value cannot help, since it is floored at roughly 0.9M by the most expensive single benchmark op, and fill time is dominated by per-format generation rather than gas magnitude. Also an alternative to #3012 (running on a larger runner), which did not improve the time because the two-phase fill does not scale with cores.

✅ Checklist

  • All: Ran fast static checks to avoid unnecessary CI fails, see also Code Standards and Enabling Pre-commit Checks:
    just static
  • All: PR title adheres to the repo standard - it will be used as the squash commit message and should start type(scope):.
  • All: Considered updating the online docs in the ./docs/ directory.
  • All: Set appropriate labels for the changes (only maintainers can apply labels).

The bench-gas smoke check filled all fixture formats via
`--generate-all-formats`, but `json_loader` only replays `blockchain_test`
against EELS, so the `engine` and `engine_x` fixtures were filled but never
verified, spending roughly two-thirds of the fill for no coverage.

Restructure the recipe into three steps: Generate the pre-alloc groups (which
keeps the `BlockchainEngineX` code path smoke-tested), fill only the
`blockchain_test` format, then verify against EELS. Add `--no-dev --group
test` to the PyPy verify so its environment builds without the dev Rust
crates, matching the `fill-pypy` recipe.

Addresses ethereum#2673.
@danceratopz danceratopz added C-chore Category: chore A-ci Area: Continuous Integration labels Jun 29, 2026
@danceratopz danceratopz changed the title perf(ci): speed up bench-gas by filling only the blockchain_test format chore(ci): speed up bench-gas by filling only the blockchain_test format Jun 29, 2026
@codecov

codecov Bot commented Jun 29, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 93.22%. Comparing base (3f888bc) to head (e8dced9).

Additional details and impacted files
@@               Coverage Diff                @@
##           forks/amsterdam    #3057   +/-   ##
================================================
  Coverage            93.22%   93.22%           
================================================
  Files                  624      624           
  Lines                36926    36926           
  Branches              3377     3377           
================================================
  Hits                 34424    34424           
  Misses                1708     1708           
  Partials               794      794           
Flag Coverage Δ
unittests 93.22% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@LouisTsai-Csie LouisTsai-Csie self-requested a review June 29, 2026 10:21
@danceratopz

Copy link
Copy Markdown
Member Author

This cuts CI run time from ~25-26 min to ~12 min, based on the run here (recent run without this change: https://github.com/ethereum/execution-specs/actions/runs/28234305627/job/83645530740).

Not filling the engine_x format for benchmark tests is less than ideal, but seems like a necessary evil for this quick win.

@danceratopz danceratopz changed the title chore(ci): speed up bench-gas by filling only the blockchain_test format chore(ci): speed up bench-gas by filling only the blockchain_test format Jun 29, 2026
@danceratopz danceratopz marked this pull request as draft June 29, 2026 11:24

@spencer-tb spencer-tb left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy with this overall! Thanks for the quick wins.

Comment thread Justfile
bench-gas *args:
@mkdir -p "{{ output_dir }}/bench-gas/tmp" "{{ output_dir }}/bench-gas/logs"
@echo "==> Step 1/2: Filling benchmark fixtures with configured EVM (EVM_BIN={{ evm_bin }})"
@echo "==> Step 1/3: Generating pre-alloc groups (smoke-tests the BlockchainEngineX path)"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How essential is this phase for CI?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this checks the most fragile part of engine_x filling by checking the prealloc groups are good. For example, that no hard-coded addresses were inadvertently added to the tests (that could lead to clashes).

Comment thread Justfile
@echo "==> Step 2/3: Filling blockchain_test fixtures with configured EVM (EVM_BIN={{ evm_bin }})"
uv run fill \
--evm-bin="{{ evm_bin }}" \
--gas-benchmark-values 1 \

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can use 0.9 here then I think we might as well?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a small gain and it's not entirely for free (requires PR, but the changeset is not huge):

I was leaning towards skipping this, will re-check it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-ci Area: Continuous Integration C-chore Category: chore

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants