Skip to content

test(autobahn): add FlatKV Integration CI matrix entry#3493

Draft
wen-coding wants to merge 5 commits into
mainfrom
wen/add_autobahn_flatkv_test
Draft

test(autobahn): add FlatKV Integration CI matrix entry#3493
wen-coding wants to merge 5 commits into
mainfrom
wen/add_autobahn_flatkv_test

Conversation

@wen-coding
Copy link
Copy Markdown
Contributor

Summary

  • Adds a new Autobahn FlatKV Integration row to the integration-test matrix, mirroring the vanilla FlatKV Integration row with AUTOBAHN=true GIGA_EXECUTOR=true GIGA_STORAGE=true GIGA_OCC=true.
  • Hardcodes the in-container repo path (/sei-protocol/sei-chain) in deploy_flatkv_evm_fixture.sh and verify_flatkv_evm_store.sh. The previous git rev-parse --show-toplevel resolved correctly in a clean checkout but failed inside the container when the host repo was a git worktree (the .git pointer file references a host-only absolute path).

Local results under Autobahn (2026-05-21)

# Script rc
1 deploy_flatkv_evm_fixture.sh 0
2 flatkv_evm_test.yaml 0
3 verify_flatkv_evm_store.sh 0
4 verify_flatkv_crash_recovery.sh 0
5 verify_flatkv_statesync_crash_recovery.sh 0
6 verify_flatkv_total_loss_recovery.sh 0
7 verify_flatkv_partial_loss_fails_loudly.sh 1

Step 7 fails with a WAL-hole / snapshot-0 panic during the cosmos snapshot exporter's readonly FlatKV open after a partial-loss restart (only state_commit/flatkv deleted; chain at height N continues writing WAL entries starting at version N while FlatKV committedVersion=0). Pushing as draft so we can confirm the same failure surface in CI.

Test plan

  • CI: confirm steps 1–6 pass under Autobahn
  • CI: confirm step 7 surfaces the WAL-hole panic (and nothing else)
  • Decide whether to omit step 7 or fix the underlying FlatKV bug before un-drafting

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 21, 2026

The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed✅ passed✅ passed✅ passedMay 22, 2026, 4:17 AM

@codecov
Copy link
Copy Markdown

codecov Bot commented May 22, 2026

Codecov Report

❌ Patch coverage is 30.55556% with 25 lines in your changes missing coverage. Please review.
✅ Project coverage is 59.10%. Comparing base (d4fc9c5) to head (bd17238).

Files with missing lines Patch % Lines
sei-db/state_db/consistency/check.go 0.00% 10 Missing ⚠️
sei-tendermint/node/node.go 0.00% 9 Missing ⚠️
sei-tendermint/internal/p2p/giga_router.go 25.00% 6 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #3493      +/-   ##
==========================================
- Coverage   59.11%   59.10%   -0.01%     
==========================================
  Files        2187     2188       +1     
  Lines      182237   182272      +35     
==========================================
+ Hits       107730   107740      +10     
- Misses      64851    64876      +25     
  Partials     9656     9656              
Flag Coverage Δ
sei-chain-pr 70.17% <30.55%> (?)
sei-db 70.41% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
sei-tendermint/internal/p2p/giga/data.go 70.21% <100.00%> (+1.67%) ⬆️
sei-tendermint/internal/p2p/giga/service.go 100.00% <100.00%> (ø)
sei-tendermint/internal/p2p/giga_router.go 68.07% <25.00%> (-1.69%) ⬇️
sei-tendermint/node/node.go 63.70% <0.00%> (-1.49%) ⬇️
sei-db/state_db/consistency/check.go 0.00% <0.00%> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@wen-coding wen-coding force-pushed the wen/add_autobahn_flatkv_test branch from 35a0d3f to d5b88f5 Compare May 22, 2026 02:56
wen-coding and others added 5 commits May 21, 2026 21:15
Mirrors the vanilla FlatKV Integration matrix row with Autobahn's
AUTOBAHN/GIGA_EXECUTOR/GIGA_OCC env. Also hardcodes the container's
repo path in deploy_flatkv_evm_fixture.sh and verify_flatkv_evm_store.sh
so they no longer rely on `git rev-parse --show-toplevel`, which fails
inside the container when the host checkout is a git worktree.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…pp state

Under Tendermint the consensus replay handshake compares state.AppHash
against app.Info().LastBlockAppHash at startup and aborts with
"Did you reset Tendermint without resetting your application's data?"
on mismatch. Autobahn skips that handshake (it does not maintain
Tendermint's block/state stores), so a partial wipe of state_commit
(e.g. state_commit/flatkv removed without state-sync) goes undetected
and the node silently commits blocks with a divergent AppHash.

Add the equivalent intra-node consistency check on the Autobahn path,
using the avail prune anchor (avail_inner_a.pb / avail_inner_b.pb) as
the persisted-AppHash source. The comparison itself lives in a new
consensus-agnostic sei-db/state_db/consistency package; the engine
adapter in node.OnStart reads the persisted hash via the giga router
and the current hash via ABCI Info, panicking when they disagree.

Skips silently when the engine has no persisted AppQC yet (fresh chain
or mid-state-sync recovery) and when the app's reported version does
not match the persisted version (chain mid-catchup).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…tion

Rename LatestCommittedAppHash to LatestPersistedAppHash and add a TODO
explaining that the method reads through the avail prune anchor's
AppQC because PushAppHash itself never writes to disk
(NewState's WAL recovery does not repopulate appProposals). The
proper fix is direct AppHash persistence via the upcoming
sei-db/ledger_db/block.BlockDB; the consistency check should be
re-pointed at that once a writer is wired up.

No behavior change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds INFO logs at the giga catchup choke points so we can see what a
restarted Autobahn node is actually doing while it waits for peer blocks:

* block fetcher started / stopped with err
* per-block: waiting for QC / requesting from peers / block fetched
* outbound peer dial + handshake success (the existing log only fires on
  failure, so a healthy connection was invisible)
* inbound peer accept

Useful for diagnosing the matrix step 5/6 stalls where the Tendermint-
shaped wait_for_statesync_log_and_kill regex can't tell what an Autobahn
node is doing during catchup. INFO level because catchup is the time we
care about; idle nodes don't loop the fetcher so the log volume stays
proportional to work.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Keeps only "FlatKV Integration" (vanilla) and "Autobahn FlatKV
Integration" in the integration-test matrix so CI iterations finish
in minutes instead of an hour while we diagnose the Autobahn step 5/6
catchup behavior. Revert before merging — see git history for the
full matrix.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@wen-coding wen-coding force-pushed the wen/add_autobahn_flatkv_test branch from 751c850 to bd17238 Compare May 22, 2026 04:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant