From 2d9f6b995ad41181695dec5efdb9e94110439c3b Mon Sep 17 00:00:00 2001 From: rob Date: Wed, 13 May 2026 21:41:46 +0200 Subject: [PATCH 1/2] Added docs generated by opus4.7 --- docs/architecture/00-overview.md | 509 ++++++++++++ docs/architecture/01-event-bus.md | 331 ++++++++ docs/architecture/02-chaser-organize.md | 616 +++++++++++++++ docs/architecture/03-chaser-check.md | 470 +++++++++++ docs/architecture/04-chaser-validate.md | 601 ++++++++++++++ docs/architecture/05-chaser-confirm.md | 592 ++++++++++++++ .../architecture/06-sessions-and-protocols.md | 746 ++++++++++++++++++ docs/architecture/07-header-protocols.md | 550 +++++++++++++ docs/architecture/08-block-out-protocols.md | 531 +++++++++++++ docs/architecture/09-filter-out-70015.md | 486 ++++++++++++ docs/architecture/10-tx-protocols.md | 450 +++++++++++ docs/architecture/11-protocol-block-in-106.md | 512 ++++++++++++ docs/architecture/12-periphery-chasers.md | 512 ++++++++++++ docs/architecture/README.md | 231 ++++++ 14 files changed, 7137 insertions(+) create mode 100644 docs/architecture/00-overview.md create mode 100644 docs/architecture/01-event-bus.md create mode 100644 docs/architecture/02-chaser-organize.md create mode 100644 docs/architecture/03-chaser-check.md create mode 100644 docs/architecture/04-chaser-validate.md create mode 100644 docs/architecture/05-chaser-confirm.md create mode 100644 docs/architecture/06-sessions-and-protocols.md create mode 100644 docs/architecture/07-header-protocols.md create mode 100644 docs/architecture/08-block-out-protocols.md create mode 100644 docs/architecture/09-filter-out-70015.md create mode 100644 docs/architecture/10-tx-protocols.md create mode 100644 docs/architecture/11-protocol-block-in-106.md create mode 100644 docs/architecture/12-periphery-chasers.md create mode 100644 docs/architecture/README.md diff --git a/docs/architecture/00-overview.md b/docs/architecture/00-overview.md new file mode 100644 index 00000000..ce47dfa2 --- /dev/null +++ b/docs/architecture/00-overview.md @@ -0,0 +1,509 @@ +# libbitcoin-node — Architectural Overview + +> **Purpose.** This document is the top-down map of the node. It is the entry +> point for the deeper subsystem docs in this directory. It is written for two +> downstream audiences: +> +> 1. A re-implementer producing a functional (e.g. Lisp) port — needs precise +> functional decomposition and clean interfaces. +> 2. A formal-verification effort — needs explicit state machines, invariants, +> and concurrency boundaries. +> +> All non-trivial claims are anchored with `path/to/file.ext:line` citations +> into the C++ so the spec can be re-derived if the source drifts. + +--- + +## 1. Layer stack + +libbitcoin is a stack of cooperating libraries. The node sits at the top and +composes the lower layers; it adds no new I/O primitives — only the chain +state machine and the orchestration of validation across many threads. + +```mermaid +flowchart BT + sys["libbitcoin-system\n(consensus primitives, hashing, script, chain types)"] + db["libbitcoin-database\n(memory-mapped store, query API, header/tx links)"] + net["libbitcoin-network\n(P2P channels, sessions, asio strands)"] + node["libbitcoin-node\n(chasers + node-level protocols)"] + server["libbitcoin-server\n(executable; not in this repo)"] + + sys --> db + sys --> net + db --> node + net --> node + node --> server +``` + +The node owns no executable +(`README.md:19` — *"This component contains no executable as it moved up to libbitcoin-server."*). +Embedders construct one `full_node` (`include/bitcoin/node/full_node.hpp:38`) +against an externally-owned `query&` (the database handle) and a +`configuration&`, then call `start() → run() → close()`. + +--- + +## 2. Top-level object graph + +```mermaid +classDiagram + class full_node { + +start(handler) + +run(handler) + +close() + +organize(header|block, h) + +get_hashes / put_hashes + +notify(ec, chase, value) + +subscribe_events(handler) + +suspend / resume / fault + +prune / snapshot / reload + -event_subscriber_ event_subscriber + -memory_ block_memory + -query_ query& + -chaser_block_, chaser_header_, chaser_check_, + chaser_validate_, chaser_confirm_, + chaser_transaction_, chaser_template_, + chaser_snapshot_, chaser_storage_ + } + class network_net["network::net (base)"] + class chaser + class session + class protocol + full_node --|> network_net : extends + full_node *-- chaser : owns 9 instances + full_node *-- "1" event_subscriber : owns + network_net <.. session : attaches (manual/inbound/outbound) + session <.. protocol : creates per-channel + chaser ..> event_subscriber : pub/sub + protocol ..> event_subscriber : sub +``` + +Key facts: + +- `full_node` is a `network::net` subclass + (`include/bitcoin/node/full_node.hpp:38-40`). Networking (peer discovery, + channels, message framing) is inherited unchanged from + libbitcoin-network. The node only *adds* chasers and *overrides* session + attachment to use node-specific sessions. +- `full_node` owns the **nine chasers as direct members** + (`include/bitcoin/node/full_node.hpp:189-197`). They are constructed in + order at `full_node` construction (`src/full_node.cpp:43-51`) and live + exactly as long as the node. +- All chasers share a single `event_subscriber_` + (`include/bitcoin/node/full_node.hpp:198`) — this is the central event bus. +- The store is **not owned** by `full_node`; only a reference to a `query` is + held (`src/full_node.cpp:51`). Lifetime is the embedder's responsibility. + +--- + +## 3. Concurrency model + +### 3.1 Three flavours of execution context + +| Context | Scope | Where defined | +| -------------------- | ------------------------------------ | ---------------------------------------------- | +| `full_node` strand | Event-bus mutation; chaser ctor | inherited from `network::net` | +| Per-chaser strand | One strand per chaser instance | `include/bitcoin/node/chasers/chaser.hpp:160` | +| Per-channel strand | One strand per peer channel/protocol | inherited from `network::channel` | + +> **Invariant (Concurrency-1).** Every chaser method that mutates chaser state +> must be on that chaser's own strand. The base provides `strand()`, +> `stranded()`, and `POST` macros to enforce this +> (`include/bitcoin/node/chasers/chaser.hpp:62-119`). + +> **Invariant (Concurrency-2).** All `event_subscriber_` mutation (subscribe, +> notify, notify_one) happens on the `full_node` strand +> (`src/full_node.cpp:187-247`). Calls from chasers/protocols are posted into +> that strand by `full_node::notify` (`src/full_node.cpp:187-193`). + +### 3.2 Notification flow + +```mermaid +sequenceDiagram + autonumber + participant ChaserA as Chaser A (its own strand) + participant Node as full_node (its strand) + participant Bus as event_subscriber_ + participant ChaserB as Chaser B (its own strand) + + ChaserA->>Node: notify(ec, chase::X, value) + Note right of Node: posted to full_node strand + Node->>Bus: do_notify(ec, X, value) + Bus-->>ChaserB: handler invoked on Chaser B strand + Bus-->>ChaserA: handler invoked on Chaser A strand +``` + +Implementation: `full_node::notify` posts to `strand()`, then +`do_notify` calls `event_subscriber_.notify(...)` +(`src/full_node.cpp:187-201`). Each subscriber's handler was registered with +a `BIND` that re-posts to the subscriber's own strand — so handlers always +run stranded with respect to their owning chaser/protocol. + +> **Invariant (Concurrency-3).** A subscription is created via +> `subscribe_events` from a chaser's `start()` +> (`include/bitcoin/node/chasers/chaser.hpp:101`), which must itself be called +> on the node strand. Protocols subscribe via the async variant +> (`include/bitcoin/node/full_node.hpp:99-100`) which posts to the strand and +> uses a completer. + +### 3.3 Why per-chaser strands + +Chasers do consensus-critical, often expensive work (script execution, +database writes, snapshotting). Each strand allows the chaser to serialize +its *own* mutations while still running in parallel with the other chasers +(see `include/bitcoin/node/chasers/chaser.hpp:32-35` — *"Each chaser operates +on its own strand … allowing concurrent chaser operations to the extent that +threads are available"*). + +This is the **central source of parallelism** in the node. The chasers form +a pipeline; each stage runs on its own strand and they communicate by +publishing events. + +--- + +## 4. The event bus + +The bus is an enumerated message type `enum class chase` +(`include/bitcoin/node/chase.hpp:27-164`) carried alongside a `code` (error) +and an `event_value` variant (`include/bitcoin/node/define.hpp:79-84`, +typed payloads: `height_t`, `header_t`, `peer_t`, `object_t`, `count_t`, +`transaction_t`). + +The enum **is the specification** of inter-chaser coupling. The summary +below is the *source-verified* mapping (issuer/handler columns reflect +actual `notify(chase::...)` and `case chase::...:` sites — not the +`chase.hpp` inline doc, which has several stale entries). For every site +citation and the full discrepancy list, see +[`01-event-bus.md`](01-event-bus.md). + +⚠ in the issuer column marks a divergence from the chase.hpp comment; +*dormant* means the handler/issuer exists in source but is currently +commented out or absent — wired but not yet active. + +| Event | Payload | Issuer(s) | Handler(s) | +| ----------------- | --------------- | ---------------------------------- | ------------------------------------------------ | +| **Work shuffling** | +| `start` | `height_t` | `full_node` | `check`, `validate`, `confirm` | +| `space` | `count_t` | `full_node` | `storage` (snapshot does *not* handle) | +| `snap` | `height_t` | *dormant — no live issuer* ⚠ | `snapshot` | +| `bump` | `height_t` | `organize` (template) | `check`, `validate`, `confirm` | +| `suspend` | — | `full_node` **+** `organize` ⚠ | `observer` | +| `resume` | — | `full_node` | `check`, `validate`, `confirm` | +| `starved` | `object_t` | `protocol_block_in_31800` | `check` | +| `split` | `object_t` | `check` ⚠ (not `session_outbound`) | `protocol_block_in_31800` | +| `stall` | `peer_t` | `check` ⚠ (not `session_outbound`) | `protocol_block_in_31800` | +| `purge` | `peer_t` | `check` | `protocol_block_in_31800` | +| `report` | `count_t` | *external* (executor / server) | `protocol_block_in_31800` | +| **Candidate chain** | +| `blocks` | `height_t` | `chaser_block` (via `chase_object()`) | *dormant — snapshot arm commented out* | +| `headers` | `height_t` | `chaser_header` (via `chase_object()`) | `check` | +| `download` | `count_t` | `check` | `protocol_block_in_31800` | +| `regressed` | `height_t` | `organize` (template) | `check`, `validate`, `confirm` | +| `disorganized` | `height_t` | `organize` (template) | `check`, `validate`, `confirm` | +| **Check / Identify** | +| `checked` | `height_t` | `protocol_block_in_31800` | `check`, `validate` (snapshot arm dormant) | +| `unchecked` | `header_t` | `protocol_block_in_31800` | `organize` (template) | +| **Accept / Connect** | +| `valid` | `height_t` | `validate` | `check`, `confirm` (snapshot arm dormant) | +| `unvalid` | `header_t` | `validate` | `organize` (template) | +| **Confirm (block)** | +| `confirmable` | `header_t` | `confirm` | *snapshot arm commented out — currently no live consumer* | +| `unconfirmable` | `header_t` | `confirm` | `organize` (template) | +| **Confirm (chain)** | +| `block` | `header_t` | `confirm` ⚠ (not `transaction`) | `snapshot`, `protocol_block_out_106`, `protocol_header_out_70012` | +| `organized` | `header_t` | `confirm` | *no live consumer* | +| `reorganized` | `header_t` | `confirm` | *no live consumer* | +| **Mining** | +| `transaction` | `transaction_t` | `transaction` | `template`, `protocol_transaction_out_106` | +| `template_` | `height_t` | *dormant — no live issuer* ⚠ | (miners, external) | +| **Stop** | +| `stop` | — | `full_node` | all (chasers + subscribing protocols) | + +> **Note for the spec.** Treat this table as the **interface boundary** +> between chasers. A formal model can represent each chaser as a process +> with a single inbox typed `chase × event_value`; the C++ implementation +> just happens to use asio strands. +> +> Discrepancies (⚠) and *dormant* entries are documented at +> [`01-event-bus.md`](01-event-bus.md#22-verified-event-reference) with +> grep methodology so the table can be re-derived mechanically when source +> changes. + +There is also a parallel, lower-volume `events` enum +(`include/bitcoin/node/events.hpp:28-65`) — these are *reporting* events +(metrics: timespans, archived/organized counts), surfaced via +`network::net::span<...>` (e.g. `src/full_node.cpp:314, 335, 360`). They are +**not** part of the inter-chaser protocol; they exist only for telemetry and +should not be modelled in the spec. + +--- + +## 5. Chaser pipeline (logical view) + +The nine chasers form a directed pipeline. The diagram below is a +*conceptual* sketch; the **authoritative, source-verified** issuer/handler +graph (with the dormant edges shown dashed) is in +[`01-event-bus.md §3`](01-event-bus.md#3-verified-issuer--handler-diagram). + +```mermaid +flowchart LR + subgraph Ingress["Ingress (peer messages)"] + H_in["protocol_header_in_*"] + B_in["protocol_block_in_31800"] + end + + subgraph Organize["Organize (templated chaser_organize)"] + H[chaser_header] + B[chaser_block] + end + + subgraph Pipeline["Validation pipeline"] + Chk[chaser_check] + Val[chaser_validate] + Cnf[chaser_confirm] + end + + subgraph Periphery["Periphery"] + Snp[chaser_snapshot] + Stg[chaser_storage] + Tx[chaser_transaction] + Tpl[chaser_template] + end + + H_in --> H + B_in --> Chk + H -- "chase::headers" --> Chk + Chk -- "chase::download" --> B_in + B_in -- "chase::checked / unchecked" --> Chk + Chk -- "chase::valid (from Val)" --> Val + Val -- "chase::valid / unvalid" --> Cnf + Cnf -- "chase::confirmable / unconfirmable" --> Snp + Cnf -- "chase::organized / reorganized" --> Tx + Tx -- "chase::transaction" --> Tpl + Cnf -- "chase::blocks" --> Snp + Cnf -- "chase::snap" --> Snp + Stg -- "chase::space" --> Snp +``` + +Two organization modes (mutually exclusive per build/config): + +- **Headers-first** (`config_.node.headers_first == true`): `chaser_header` + starts; headers stream in first, blocks chase them + (`src/full_node.cpp:79-81`). +- **Blocks-first**: `chaser_block` starts; full blocks are organized + directly. Same path, file `src/full_node.cpp:79-81`. + +The selection happens once at `do_start`. Both +`chaser_block` and `chaser_header` instantiate the templated +`chaser_organize` (`include/bitcoin/node/chasers/chaser_organize.hpp:33-35`) +— so the organize state machine is identical; only the unit (header vs. +block) differs. This is a strong hint that a formal spec should model +"organize" once, parameterized. + +--- + +## 6. Lifecycle + +### 6.1 Construction → start → run → close + +```mermaid +sequenceDiagram + autonumber + actor Embedder + participant FN as full_node + participant Net as network::net (base) + participant CH as 9 chasers + participant Bus as event_subscriber_ + + Embedder->>FN: full_node(query, config, log) + FN->>CH: construct in order (block, header, check, validate,\nconfirm, transaction, template, snapshot, storage) + Embedder->>FN: start(handler) + FN->>Net: net::start (posts to strand) + Net->>FN: do_start [stranded] + FN->>CH: chaser_X.start() (header xor block first) + Note over FN,CH: any chaser->start() failure aborts startup + FN->>Net: net::do_start (seed + manual peer service) + Net-->>Embedder: handler(success) + Embedder->>FN: run(handler) + FN->>FN: do_notify(chase::start, height_t{}) + FN->>Net: net::do_run (inbound + outbound services) + Net-->>Embedder: handler(success) + Note over FN: node is live + Embedder->>FN: close() + FN->>Net: net::close → do_close + FN->>CH: stopping(service_stopped) (non-blocking) + FN->>Bus: stop(service_stopped, chase::stop, {}) + FN->>CH: stop() (blocking, joins threadpools) +``` + +Citations: +`src/full_node.cpp:62-72` (`start`), +`src/full_node.cpp:74-95` (`do_start`), +`src/full_node.cpp:97-119` (`run` / `do_run`), +`src/full_node.cpp:121-156` (`close` / `do_close`). + +> **Invariant (Lifecycle-1).** The store must be initialised before `start`: +> `full_node::start` short-circuits with `error::store_uninitialized` if +> `query_.is_initialized()` is false (`src/full_node.cpp:64-67`). + +> **Invariant (Lifecycle-2).** `do_close` initiates non-blocking +> `stopping(...)` first, then `event_subscriber_.stop(...)`, then `close()` +> blocks on per-chaser `stop()`. Order matters: chasers must observe stop +> before the bus is torn down (`src/full_node.cpp:139-156`). + +### 6.2 Suspend / resume / fault + +`suspend(ec)` and `resume()` are pre-existing `network::net` overrides that +also emit `chase::suspend` / `chase::resume` +(`src/full_node.cpp:252-271`). `fault(ec)` is the unified +"something terminal happened" entry; it dispatches by store state +(`is_full` → `chase::space`, `is_fault` → log, otherwise resumable) and +always calls `suspend(ec)` (`src/full_node.cpp:273-295`). + +`prune` / `snapshot` / `reload` (`src/full_node.cpp:298-362`) all share a +pattern: suspend network, run the store operation, leave network suspended +on completion (caller resumes on success). Each emits a metrics event +(`events::prune_msecs`, `events::snapshot_secs`, `events::reload_msecs`). + +> **Invariant (Store-1).** During a store maintenance operation, all peer +> connections are suspended; the caller is responsible for `resume()` after +> success. A `wait_lock` event during the op causes a *renewed* `suspend` +> (`src/full_node.cpp:308-311, 329-332, 354-357`) — racy connections that +> missed the first suspend are caught here. + +--- + +## 7. Network layer at a glance + +`full_node` overrides three session-attachment hooks +(`src/full_node.cpp:444-457`): + +- `attach_manual_session` → `node::session_manual` +- `attach_inbound_session` → `node::session_inbound` +- `attach_outbound_session` → `node::session_outbound` + +Each `node::session_*` adds a `node::session` mixin +(`include/bitcoin/node/sessions/session.hpp:31-124`) which mostly forwards to +`full_node` (organize, get/put_hashes, notify, subscribe). The mixin's job is +to give every protocol instance on every channel a reference to the node's +event bus and organizers. + +Protocols are versioned by Bitcoin P2P protocol number: + +| Family | Versions | Direction | +| --------- | ---------------------- | --------- | +| block | 106, 31800 | in | +| block | 106, 70012 | out | +| header | 31800, 70012 | in | +| header | 31800, 70012 | out | +| tx | 106 | in/out | +| filter | 70015 (BIP157/158) | out | +| meta | observer, peer, performer | — | + +Each protocol class lives in `include/bitcoin/node/protocols/` and is +implemented in `src/protocols/`. `protocol_block_in_31800` is the central +performance-sensitive class — it's the only one referenced from many +`chase` event handlers (download, starved, split, stall, purge, report, +checked, unchecked). It deserves its own subsystem doc. + +--- + +## 8. Memory model: the block arena + +`full_node` owns a `block_memory` controller +(`include/bitcoin/node/full_node.hpp:44, 185`) sized by +`allocation_multiple × network.threads` (`src/full_node.cpp:41`). This is +exposed to derived sessions via `get_memory()` and is the source of arena +allocations for block-shaped objects. + +The header carries an unusually strong lifetime warning +(`include/bitcoin/node/full_node.hpp:32-37`): + +> *"when full node is using block_memory controller, all shared block +> components invalidate when the block destructs. Lifetime of the block is +> assured for the extent of all methods below, however if a sub-object is +> retained by shared_ptr, beyond method completion, a copy of the block +> shared_ptr must also be retained. Taking a block or sub-object copy is +> insufficient, as copies are shallow (copy internal shared_ptr objects)."* + +> **Implication for a port.** A Lisp implementation that uses garbage +> collection should **not** replicate the arena-lifetime contract verbatim — +> it is a C++-specific optimisation. A formal model can abstract block +> contents as immutable values; the arena is below the spec layer. The +> closest correctness obligation is: any reference held beyond a chaser +> method must keep the root block alive. + +--- + +## 9. Failure model + +Errors flow as `std::error_code` values. The node-specific category is +defined in `include/bitcoin/node/error.hpp:32-101`. Categories: + +- **store**: `store_uninitialized`, `store_reload`, `store_prune`, `store_snapshot` +- **network**: `slow_channel`, `stalled_channel`, `exhausted_channel`, + `sacrificed_channel`, `suspended_channel`, `suspended_service` +- **blockchain**: `orphan_block`, `orphan_header`, `duplicate_block`, + `duplicate_header` +- **faults (terminal)**: `protocol1-2`, `header1`, `organize1-15`, + `validate1-8`, `confirm1-12` + +> **Note for the spec.** The `organizeN` / `validateN` / `confirmN` codes are +> *terminal* error markers (`error.hpp:62` — *"faults (terminal, code error +> and store corruption assumed)"*). Each represents an internal invariant +> violation. Each numbered code corresponds to a specific +> `BC_ASSERT`-equivalent site in the relevant chaser; locating each site is +> a prerequisite for a formal proof obligation list. **See the per-chaser +> docs (to be written) for the mapping.** + +--- + +## 10. Subsystem docs (planned) + +This overview will be followed by drill-downs at the same level of rigour +(state machines, invariants, sequence diagrams). Proposed structure: + +1. `01-event-bus.md` — every `chase` event with formal pre/post-conditions + and the C++ call sites that issue/consume it. +2. `02-chaser-organize.md` — the templated organize state machine + (shared by header/block). +3. `03-chaser-check.md` — block download orchestration + work splitting. +4. `04-chaser-validate.md` — script & consensus validation (the core + target for formal verification). +5. `05-chaser-confirm.md` — confirmation, reorg, UTXO commit. +6. `06-chaser-snapshot-storage.md` — disk-space, snapshot, prune, reload. +7. `07-chaser-transaction-template.md` — mempool + mining template. +8. `08-sessions.md` — inbound/outbound/manual lifecycle + suspend/resume. +9. `09-protocols-block-header.md` — versioned P2P protocols, esp. + `protocol_block_in_31800` (the chase-event-heavy class). +10. `10-protocols-tx-filter.md` — tx, filter protocols. +11. `11-memory-and-arena.md` — `block_arena` and lifetime contract. +12. `12-failure-model.md` — full enumeration of error codes to source + sites, with proof obligations. + +Each subsystem doc will explicitly call out: + +- **State variables** and their invariants +- **Pre/post-conditions** on every state-mutating operation +- **Events consumed/emitted** (typed) +- **Concurrency boundary** (which strand, what may run concurrently) +- **Source citations** with `file:line` + +--- + +## Appendix A — Glossary + +- **Chaser**: a long-running strand-confined state machine that owns one + responsibility in the blockchain pipeline (header sync, validation, etc.). +- **Organize**: the act of attaching a header or block to the candidate + chain. Implemented once for both via `chaser_organize`. +- **Candidate chain**: the chain of headers/blocks the node *prefers* but + has not yet fully confirmed. +- **Confirmed chain**: the chain of blocks that have passed all consensus + checks and been committed (UTXO-applied). +- **Strand**: an asio primitive that serializes execution of posted + callbacks; the node uses one per chaser and one per channel. +- **Event bus**: `event_subscriber_` on `full_node`; carries `chase` + enumerators between chasers and protocols. +- **Block arena**: a custom allocator owned by `full_node` that backs the + components of received blocks; lifetime is tied to the root block. diff --git a/docs/architecture/01-event-bus.md b/docs/architecture/01-event-bus.md new file mode 100644 index 00000000..d924364b --- /dev/null +++ b/docs/architecture/01-event-bus.md @@ -0,0 +1,331 @@ +# 01 — The Event Bus (`chase` events) + +> Companion to [`00-overview.md`](00-overview.md). This is the authoritative, +> source-verified reference for every `chase` enumerator: payload, issuer +> sites, handler sites, and known discrepancies with the inline +> `chase.hpp` documentation. +> +> Every fact below is grep-derived. Methodology and queries are in +> [§4](#4-methodology) so any future change can be re-verified mechanically. + +--- + +## 1. Subscriber model + +### 1.1 Storage + +``` +event_subscriber_ : full_node // include/bitcoin/node/full_node.hpp:198 + = network::desubscriber // define.hpp:87 +``` + +A `desubscriber` is the libbitcoin-network primitive that combines pub/sub +with explicit unsubscription via a key. All subscribers share one bus. + +### 1.2 Subscribe protocol + +There are two subscribe APIs, both eventually owning state on the +`full_node` strand: + +| API | Used by | Strand discipline | +| --------------------------------------------------------- | --------- | -------------------------------------------------------------- | +| `subscribe_events(notifier) → object_key` (sync) | chasers | Caller must already be on `full_node` strand | +| `subscribe_events(notifier, completer)` (async) | protocols | Posts to `full_node` strand internally; completer fires there | + +Chaser subscription path (`include/bitcoin/node/chasers/chaser.hpp:101`, +`src/chasers/chaser.cpp:92-94`): +``` +chaser::subscribe_events + → full_node::subscribe_events(handler) // src/full_node.cpp:219-225 + → event_subscriber_.subscribe(handler, key) +``` +The `BIND` macro that produces `notifier` re-posts the handler back to the +chaser's own strand, so handlers always execute stranded with respect to +their owning chaser. Each chaser subscribes exactly once, in its `start()`, +via `SUBSCRIBE_EVENTS(handle_event, _1, _2, _3)` — see +`include/bitcoin/node/chasers/chaser.hpp:167-168` for the macro. + +Protocol subscription path (`src/protocols/protocol.cpp:72-89`): +``` +protocol::subscribe_events + → session::subscribe_events(handler, complete) // src/sessions/session.cpp:86-90 + → full_node::subscribe_events(handler, complete) // src/full_node.cpp:227-242 + → posts to strand → do_subscribe_events + → event_subscriber_.subscribe(handler, key) + → complete(ec, key) +``` +The async variant exists because a protocol is constructed on its channel +strand, not the node strand, so it must hop across. + +### 1.3 Notify protocol + +``` +chaser/protocol calls notify(ec, chase::X, value) + → full_node::notify posts to strand // src/full_node.cpp:187-193 + → do_notify on strand + → event_subscriber_.notify(ec, X, value) + → for each subscriber: re-post to subscriber's own strand +``` + +### 1.4 Unsubscribe semantics + +`full_node::unsubscribe_events(key)` does not remove the entry; it sends a +single `chase::stop` to that subscriber only +(`src/full_node.cpp:244-247`). The subscriber's `handle_event` returns +`false` on `chase::stop`, and the `desubscriber` interprets that as +removal. So: + +> **Invariant (Bus-1).** A `false` return from `handle_event` means +> "remove me". Every chaser returns `false` on `chase::stop` +> (verified: each `chaser_*.cpp` has `case chase::stop: return false;`). + +> **Invariant (Bus-2).** The bus is torn down only in `full_node::do_close`, +> which broadcasts `chase::stop` to all subscribers +> (`src/full_node.cpp:154`). After that point no further `notify` will +> deliver. + +--- + +## 2. Verified event reference + +Notation: +- **Payload** column shows the active variant member; types are in + `include/bitcoin/node/define.hpp:70-75`. +- **Issuer** is every `notify(... chase::X ...)` or `notify_one(... chase::X ...)` + call site that is not commented out. +- **Handler** is every `case chase::X:` arm that is not commented out. +- Discrepancies with the `chase.hpp` inline doc are flagged ⚠. + +### 2.1 Work shuffling + +| Event | Payload | Issuer (file:line) | Handler (file:line) | +| ----------- | ---------- | ------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| `start` | `height_t` | `src/full_node.cpp:115` (`do_run`) | `chaser_check.cpp:130`, `chaser_validate.cpp:77`, `chaser_confirm.cpp:75` | +| `space` | `count_t` | `src/full_node.cpp:279` (`fault` when `query_.is_full()`) | `chaser_storage.cpp:84` (`chaser_snapshot.cpp` does *not* handle `space` currently) | +| `snap` | `height_t` | **none** ⚠ — chase.hpp says issued by `confirm`, but no live `notify(chase::snap, ...)` in source | `chaser_snapshot.cpp:126` | +| `bump` | `height_t` | `chaser_organize.ipp:311` (after first non-stale candidate push) | `chaser_check.cpp:131`, `chaser_validate.cpp:78`, `chaser_confirm.cpp:76` | +| `suspend` | — (`{}`) | `src/full_node.cpp:270` (`suspend`); **also** `chaser_organize.ipp:442` (after `do_disorganize`) ⚠ | `protocol_observer.cpp:77` (only) | +| `resume` | — (`{}`) | `src/full_node.cpp:261` | `chaser_check.cpp:129`, `chaser_validate.cpp:76`, `chaser_confirm.cpp:74` | +| `starved` | `object_t` | `src/protocols/protocol_block_in_31800.cpp:423` | `chaser_check.cpp:120` | +| `split` | `object_t` | `src/chasers/chaser_check.cpp:238` (via `notify_one`) ⚠ | `src/protocols/protocol_block_in_31800.cpp:108` | +| `stall` | `peer_t` | `src/chasers/chaser_check.cpp:243` ⚠ | `src/protocols/protocol_block_in_31800.cpp:115` | +| `purge` | `peer_t` | `src/chasers/chaser_check.cpp:351` | `src/protocols/protocol_block_in_31800.cpp:123` | +| `report` | `count_t` | **external** — issued by `executor` in libbitcoin-server, not in this repo | `src/protocols/protocol_block_in_31800.cpp:139` | + +Discrepancy notes: +- ⚠ **`split`, `stall`**: `chase.hpp:60-66` lists `session_outbound` as the + issuer, but the actual emits are in `chaser_check`. The functional intent + matches (slow/stalled channel detection drives work redistribution), but + the issuer is mislabeled in the inline doc. +- ⚠ **`suspend`** is also issued by `chaser_organize` after a + disorganization (`chaser_organize.ipp:442`), not only by `full_node`. The + comment in chase.hpp omits this. +- ⚠ **`snap`** has no live issuer anywhere in this repo. The handler in + `chaser_snapshot.cpp:126` is wired but unreachable from the current + source. Treat as a planned/dormant event. + +### 2.2 Candidate chain + +| Event | Payload | Issuer (file:line) | Handler (file:line) | +| -------------- | ---------- | ----------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------- | +| `blocks` | `height_t` | `chaser_organize.ipp:318` when `Block = system::chain::block` (via `chase_object()` helper) | None active. ⚠ `chaser_snapshot.cpp:89` has a commented-out arm. Spec-wise: handler dormant. | +| `headers` | `height_t` | `chaser_organize.ipp:318` when `Block = system::chain::header` | `chaser_check.cpp:151` | +| `download` | `count_t` | `chaser_check.cpp:409`, `chaser_check.cpp:455` | `protocol_block_in_31800.cpp:130` | +| `regressed` | `height_t` | `chaser_organize.ipp:265` | `chaser_check.cpp:142`, `chaser_validate.cpp:90`, `chaser_confirm.cpp:88` | +| `disorganized` | `height_t` | `chaser_organize.ipp:439` | `chaser_check.cpp:143`, `chaser_validate.cpp:91`, `chaser_confirm.cpp:89` | + +Notes: +- `chase_object()` is the template helper at + `chaser_organize.hpp:127-130` that selects `chase::blocks` vs + `chase::headers` based on `Block` parameter. So one source line + (`ipp:318`) is the issuer of *both* events — choose-by-template. +- `chase::blocks` has **no live handler** today (snapshot's handler is + commented out at `chaser_snapshot.cpp:89`). This is consistent with the + current emphasis on headers-first sync. + +### 2.3 Check / Identify + +| Event | Payload | Issuer (file:line) | Handler (file:line) | +| ----------- | ---------- | ------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------- | +| `checked` | `height_t` | `protocol_block_in_31800.cpp:348` | `chaser_check.cpp:136`, `chaser_validate.cpp:83` (snapshot arm commented out at `chaser_snapshot.cpp:90`) | +| `unchecked` | `header_t` | `protocol_block_in_31800.cpp:324` | `chaser_organize.ipp:89` (handled identically with `unvalid`, `unconfirmable`) | + +### 2.4 Accept / Connect + +| Event | Payload | Issuer (file:line) | Handler (file:line) | +| --------- | ---------- | -------------------------- | -------------------------------------------------------------------------------------------------------------- | +| `valid` | `height_t` | `chaser_validate.cpp:330` | `chaser_check.cpp:157`, `chaser_confirm.cpp:81` (snapshot arm commented out at `chaser_snapshot.cpp:99`) | +| `unvalid` | `header_t` | `chaser_validate.cpp:321` | `chaser_organize.ipp:90` (treated as disorganize trigger) | + +### 2.5 Confirm (block) + +| Event | Payload | Issuer (file:line) | Handler (file:line) | +| --------------- | ---------- | ------------------------- | -------------------------------------------------------------------- | +| `confirmable` | `header_t` | `chaser_confirm.cpp:345` | None active (snapshot arm commented out at `chaser_snapshot.cpp:108`) | +| `unconfirmable` | `header_t` | `chaser_confirm.cpp:338` | `chaser_organize.ipp:91` | + +### 2.6 Confirm (chain) and Mining + +| Event | Payload | Issuer (file:line) | Handler (file:line) | +| --------------- | --------------- | ----------------------------------- | ------------------------------------------------------------------------------------------------ | +| `block` | `header_t` | `chaser_confirm.cpp:427` ⚠ | `chaser_snapshot.cpp:117`, `protocol_block_out_106.cpp:75`, `protocol_header_out_70012.cpp:71` | +| `organized` | `header_t` | `chaser_confirm.cpp:396` | None active (chase.hpp predicts `transaction` consumer; not wired) | +| `reorganized` | `header_t` | `chaser_confirm.cpp:363` | None active (chase.hpp predicts `transaction` consumer; not wired) | +| `transaction` | `transaction_t` | `chaser_transaction.cpp:85` | `chaser_template.cpp:67`, `protocol_transaction_out_106.cpp:72` | +| `template_` | `height_t` | **none** ⚠ — chase.hpp says `template` issues; currently dormant | (miners, external) | + +Discrepancy: +- ⚠ **`chase::block`**: `chase.hpp:138` says "Issued by 'transaction' and + handled by 'protocol_header/block_out'". In reality `chaser_confirm` + issues it (`chaser_confirm.cpp:427`), and `chaser_snapshot` is *also* a + live consumer (`chaser_snapshot.cpp:117`). + +### 2.7 Stop + +| Event | Payload | Issuer (file:line) | Handler | +| ------ | ------- | --------------------------------------------------------------------------------------------- | ------------------------ | +| `stop` | — (`{}`) | `full_node::unsubscribe_events` via `notify_one` (`src/full_node.cpp:246`); `full_node::do_close` via `event_subscriber_.stop(...)` (`src/full_node.cpp:154`) | Every chaser & protocol | + +--- + +## 3. Verified issuer / handler diagram + +This replaces and supersedes the pipeline diagram in `00-overview.md §5`, +with one node per *actual* source location rather than per chase.hpp +comment. + +```mermaid +flowchart LR + %% Issuers + FN[full_node] + ORG["chaser_organize<Block>\n(template; instantiated as\nchaser_header and chaser_block)"] + CHK[chaser_check] + VAL[chaser_validate] + CNF[chaser_confirm] + TX[chaser_transaction] + PIN["protocol_block_in_31800"] + + %% Handlers (only those that introduce new edges) + SNP[chaser_snapshot] + STG[chaser_storage] + TPL[chaser_template] + OBS[protocol_observer] + POUT_B["protocol_block_out_106"] + POUT_H["protocol_header_out_70012"] + POUT_T["protocol_transaction_out_106"] + + %% Edges (event labels) + FN -- "start, resume, suspend, space, stop" --> CHK + FN -- "start, resume" --> VAL + FN -- "start, resume" --> CNF + FN -- "space" --> STG + FN -- "suspend" --> OBS + ORG -- "regressed, disorganized, bump" --> CHK + ORG -- "regressed, disorganized, bump" --> VAL + ORG -- "regressed, disorganized, bump" --> CNF + ORG -- "suspend (after disorganize)" --> OBS + ORG -- "headers" --> CHK + ORG -. "blocks (no live consumer)" .-> SNP + PIN -- "checked" --> CHK + PIN -- "checked" --> VAL + PIN -- "unchecked" --> ORG + PIN -- "starved" --> CHK + CHK -- "download, split, stall, purge" --> PIN + CHK -- "purge (also re-handles disorganize indirectly via organize)" --> ORG + VAL -- "valid" --> CHK + VAL -- "valid" --> CNF + VAL -- "unvalid" --> ORG + CNF -- "confirmable" -. "(dormant)" .-> SNP + CNF -- "unconfirmable" --> ORG + CNF -- "block" --> SNP + CNF -- "block" --> POUT_B + CNF -- "block" --> POUT_H + TX -- "transaction" --> TPL + TX -- "transaction" --> POUT_T + + style ORG fill:#eef + style CNF fill:#fee + style VAL fill:#fee + style CHK fill:#fee +``` + +(Solid = live; dashed = wired in source but currently dormant.) + +--- + +## 4. Methodology + +The tables above were generated by these greps from repo root: + +```sh +# Issuers (excluding notify_one) +grep -rn 'notify(.*chase::' src include | grep -v 'notify_one' + +# Issuers via notify_one +grep -rn 'notify_one(.*chase::' src include + +# Handlers +grep -rn 'case chase::' src include + +# Subscription sites +grep -rn 'SUBSCRIBE_EVENTS\|subscribe_events' src + +# Indirect issuers via the chase_object() template helper +grep -rn 'chase_object' src include +``` + +To recheck after source changes, run the same set and diff against the +tables here. Any new row must come with its source citation, and any +deletion should be cross-referenced with this doc to ensure no downstream +spec/Lisp port assumes the old behaviour. + +--- + +## 5. Specification view + +A formal model can treat the bus as follows. + +**Processes.** Each chaser and each event-subscribing protocol is a +process `P` with: +- a single inbox typed `(code, chase, event_value)` +- a private state space `Σ_P` +- a deterministic transition `δ_P : Σ_P × (code, chase, event_value) → Σ_P × Output_P` +- where `Output_P` is a finite set of `notify` calls and outbound + asynchronous actions (e.g. DB writes). + +**Bus.** The bus is broadcast-with-filter: every `notify(ec, X, v)` is +delivered to every subscriber, and each subscriber's `handle_event` +inspects the `chase` value to decide whether to act. + +**Invariants worth proving.** +1. *Liveness of stop*: after `full_node::do_close`, every process + eventually reaches a terminal state. (Follows from Bus-2 + the universal + `case chase::stop: return false;`.) +2. *Disorganize convergence*: every `chase::regressed`/`disorganized` is + eventually responded to by the consuming chasers' "rewind" branches + (`chaser_check.cpp:142-143`, `chaser_validate.cpp:90-91`, + `chaser_confirm.cpp:88-89`) before the next forward event is processed. +3. *No spurious validation*: `chase::valid` is emitted only after + `chaser_validate` has executed its validation routine on the height + (`chaser_validate.cpp:330` is the unique emit site). +4. *Confirm uniqueness*: `chase::confirmable`, `unconfirmable`, + `organized`, `reorganized`, `block` all have a single issuer + (`chaser_confirm`). Confirmation is centralized. + +**Dormant events (do not model as live).** +- `chase::snap` — handler exists, no issuer. +- `chase::template_` — handler is "miners" (external), no issuer in this + repo. +- `chase::report` — issuer is `executor` (external, libbitcoin-server). + +--- + +## Cross-references + +- [`00-overview.md`](00-overview.md) §3 (concurrency model) and §4 + (high-level event table). +- Future doc `02-chaser-organize.md` — the templated state machine that + emits `bump`, `regressed`, `disorganized`, `headers`/`blocks`, + `suspend`. +- Future doc `04-chaser-validate.md` — the unique issuer of `valid` / + `unvalid`. diff --git a/docs/architecture/02-chaser-organize.md b/docs/architecture/02-chaser-organize.md new file mode 100644 index 00000000..283c7b75 --- /dev/null +++ b/docs/architecture/02-chaser-organize.md @@ -0,0 +1,616 @@ +# 02 — `chaser_organize` (header and block organization) + +> Companion to [`00-overview.md`](00-overview.md) and +> [`01-event-bus.md`](01-event-bus.md). +> +> `chaser_organize` is the templated state machine that attaches +> incoming headers (and, in blocks-first mode, full blocks) to the +> **candidate chain**. Both `chaser_header` and `chaser_block` are +> instantiations of this template, so the state machine is *shared* — +> only a handful of hook methods differ. This makes it the single most +> important subsystem to specify cleanly for a port or formal model. + +| File | Role | +| ----------------------------------------------------------------------------------------------- | --------------------------------------------------- | +| `include/bitcoin/node/chasers/chaser_organize.hpp` | Template declaration + per-Block `static constexpr` differentiators | +| `include/bitcoin/node/impl/chasers/chaser_organize.ipp` | Template implementation (the core state machine) | +| `include/bitcoin/node/chasers/chaser_header.hpp` / `src/chasers/chaser_header.cpp` | Instantiation for `system::chain::header` | +| `include/bitcoin/node/chasers/chaser_block.hpp` / `src/chasers/chaser_block.cpp` | Instantiation for `system::chain::block` | + +--- + +## 1. State + +### 1.1 Persistent (the store, accessed via `archive()`) + +The chaser does not own the chain — the database does. It manipulates two +chain projections exposed by `database::query`: + +| Projection | Mutators called from organize | Read methods used | +| --------------- | ------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------ | +| Candidate chain | `push_candidate(link)`, `pop_candidate()` | `get_top_candidate`, `to_candidate(height)`, `is_candidate_header(link)`, `get_candidate_fork(out fork_point) → header_links` | +| Confirmed chain | — (organize never writes confirmed; see chaser_confirm) | `get_confirmed_fork(fork) → header_links` | +| Header records | `set_code(out link, block, ctx, milestone, checked)`, `set_block_unconfirmable(link)` | `to_header(hash)`, `to_parent(link)`, `is_unconfirmable(link)`, `get_header_state(id)`, `get_header_key(link)`, `get_chain_state(...)` | + +> **Invariant (Organize-Store-1).** Every store mutation in organize is +> performed on the chaser's own strand +> (`include/bitcoin/node/impl/chasers/chaser_organize.ipp:115, 330` — both +> `do_organize` and `do_disorganize` open with `BC_ASSERT(stranded())`). +> Concurrent writes from other chasers do not occur in this template's +> scope; coordination across chasers happens via the event bus. + +### 1.2 In-memory (this chaser only) + +```cpp +// chaser_organize.hpp:186-194 +const system::settings& settings_; // immutable +const system::chain::checkpoints& checkpoints_; // immutable + +bool bumped_{}; // strand-only +chain_state::cptr state_{}; // strand-only +block_tree tree_{}; // strand-only + // = unordered_map +``` + +- **`state_`** — current top of the *candidate* chain, cached as a + `chain_state` (height, hash, flags, BIP-window context). Initialised in + `start()` to top of candidate (`ipp:51-59`); updated at the end of every + successful `do_organize` (`ipp:323`); rewound by `do_disorganize` + (`ipp:434-436`). +- **`tree_`** — a memory-resident map of *weak-branch* candidates (blocks + not strong enough to displace the current candidate). Populated by + `cache()` (`ipp:530-538`); drained by `push_block(key)` + (`ipp:518-527`). +- **`bumped_`** — latches `true` after the first time validation has been + kicked off via `chase::bump` (`ipp:304-313`). Used to fire `bump` exactly + once per "currentness" transition. + +> **Invariant (Organize-Mem-1).** `bumped_` is set on the strand and read +> on the strand. It exists to *suppress redundant validation bumps* after +> the first one in the current run, not to enforce a chain property. + +### 1.3 Concrete subclass state + +#### `chaser_header` (`src/chasers/chaser_header.cpp`) + +```cpp +const system::chain::checkpoint milestone_; // immutable, from settings +size_t active_milestone_height_{}; // strand-only +const bool /*node_witness_*/ // not applicable to header +``` + +`active_milestone_height_` tracks the highest height for which a milestone +has been observed on the current candidate. See §6.2. + +#### `chaser_block` (`src/chasers/chaser_block.cpp`) + +```cpp +const bool node_witness_; // immutable; affects get_block +``` + +`chaser_block` has **no milestone tracking** — +`is_under_milestone(size_t)` returns `false` unconditionally +(`chaser_block.cpp:164-167`). + +--- + +## 2. Public interface (process boundary) + +```mermaid +sequenceDiagram + autonumber + participant Peer as protocol_*_in (peer-receiving protocol) + participant Sess as session + participant FN as full_node + participant Org as chaser_organize<Block> + + Peer->>Sess: organize(block, h) + Sess->>FN: organize(block, h) + FN->>Org: organize(block, h) + Org->>Org: POST(do_organize, block, h) [own strand] + activate Org + Org->>Org: do_organize (see §3) + Org-->>Peer: h(ec, height) (via continuation chain) + deactivate Org +``` + +There are exactly **three external entry points** (state-mutating): + +1. **`organize(Block::cptr, organize_handler)`** — + `ipp:69-76` → posts `do_organize` to strand. +2. **`handle_event(code, chase, event_value)`** — bus subscriber. + On `chase::unchecked` / `chase::unvalid` / `chase::unconfirmable`, + posts `do_disorganize(header_t)` (`ipp:82-109`). +3. **`stop`** — base `chaser::stop`, drains the strand and unsubscribes. + +Plus one read-only: +4. **`tree()`** — used by subclasses to walk the tree + (`chaser_header.cpp:194-209` uses it for milestone matching). + +> **Invariant (Organize-Iface-1).** No `organize` call ever runs +> concurrently with another `organize` call or with a `do_disorganize` on +> the same chaser instance: both go through `POST(...)` to the chaser's +> strand and asio strand semantics serialise them. + +--- + +## 3. `do_organize`: the forward state machine + +This is the core. The diagram below labels each branch with the source +line and, where applicable, the `error::organize{N}` code it returns. +Faults marked **F** are terminal and call `fault(ec)` (which suspends the +network); non-fatal returns hand `ec` to the caller via `handler`. + +```mermaid +stateDiagram-v2 + [*] --> CHECK_CLOSED: do_organize(block, h) + CHECK_CLOSED: ipp:125-129 + CHECK_CLOSED --> RETURN_STOPPED: closed() + CHECK_CLOSED --> CHECK_TREE_DUP: open + + CHECK_TREE_DUP: ipp:131-136\nlook up hash in tree_ + CHECK_TREE_DUP --> RETURN_DUP_TREE: found + CHECK_TREE_DUP --> CHECK_STORE_DUP: not in tree + + CHECK_STORE_DUP: ipp:138-143\nduplicate(out height, hash) + CHECK_STORE_DUP --> RETURN_DUP_STORE: ec ≠ success + CHECK_STORE_DUP --> CHECK_PARENT_BAD: ec == success + + CHECK_PARENT_BAD: ipp:148-154\nis_unconfirmable(to_header(prev)) + CHECK_PARENT_BAD --> RETURN_BAD_PARENT: true + CHECK_PARENT_BAD --> GET_PARENT_STATE: false + + GET_PARENT_STATE: ipp:156-162\nget_chain_state(prev) + GET_PARENT_STATE --> RETURN_ORPHAN: null (parent unknown) + GET_PARENT_STATE --> ROLL_STATE: parent known + + ROLL_STATE: ipp:165-166\nchain_state(*parent, header, settings) + + ROLL_STATE --> CHECK_CHECKPOINT + CHECK_CHECKPOINT: ipp:171-175\ncheckpoint::is_conflict + CHECK_CHECKPOINT --> RETURN_CHKPT_CONFLICT: conflict + CHECK_CHECKPOINT --> VALIDATE_HOOK: ok + + VALIDATE_HOOK: ipp:188-192\nvalidate(*block, *state)\n[virtual hook — §4] + VALIDATE_HOOK --> RETURN_INVALID: ec ≠ block_success + VALIDATE_HOOK --> CHECK_STORABLE: ok + + CHECK_STORABLE: ipp:195-201\nis_storable(state) [virtual hook] + CHECK_STORABLE --> CACHE_AND_RETURN: false (cache in tree_) + CHECK_STORABLE --> COMPUTE_WORK: true + + COMPUTE_WORK: ipp:206-213\nget_branch_work(out work, tree_branch, store_branch) + COMPUTE_WORK --> F_organize2: false + COMPUTE_WORK --> STRONG_CHECK: true + + STRONG_CHECK: ipp:215-222\nget_strong_branch(work, branch_point) + STRONG_CHECK --> F_organize3: false + STRONG_CHECK --> CHECK_WEAK_BRANCH: ok + + CHECK_WEAK_BRANCH: ipp:225-231\n!strong + CHECK_WEAK_BRANCH --> CACHE_AND_RETURN: weak (cache, return) + CHECK_WEAK_BRANCH --> UPDATE_MS: strong + + UPDATE_MS: ipp:238-241\nupdate_milestone(header, h, bp) [virtual hook] + + UPDATE_MS --> CHECK_BP_BELOW_TOP + CHECK_BP_BELOW_TOP: ipp:243-249\nbranch_point ≤ top + CHECK_BP_BELOW_TOP --> F_organize4: branch_point > top + CHECK_BP_BELOW_TOP --> POP_TO_BP: ok + + POP_TO_BP: ipp:251-260\nset_reorganized(top--) for top down to bp+1 + POP_TO_BP --> F_organize5: pop_candidate() failed + + POP_TO_BP --> EMIT_REGRESSED: regress = (branch_point < top₀) + EMIT_REGRESSED: ipp:262-266\nif regress: notify(chase::regressed, bp) + + EMIT_REGRESSED --> PUSH_STORE_BRANCH + PUSH_STORE_BRANCH: ipp:268-276\nfor each in store_branch (rev):\n set_organized(link, ++top) + PUSH_STORE_BRANCH --> F_organize6: push_candidate failed + + PUSH_STORE_BRANCH --> PUSH_TREE_BRANCH + PUSH_TREE_BRANCH: ipp:278-288\nfor each in tree_branch (rev):\n push_block(key) [archive + set_organized] + PUSH_TREE_BRANCH --> F_PUSH_KEY: code != success + + PUSH_TREE_BRANCH --> PUSH_TOP_NEW + PUSH_TOP_NEW: ipp:290-295\npush_block(*block, ctx) + PUSH_TOP_NEW --> F_PUSH_NEW: code != success + + PUSH_TOP_NEW --> CHECK_CURRENT + CHECK_CURRENT: ipp:302\nis_block() || is_current(header.timestamp()) + CHECK_CURRENT --> EMIT_NOT_CURRENT: header && !current\n(skip bump and chase_object) + CHECK_CURRENT --> EMIT_BUMP_AND_OBJ: yes + + EMIT_BUMP_AND_OBJ: ipp:304-318\nif !bumped_: notify(chase::bump);\nbumped_ = true;\nnotify(chase_object(), bp)\n(chase::headers or chase::blocks) + + EMIT_BUMP_AND_OBJ --> UPDATE_STATE: ipp:322-324 + EMIT_NOT_CURRENT --> UPDATE_STATE + UPDATE_STATE: state_ = state; handler(success, h) + UPDATE_STATE --> [*] + + RETURN_STOPPED --> [*]: handler(service_stopped, {}) + RETURN_DUP_TREE --> [*]: handler(error_duplicate(), h) + RETURN_DUP_STORE --> [*]: handler(ec, h) + RETURN_BAD_PARENT --> [*]: handler(database::block_unconfirmable, {}) + RETURN_ORPHAN --> [*]: handler(error_orphan(), {}) + RETURN_CHKPT_CONFLICT --> [*]: handler(checkpoint_conflict, h) + RETURN_INVALID --> [*]: handler(ec, h) + CACHE_AND_RETURN --> [*]: cache(block, state); handler(success, h) + + F_organize2 --> [*]: F → fault(organize2) + F_organize3 --> [*]: F → fault(organize3) + F_organize4 --> [*]: F → fault(organize4) + F_organize5 --> [*]: F → fault(organize5) + F_organize6 --> [*]: F → fault(organize6) + F_PUSH_KEY --> [*]: F → fault(ec)\n(may be organize14, organize15 from push_block) + F_PUSH_NEW --> [*]: F → fault(ec)\n(may be organize14, organize15) +``` + +### 3.1 Logical phases (for the spec) + +``` +PHASE LINES OBSERVABLE EFFECTS +--------------------------------- ----------------- ---------------------------------- +A. Dedup / parent check ipp:131-154 none (read-only) +B. State roll-forward ipp:156-166 none (in-memory) +C. Consensus pre-validate ipp:171-192 none (read-only validate hook) +D. Storable decision ipp:195-201 may write to tree_ +E. Branch-work accounting ipp:206-222 none +F. Strong-branch test ipp:225-231 may write to tree_ +G. Reorganize candidate chain ipp:233-295 DB writes; emit regressed +H. Currency check & emit ipp:300-319 emit bump (once) + chase_object +I. State commit ipp:321-324 in-memory state_ update; handler +``` + +The split between (D) cache-on-not-storable and (F) cache-on-not-strong is +deliberate: storable means *we'd archive it if it became strong* — +checkpoints, milestones, and (for headers) "current & sufficient work" +qualify; everything else is held only in `tree_`. + +### 3.2 Side-effect summary table + +| Path | DB writes | Bus emits | Reporting events | +| ------------------------------ | -------------------------------------------------------------------------------- | ------------------------------------------------------ | ------------------------------------------------------------------------- | +| Cache (weak or not-storable) | none | none | none | +| Reorg without regress | `set_strong_chain` ops (via push_block) + `push_candidate` per pushed link | `chase_object()` (headers or blocks); maybe `bump` | `events::header_archived/block_archived`; `events::header_organized` | +| Reorg with regress | pop_candidate (top→bp+1) + pushes above | `chase::regressed`, then `chase_object()`; maybe `bump`| `events::header_reorganized` for each popped; `events::*_organized` for each pushed | +| Non-current header (org. ok) | DB writes as above | *neither* `bump` nor `chase_object` | reporting events as above | + +> **Invariant (Organize-Emit-1).** `chase::bump` is emitted at most once +> per chaser instance lifetime (latched by `bumped_`, +> `ipp:304-313`). +> +> **Invariant (Organize-Emit-2).** `chase::headers` (or `chase::blocks`) +> is emitted on **every** successful organize that yields a current top, +> not only on regress. Source: emission at `ipp:318` is outside the +> `if (regress)` block. + +--- + +## 4. Hooks: where the template differs + +The state machine is identical; six virtual hooks let each instantiation +plug in chain-specific semantics. See `chaser_organize.hpp:60-83` for +declarations. Below is what each instantiation does. + +| Hook | `chaser_header` | `chaser_block` | +| ----------------------------- | ---------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------- | +| `get_header(Block&)` | identity (`chaser_header.cpp:44-47`) | `block.header()` (`chaser_block.cpp:38-41`) | +| `get_block(out, link)` | `query.get_header(link)` (`chaser_header.cpp:49-55`) | `query.get_block(link, node_witness_)` (`chaser_block.cpp:43-49`) | +| `duplicate(out h, hash)` | If header exists: returns `block_unconfirmable` if state says so; else `duplicate_header`. (`chaser_header.cpp:57-89`) | Same idea but checks block state (`unassociated` is *not* a duplicate). (`chaser_block.cpp:51-83`) | +| `validate(block, state)` | `header.check()` + `header.accept(ctx)` (`chaser_header.cpp:91-114`) | `header.check + header.accept`; if under checkpoint: `block.identify` only; else: `block.check`, `populate`, `block.accept`, `block.connect`. (`chaser_block.cpp:85-154`) | +| `is_storable(state)` | checkpoint ∨ milestone ∨ (current ∧ hard) (`chaser_header.cpp:119-147`) | **`true`** unconditionally (`chaser_block.cpp:156-159`) | +| `is_under_milestone(h)` | `h ≤ active_milestone_height_` (`chaser_header.cpp:175-178`) | **`false`** unconditionally (`chaser_block.cpp:164-167`) | +| `update_milestone(hdr,h,bp)` | Updates `active_milestone_height_`; scans tree branch for milestone match (`chaser_header.cpp:182-221`) | **No-op**, returns false (`chaser_block.cpp:169-172`) | + +### 4.1 Spec implications + +For a formal model, these hooks are the **chain-shape parameters** of the +organize machine. A clean factoring: + +``` +organize(M : ChainParams) : Process + where ChainParams = { + Unit : Type, -- header | block + duplicate : Unit -> Store -> DupResult, + validate : Unit -> ChainState -> Code, -- consensus pre-check + is_storable : ChainState -> Bool, -- gating on tree promotion + milestone : Heightℕ -> Bool -- monotone height predicate + } +``` + +`chaser_block::validate` notably includes `block.connect(ctx)` (i.e. UTXO +*consumption* checks via `populate`), but the comment block at +`chaser_block.cpp:130-151` flags that **blocks-first does not currently +emit `chase::valid`** — the validation chaser has separate work to do that +isn't possible here because the block isn't archived yet. This is why +blocks-first is the secondary path and headers-first is the production +default. + +> **Implication.** A specification of "fully validated" must come from +> `chaser_validate`, not from this template's `validate` hook — even for +> `chaser_block`. The hook is `validate-up-to-consensus-acceptance`; the +> deeper UTXO/witness obligations are completed elsewhere. + +--- + +## 5. `do_disorganize`: the rollback state machine + +Triggered by `handle_event` on `chase::unchecked`, `chase::unvalid`, +`chase::unconfirmable` (`ipp:82-109`). Argument is the `header_t` link of +the offending header. + +```mermaid +stateDiagram-v2 + [*] --> STRAND_GUARD: do_disorganize(link) + STRAND_GUARD: ipp:330\nBC_ASSERT(stranded()) + STRAND_GUARD --> CHECK_CLOSED + CHECK_CLOSED: ipp:334-335 + CHECK_CLOSED --> RETURN: closed() + CHECK_CLOSED --> ALREADY_REORG: open + + ALREADY_REORG: ipp:337-339\n!is_candidate_header(link)\n(may have been undone) + ALREADY_REORG --> RETURN: yes (silent return) + ALREADY_REORG --> GET_FORK: no + + GET_FORK: ipp:344-346\nget_candidate_fork(out fork_point) + GET_FORK --> PART_AT_LINK + PART_AT_LINK: ipp:349-351\npart(candidates, invalids, link) + PART_AT_LINK --> RETURN: !part (degenerate) + PART_AT_LINK --> GET_FORK_STATE: ok + + GET_FORK_STATE: ipp:356-361\nget_candidate_chain_state(fork_point) + GET_FORK_STATE --> F_organize7: null + GET_FORK_STATE --> REBUILD_TREE: ok + + REBUILD_TREE: ipp:363-375\nfor each candidate below link:\n state = chain_state(state, header)\n cache(block, state) + REBUILD_TREE --> F_organize8: get_block failed + REBUILD_TREE --> POP_INVALIDS + + POP_INVALIDS: ipp:380-393\nfor each invalid (top→link):\n set_block_unconfirmable\n set_reorganized(invalid) + POP_INVALIDS --> F_organize9: set_block_unconfirmable failed + POP_INVALIDS --> F_organize10: set_reorganized failed + POP_INVALIDS --> POP_WEAK + + POP_WEAK: ipp:398-405\nfor each candidate below link (rev):\n set_reorganized(candidate) + POP_WEAK --> F_organize11: set_reorganized failed + POP_WEAK --> PUSH_CONFIRMED + + PUSH_CONFIRMED: ipp:411-421\nfor each in get_confirmed_fork(fork):\n set_organized(confirmed, ++fork_point) + PUSH_CONFIRMED --> F_organize12: set_organized failed + PUSH_CONFIRMED --> REFRESH_STATE + + REFRESH_STATE: ipp:427-432\nget_candidate_chain_state(fork_point) + REFRESH_STATE --> F_organize13: null + REFRESH_STATE --> EMIT_DISORG + + EMIT_DISORG: ipp:435-442\nstate_ = state;\nnotify(chase::disorganized, fork_point);\nnotify(chase::suspend, {}) + EMIT_DISORG --> [*] + + F_organize7 --> [*] + F_organize8 --> [*] + F_organize9 --> [*] + F_organize10 --> [*] + F_organize11 --> [*] + F_organize12 --> [*] + F_organize13 --> [*] + RETURN --> [*] +``` + +### 5.1 What disorganize accomplishes + +After a successful run, post-conditions: + +1. **`set_block_unconfirmable`** has been called for the offending link + and every candidate at or above it. The store now permanently marks + these headers as bad (`ipp:382`). +2. The **candidate chain top equals the previous confirmed chain top**: + every candidate above the fork point has been popped + (`ipp:380-405`); every confirmed block above the fork point has been + re-pushed as candidate (`ipp:411-421`). +3. The valid portion of the candidate branch (below the bad link) has + been re-cached into `tree_` so that subsequent organize calls can + re-promote it (`ipp:363-375`). +4. `state_` is reset to the new (lower) candidate top (`ipp:435-436`). +5. **`chase::disorganized`** is published with payload = new fork point; + downstream chasers (`check`, `validate`, `confirm`) rewind themselves + to this height. +6. **`chase::suspend`** is published; peer connections are dropped. + +> **Invariant (Organize-Disorg-1).** After `chase::disorganized` is +> emitted, the candidate top equals the confirmed top (modulo races; the +> caller-side `query` interlock guarantees the snapshot used to compute +> the fork point is consistent). This is the strongest correctness +> obligation in the organize machine — see `ipp:438` comment: *"Candidate +> is same as confirmed, reset chasers to new top."* + +> **Invariant (Organize-Disorg-2).** `chase::suspend` follows +> `chase::disorganized` *unconditionally* on success (`ipp:439-442`). +> The two emissions are not reordered. A specification can treat them as +> a single atomic publish. + +### 5.2 Re-entrancy + +Disorganize can be triggered while another disorganize is still in +flight (a second `chase::unvalid` arrives). The `is_candidate_header` +short-circuit at `ipp:337-339` handles this gracefully: the second call +silently returns because the offending link is no longer on the +candidate chain. There is no explicit lock. + +> **Invariant (Organize-Disorg-3).** Repeated disorganize calls on the +> same link are idempotent: the second is a no-op after +> `is_candidate_header(link)` returns false. + +--- + +## 6. Auxiliary state machines + +### 6.1 The header tree (`tree_`) + +`tree_` is an in-memory cache of blocks/headers that have arrived but +cannot yet be archived. Two arrival paths: + +1. **Not storable** (`ipp:195-201`) — header isn't current/strong enough + to gate archival. Held until enough work accumulates and `is_storable` + becomes true on a subsequent arrival. +2. **Not strong** (`ipp:225-231`) — branch work doesn't beat the current + candidate top. + +Exit paths: +- **`push_block(key)`** (`ipp:518-527`) — extracts from `tree_`, archives, + pushes to candidate. +- **`cache` in disorganize** (`ipp:374`) — re-adds blocks below a bad + link so they remain candidates after the rollback. + +> **TODO marker.** `ipp:536` notes "guard cache against memory exhaustion +> (DoS)". The tree is currently unbounded. For a spec, this is a +> liveness assumption (peer cannot drive memory unbounded); a +> formal proof of safety should require a bound on `|tree_|`. + +### 6.2 Header milestone tracking (`chaser_header` only) + +A *milestone* is a configured `(hash, height)` pair that fixes the chain. +Functionally similar to a checkpoint but mutable per node settings. + +State: `active_milestone_height_` is the height of the *most recent +milestone observed on the current candidate*. Initialised by +`initialize_milestone()` (`chaser_header.cpp:153-173`) by reading the +candidate at the milestone height and comparing hashes. + +`update_milestone` (`chaser_header.cpp:182-221`) runs during organize +and: +- Sets `active_milestone_height_` if the new header *is* the milestone. +- Else scans `tree_` for an ancestor that matches the milestone; sets + height if found. +- Else, if the new branch reorganises below the previously-active + milestone, retracts `active_milestone_height_` to the branch point + (`chaser_header.cpp:214-218`). + +`is_under_milestone(h)` is then simply `h ≤ active_milestone_height_`. + +> **Invariant (Organize-Milestone-1).** `active_milestone_height_` is +> non-decreasing *along the current candidate*. It only decreases when +> the candidate is reorganized below the milestone (a rare event), and +> only via `update_milestone`. + +`chaser_block` skips milestones entirely. The full block already carries +enough state to validate without the heuristic. + +--- + +## 7. Error inventory (`organizeN` and friends) + +Every numbered fault is a *terminal* (calls `chaser::fault` which +suspends the network and is meant to indicate a programmer or +store-corruption error). For a formal model, each is a proof obligation: +"this branch is unreachable under the system invariants". + +| Code | Site | Meaning (paraphrased from comments) | +| ------------------ | ----------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------- | +| `header1` | `chaser_header.cpp:38-40` | `initialize_milestone()` failed (mismatch between milestone height and stored candidate) | +| `organize1` | `ipp:55-59` | `start()`: `get_candidate_chain_state(top)` returned null | +| `organize2` | `ipp:209-213` | `get_branch_work` failed (store inconsistency in branch sum) | +| `organize3` | `ipp:218-222` | `get_strong_branch` failed (could not decide strength) | +| `organize4` | `ipp:243-249` | branch_point > current candidate top (impossible if state is consistent) | +| `organize5` | `ipp:255-259` | `pop_candidate()` failed during reorg | +| `organize6` | `ipp:271-276` | `push_candidate(stored.link)` failed during reorg | +| `organize7` | `ipp:357-361` | disorganize: `get_candidate_chain_state(fork_point)` returned null | +| `organize8` | `ipp:367-370` | disorganize: `get_block(candidate)` returned null (missing block for known-candidate) | +| `organize9` | `ipp:382-386` | disorganize: `set_block_unconfirmable(invalid)` failed | +| `organize10` | `ipp:388-392` | disorganize: `set_reorganized(invalid)` failed (pop while marking unconfirmable) | +| `organize11` | `ipp:400-404` | disorganize: `set_reorganized(candidate)` failed (pop weak) | +| `organize12` | `ipp:416-420` | disorganize: `set_organized(confirmed, ...)` failed (push confirmed back as candidate) | +| `organize13` | `ipp:428-432` | disorganize: `get_candidate_chain_state(fork_point)` after rebuild returned null | +| `organize14` | `ipp:515` (in `push_block`) | `set_organized` failed after a successful `set_code` (we archived but couldn't push to candidate) | +| `organize15` | `ipp:521-523` (in `push_block(key)`)| Tree extract returned no handle (item was missing when expected) | +| `stalled_channel` | `ipp:471-475` (in `set_organized`, NDEBUG-only check) | Candidate height isn't `top+1` (broken sequencing) | +| `suspended_channel`| `ipp:477-483` (in `set_organized`, NDEBUG-only check) | Parent of new candidate isn't current top (broken sequencing) | + +> **Spec obligation list.** A formal model should be able to discharge +> `organize2` through `organize15` as unreachable, given: +> - Store consistency invariants supplied by libbitcoin-database. +> - The strand invariant (Organize-Iface-1) excluding concurrent writes. +> - `tree_` invariant (every key has a value). +> +> `organize1` and `header1` are startup-only and reduce to "store was +> initialised at boot". + +--- + +## 8. Differences from the inline `chase.hpp` comments + +(For book-keeping; cross-reference with +[`01-event-bus.md §2`](01-event-bus.md#2-verified-event-reference).) + +- `chase.hpp:46` says **`chase::bump`** is "Issued by 'organize'". ✓ + This template is the only issuer (`ipp:311`). +- `chase.hpp:79-85` say `chase::blocks` / `chase::headers` are issued by + `block` / `header` respectively. Technically correct: both come from + this template via `chase_object()` (`ipp:318`); the template + instantiates as `chaser_block` or `chaser_header`. +- `chase.hpp:91-97` say `chase::regressed` / `disorganized` are issued + by 'organize'. ✓ (`ipp:265`, `ipp:439`). +- `chase.hpp:50` says **`chase::suspend`** is issued by `full_node`. + Incomplete: this template *also* emits it after a disorganize + (`ipp:442`). This is the source of the ⚠ in + [`01-event-bus.md §2.1`](01-event-bus.md#21-work-shuffling). + +--- + +## 9. Notes for the Lisp port + +- The state machine is naturally functional. A pure-Lisp implementation + can model `do_organize` as a pipeline of `Maybe`-returning steps + with the store as an effectful argument. The DB writes and bus emits + are the only effectful boundaries; everything between them is pure + computation over `(block, parent_state, settings)`. + +- The two-mode template can be implemented in Lisp as a single function + parameterised by a record of hook closures, mirroring the spec + factoring in §4.1. + +- `tree_` is naturally a `hash-table` keyed by header hash. The DoS + concern (`§6.1 TODO`) can be enforced by a size cap with a + least-recently-used eviction. + +- `update_milestone` walks `tree_` by parent-hash chain — straightforward + recursion. + +--- + +## 10. Notes for the formal model + +- The whole state machine is *strand-confined*, so single-threaded + semantics suffice for proving local invariants. + +- The store is treated as an oracle: every numbered failure mode + `organizeN` reduces to a store-consistency assumption. + +- Strongest provable property: after every `do_organize` that returns + `success`, the candidate top has either (a) advanced strictly + forward, (b) reorganized to a different top with at least as much + work, or (c) the block is cached in `tree_`. Cases (a)/(b) coincide + with emission of `chase_object()`; (c) does not emit. + +- The "after disorganize, candidate == confirmed" post-condition + (Organize-Disorg-1) is the hinge for downstream chasers: `check`, + `validate`, `confirm` all use it to bound their rewind. + +- A useful proof structure: define `Φ(σ)` = "candidate chain in `σ` has + cumulative work `W` and parent of top has `W − proof(top)`" as a + loop invariant for the reorg phase (G) — every iteration of the + inner loops at `ipp:269-288` preserves it. + +--- + +## Cross-references + +- [`00-overview.md`](00-overview.md) §2 (object graph), §3 (concurrency), + §5 (chaser pipeline). +- [`01-event-bus.md`](01-event-bus.md) §2 (every event with citations). +- Upcoming: `03-chaser-check.md` (consumer of `headers`, producer of + `download`/`split`/`stall`/`purge`). +- Upcoming: `04-chaser-validate.md` (consumer of `checked`, producer of + `valid`/`unvalid`). +- Upcoming: `05-chaser-confirm.md` (the only writer to the *confirmed* + chain). diff --git a/docs/architecture/03-chaser-check.md b/docs/architecture/03-chaser-check.md new file mode 100644 index 00000000..fe1da630 --- /dev/null +++ b/docs/architecture/03-chaser-check.md @@ -0,0 +1,470 @@ +# 03 — `chaser_check` (block-download orchestration) + +> Companion to [`00-overview.md`](00-overview.md), +> [`01-event-bus.md`](01-event-bus.md), +> [`02-chaser-organize.md`](02-chaser-organize.md). +> +> `chaser_check` is the dispatcher between the **header chain** and the +> peer **block-download protocols**. Conceptually: +> +> 1. The header chaser commits to a candidate chain of *headers*. +> 2. `chaser_check` notices which heights in that chain still lack block +> bodies (are *unassociated*) and partitions them into work-maps. +> 3. Peer protocols (`protocol_block_in_31800`) pull maps via +> `get_hashes`, download the blocks, and return any leftovers via +> `put_hashes`. +> 4. As blocks arrive (`chase::checked`), the position advances; as +> validation completes (`chase::valid`), the request window slides +> forward. +> +> `chaser_check` also runs the **performance-policing loop**: peers report +> their download speed, slow peers are detected by standard deviation, +> and starved peers cause work to be split off slow peers. + +| File | Role | +| ------------------------------------------------------------------ | ---------------------------------------------------------------------- | +| `include/bitcoin/node/chasers/chaser_check.hpp` | Class declaration; private types `speeds`, `maps` | +| `src/chasers/chaser_check.cpp` | Implementation | + +--- + +## 1. State + +### 1.1 Configuration (immutable, set at construction) + +```cpp +// chaser_check.hpp:97-101 / chaser_check.cpp:62-69 +const float allowed_deviation_; // σ multiplier for "slow" cutoff +const size_t maximum_concurrency_; // download window size cap +const size_t maximum_height_; // hard ceiling (0 → no cap) +const size_t connections_; // target outbound count (or inbound if no outbound) +const size_t step_; // = min(max_concurrency, max_inv × connections) +``` + +`connections_` is derived from network config +(`chaser_check.cpp:45-53`): outbound + manual peer count if any, else +inbound count. `step_` is the maximum number of hashes scanned per +`set_unassociated` call (`chaser_check.cpp:55-60`). + +### 1.2 Strand-protected runtime state + +```cpp +// chaser_check.hpp:104-111 +size_t inventory_{}; // per-channel inventory size — latched after first nonzero +size_t requested_{}; // height of last request issued +size_t advanced_{}; // count of chase::valid events received (NOT a height) +job::ptr job_{}; // network::race_all token; null ⇒ "purging" +speeds speeds_{}; // map +maps maps_{}; // deque of pending download chunks +``` + +Plus `position()` from the base `chaser` — the height at which all blocks +below are associated (downloaded) on the candidate chain. + +### 1.3 Why `advanced_` is a counter, not a height + +`do_advanced` (`chaser_check.cpp:357-367`) increments `advanced_` on every +`chase::valid` event, regardless of which height was validated. The +*round* is complete when `advanced_ == requested_`. This is correct +because: + +- Every block hash that is requested will eventually be validated or the + peer will fail (and the hash is returned to the pool). +- Once `requested_` round-trips have completed, the request window can + slide forward without leaving gaps below. + +It does **not** track an in-order height; out-of-order validation is +permitted as long as the count balances. + +> **Invariant (Check-State-1).** `position() ≤ requested_` and +> `advanced_ ≤ requested_` always hold on the strand. `set_unassociated` +> short-circuits if either falls behind (`chaser_check.cpp:492-493`). + +### 1.4 The `job_` token (purge semaphore) + +`job_` is a `network::race_all` whose completer is +`handle_purged`. Semantics: + +- A `race_all` invokes its completer **once all copies are dropped**. +- `chaser_check` holds one copy in `job_`. Every peer protocol that + calls `get_hashes` receives a copy via the `map_handler` + (`chaser_check.cpp:443`: `handler(success, get_map(), job_)`). +- When `chaser_check` drops its copy via `stop_tracking()` + (`chaser_check.cpp:184-192`), the completer fires only after every + peer has also dropped its copy — i.e., released its current download + work. + +So `job_` is effectively a *purge barrier*: "wait until every +in-flight download has either completed or surrendered, then resume". + +> **Invariant (Check-Job-1).** `job_ == nullptr` ⇒ chaser is *purging* +> (currently waiting for the barrier). While purging, all of `do_bump`, +> `do_get_hashes`, `do_put_hashes`, and `set_unassociated` are no-ops: +> +> - `purging()` guards `do_bump` (`chaser_check.cpp:381-382`). +> - `purging()` guards `do_get_hashes` (`chaser_check.cpp:440-441`). +> - `purging()` guards `do_put_hashes` (`chaser_check.cpp:451-452`). +> - `purging()` guards `set_unassociated` (`chaser_check.cpp:488-489`). +> +> The transition out of purging is exactly the `do_handle_purged` +> callback (`chaser_check.cpp:204-210`) which calls `start_tracking()` +> followed by `do_bump(0)`. + +--- + +## 2. Public interface + +```mermaid +classDiagram + class chaser_check { + +start() : code + +stopping(ec) + +update(channel, speed, h) + +get_hashes(h) + +put_hashes(map, h) + -handle_event(ec, chase, value) + -do_bump, do_checked, do_advanced, do_headers, do_regressed + -do_starved, do_update + -do_get_hashes, do_put_hashes + -do_handle_purged + -start_tracking, stop_tracking, purging + } +``` + +| External entry | Caller | Stranded? | Effect summary | +| ----------------------------------------- | --------------------------------- | --------- | ------------------------------------------------------------------------------------------------------- | +| `start()` | `full_node::do_start` | on node strand at startup | initialise tracking, position, requested/advanced; subscribe to bus | +| `stopping(ec)` | `chaser::stop` path | no | drop `job_` | +| `update(channel, speed, h)` | `full_node::performance` → protocols | posts to strand | record speed; detect slow channel; reply `slow_channel` if so | +| `get_hashes(h)` | `full_node::get_hashes` → protocols | posts to strand | hand a map and the `job_` pointer to the caller | +| `put_hashes(map, h)` | `full_node::put_hashes` → protocols | posts to strand | re-queue the returned map; emit `chase::download(map.size())` | +| Bus subscription via `handle_event` | `event_subscriber_` | yes (strand-bound handler) | dispatch to `do_*` | + +--- + +## 3. Behaviour overview (3 cooperating state machines) + +```mermaid +flowchart LR + subgraph SM1 [Position tracker] + direction TB + P0([position]) + P0 -- chase::checked(h) --> P1{h == position+1?} + P1 -- yes --> P2[do_bump: skip associated\nadvance position] + P1 -- no --> P0 + P2 --> P3[do_headers] + P3 -- "added > 0" --> P4[emit chase::download(added)] + P3 -- "added == 0" --> P0 + end + + subgraph SM2 [Request window] + direction TB + R0([requested, advanced]) + R0 -- chase::valid --> R1[advanced++] + R1 -- "advanced == requested" --> R2[do_headers] + R2 -- "set_unassociated\nadds work" --> R3[requested = new top] + R3 -- emit chase::download --> R0 + end + + subgraph SM3 [Speed policy] + direction TB + S0([speeds map]) + S0 -- update(ch, speed) --> S1{speed==max?} + S1 -- yes --> S2[erase ch\nec = exhausted_channel] + S1 -- "speed==0" --> S3[erase ch\nec = stalled_channel] + S1 -- "else" --> S4[record\nif > 3 samples\ncheck σ deviation] + S4 -- "below mean - α·σ" --> S5[ec = slow_channel] + S4 -- "else" --> S0 + end +``` + +These three are interleaved on the same strand. The Mermaid grouping is +conceptual; in code they share `do_*` handlers. + +--- + +## 4. Bus integration (verified) + +Inputs (from [`01-event-bus.md §2`](01-event-bus.md#2-verified-event-reference)): + +| Event | Source | Reaction | +| --------------- | ------------------------------- | ----------------------------------------------------------------------- | +| `chase::start` | `full_node` | `do_bump(0)` — kickstart | +| `chase::resume` | `full_node` | `do_bump(0)` | +| `chase::bump` | `chaser_organize` | `do_bump(0)` | +| `chase::headers`| `chaser_header` (organize) | `do_headers(branch_point)` | +| `chase::checked`| `protocol_block_in_31800` | `do_checked(height)` | +| `chase::valid` | `chaser_validate` | `do_advanced(height)` | +| `chase::regressed`/`disorganized` | `chaser_organize` | `do_regressed(branch_point)` — purge work, emit `purge` | +| `chase::starved`| `protocol_block_in_31800` | `do_starved(channel)` — emit `split` or `stall` | +| `chase::stop` | service | unsubscribe (return false) | + +Outputs: + +| Event | When | Payload | +| ------------------ | ------------------------------------------------------------------- | --------------- | +| `chase::download` | After `set_unassociated` adds new work, or `do_put_hashes` re-queues | `count_t added` | +| `chase::purge` | After `do_regressed` purges all outstanding work | `peer_t` (= branch_point) | +| `chase::split` | (`notify_one`) On starvation, target the slowest tracked channel | `object_t` (= starved channel id) | +| `chase::stall` | On starvation when **no speeds** are tracked | `peer_t` (= starved channel id) | + +> **Discrepancy with chase.hpp comments**: chase.hpp:60-66 attribute +> `split` and `stall` to `session_outbound`. Reality (verified by grep): +> both come from `chaser_check::do_starved`. + +--- + +## 5. The download-window state machine + +```mermaid +stateDiagram-v2 + [*] --> CONFIGURED: start()\nset_position(get_fork())\nrequested_ = advanced_ = position\nstart_tracking + + CONFIGURED --> ACTIVE: subscribe + first bump + + state ACTIVE { + [*] --> IDLE + IDLE --> BUMP_HEIGHT: chase::checked(h)\nh == pos+1 + BUMP_HEIGHT --> SCAN_ASSOC: do_bump\nadvance pos over associated + SCAN_ASSOC --> ISSUE: do_headers\nset_unassociated → emit download + ISSUE --> IDLE + IDLE --> ADV_COUNT: chase::valid(h) + ADV_COUNT --> CHECK_ROUND: advanced_++ + CHECK_ROUND --> ISSUE: advanced_ == requested_ + CHECK_ROUND --> IDLE: else + IDLE --> REQUEUE: put_hashes(map) + REQUEUE --> IDLE: emit download(map.size()) + IDLE --> HANDOUT: get_hashes() + HANDOUT --> IDLE: hand pop_front(maps_) + job_ + } + + ACTIVE --> PURGING: chase::regressed(bp) || chase::disorganized(bp)\n[bp < position]\ndo_regressed:\n set_position(bp)\n stop_tracking (drop job_)\n maps_.clear()\n emit purge(bp) + + PURGING --> DRAINED: all peers release their job_ copies\nrace_all completer fires\nhandle_purged → do_handle_purged + + DRAINED --> ACTIVE: start_tracking()\ndo_bump(0) + + ACTIVE --> STOPPED: chase::stop / chaser::stop + PURGING --> STOPPED: chase::stop / chaser::stop + STOPPED --> [*] +``` + +> **Invariant (Check-Purge-1).** A `chase::purge` emission is always +> followed (eventually, after barrier completion) by an idempotent return +> to the ACTIVE state with `position_` rewound to `branch_point` and +> `maps_` empty. While in PURGING, no work is handed out and no new work +> is computed. + +> **Invariant (Check-Purge-2).** `do_regressed` ignores regressions that +> are *above* the current position (`chaser_check.cpp:344-345`). These +> are "inconsequential": the chaser hasn't reached that height yet, so +> there's no work to purge. + +--- + +## 6. The performance / speed policy + +`chaser_check` is the only chaser that interacts with peer connections +beyond bus events: `protocol_block_in_31800` calls +`full_node::performance(channel, speed, handler)` periodically, which +forwards to `chaser_check::update`. The reply `code` tells the peer +whether to keep going or terminate. + +### 6.1 `do_update` decision tree + +``` +speed == max_uint64: erase(channel); return exhausted_channel +speed == 0: erase(channel); return stalled_channel +otherwise: record speed = double(speed) + if count(speeds_) ≤ 3: return success + compute mean, σ from speeds_ + if speed ≥ mean: return success + if (mean - speed) > α·σ: return slow_channel + else: return success +``` + +(`chaser_check.cpp:269-334`) + +- `exhausted_channel`: peer announces "I have nothing else useful for + you". Peer drops itself; bus knows to release work. +- `stalled_channel`: peer reports zero progress. Same effect. +- `slow_channel`: peer is statistically below the herd. Returned only + when ≥ 4 samples and the gap exceeds `allowed_deviation × σ`. + +### 6.2 `do_starved` (when a peer has no work) + +`protocol_block_in_31800` emits `chase::starved(self)` when its queue +empties. `do_starved` (`chaser_check.cpp:215-244`): + +1. Removes `self` from `speeds_` (so it can't be picked as its own + victim). +2. Picks the slowest remaining channel by `speeds_` min. +3. If found: `notify_one(slow, chase::split, self)` — direct that + single channel to split its work and stop, leaving the starved + channel something to claim. +4. If `speeds_` is empty: broadcast `chase::stall(self)` to all + download-capable channels. + +The receiver of `split`/`stall` is `protocol_block_in_31800`; see future +doc on the protocols layer for that side. + +--- + +## 7. `set_unassociated` — the work generator + +`set_unassociated` (`chaser_check.cpp:484-532`) is the heart of work +generation. It: + +1. Early-exit if not all previously-requested work has cleared + (`position < requested || advanced < requested`) + (`chaser_check.cpp:492-493`). +2. Latches `inventory_` on first nonzero + (`chaser_check.cpp:496-498` → `get_inventory_size`). +3. Computes a `stop` height + (`min(position + max_concurrency, maximum_height)`). +4. Loops `query.get_unassociated_above(requested_, inventory_, stop)`, + chunking each result into a `map_ptr` and pushing onto `maps_`, until + `set_map` returns false (empty map, end of unassociated headers + below `stop`). +5. Returns total count added. + +`get_inventory_size` (`chaser_check.cpp:534-543`): + +- Returns 0 if no connections OR if the *confirmed* chain isn't current + (so no inventory work issues until the node is reasonably caught up). +- Otherwise: `ceilinged_divide(unassociated_count_above(fork, step), connections)`. + +> **Invariant (Check-Inventory-1).** `inventory_` is computed at most +> once (latch on first nonzero result). Stored in +> `chaser_check.cpp:496-498`. This is intentional: peer inventory size is +> fixed for the run, after the first "currentness" of the chain. + +--- + +## 8. Error inventory + +Unlike `chaser_organize`, `chaser_check` does **not** define numbered +faults. Its errors are channel-class codes returned to the peer (the +chaser itself does not call `fault`): + +| Code returned by `do_update` | Meaning | +| ------------------------------------------- | -------------------------------------------------- | +| `error::exhausted_channel` | peer has no more work to offer | +| `error::stalled_channel` | peer reports zero progress | +| `error::slow_channel` | peer below statistical floor (mean − α·σ) | +| `network::error::service_stopped` | chaser is closed; `update` short-circuits | +| `error::success` | continue | + +The peer-side reaction (drop connection vs. continue) lives in +`protocol_block_in_31800`. + +--- + +## 9. Coupling diagram + +```mermaid +flowchart LR + ORG[chaser_organize\n(header)] -- "chase::headers (bp)" --> CHK[chaser_check] + ORG -- "chase::regressed / disorganized (bp)" --> CHK + VAL[chaser_validate] -- "chase::valid (h)" --> CHK + FN[full_node] -- "chase::start, resume" --> CHK + + CHK -- "chase::download (n)" --> PIN[protocol_block_in_31800] + CHK -- "chase::purge (bp)" --> PIN + CHK -- "chase::split (target)" --> PIN_one[(one channel via notify_one)] + CHK -- "chase::stall (peer)" --> PIN + + PIN -- "chase::checked (h)" --> CHK + PIN -- "chase::starved (self)" --> CHK + PIN -- "get_hashes() →" --> CHK + PIN -- "put_hashes(map) →" --> CHK + PIN -- "performance(speed) →" --> CHK +``` + +--- + +## 10. Spec view + +### 10.1 As a process + +A formal model can represent `chaser_check` as a process with: + +- **inputs**: + - bus messages (`chase::*` listed in §4) + - request RPCs: `update`, `get_hashes`, `put_hashes` (each carries a + `result_handler` continuation) +- **outputs**: + - bus messages (`chase::download`, `chase::purge`, `chase::split`, + `chase::stall`) + - returned codes to request RPCs +- **state**: + - `position`, `requested`, `advanced` ∈ ℕ + - `inventory` ∈ ℕ (latched after first nonzero) + - `purging` ∈ {true, false} -- abstracts `job_` + - `maps` : Queue of `map` -- each `map` is a non-empty set of + `(height, hash)` pairs + - `speeds` : finite map from channel-id to ℝ⁺ + +### 10.2 Safety properties to prove + +1. **No-overshoot**: `position ≤ requested` and `advanced ≤ requested` + are loop invariants. +2. **Purge soundness**: while `purging`, `maps = ∅` and no `download` + is emitted. +3. **Eventual progress**: assuming peers honour `get_hashes`/`put_hashes` + contracts, the position increases monotonically when not purging. + *Liveness*; requires fairness on peer scheduling. +4. **Split correctness**: `chase::split(self)` is sent only to + `notify_one(slow)` where `slow ≠ self` + (`chaser_check.cpp:219-221, 238`). +5. **Purge barrier completeness**: every `chase::purge` is eventually + followed by a `chase::download` (or by `stop`). This depends on the + `race_all` semantics and peer compliance. + +### 10.3 Liveness assumptions + +- Peers eventually release `job_` copies after `chase::purge` — the + protocols layer must guarantee this (see future doc). +- `query.get_unassociated_above` terminates and is monotone in + `requested_`. + +--- + +## 11. Notes for the Lisp port + +- The three state machines (§3) are independent in their state space + but interleaved in time. A straightforward Lisp port can use one + agent/actor per chaser with a single message-pump. +- The `race_all` token (`job_`) is a non-trivial primitive. In Lisp it + could be modelled as a `(condition-variable, ref-count)` pair: each + peer increments on `get_hashes` and decrements when done; the chaser + waits on count → 0. +- `set_unassociated` is naturally a generator/iterator (lazy sequence + of maps). + +--- + +## 12. Notes for the formal model + +- All `do_*` methods run on the strand. Race-freedom on internal state + is by construction. +- The only externally-visible interaction beyond the bus is the peer + speed/inventory protocol (§6, §7). Model the peer as a process that + responds to `slow_channel`/`exhausted_channel`/`stalled_channel` + return values by dropping its connection. +- A model of `race_all` requires a counted semaphore or refcount; this + is the only place in `chaser_check` that needs more than + single-threaded reasoning. + +--- + +## Cross-references + +- [`00-overview.md`](00-overview.md) §5 (chaser pipeline overview) +- [`01-event-bus.md`](01-event-bus.md) §2.1 (work-shuffling events) and + §2.2 (candidate-chain events) +- [`02-chaser-organize.md`](02-chaser-organize.md) §3.2 (organize emits + `headers`/`bump`/`regressed`/`disorganized` consumed here) +- Upcoming: `04-chaser-validate.md` (issuer of `valid`) +- Upcoming: `09-protocol-block-in-31800.md` (peer-side counterpart; + consumer of `download`/`purge`/`split`/`stall`/`report`) diff --git a/docs/architecture/04-chaser-validate.md b/docs/architecture/04-chaser-validate.md new file mode 100644 index 00000000..35506b0e --- /dev/null +++ b/docs/architecture/04-chaser-validate.md @@ -0,0 +1,601 @@ +# 04 — `chaser_validate` (consensus validation) + +> Companion to [`00-overview.md`](00-overview.md), +> [`01-event-bus.md`](01-event-bus.md), +> [`02-chaser-organize.md`](02-chaser-organize.md), +> [`03-chaser-check.md`](03-chaser-check.md). +> +> `chaser_validate` is the **consensus pre-check stage**. For each block +> on the candidate chain (in heights-first order): +> +> - it ensures the block is downloaded (`chase::checked` already received) +> - it fetches the block from the store and runs full block-validation +> logic from libbitcoin-system (`block.populate`, `block.accept`, +> `block.connect`) +> - if valid, it persists prevout metadata and the filter body, and marks +> the block as `block_valid` +> - it emits `chase::valid(height)` on success or `chase::unvalid(link)` +> on failure +> +> This is the chaser most directly relevant to **formal verification**: it +> is the single source of consensus acceptance, and the only place script +> execution and UTXO availability are checked before confirmation. + +| File | Role | +| --------------------------------------------------------------------- | ------------------------------------------------------------------------------------- | +| `include/bitcoin/node/chasers/chaser_validate.hpp` | Class declaration; private types `race`, `parallel<>` helper | +| `src/chasers/chaser_validate.cpp` | Implementation | + +--- + +## 1. Critical configuration & operating modes + +`chaser_validate` is **inactive in blocks-first mode**: +```cpp +// chaser_validate.cpp:52-61 +code chaser_validate::start() NOEXCEPT +{ + if (!node_settings().headers_first) + return error::success; + ... + SUBSCRIBE_EVENTS(handle_event, _1, _2, _3); + return error::success; +} +``` + +If `node.headers_first == false`, the chaser does not subscribe to the +bus and never validates. Validation in that mode is folded into +`chaser_block::validate` (the organize hook; see +[`02-chaser-organize.md §4`](02-chaser-organize.md#4-hooks-where-the-template-differs)). + +> **Invariant (Validate-Mode-1).** `chaser_validate` is active iff +> `headers_first == true`. In blocks-first mode, no +> `chase::valid`/`chase::unvalid` events are issued by this chaser. + +### 1.1 Other configuration + +```cpp +// chaser_validate.hpp:85-92, chaser_validate.cpp:38-50 +const uint32_t subsidy_interval_; // bitcoin: 210000 (halving interval) +const uint64_t initial_subsidy_; // bitcoin: 50 * 10^8 sat +const size_t maximum_backlog_; // concurrent in-flight validations cap +const bool node_witness_; // include witness data when fetching block +const bool defer_; // skip the consensus pass entirely +const bool filter_; // run filter body even when bypassing validation + // = !defer && archive.filter_enabled() +std::atomic backlog_{}; // in-flight count (atomic; updated off-strand) +``` + +`defer_` is a performance/operational lever. When true, validation is +"deferred" — the chaser emits `chase::valid` without doing actual +consensus work, leaving the obligation to a later phase (or never, for +trusted bootstraps). With `defer_ = false` AND `filter_enabled`, the +filter body is computed even on bypass paths so block-filter data +remains complete. + +### 1.2 Bypass conditions per block + +Each block-level decision uses a local `bypass` flag computed at +`chaser_validate.cpp:161-162`: + +``` +bypass = defer_ || is_under_checkpoint(height) || query.is_milestone(link) +``` + +- **Under checkpoint** — height is at or below the top configured + checkpoint (`chaser::is_under_checkpoint`). Hard-coded chain truth. +- **Milestone** — header was archived under an active milestone + (set during organize; see + [`02-chaser-organize.md §6.2`](02-chaser-organize.md#62-header-milestone-tracking-chaser_header-only)). + +> **Invariant (Validate-Bypass-1).** Under `bypass`: +> - `block.accept(ctx, ...)`, `block.connect(ctx)` are *not* run. +> - `query.set_prevouts(link, block)` is *not* called. +> - `query.set_block_valid(link)` is *not* called. +> +> Only `populate_without_metadata` and (optionally) `set_filter_body` +> run. This is the only place where consensus checks can be elided; it +> is gated on checkpoint/milestone/`defer_` — the rest of the system +> must trust that these gates are sound. + +--- + +## 2. Concurrency model — unique among the chasers + +`chaser_validate` is the only chaser with its **own thread pool**. + +```cpp +// chaser_validate.hpp:81-86 +network::threadpool validation_threadpool_; +network::asio::strand validation_strand_; // on validation_threadpool_ + +// chaser_validate.cpp:40-42, 358-366 +validation_threadpool_(node_settings.threads_(), node_settings.thread_priority_()), +validation_strand_(validation_threadpool_.service().get_executor()), +... +strand() → validation_strand_ // overrides base +stranded() → validation_strand_.running_in_this_thread() +``` + +This means: + +| Method | Runs on | +| -------------------------------------------- | ------------------------------------ | +| `handle_event` (bus callback) | validation strand | +| `do_bump`, `do_bumped`, `do_checked`, `do_regressed`, `post_block` | validation strand | +| `validate_block`, `populate`, `validate` | **validation threadpool (any worker)** — one job per block in flight | +| `complete_block` | validation threadpool worker (whichever ran `validate_block`) | + +The strand serialises *scheduling*; the pool parallelises *execution*. +`maximum_backlog_` caps in-flight parallelism. + +### 2.1 Lifecycle override + +```cpp +// chaser_validate.cpp:342-356 +void chaser_validate::stopping(const code& ec) NOEXCEPT +{ + validation_threadpool_.stop(); // ← halt new work; existing tasks finish + chaser::stopping(ec); +} + +void chaser_validate::stop() NOEXCEPT +{ + if (!validation_threadpool_.join()) { + BC_ASSERT_MSG(false, "failed to join threadpool"); + std::abort(); + } +} +``` + +This is why `full_node::close` calls per-chaser `stop()` blocking +(`src/full_node.cpp:127-135`): validate's pool must be joined before the +process tears down. + +> **Invariant (Validate-Lifecycle-1).** All posted `validate_block` jobs +> must complete or be cancelled before `stop()` returns. The +> threadpool's `join` is what enforces this. + +### 2.2 Backlog control + +```cpp +// chaser_validate.cpp:144-197 (do_bumped, on strand) +while ((backlog_ < maximum_backlog_) && !closed() && !suspended()) { + ... pick next height, decide bypass, post_block(link, bypass) ... + set_position(height++); +} + +// chaser_validate.cpp:199-205 (post_block, on strand) +backlog_.fetch_add(1); +PARALLEL(validate_block, link, bypass); + +// chaser_validate.cpp:246-247 (validate_block, off-strand, at the END) +if (is_one(backlog_.fetch_sub(1))) + handle_event(error::success, chase::bump, height_t{}); +``` + +Two important details: + +1. The strand loop **does not block** for any validation to complete; it + pre-fills the backlog and exits. New iterations come from later + events. +2. The completing worker that pulls the counter to zero **calls + `handle_event` directly**, *not* through the bus + (`chaser_validate.cpp:247`). This is a stall-prevention optimisation: + re-bumps the strand without paying the bus round-trip cost. The + `handle_event` body then `POST`s `do_bump`, so the strand becomes + active again. + +> **Invariant (Validate-Backlog-1).** The number of concurrently running +> `validate_block` tasks is always ≤ `maximum_backlog_`. Enforced by +> the loop guard at `chaser_validate.cpp:151`. + +> **Invariant (Validate-Backlog-2).** Self-bump on backlog drain +> (`chaser_validate.cpp:246-247`) prevents validation from stalling when +> the loop guard would otherwise sit idle. From a spec standpoint it is +> equivalent to a bus-mediated `chase::bump`; treat it as such. + +--- + +## 3. Bus integration (verified) + +Inputs: + +| Event | Source | Reaction | +| ------------------ | ------------------------------- | ---------------------------------------------------------------------------- | +| `chase::start` | `full_node` | `do_bump(0)` — try next height | +| `chase::resume` | `full_node` | `do_bump(0)` | +| `chase::bump` | `chaser_organize` (or self) | `do_bump(0)` | +| `chase::checked` | `protocol_block_in_31800` | `do_checked(height)` — kick when the *next* sequential block arrives | +| `chase::regressed`/`disorganized` | `chaser_organize` | `do_regressed(branch_point)` — rewind position if affected | +| `chase::stop` | service | unsubscribe | + +Suspension gate: `handle_event` short-circuits with `return true` +(remain subscribed) if `suspended()` +(`chaser_validate.cpp:71-72`). Already-posted validations keep running. + +Outputs: + +| Event | Site | Payload | Trigger | +| --------------- | --------------------------------- | ------------------------ | ---------------------------------------------- | +| `chase::valid` | `complete_block` `:330` | `height_t height` | block validated (or bypassed without error) | +| `chase::unvalid`| `complete_block` `:321` | `header_t link` | validation produced a non-fault error | + +And implicit: +- `fire(events::block_unconfirmable, height)` `:322` +- `fire(events::block_validated, height)` `:334` + +--- + +## 4. The validation pipeline (per block) + +The block-level pipeline (off-strand, concurrent) runs four phases. Each +can fail and short-circuit to `complete_block` with a code. + +```mermaid +flowchart TD + A[validate_block link, bypass] --> A0{closed?} + A0 -- yes --> Z[ return ] + A0 -- no --> B[query.get_block link, witness] + B -- null --> E2[ec = validate2] + B -- ok --> C[query.get_context ctx, link] + C -- false --> E3[ec = validate3] + C -- true --> D[populate bypass, block, ctx] + D -- error --> D1[query.set_block_unconfirmable] + D1 -- success --> CP[complete_block ec, link, h, bypass] + D1 -- fail --> E4[ec = validate4] + D -- success --> V[validate bypass, block, link, ctx] + V -- error --> V1[query.set_block_unconfirmable] + V1 -- success --> CP + V1 -- fail --> E5[ec = validate5] + V -- success --> CP + E2 --> CP + E3 --> CP + E4 --> CP + E5 --> CP + CP --> DEC[backlog_-- ; if was-one self-bump chase::bump] + DEC --> Z +``` + +### 4.1 `populate(bypass, block, ctx)` — prevout filling + +```cpp +// chaser_validate.cpp:250-275 +if (bypass) { + block.populate(ctx); // internal-only spends + if (!query.populate_without_metadata(block)) + return system::error::missing_previous_output; +} else { + if (const auto ec = block.populate(ctx)) // includes time/maturity locks + return ec; + if (!query.populate_with_metadata(block)) + return system::error::missing_previous_output; +} +return error::success; +``` + +`block.populate(ctx)` (libbitcoin-system) fills in internal prevouts — +inputs that reference outputs from earlier transactions in the same +block — and checks BIP68 sequence locks, coinbase maturity, etc. + +`query.populate_with_metadata(block)` fills external prevouts from the +store and attaches the spent-coin metadata used by `set_prevouts`. On +bypass, the metadata is unnecessary so the cheaper variant is used. + +> **Invariant (Validate-Populate-1).** Every input not satisfied by +> internal-block prevouts must be satisfied by the store, otherwise +> `populate` returns `missing_previous_output`. This is the *UTXO +> availability* check. + +### 4.2 `validate(bypass, block, link, ctx)` — consensus + +```cpp +// chaser_validate.cpp:277-304 +if (!bypass) { + if ((ec = block.accept(ctx, subsidy_interval_, initial_subsidy_))) + return ec; // (4.2a) + if ((ec = block.connect(ctx))) + return ec; // (4.2b) + if (!query.set_prevouts(link, block)) + return error::validate6; // (4.2c) +} + +if (!query.set_filter_body(link, block)) + return error::validate7; // (4.2d) + +if (!bypass && !query.set_block_valid(link)) + return error::validate8; // (4.2e) + +return error::success; +``` + +| Step | What it checks | +| ----- | -------------------------------------------------------------------------------------------------- | +| 4.2a | **`block.accept(ctx, subsidy, initial)`** — block-wide consensus: coinbase amount, BIP30, fee sum, witness commitment, sigops, sapling, etc. The full set of "above-script" rules. | +| 4.2b | **`block.connect(ctx)`** — runs script verification on every input via libbitcoin-system's script engine. This is the expensive step. | +| 4.2c | **`query.set_prevouts(link, block)`** — persist prevout metadata used to short-cut confirmation. | +| 4.2d | **`query.set_filter_body(link, block)`** — BIP158 filter body, computed for every block. | +| 4.2e | **`query.set_block_valid(link)`** — mark block state in the store as `block_valid`. | + +> **Invariant (Validate-Ordering-1).** Steps 4.2c, 4.2d, 4.2e are +> ordered: +> 1. `set_prevouts` MUST run before `set_filter_body` +> (`chaser_validate.cpp:291` comment: *"Prevouts optimize confirmation"*). +> 2. `set_block_valid` MUST run after both +> (`chaser_validate.cpp:299` comment: *"Valid must be set after set_prevouts and set_filter_body"*). +> +> Reasoning: the store represents block state as a monotone progression +> (`unvalidated` → `block_valid` → `block_confirmable`/`block_unconfirmable`). +> Setting `block_valid` before metadata is complete would expose +> downstream chasers (confirm) to an incomplete record. + +### 4.3 `complete_block(ec, link, height, bypass)` + +Three terminal cases: + +```cpp +// chaser_validate.cpp:307-337 +if (ec) { + if (node::error::error_category::contains(ec)) { + fault(ec); // ← terminal: validate1..8 are all node errors + return; + } + notify(ec, chase::unvalid, link); // ← invalid block (consensus failure) + fire(events::block_unconfirmable, height); + return; +} +notify(ec, chase::valid, height); // ← VALID +if (!defer_) + fire(events::block_validated, height); +``` + +> **Invariant (Validate-Complete-1).** Three mutually-exclusive paths: +> (a) node-error → `fault` (suspends network), (b) consensus error → +> `chase::unvalid` (organize disorganises), (c) success → +> `chase::valid` (downstream chasers advance). Every concurrent +> `validate_block` invocation reaches exactly one of these. + +> **Invariant (Validate-Complete-2).** `chase::valid` is emitted even +> under `defer_` (`chaser_validate.cpp:330` is unguarded). This is +> *required* for `chaser_check` to advance its `advanced_` counter; see +> [`03-chaser-check.md §1.3`](03-chaser-check.md#13-why-advanced_-is-a-counter-not-a-height). + +--- + +## 5. The strand-level state machine (`do_*` methods) + +```mermaid +stateDiagram-v2 + [*] --> RUNNING: start (only if headers_first) + RUNNING --> RUNNING: chase::start/resume/bump → do_bump\nadvance if next height is validateable + RUNNING --> RUNNING: chase::checked(h) → do_checked\nif h == position+1: do_bumped + RUNNING --> RUNNING: do_bumped: walk heights, fill backlog + RUNNING --> RUNNING: chase::regressed/disorganized(bp) → do_regressed\nif bp < position: set_position(bp) + RUNNING --> SUSPENDED_BUT_FINISHING: suspend(ec)\n(handle_event short-circuits; in-flight continues) + SUSPENDED_BUT_FINISHING --> RUNNING: resume + RUNNING --> STOPPED: chase::stop / closed() + SUSPENDED_BUT_FINISHING --> STOPPED: chase::stop / closed() + STOPPED --> [*] +``` + +### 5.1 `do_bumped` height iteration + +`do_bumped` (`chaser_validate.cpp:144-197`) is the inner work-finder. +For each iteration: + +1. Compute `link = to_candidate(height)`; fetch `get_block_state(link)`. +2. Compute `bypass`. +3. Branch on the state: + - `unassociated` → return (we hit the gap; nothing more to do) + - `unvalidated` | `unknown_state` → + - if `!bypass || filter_` → `post_block` + - else → complete immediately with `success` (no work needed) + - `block_valid` | `block_confirmable` → already done, complete with + `success` (this advances `position_`) + - `block_unconfirmable` → return (chain top is bad; halt) + - anything else → `fault(validate1)` +4. `set_position(height++)`. + +> **Invariant (Validate-Iter-1).** `position_` increases monotonically +> in `do_bumped`, regardless of bypass. Each iteration that doesn't +> early-exit advances by one. + +> **Invariant (Validate-Iter-2).** Encountering an +> `unassociated` state aborts the loop (`chaser_validate.cpp:158-159`). +> This guarantees that **no block is bypassed-as-valid without being +> downloaded first** — there's no path through the loop that calls +> `set_block_valid` on an unassociated link. + +### 5.2 `do_checked` and `do_regressed` + +`do_checked(h)` (`chaser_validate.cpp:123-130`): if `h == position+1`, +call `do_bumped(h)`. This is the *fast path* for when a block arrives +right at the head — skip waiting for the next `chase::bump`. + +`do_regressed(bp)` (`chaser_validate.cpp:114-121`): if `bp < position`, +set position to `bp`. The next bump re-walks from there. Crucially, +**there is no purge** like in `chaser_check`: in-flight validations on +heights above `bp` keep running. Their results either: + +- Find the block still on the candidate chain (rare) → emit + `chase::valid(h)` harmlessly; check chaser will increment a stale + counter. +- Find the block no longer there (typical) → still emit + `chase::valid(h)`; downstream effects are absorbed. + +In either case, the validation work itself is correct; the only "wasted" +output is bus traffic. + +> **Invariant (Validate-Regress-1).** A regression does not invalidate +> in-flight validations. Their results are persisted via +> `set_block_valid(link)` only if validation succeeds and bypass is +> false; the store reconciles correctness via the link/state map. + +--- + +## 6. Error inventory + +| Code | Site | Origin | +| ------------ | --------------------------------------- | -------------------------------------------------------------------------------- | +| `validate1` | `chaser_validate.cpp:188` | `default` arm of `get_block_state` switch — unexpected state value | +| `validate2` | `chaser_validate.cpp:226` | `query.get_block(link, witness)` returned null | +| `validate3` | `chaser_validate.cpp:230` | `query.get_context(ctx, link)` returned false | +| `validate4` | `chaser_validate.cpp:235` | `set_block_unconfirmable` failed after a `populate` failure | +| `validate5` | `chaser_validate.cpp:240` | `set_block_unconfirmable` failed after a `validate` failure | +| `validate6` | `chaser_validate.cpp:293` | `query.set_prevouts(link, block)` failed | +| `validate7` | `chaser_validate.cpp:297` | `query.set_filter_body(link, block)` failed | +| `validate8` | `chaser_validate.cpp:301` | `query.set_block_valid(link)` failed | + +All eight are *node-category* errors (terminal): `complete_block` routes +them to `fault(ec)` which suspends the network. The remaining errors +(libbitcoin-system consensus failures from `block.accept`/`block.connect` +and `database::error::missing_previous_output` from `populate`) are +**not** node-category — they flow to `chase::unvalid`, which routes to +`chaser_organize::do_disorganize`. + +> **Spec obligation list.** `validate1` through `validate8` should be +> proved unreachable under the store-consistency invariants supplied by +> libbitcoin-database, combined with the strand discipline. Specifically: +> - `validate2`: `get_block_state` returned a non-error state, so a +> matching `get_block` must succeed. +> - `validate3`: same for `get_context`. +> - `validate4`, `validate5`: `set_block_unconfirmable` failure implies +> store I/O failure. +> - `validate6`, `validate7`, `validate8`: each is a store-write that +> "should not fail given previous reads succeeded". +> - `validate1`: the only unexpected state values are reachable only via +> external store mutation. + +--- + +## 7. Coupling diagram + +```mermaid +flowchart LR + PIN[protocol_block_in_31800] -- "chase::checked (h)" --> VAL[chaser_validate] + ORG[chaser_organize] -- "chase::regressed / disorganized" --> VAL + FN[full_node] -- "chase::start, resume" --> VAL + SELF[chaser_validate self-bump\non backlog drain] --> VAL + + VAL -- "chase::valid (h)" --> CHK[chaser_check] + VAL -- "chase::valid (h)" --> CNF[chaser_confirm] + VAL -- "chase::unvalid (link)" --> ORG_in[chaser_organize
(do_disorganize)] + + VAL -.->|"reads: get_block, get_context"| STORE[(libbitcoin-database query)] + VAL -.->|"writes: set_prevouts, set_filter_body,\nset_block_valid, set_block_unconfirmable"| STORE +``` + +--- + +## 8. Spec view + +### 8.1 Process abstraction + +`chaser_validate` is a process with: +- **inputs**: bus events listed in §3 +- **outputs**: bus events `chase::valid`, `chase::unvalid`; store mutations +- **state**: `position_ ∈ ℕ`, `backlog_ ∈ [0, maximum_backlog_]` +- **internal jobs**: bounded number (≤ `maximum_backlog_`) of pending + per-block validation tasks + +### 8.2 Properties + +**Safety** +1. **No validation without download** (Validate-Iter-2): for every + `chase::valid(h)`, the block at link `to_candidate(h)` has previously + been associated. This is the soundness contract relied on by all + downstream chasers. +2. **Bypass discipline** (Validate-Bypass-1): `set_block_valid` is + never called under bypass; the store distinguishes "validated" from + "checkpoint-bypassed". +3. **Ordering** (Validate-Ordering-1): `set_block_valid` after + `set_prevouts` and `set_filter_body`. +4. **At most one emission per block**: each `validate_block` invocation + produces exactly one of `chase::valid`, `chase::unvalid`, or `fault` + (Validate-Complete-1). + +**Liveness** +- Provided `chaser_check` keeps issuing downloads and the store responds, + every height on the candidate chain eventually receives a + `complete_block` call. +- Backlog drain self-bump (Validate-Backlog-2) prevents stall when the + strand loop exits with backlog at max. + +### 8.3 The consensus surface + +For a formal model, the cleanest factoring is: + +``` +validate : (block, ctx, bypass, defer, filter) → {valid, unvalid(reason), fault(code)} +``` + +with the rules: + +``` +if defer ∨ checkpoint(height) ∨ milestone(link): + if filter: set_filter_body; succeed + else: succeed +else: + block.populate(ctx) > ok + query.populate_with_metadata(block) > ok + block.accept(ctx, subsidy_interval, initial_subsidy) > ok + block.connect(ctx) > ok + set_prevouts(link, block) + set_filter_body(link, block) + set_block_valid(link) + succeed +``` + +The libbitcoin-system functions (`block.accept`, `block.connect`) are +the actual consensus rules. They are *external* to this chaser and +deserve their own specification (which lives in libbitcoin-system docs; +this chaser only sequences and persists their outcomes). + +--- + +## 9. Notes for the Lisp port + +- The strand/loop pattern (one strand, many workers, atomic backlog) maps + to a typed actor with a bounded mailbox and a worker pool. +- All consensus work is delegated to libbitcoin-system; a Lisp port can + ignore the parallelism initially and inline `validate_block` into the + strand (correct but slow). The parallelism is purely a performance + feature — the *correctness* of the chaser does not depend on it. +- Bypass logic (checkpoint, milestone, defer) is a one-line predicate + and should be expressed as such. +- The store interface used by validate is narrow: + `get_block`, `get_context`, `get_block_state`, `is_validateable`, + `is_milestone`, `populate_with_metadata`, `populate_without_metadata`, + `set_prevouts`, `set_filter_body`, `set_block_valid`, + `set_block_unconfirmable`. Reproducing those signatures is enough. + +--- + +## 10. Notes for the formal model + +- Off-strand parallelism makes this the **only chaser that needs more + than single-threaded reasoning**. But the atomic `backlog_` is the + only shared variable; the strand serialises all other state. +- A simple proof outline: treat each posted `validate_block` as an + independent process that does not read or write `position_` or + `backlog_` except for the final `fetch_sub`. Then the strand and the + workers commute except at that one decrement. +- The hard part of a *consensus* proof is in libbitcoin-system + (`block.accept`, `block.connect`); this chaser only proves that + outputs of those functions are persisted in the right order and the + right state transitions emitted. + +--- + +## Cross-references + +- [`01-event-bus.md`](01-event-bus.md) §2.4 (Accept/Connect events: + `valid`, `unvalid`) +- [`02-chaser-organize.md`](02-chaser-organize.md) §4 (`chaser_block` + inlines a subset of validation when in blocks-first mode); §5 + (`chase::unvalid` triggers `do_disorganize`) +- [`03-chaser-check.md`](03-chaser-check.md) §1.3, §5 (consumer of + `chase::valid`) +- Upcoming: `05-chaser-confirm.md` (consumer of `chase::valid`) +- libbitcoin-system docs (external): `block.accept` and `block.connect` + consensus rules diff --git a/docs/architecture/05-chaser-confirm.md b/docs/architecture/05-chaser-confirm.md new file mode 100644 index 00000000..aa69187d --- /dev/null +++ b/docs/architecture/05-chaser-confirm.md @@ -0,0 +1,592 @@ +# 05 — `chaser_confirm` (chain confirmation and reorg) + +> Companion to [`00-overview.md`](00-overview.md), +> [`01-event-bus.md`](01-event-bus.md), +> [`02-chaser-organize.md`](02-chaser-organize.md), +> [`03-chaser-check.md`](03-chaser-check.md), +> [`04-chaser-validate.md`](04-chaser-validate.md). +> +> `chaser_confirm` is the **only writer to the *confirmed* chain**. It +> takes the validated portion of the candidate chain (heights from the +> current confirmed top up to where validation/checkpointing/milestoning +> guarantees acceptance) and: +> +> 1. compares branch work against the current confirmed branch, +> 2. reorganises the confirmed chain (pop ⇒ push) if the candidate is +> stronger, +> 3. runs **`query.block_confirmable(link)`** per block to do final +> UTXO double-spend checks against the now-committed confirmed +> chain, +> 4. persists `block_confirmable`, the filter head, and the candidate +> state via `push_confirmed`, +> 5. emits `chase::confirmable` / `chase::unconfirmable` per block, +> `chase::organized` / `chase::reorganized` per push/pop, and +> `chase::block` when announcing newly-confirmed blocks to peers. +> +> The full transition from "header arrived" to "confirmed in the chain" +> goes: +> +> ``` +> chaser_organize → chaser_check → chaser_validate → chaser_confirm +> (header sync) (download) (consensus) (UTXO + commit) +> ``` +> +> Confirmation is **sequential and not cancellable** — unlike validate's +> parallel pool. + +| File | Role | +| --------------------------------------------------------------- | --------------------------------------------------------------- | +| `include/bitcoin/node/chasers/chaser_confirm.hpp` | Class declaration | +| `src/chasers/chaser_confirm.cpp` | Implementation | + +--- + +## 1. Operating modes + +```cpp +// chaser_confirm.cpp:36-41 +chaser_confirm::chaser_confirm(full_node& node) NOEXCEPT + : chaser(node), + filter_(node.archive().filter_enabled()), + defer_(node.node_settings().defer_confirmation) +{ +} +``` + +### 1.1 Defer-confirmation mode + +```cpp +// chaser_confirm.cpp:43-59 +code chaser_confirm::start() NOEXCEPT { + const auto& query = archive(); + set_position(query.get_fork()); + ... + if (!defer_) { + SUBSCRIBE_EVENTS(handle_event, _1, _2, _3); + } + return error::success; +} +``` + +> **Invariant (Confirm-Mode-1).** `chaser_confirm` is active iff +> `defer_confirmation == false`. In deferred mode it does not subscribe +> and never confirms — the confirmed chain remains at the fork point +> for the run. + +This is symmetric with `chaser_validate`'s `headers_first` gate +(`Validate-Mode-1`). The two `defer_` flags are independent — a +deployment can defer confirmation (e.g. a bootstrap node) while still +validating. + +### 1.2 No parallel pool (unlike validate) + +`chaser_confirm` runs on the node's network strand (the standard chaser +strand, inherited from `chaser`); no override. Confirmation cannot be +parallelised because each block depends on the cumulative UTXO state of +the chain up to its parent — strictly sequential. + +> **Invariant (Confirm-Conc-1).** All confirm operations run on the +> chaser's own strand. Every `do_*` and helper has +> `BC_ASSERT(stranded())` at entry. No off-strand work. + +--- + +## 2. State + +The chaser holds *no in-memory state machine state* beyond the base +class's `position_` (read but only set in `start`). + +All durable state lives in the database, specifically: + +| Method | Effect on store | +| -------------------------------------------- | ----------------------------------------------------- | +| `query.push_confirmed(link, !under_checkpoint)` | Add to confirmed chain (strong = !under_checkpoint) | +| `query.pop_confirmed()` | Remove top of confirmed chain | +| `query.set_block_confirmable(link)` | Mark block as confirmed-in-chain (state transition) | +| `query.set_block_unconfirmable(link)` | Mark block as permanently rejected | +| `query.set_filter_head(link)` | Commit BIP158 filter head record | + +> **Invariant (Confirm-Store-1).** `chaser_confirm` and +> `chaser_organize
` are the only writers to the candidate / +> confirmed chain order. `chaser_confirm` is the only writer to the +> *confirmed* projection (push_confirmed / pop_confirmed); organize +> only touches *candidate*. This separation underpins the spec's +> two-projection view. + +--- + +## 3. Bus integration (verified) + +Inputs: + +| Event | Source | Reaction | +| ------------------------- | ---------------------------- | -------------------------------------------------------------------- | +| `chase::start` | `full_node` | `do_bump(0)` → `do_bumped({})` | +| `chase::resume` | `full_node` | `do_bump(0)` | +| `chase::bump` | `chaser_organize` (or self) | `do_bump(0)` | +| `chase::valid` | `chaser_validate` | `do_validated(h)` → `do_bumped({})` | +| `chase::regressed`/`disorganized` | `chaser_organize` | `do_regressed(h)` — **no-op** (only `BC_ASSERT(stranded())`) | +| `chase::stop` | service | unsubscribe | + +Suspension: `handle_event` short-circuits with `return true` if +`suspended()` (`chaser_confirm.cpp:69-70`). `do_bumped` also has an +explicit early exit on `suspended()` (`chaser_confirm.cpp:141-142`). + +> **Invariant (Confirm-Regress-1).** Regression events are *not acted +> upon*. The next bump-driven call to `do_bumped` will use +> `query.get_validated_fork` against the current candidate chain, which +> reflects the regression. So the chaser naturally tracks rollbacks +> without explicit handling. + +Outputs (all from `chaser_confirm.cpp`, all stranded): + +| Event | Site | Payload | Trigger | +| ------------------ | ------------------------------- | ------------ | ----------------------------------------------------------------------- | +| `chase::reorganized` | `:363` (in `set_reorganized`) | `header_t` | Each `pop_confirmed` during a reorg | +| `chase::organized` | `:396` (in `set_organized`) | `header_t` | Each `push_confirmed` during a reorg | +| `chase::block` | `:427` (in `announce`) | `header_t` | Each `set_organized` *if the confirmed chain is current* | +| `chase::unconfirmable` | `:338` (in `complete_block`)| `header_t` | Block failed `block_confirmable` check | +| `chase::confirmable` | `:345` (in `complete_block`) | `header_t` | Block passed | + +⚠ This differs from `chase.hpp`'s comment claim that `chase::block` is +"Issued by 'transaction'" — verified issuer is `chaser_confirm`. See +[`01-event-bus.md §2.6`](01-event-bus.md#26-confirm-chain-and-mining). + +--- + +## 4. The confirmation algorithm — overall flow + +```mermaid +sequenceDiagram + autonumber + participant VAL as chaser_validate + participant CNF as chaser_confirm (strand) + participant Q as query (store) + + VAL->>CNF: chase::valid(h) + CNF->>CNF: POST(do_validated) + Note over CNF: do_validated → do_bumped({}) + + CNF->>Q: get_validated_fork(out fork_point, top_checkpoint, filter) + Q-->>CNF: fork: header_states (or empty) + alt fork empty + CNF-->>CNF: return + end + + CNF->>Q: get_top_confirmed → top + alt fork_point > top + Note over CNF: F → fault(confirm1) + end + + opt fork_point < top + CNF->>Q: get_work(work, fork) + CNF->>Q: get_strong_fork(strong, work, fork_point) + alt !strong + CNF-->>CNF: return (wait for more candidates) + end + end + + Note over CNF: reorganize(fork, top, fork_point): + loop top > fork_point + CNF->>Q: pop_confirmed() + CNF->>CNF: notify(chase::reorganized, link) + end + + Note over CNF: organize(fork, popped, fork_point): + loop each state in fork + alt state.ec == bypassed + CNF->>Q: set_filter_head(link) + CNF-->>CNF: complete_block(success, bypass=true) + else state.ec == block_valid + CNF->>Q: block_confirmable(link) + alt confirmable returns error + CNF->>Q: set_block_unconfirmable(link) + CNF->>CNF: roll_back(popped, fork_point, height-1) + CNF-->>CNF: complete_block(ec) → emit chase::unconfirmable + Note over CNF: returns early + else success + CNF->>Q: set_filter_head(link) + CNF->>Q: set_block_confirmable(link) + CNF-->>CNF: complete_block(success) → emit chase::confirmable + end + else state.ec == block_confirmable (already) + CNF-->>CNF: complete_block(success, bypass=false) + else + Note over CNF: F → fault(confirm7) + end + CNF->>Q: push_confirmed(link, !under_checkpoint) + CNF->>CNF: notify(chase::organized, link) + opt is_current(true) + CNF->>CNF: notify(chase::block, link) + end + end + + CNF->>CNF: handle_event(chase::bump) [self-bump] +``` + +--- + +## 5. The confirm state machine + +```mermaid +stateDiagram-v2 + [*] --> IDLE: start (subscribed only if !defer_) + IDLE --> COMPUTE_FORK: do_bumped + COMPUTE_FORK: get_validated_fork(out fork_point) + COMPUTE_FORK --> IDLE: fork empty + COMPUTE_FORK --> CHECK_BOUNDS: fork non-empty + CHECK_BOUNDS: top = get_top_confirmed\nfork_point > top? + CHECK_BOUNDS --> F_confirm1: yes → fault + CHECK_BOUNDS --> WORK_CHECK: no + + WORK_CHECK: fork_point < top? + WORK_CHECK --> REORG: equal (no work check needed) + WORK_CHECK --> COMPARE_WORK: yes + COMPARE_WORK: get_work; get_strong_fork + COMPARE_WORK --> F_confirm2: work fetch failed + COMPARE_WORK --> F_confirm3: strong fetch failed + COMPARE_WORK --> IDLE: !strong (wait) + COMPARE_WORK --> REORG: strong + + REORG: reorganize(fork, top, fork_point)\nPop confirmed top down to fork_point + REORG --> F_confirm4: to_confirmed(top) terminal + REORG --> F_confirm5: pop_confirmed failed + + REORG --> ORG_LOOP: pop complete + + ORG_LOOP: organize(fork, popped, fork_point)\nFor each state in fork: + + ORG_LOOP --> BYPASS_PATH: state.ec == bypassed + BYPASS_PATH: set_filter_head + BYPASS_PATH --> F_confirm6: failed + BYPASS_PATH --> PUSH_NEXT: success → complete_block(true) + + ORG_LOOP --> CONFIRM_BLOCK: state.ec == block_valid + CONFIRM_BLOCK: query.block_confirmable(link) + CONFIRM_BLOCK --> UNCONFIRMABLE_PATH: ec != success + CONFIRM_BLOCK --> SUCCESS_PATH: success + + UNCONFIRMABLE_PATH: set_block_unconfirmable + UNCONFIRMABLE_PATH --> F_confirm9: failed + UNCONFIRMABLE_PATH --> ROLL_BACK: success + ROLL_BACK: roll_back(popped, fork_point, h-1) + ROLL_BACK --> F_confirm10: failed + ROLL_BACK --> COMPLETE_UNCONF: success + COMPLETE_UNCONF: complete_block(ec) → emit chase::unconfirmable; return + + SUCCESS_PATH: set_filter_head; set_block_confirmable + SUCCESS_PATH --> F_confirm11: filter_head failed + SUCCESS_PATH --> F_confirm12: set_block_confirmable failed + SUCCESS_PATH --> COMPLETE_CONF: success + COMPLETE_CONF: complete_block(success) → emit chase::confirmable + + ORG_LOOP --> PREV_CONFIRMABLE: state.ec == block_confirmable + PREV_CONFIRMABLE: complete_block(success, bypass=false) + + ORG_LOOP --> F_confirm7: anything else + + COMPLETE_CONF --> PUSH_NEXT + PREV_CONFIRMABLE --> PUSH_NEXT + PUSH_NEXT: set_organized: push_confirmed + emit chase::organized\n[+ chase::block if current] + PUSH_NEXT --> F_confirm8: push_confirmed failed + PUSH_NEXT --> ORG_LOOP: next state in fork + PUSH_NEXT --> SELF_BUMP: fork exhausted + + SELF_BUMP: handle_event(chase::bump) [internal] + SELF_BUMP --> IDLE + + COMPLETE_UNCONF --> IDLE + + F_confirm1 --> IDLE + F_confirm2 --> IDLE + F_confirm3 --> IDLE + F_confirm4 --> IDLE + F_confirm5 --> IDLE + F_confirm6 --> IDLE + F_confirm7 --> IDLE + F_confirm8 --> IDLE + F_confirm9 --> IDLE + F_confirm10 --> IDLE + F_confirm11 --> IDLE + F_confirm12 --> IDLE +``` + +--- + +## 6. The three confirmation paths + +`organize` (`chaser_confirm.cpp:222-280`) is a switch on the +*candidate-side state* of each block in the fork: + +### 6.1 `database::error::bypassed` + +```cpp +// chaser_confirm.cpp:234-244 +if (!query.set_filter_head(state.link)) { fault(confirm6); return; } +complete_block(error::success, state.link, height, true); // bypass=true +``` + +A block previously marked as bypass (under checkpoint or milestone in +validate). Skip confirmation work, just persist filter head, mark +"confirmed" (via `set_organized` below). + +### 6.2 `database::error::block_valid` + +The normal path — block was validated, must now be confirmed against the +*current* confirmed chain. + +```cpp +// chaser_confirm.cpp:282-319 (confirm_block) +if (const auto ec = query.block_confirmable(link)) { + // UTXO double-spend check failed against confirmed chain + if (!query.set_block_unconfirmable(link)) { fault(confirm9); return false; } + if (!roll_back(popped, fork_point, height-1)) { fault(confirm10); return false; } + return complete_block(ec, link, height, false); // emits chase::unconfirmable +} +if (!query.set_filter_head(link)) { fault(confirm11); return false; } +if (!query.set_block_confirmable(link)) { fault(confirm12); return false; } +return complete_block(error::success, link, height, false); // emits chase::confirmable +``` + +`query.block_confirmable(link)` (in libbitcoin-database) is the +**UTXO-availability check across the confirmed chain** — the spend +of every input must reference an unspent output present in the +currently-confirmed UTXO set. This is the deepest correctness obligation +in the whole node. + +> **Invariant (Confirm-Order-1).** `set_filter_head` precedes +> `set_block_confirmable`. Symmetric with validate's ordering: state +> bit is the last write so it never advances over an incomplete record. + +> **Invariant (Confirm-Rollback-1).** A `block_confirmable` failure +> triggers `roll_back(popped, fork_point, height-1)` which: +> 1. Pops the confirmed top back to `fork_point` (reversing +> confirmations completed for this fork before the failing block). +> 2. Re-pushes the originally-popped blocks (restoring the previous +> confirmed chain). +> +> Implementation: `chaser_confirm.cpp:405-419`. Post-condition: confirmed +> chain == its state before this `do_bumped` call. + +### 6.3 `database::error::block_confirmable` + +```cpp +// chaser_confirm.cpp:253-258 +// Previously confirmable is not considered bypass. +complete_block(error::success, state.link, height, false); +``` + +Already-confirmed previously (e.g. during prior session). Just emit +`chase::confirmable` and continue. The block was confirmed in a previous +run; `push_confirmed` here re-establishes its position on the confirmed +chain after a previous shutdown/restart. + +### 6.4 All other states → fault + +```cpp +// chaser_confirm.cpp:259-266 +// error::unassociated, error::unknown_state, error::block_unconfirmable +default: + fault(error::confirm7); + return; +``` + +> **Invariant (Confirm-State-1).** `get_validated_fork` should not +> include blocks in states `unassociated`, `unknown_state`, or +> `block_unconfirmable`. If it does, the store is inconsistent and the +> chaser faults — proof obligation for libbitcoin-database. + +--- + +## 7. Side-effect summary table + +For each fork-element state, listing observable effects: + +| Block state on entry | DB writes | Bus emits | +| ------------------------- | -------------------------------------------------------------------------------------------------------- | ------------------------------------------ | +| `bypassed` | `set_filter_head`, `push_confirmed` | `chase::confirmable`, `chase::organized`, optionally `chase::block` | +| `block_valid` (success) | `set_filter_head`, `set_block_confirmable`, `push_confirmed` | `chase::confirmable`, `chase::organized`, optionally `chase::block` | +| `block_valid` (fail) | `set_block_unconfirmable`, plus rollback writes (`pop_confirmed` × N then `push_confirmed` × M to restore) | `chase::unconfirmable`, plus per-pop `chase::reorganized` and per-restore `chase::organized` | +| `block_confirmable` | `push_confirmed` only | `chase::confirmable`, `chase::organized`, optionally `chase::block` | + +Per pop during reorg: +- `pop_confirmed` +- `chase::reorganized(link)` + +> **Invariant (Confirm-Emit-1).** Reorg emits `chase::reorganized` per +> popped link before any `chase::organized` for newly-confirmed links. +> Order matters for subscribers that maintain auxiliary indexes. + +> **Invariant (Confirm-Announce-1).** `chase::block(link)` is emitted +> only when `is_current(true)` (confirmed chain is current to the wall +> clock per node settings) — `chaser_confirm.cpp:421-428`. Peers don't +> get announcements for catch-up blocks. + +--- + +## 8. Error inventory + +All terminal (call `fault`, suspend network). + +| Code | Site | Cause | +| --------------------- | ------------------------------------- | ---------------------------------------------------------------------------------------------------------------- | +| `confirm1` | `chaser_confirm.cpp:157` | `fork_point > top` — fork above confirmed top (inconsistent store) | +| `confirm2` | `chaser_confirm.cpp:168` | `get_work(fork)` failed | +| `confirm3` | `chaser_confirm.cpp:176` | `get_strong_fork(strong, work, fork_point)` failed | +| `confirm4` | `chaser_confirm.cpp:202` | `to_confirmed(top)` returned terminal during pop | +| `confirm5` | `chaser_confirm.cpp:209` | `pop_confirmed()` failed in `set_reorganized` | +| `confirm6` | `chaser_confirm.cpp:238` | `set_filter_head` failed in bypassed path | +| `confirm7` | `chaser_confirm.cpp:264` | Unexpected block state in fork (unassociated / unknown_state / block_unconfirmable) | +| `confirm8` | `chaser_confirm.cpp:272` | `set_organized` (push_confirmed) failed | +| `confirm9` | `chaser_confirm.cpp:292` | `set_block_unconfirmable` failed after `block_confirmable` returned error | +| `confirm10` | `chaser_confirm.cpp:298` | `roll_back` failed | +| `confirm11` | `chaser_confirm.cpp:308` | `set_filter_head` failed before `set_block_confirmable` | +| `confirm12` | `chaser_confirm.cpp:314` | `set_block_confirmable` failed | +| `suspended_channel` | `chaser_confirm.cpp:379` (NDEBUG-only)| `confirmed_height != top+1` in `set_organized` — sequencing bug | +| `suspended_service` | `chaser_confirm.cpp:387` (NDEBUG-only)| `to_parent(link) != to_confirmed(previous_height)` — parent mismatch | + +> **Spec obligation list.** As with organize/validate, every `confirmN` +> is unreachable under store-consistency invariants plus the strand +> discipline. The non-trivial ones are: +> - `confirm1`: `get_validated_fork` returns a fork whose `fork_point` +> sits below `get_top_confirmed`. +> - `confirm7`: fork only contains valid/confirmable/bypassed entries. +> - `confirm10`: `roll_back` succeeds whenever the inputs are +> consistent with what was popped above. + +--- + +## 9. Coupling diagram + +```mermaid +flowchart LR + VAL[chaser_validate] -- "chase::valid (h)" --> CNF[chaser_confirm] + ORG[chaser_organize] -- "chase::regressed/disorganized\n(no-op handler)" --> CNF + FN[full_node] -- "chase::start, resume" --> CNF + SELF[chaser_confirm self-bump\nafter fork drained] --> CNF + + CNF -- "chase::confirmable (link)" --> SNP_X["chaser_snapshot\n(arm currently commented out)"] + CNF -- "chase::unconfirmable (link)" --> ORG_in[chaser_organize\n(do_disorganize)] + CNF -- "chase::organized (link)" --> X[(no live consumer)] + CNF -- "chase::reorganized (link)" --> Y[(no live consumer)] + CNF -- "chase::block (link)" --> SNP[chaser_snapshot] + CNF -- "chase::block (link)" --> POUT[protocol_block_out_106] + CNF -- "chase::block (link)" --> HOUT[protocol_header_out_70012] + + CNF -.->|"reads: get_validated_fork, get_top_confirmed,\nget_work, get_strong_fork, to_confirmed, to_parent,\nblock_confirmable"| STORE[(database query)] + CNF -.->|"writes: pop_confirmed, push_confirmed,\nset_block_confirmable, set_block_unconfirmable,\nset_filter_head"| STORE +``` + +--- + +## 10. Spec view + +### 10.1 Process abstraction + +``` +chaser_confirm : Process + state: { + position : ℕ, -- baseline; written in start + chain_confirmed : ℕ -- abstract: store-derived top + } + inputs: chase::{start, resume, bump, valid, regressed, disorganized, stop} + outputs: chase::{confirmable, unconfirmable, organized, reorganized, block} + effects: store mutations as in §7 +``` + +### 10.2 Safety properties + +1. **Confirmed monotonicity** (modulo reorgs): after each successful + `set_organized`, `top_confirmed` increases by exactly one. After + each successful `set_reorganized`, it decreases by exactly one. + +2. **Reorg atomicity** (Confirm-Rollback-1): if `confirm_block` fails + mid-fork, the post-rollback state equals the pre-`do_bumped` state. + Either the whole fork commits or no part of it does. + +3. **State-bit ordering** (Confirm-Order-1): `set_block_confirmable` is + the last write per block. A reader observing + `get_block_state(link) == block_confirmable` is guaranteed that + `set_filter_head` and (eventually) `push_confirmed` have completed. + +4. **Work monotonicity at switch** (`get_strong_fork`): the confirmed + chain only changes when the candidate's work strictly exceeds the + current confirmed work above the fork point. This is the consensus + "longest chain" rule for switching. + +5. **Bypassed soundness**: blocks confirmed in bypass mode were + previously gated by `is_under_checkpoint(height) ∨ + query.is_milestone(link)` in validate — a checkpoint/milestone is + the upstream proof of consensus correctness, transitively used here. + +### 10.3 Liveness + +- Provided validate keeps emitting `chase::valid`, every block on the + candidate chain with `state == block_valid` eventually receives a + `do_bumped` call that includes it in the fork. +- The self-bump after fork drain (`chaser_confirm.cpp:279`) ensures + no stall when `chase::valid` arrived during the in-progress + iteration. + +### 10.4 The UTXO oracle + +`query.block_confirmable(link)` is the UTXO double-spend check. Its +correctness is the responsibility of libbitcoin-database. For a formal +model, treat it as: + +``` +block_confirmable(link, store_state) → + Right(()) if every input refers to a UTXO present in store_state, + and double-spend checks pass + Left(error_code) otherwise +``` + +The chaser sequences calls so that `store_state` at the moment of +`block_confirmable(link)` reflects all blocks confirmed below `link`'s +height in this fork (because `set_block_confirmable` for prior heights +has already run by the loop ordering at `chaser_confirm.cpp:230-275`). + +--- + +## 11. Notes for the Lisp port + +- Sequential confirmation maps directly to a single-threaded function. + No parallelism to model. +- The reorg / rollback structure is a clean two-phase commit: + pop-confirmed-down-to-fork-point, then push-new-tops; on per-block + failure, run roll-back. +- The fork is returned by `get_validated_fork` as a list of + `header_state` records. Iterate forward. +- Three state arms (bypassed, block_valid, block_confirmable) and one + fault arm — easy `case` form. + +--- + +## 12. Notes for the formal model + +- The entire chaser is **strand-confined** (`Confirm-Conc-1`). + Single-threaded reasoning suffices. +- The most subtle property is the **rollback**: prove that if + `confirm_block` fails at height `h` after `j` prior successful + pushes in this fork, then `roll_back(popped, fork_point, h-1)` is + equivalent to undoing those `j` pushes and re-pushing the original + `popped`. +- The "current chain" gate on `chase::block` (`Confirm-Announce-1`) is + purely about peer-announcement filtering; not a safety property. + +--- + +## Cross-references + +- [`00-overview.md`](00-overview.md) §5 (pipeline overview), §9 (error + categories — confirmN listed) +- [`01-event-bus.md`](01-event-bus.md) §2.5, §2.6 (confirm events: + `confirmable`, `unconfirmable`, `organized`, `reorganized`, `block`) +- [`02-chaser-organize.md`](02-chaser-organize.md) §5 (`chase::unvalid` + / `unconfirmable` trigger disorganize, which is how *this* chaser's + output flows back upstream) +- [`04-chaser-validate.md`](04-chaser-validate.md) §3 (issuer of + `chase::valid` consumed here) +- Upcoming: `06-chaser-snapshot-storage.md` (consumer of `chase::block`) +- Upcoming: `09-protocol-block-out.md` and `10-protocol-header-out.md` + (consumers of `chase::block` for peer announcements) diff --git a/docs/architecture/06-sessions-and-protocols.md b/docs/architecture/06-sessions-and-protocols.md new file mode 100644 index 00000000..9893677d --- /dev/null +++ b/docs/architecture/06-sessions-and-protocols.md @@ -0,0 +1,746 @@ +# 06 — Sessions and the block-in protocol + +> Companion to [`00-overview.md`](00-overview.md), +> [`01-event-bus.md`](01-event-bus.md), and the chaser docs +> [`02`](02-chaser-organize.md)–[`05`](05-chaser-confirm.md). +> +> This doc covers: +> +> 1. **Session layer** — how `full_node` wires the node's chasers into +> libbitcoin-network's session machinery via a thin mixin, and the +> decision tree that determines which P2P protocols attach to a new +> channel. +> 2. **`protocol_block_in_31800`** — the heaviest peer-side class. It +> is the counterpart of `chaser_check` on the wire side: pulls +> download maps from `chaser_check`, sends `getdata`, receives +> blocks, checks and archives them, and emits `chase::checked` per +> block. +> +> The supporting bases `protocol`, `protocol_observer`, +> `protocol_performer` are described where they matter. + +| File | Role | +| ----------------------------------------------------------------------- | ----------------------------------------------------------------- | +| `include/bitcoin/node/sessions/session.hpp` + `src/sessions/session.cpp`| `node::session` mixin (sibling to `network::session`) | +| `include/bitcoin/node/sessions/session_peer.hpp` + `.../impl/...ipp` | CRTP template combining mixin with `network::session_*` | +| `include/bitcoin/node/sessions/session_{inbound,outbound,manual}.hpp` | Trivial typedefs of `session_peer` | +| `include/bitcoin/node/protocols/protocol.hpp` + `src/protocols/protocol.cpp` | `node::protocol` base — sibling of `network::protocol` | +| `src/protocols/protocol_observer.cpp` | Universal channel observer (handles `chase::suspend`) | +| `src/protocols/protocol_performer.cpp` | Mixin providing the speed-reporting loop | +| `src/protocols/protocol_block_in_31800.cpp` | Block download workhorse | + +--- + +## 1. The class hierarchy + +libbitcoin-node deliberately avoids diamond inheritance: `node::session` +is **a sibling**, not a parent, of `network::session`. The same applies +to protocols. + +### 1.1 Session hierarchy + +```mermaid +classDiagram + class network_session["network::session"] { + +start_channel + +attach_handshake + +attach_protocols (virtual) + } + class node_session["node::session"] { + +organize(header|block, h) + +get_hashes / put_hashes + +notify / notify_one / subscribe_events + +performance(channel, speed, h) + +get_memory() + +fault(ec) + +archive() / *_settings() + } + class network_session_outbound["network::session_outbound"] + class network_session_inbound["network::session_inbound"] + class network_session_manual["network::session_manual"] + class session_peer["session_peer<NetworkSession> (template)"] { + +create_channel (override) + +attach_handshake (override) + +attach_protocols (override) + } + class session_outbound + class session_inbound { + +enabled() override + } + class session_manual + + network_session <|-- network_session_outbound + network_session <|-- network_session_inbound + network_session <|-- network_session_manual + network_session_outbound <|-- session_peer + node_session <|-- session_peer + session_peer <|-- session_outbound + session_peer <|-- session_inbound + session_peer <|-- session_manual + + note for session_peer "Multiply derived:\n• node::session for chaser/bus access\n• network::session_* for socket lifecycle" + note for node_session "All methods forward to full_node" +``` + +The `node::session` mixin (`src/sessions/session.cpp:35-160`) is **pure +forwarding**: every method delegates to the held `full_node&`. So at the +session layer, there is *no node-specific state*; the mixin's only role +is to give protocols a typed handle on the node. + +### 1.2 Per-concrete-session specialisation + +```cpp +// session_outbound.hpp:28-35, session_manual.hpp:28-35: +class session_outbound : public session_peer { ... }; +class session_manual : public session_peer { ... }; +// session_inbound.hpp:28-38 — only override: +class session_inbound : public session_peer { +public: + ... +protected: + bool enabled() const NOEXCEPT override; // see §1.3 +}; +``` + +### 1.3 Inbound `enabled()` gate + +```cpp +// src/sessions/session_inbound.cpp:26-29 +bool session_inbound::enabled() const NOEXCEPT +{ + return !node_settings().delay_inbound || is_recent(); +} +``` + +> **Invariant (Session-Inbound-1).** Inbound connection attempts are +> rejected (the network layer disables the listener) until either +> `delay_inbound == false` *or* the confirmed chain is "recent". The +> definition of "recent" is the same as `full_node::is_recent` — top +> equals configured max height *or* top timestamp is within the +> `currency_window` (`src/full_node.cpp:415-425`). This prevents a +> not-yet-caught-up node from serving stale data. + +This is implemented via `enabled()` rather than the bus +`suspend`/`resume` mechanism so that the listener has independent +control flow. + +--- + +## 2. Channel construction and protocol attach + +The CRTP template `session_peer` overrides three hooks +that the network layer calls per accepted/connected channel. + +### 2.1 `create_channel(socket)` — line `session_peer.ipp:29-39` + +```cpp +const auto channel = system::emplace_shared( + this->get_memory(), // ← block arena from full_node + this->log, socket, this->create_key(), + this->node_config(), this->options()); +``` + +Returns a `node::channel_peer` (the node's channel subclass) upcast to +`network::channel`. Critical: the channel is allocated against the +**node's block memory arena**, which gates block lifetime (see +[`00-overview.md §8`](00-overview.md#8-memory-model-the-block-arena)). + +### 2.2 `attach_handshake(channel, handler)` — line `session_peer.ipp:41-55` + +Runs *before* the version handshake. Sets the channel's `start_height` +from `query.get_top_confirmed()` so the outgoing `version` message +carries the right height to the peer: + +```cpp +const auto top = this->archive().get_top_confirmed(); +const auto peer = std::dynamic_pointer_cast(channel); +peer->set_start_height(top); +NetworkSession::attach_handshake(channel, std::move(handler)); +``` + +The base then runs the standard version exchange from +libbitcoin-network. + +### 2.3 `attach_protocols(channel)` — line `session_peer.ipp:57-161` + +After handshake succeeds, this method decides **which P2P protocols run +on the channel** based on negotiated peer features and node +configuration. This is the central network-layer decision tree. + +```mermaid +flowchart TD + A[attach_protocols] --> B[NetworkSession::attach_protocols\nping/address/alert/reject] + B --> C[protocol_observer] + C --> D{node_client_filters\n&& blocks_out\n&& peer.is_negotiated bip157} + D -- yes --> D1[protocol_filter_out_70015] + D -- no --> E{node_network?} + D1 --> E + E -- no --> Z(["return (no block/tx wiring)"]) + E -- yes --> F{peer.is_peer_service node_network?} + F -- yes --> G{headers && peer.is_negotiated bip130?} + G -- yes --> G1[protocol_header_in_70012\n+ protocol_block_in_31800] + G -- no --> H{headers && peer.is_negotiated headers_protocol?} + H -- yes --> H1[protocol_header_in_31800\n+ protocol_block_in_31800] + H -- no --> H2[protocol_block_in_106] + G1 --> I + H1 --> I + H2 --> I + F -- no --> I + I{blocks_out?} + I -- yes --> J{headers && peer.is_negotiated bip130?} + J -- yes --> J1[protocol_header_out_70012\n+ protocol_block_out_70012] + J -- no --> K{headers && peer.is_negotiated headers_protocol?} + K -- yes --> K1[protocol_header_out_31800\n+ protocol_block_out_106] + K -- no --> K2[protocol_block_out_106] + J1 --> L + K1 --> L + K2 --> L + I -- no --> L + L{txs_in_out && peer_version.relay?} + L -- yes --> L1[protocol_transaction_out_106] + L -- no --> Z2(["return"]) + L1 --> Z2 +``` + +### 2.4 Predicate definitions + +| Predicate | Definition (file:line) | +| -------------------------- | ------------------------------------------------------------------------------------- | +| `delay` | `node_settings.delay_inbound` (`session_peer.ipp:70`) | +| `headers` | `node_settings.headers_first` (`session_peer.ipp:71`) | +| `relay` | `network_settings.enable_relay` (`session_peer.ipp:69`) | +| `node_network` | `services_maximum & service::node_network` (`session_peer.ipp:72-76`) | +| `node_client_filters` | `services_maximum & service::node_client_filters` (`session_peer.ipp:77-81`) | +| `blocks_out` | `!delay || is_recent()` (`session_peer.ipp:90`) | +| `txs_in_out` | `relay && peer.is_negotiated(bip37) && (!delay || is_current(true))` (`session_peer.ipp:111-112`) | + +`peer.is_negotiated(level)` consults the handshake-negotiated protocol +level. `peer.is_peer_service(service::node_network)` checks the peer's +advertised service bits. + +> **Invariant (Attach-1).** Every channel receives `protocol_observer` +> (`session_peer.ipp:87`), so every channel is subject to +> `chase::suspend` (and stop). Without this, suspend would not reach +> all channels. + +> **Invariant (Attach-2).** `protocol_block_in_31800` is attached +> exactly when the peer advertises `node_network` *and* either `bip130` +> or `headers_protocol` is negotiated *and* headers-first mode is on. +> The legacy `protocol_block_in_106` path is only for peers below +> protocol level 31800. + +> **Note for the port.** Future P2P protocol additions slot into this +> tree by adding new conditional branches. A spec can model the result +> as a *static* set of protocols per channel decided at handshake; the +> set never changes during the channel's lifetime. + +--- + +## 3. `node::protocol` base — sibling pattern + +`include/bitcoin/node/protocols/protocol.hpp:33-99`: + +`node::protocol` does **not** inherit from `network::protocol`. Instead, +concrete protocols multiply-inherit from both +`network::protocol_xxx` (the wire-level base) *and* `node::protocol` +(the node-aware mixin). The `subscribe_events` machinery uses +`shared_from_sibling` to get a `shared_ptr` out of a +`network::protocol&` — see `protocol.cpp:74-84`. + +### 3.1 Event subscription protocol + +```cpp +// src/protocols/protocol.cpp:72-84 +void protocol::subscribe_events(event_notifier&& handler) NOEXCEPT { + const auto self = dynamic_cast(*this) + .shared_from_sibling(); + + event_completer completer = std::bind(&protocol::handle_subscribed, + self, _1, _2); + + session_->subscribe_events(std::move(handler), + std::bind(&protocol::handle_subscribe, + self, _1, _2, std::move(completer))); +} +``` + +The chain: protocol → session → full_node → posts to node strand → +subscribes → completer fires → `handle_subscribe` stores the `key_` → +calls `handle_subscribed` → posts back to channel strand → calls the +override `subscribed(ec, key)` for any protocol-specific init. + +> **Invariant (Protocol-Sub-1).** Each protocol may have *at most one +> active subscription*. Enforced by `BC_ASSERT_MSG(is_zero(key_), +> "unsafe access")` in `handle_subscribe` (`protocol.cpp:91`). + +> **Invariant (Protocol-Sub-2).** `unsubscribe_events` must be called +> from the protocol's `stopping(...)` override +> (`include/bitcoin/node/protocols/protocol.hpp:80-81`). The base +> `subscribed` calls it on early failure (`protocol.cpp:114-121`). + +### 3.2 `protocol_observer` — channel-suspend listener + +Every channel has one. Its job is two-fold +(`src/protocols/protocol_observer.cpp`): + +1. **Translate `chase::suspend` to channel stop.** When the bus + broadcasts `suspend`, every `protocol_observer` calls + `stop(error::suspended_channel)` on its channel + (`protocol_observer.cpp:77-80`). Result: a `suspend` event tears + down every peer connection. +2. **Hygiene: drop peers that send unrequested tx inventories** if + relay is disallowed (`protocol_observer.cpp:101-127`). This is a + defensive check. + +> **Invariant (Observer-Suspend-1).** A single `chase::suspend` +> emission drops *all* channels via the per-channel observer. The base +> `network::net` also suspends its listeners (see +> [`00-overview.md §6.2`](00-overview.md#62-suspend--resume--fault)), so +> the combined effect is "stop all current and refuse new". + +--- + +## 4. `protocol_performer` — speed reporting loop + +`src/protocols/protocol_performer.cpp`. This is a mixin used by +`protocol_block_in_31800` to drive the speed-reporting cycle. + +```mermaid +stateDiagram-v2 + [*] --> IDLE + IDLE --> RUNNING: start_performance\n(start_=now; bytes_=0; timer) + RUNNING --> RUNNING: count(bytes)\nbytes_ += bytes + RUNNING --> TICK: timer fires + TICK --> SEND_RATE: !is_idle\nrate = bytes_ / elapsed_seconds + TICK --> PAUSED: is_idle (exhausted)\nsend_performance(max_uint64) + SEND_RATE --> APPLY: performance(rate, ...) + APPLY: do_handle_performance(ec) + APPLY --> IDLE: ec == exhausted → return + APPLY --> STOPPED: ec ∈ {stalled, slow} → stop + APPLY --> RUNNING: ec == success → start_performance + PAUSED --> [*]: (timer stopped until next chase::download) + STOPPED --> [*] +``` + +Key code paths: + +- `start_performance()` (`protocol_performer.cpp:35-46`): start clock, + reset byte counter, arm the timer. +- `handle_performance_timer(ec)` (`:48-76`): on tick, if idle ⇒ + `pause_performance` (sends `max_uint64`); else compute rate in bytes/s + and `send_performance`. +- `send_performance(rate)` (`:90-113`): dispatches based on + configuration: + - If `deviation_` is enabled: send via the + `performance(rate, handler)` RPC → chaser_check → maybe + `error::slow_channel`. + - Else: only stalled/exhausted detection (no σ analysis). +- `do_handle_performance(ec)` (`:123-152`): the dispatch table: + +| Reply code | Reaction | +| ------------------ | --------------------------------------------------------------- | +| `exhausted_channel`| Stop the timer; wait for `chase::download` to restart | +| `stalled_channel` | `stop(ec)` — drop the channel | +| `slow_channel` | `stop(ec)` — drop the channel | +| other error | `stop(ec)` | +| `success` | `start_performance()` — go again | + +> **Invariant (Performer-1).** `count(bytes)` accumulates incoming +> block bytes between ticks; `bytes_` is reset at every +> `start_performance`. So `rate = bytes_/elapsed` is bytes/sec since +> the last tick boundary. + +> **Invariant (Performer-2).** A channel that returns +> `exhausted_channel` is NOT dropped; only the timer is paused. It +> resumes on the next `chase::download` event (via +> `protocol_block_in_31800::do_get_downloads` calling +> `start_performance`, see §5.2). + +--- + +## 5. `protocol_block_in_31800` — the block-download protocol + +This class is the on-the-wire counterpart of `chaser_check`. The +chaser holds the *queue of pending downloads* (per [`03`](03-chaser-check.md)); +this protocol holds *one map of in-flight items per channel* and +shuttles between them. + +### 5.1 Per-channel state + +```cpp +// (private members; signatures from the .cpp) +map_ptr map_; // current download map; empty ⇒ idle +job::ptr job_; // shared race_all barrier (held while in flight) +size_t bytes_; // bytes received this performance window (inherited) +``` + +Plus inherited from `protocol_performer`, `protocol_peer`, and +`protocol`. + +### 5.2 Lifecycle + +```cpp +// protocol_block_in_31800.cpp:45-56 +void start() { + if (started()) return; + subscribe_events(BIND(handle_event, _1, _2, _3)); + SUBSCRIBE_CHANNEL(block, handle_receive_block, _1, _2); + protocol_performer::start(); +} + +// :59-77 — subscribed (called after event subscription completes) +void subscribed(ec, key) { + if (stopped(ec)) { unsubscribe_events(); return; } + if (is_current(false)) { + start_performance(); + get_hashes(BIND(handle_get_hashes, _1, _2, _3)); + } +} + +// :80-88 — stopping +void stopping(ec) { + restore(map_); // return any unfinished work + map_ = chaser_check::empty_map(); + stop_performance(); + unsubscribe_events(); + protocol_performer::stopping(ec); +} +``` + +> **Invariant (BlockIn-Lifecycle-1).** On stop, the protocol *returns +> its current map* via `put_hashes`, so the chaser can re-queue the +> work for another channel. `restore` is a no-op if the map is empty +> (`protocol_block_in_31800.cpp:386-390`). + +> **Invariant (BlockIn-Lifecycle-2).** Initial download only starts if +> the candidate chain is *current* at subscription time +> (`:72-76`). Otherwise the protocol waits for `chase::download` +> events to kick it (see §5.3). + +### 5.3 Bus event handling + +```mermaid +stateDiagram-v2 + [*] --> IDLE: start (map_ = empty, performance timer started) + + IDLE --> RECV: chase::download\nif is_idle: start_performance, get_hashes + RECV --> SENDING: send_get_data(map, job)\nSEND get_data message + SENDING --> RECEIVING: peer ack + RECEIVING --> RECEIVING: handle_receive_block(block)\ncheck → set_code → emit chase::checked\nerase from map_ + RECEIVING --> NEED_MORE: map_ empty\nget_hashes again + NEED_MORE --> IDLE: handle_get_hashes\nif map empty: emit chase::starved + NEED_MORE --> SENDING: handle_get_hashes\ngot work: send_get_data + + IDLE --> SPLIT_OFF: chase::stall\nif map_.size > 1: split half + restore both + stop + RECEIVING --> SPLIT_OFF: chase::stall\nsame + IDLE --> SPLIT_OFF: chase::split (notify_one)\nsame as stall but targeted + RECEIVING --> SPLIT_OFF: chase::split (notify_one)\nsame + SPLIT_OFF: stop(sacrificed_channel) + + IDLE --> PURGED: chase::purge\nclear map; stop + RECEIVING --> PURGED: chase::purge\nsame + SENDING --> PURGED: chase::purge\nsame + PURGED: stop(sacrificed_channel) + + IDLE --> [*]: chase::stop, channel stop, etc. + RECEIVING --> [*] + SENDING --> [*] + NEED_MORE --> [*] +``` + +The mapping of bus events → handlers +(`protocol_block_in_31800.cpp:99-156`): + +| Event | Handler | Effect | +| ------------------ | ----------------- | ----------------------------------------------------------------------------------------------- | +| `chase::download` | `do_get_downloads`| If idle: `start_performance` + `get_hashes` | +| `chase::split` | `do_split` | If map_ > 1 item: split half, return both halves, stop with `sacrificed_channel` | +| `chase::stall` | `do_stall` | Same as split (split if divisible work, else no-op) | +| `chase::purge` | `do_purge` | Clear map_, stop with `sacrificed_channel` | +| `chase::report` | `do_report` | LOG only — current map size | +| `chase::stop` | (return false) | Unsubscribe | + +### 5.4 The download cycle (handle_get_hashes → send_get_data → handle_receive_block) + +```mermaid +sequenceDiagram + autonumber + participant CHK as chaser_check + participant PB as protocol_block_in_31800 + participant P as Peer + participant Q as query + participant Bus as event bus + + Note over PB: idle (map_ empty) + PB->>CHK: get_hashes(handle_get_hashes) + CHK-->>PB: (success, map, job) + alt map empty + PB->>Bus: notify(chase::starved, events_key) + else + PB->>PB: send_get_data(map, job)\nmap_ = map; job_ = job + PB->>P: getdata(items) + loop per block sent by peer + P->>PB: block message + PB->>PB: handle_receive_block(block) + alt hash not in map_ + Note over PB: ignore (unrequested), return true + else + PB->>PB: check(block, ctx, bypass) + alt malleated / commitment failure + PB->>P: stop(invalid_commitment) ← drops channel + else other check failure + PB->>Q: set_block_unconfirmable(link) + PB->>Bus: notify(chase::unchecked, link) + PB->>P: stop(ec) + else success + PB->>Q: set_code(block, link, checked, bypass, height) + PB->>Bus: notify(chase::checked, height) + PB->>PB: count(serialized_size); map_.erase(it) + alt map_ now empty + PB->>PB: job_.reset() (release barrier) + PB->>CHK: get_hashes(...) + end + end + end + end + end +``` + +### 5.5 The `check` function and bypass + +```cpp +// protocol_block_in_31800.cpp:365-381 +code check(const chain::block& block, const chain::context& ctx, + bool bypass) const NOEXCEPT +{ + if (bypass) { + if ((ec = block.identify())) return ec; + if ((ec = block.identify(ctx))) return ec; + } else { + if ((ec = block.check())) return ec; + if ((ec = block.check(ctx))) return ec; + } + return error::success; +} +``` + +- `block.identify` (libbitcoin-system): cheap surface checks — header + match, transaction commitment, witness commitment. +- `block.check`: full check (identify + size limits + tx checks + + sigops + …). + +`bypass = is_under_checkpoint(height) || query.is_milestone(link)` +(`:296-297`). + +> **Invariant (BlockIn-Check-1).** Under bypass, only identity is +> verified at receive time. Full consensus checks are skipped (the +> upstream checkpoint/milestone is the proof). Note: identity +> failure (`invalid_witness_commitment`, +> `invalid_transaction_commitment`) is interpreted as **malleation** — +> stop the peer but do *not* mark the block unconfirmable +> (`protocol_block_in_31800.cpp:303-314`). A different peer may yet +> deliver the canonical block. + +> **Invariant (BlockIn-Check-2).** Non-malleation check failure ⇒ +> `set_block_unconfirmable(link)` is called AND `chase::unchecked` +> is emitted (`:316-327`). `chaser_organize` will then disorganize +> down to this link. + +### 5.6 Storage write + +```cpp +// :333-340 +if (const auto code = query.set_code(*block, link, checked, bypass, height)) { + LOGF("Failure storing block ..."); + stop(fault(code)); + return false; +} +``` + +`query.set_code` associates the block body with its header link in the +store. Failure here is a node fault (suspends the network via +`fault(code)`). + +> **Invariant (BlockIn-Store-1).** A `chase::checked(height)` event is +> emitted *only after* `set_code` succeeded (`:333-348`). So consumers +> of `chase::checked` can rely on the block being durably associated. + +### 5.7 Self-stall detection + +```cpp +// :421-424 in handle_get_hashes +if (map->empty()) { + notify(error::success, chase::starved, events_key()); + return; +} +``` + +If the chaser hands the protocol an empty map, this protocol is *out of +work*. Emitting `chase::starved` causes `chaser_check::do_starved` to +either pick the slowest channel and signal a `split` (rerouting its +work to here) or — if no channels are tracked — broadcast `stall` +(asking *any* channel with divisible work to split). + +> **Invariant (BlockIn-Starve-1).** `chase::starved` is emitted with +> the protocol's own subscription key (`events_key()`). The chaser +> uses this key to *avoid* asking the starved peer to split its own +> (empty) work (`chaser_check.cpp:219-221`). + +--- + +## 6. Coupling diagram + +```mermaid +flowchart LR + SS[session_outbound / inbound / manual] -- "attach_protocols" --> P_observer[protocol_observer] + SS -- "attach if headers-first + bip130" --> P_block_in[protocol_block_in_31800] + SS -- "attach if headers-first + bip130" --> P_hin_70012[protocol_header_in_70012] + SS -- "attach if blocks_out" --> P_block_out[protocol_block_out_*] + SS -- "attach if filters & client" --> P_filter[protocol_filter_out_70015] + + P_block_in -- "get_hashes / put_hashes / performance" --> SES[node::session forwarder] + SES --> FN[full_node] + FN --> CHK[chaser_check] + + CHK -- "chase::download (count)" --> P_block_in + CHK -- "chase::purge (bp)" --> P_block_in + CHK -- "chase::split (target via notify_one)" --> P_block_in + CHK -- "chase::stall (peer)" --> P_block_in + + P_block_in -- "chase::checked (height)" --> CHK + P_block_in -- "chase::checked (height)" --> VAL[chaser_validate] + P_block_in -- "chase::unchecked (link)" --> ORG[chaser_organize] + P_block_in -- "chase::starved (key)" --> CHK + + P_observer -- "chase::suspend → stop(channel)" --> P_block_in + P_observer -- "chase::suspend → stop(channel)" --> P_block_out + P_observer -- "chase::suspend → stop(channel)" --> P_hin_70012 + + P_block_in -.->|"get_data / block via wire"| PEER[(peer)] + P_block_in -.->|"set_code(block, link, ...)"| STORE[(query)] + P_block_in -.->|"set_block_unconfirmable"| STORE +``` + +--- + +## 7. Error / outcome inventory for `protocol_block_in_31800` + +The protocol returns/uses these codes; only `protocol1` is a node-fault. + +| Code | Site | Trigger | +| ------------------------------------- | ------------------------------------------ | ------------------------------------------------------------------------------ | +| `system::error::invalid_transaction_commitment` | `:305-313` | Malleation in tx commitment — peer dropped; block left unmarked | +| `system::error::invalid_witness_commitment` | `:305-313` | Same, witness | +| any other consensus error | `:316-327` | `set_block_unconfirmable`; emit `chase::unchecked`; stop peer | +| `node::error::protocol1` | `:318` | `set_block_unconfirmable` itself failed — node fault | +| `network::error::sacrificed_channel` | `:188, :202, :216` | Self-sacrifice on `purge`/`stall`/`split` | +| any database error code | `:336-339` | `set_code` failure — node fault | + +--- + +## 8. Spec view + +### 8.1 Sessions as protocol selectors + +The session layer's only consensus-relevant action is the **attach +tree** (§2.3). A spec can model: + +``` +attach(channel, peer, settings) : Set Protocol = + { observer } ∪ filters(...) ∪ block_in_set(...) ∪ block_out_set(...) ∪ tx_set(...) +``` + +Once `start_channel` succeeds and `attach_protocols` runs, the set is +fixed for that channel's lifetime. + +### 8.2 `protocol_block_in_31800` as a process + +- **State**: `map ∈ MaybeQueue (HashPair)`, `job ∈ MaybeJobBarrier`, + `bytes ∈ ℕ`. +- **Inputs** (channel strand): + - peer `block` messages + - bus events listed in §5.3 + - performance timer ticks +- **Inputs** (chaser strand, via RPC results): + - `handle_get_hashes(ec, map, job)` + - `handle_put_hashes(ec, count)` +- **Outputs**: + - peer `getdata` messages + - bus events `chase::checked`, `chase::unchecked`, `chase::starved` + - store mutations: `set_code(...)`, `set_block_unconfirmable(...)` + - `chase::valid` is *not* emitted by this protocol (it only emits + `chase::checked`); validation comes from `chaser_validate`. + +### 8.3 Safety properties + +1. **Store-precedes-bus** (BlockIn-Store-1): `chase::checked(h)` ⇒ + the block is durably stored at the link `to_candidate(h)`. +2. **No silent invalidation**: every `chase::unchecked(link)` is + preceded by a successful `set_block_unconfirmable(link)`. Failure + path goes to fault, not silent drop (BlockIn-Check-2). +3. **Map ownership**: at any time, a given map entry is in exactly + one place: in `chaser_check.maps_`, in some channel's `map_`, or + in flight as a returned `restore` to the chaser. Proving this + requires reasoning about the asynchronous `restore` callback. +4. **Job barrier completeness**: while any channel holds `job_`, the + chaser's `race_all` is not complete. So the chaser cannot enter + `purging` state during a request → response cycle without an + explicit `chase::purge` first. + +### 8.4 Liveness + +- Each channel either keeps progressing (`chase::checked` per block) or + emits `chase::starved` when its map drains, prompting the chaser to + rebalance. +- A persistently slow channel is dropped via `slow_channel`; a + persistently stuck channel via `stalled_channel`. + +--- + +## 9. Notes for the Lisp port + +- **Sessions**: a Lisp port can collapse all three session subclasses + into one (`enabled` for inbound is the only customisation). The + attach tree (§2.3) is the only behavioural component. +- **`node::protocol` sibling pattern**: this is a workaround for C++ + multiple inheritance; in Lisp there is no analogue needed. Wire + protocols can simply contain a reference to the node. +- **`protocol_block_in_31800`**: model it as an actor with one mailbox + for peer messages and one for chaser events. The interleaving on the + channel strand makes it inherently single-threaded per channel; one + Lisp actor per channel suffices. +- **`protocol_performer`**: implementable as a recurring scheduled + task that snapshots `bytes_` and computes a rate. +- The **block arena** lifetime contract on `block::cptr` is the one + C++-specific subtlety that must be replaced by GC ownership in Lisp. + +--- + +## 10. Notes for the formal model + +- The session layer adds no shared state — it is a static dispatch + from configuration + handshake state to a finite protocol set. +- `protocol_block_in_31800` is strand-confined; its only off-strand + interactions are `get_hashes`/`put_hashes`/`performance` RPCs which + post to the chaser strand. The full system can be modelled as one + chaser strand + N independent channel strands communicating only by + the bus and these three RPCs. +- The **map-ownership** property (§8.3-3) is the non-trivial + correctness obligation here; an exchange protocol proof can be + modelled with a token-passing semantics. + +--- + +## Cross-references + +- [`00-overview.md`](00-overview.md) §7 (network layer at a glance) +- [`01-event-bus.md`](01-event-bus.md) §2.1 (work-shuffling events + emitted/consumed here), §2.3 (`checked`/`unchecked`) +- [`03-chaser-check.md`](03-chaser-check.md) §6 (the chaser side of + performance and starved/split/stall), §9 (coupling diagram — + upstream pair of this doc's §6) +- Upcoming: `07-protocol-block-out.md` (block-serving counterpart) +- Upcoming: `08-protocol-header-in-out.md` (header sync protocols) +- Upcoming: `09-protocol-filter-out-70015.md` (BIP158 filter serving) +- libbitcoin-network docs (external): the base `network::session_*`, + `network::protocol`, and channel/timer primitives. diff --git a/docs/architecture/07-header-protocols.md b/docs/architecture/07-header-protocols.md new file mode 100644 index 00000000..91998576 --- /dev/null +++ b/docs/architecture/07-header-protocols.md @@ -0,0 +1,550 @@ +# 07 — Header protocols (in/out, 31800 / 70012) + +> Companion to [`06-sessions-and-protocols.md`](06-sessions-and-protocols.md) +> and [`02-chaser-organize.md`](02-chaser-organize.md). +> +> The header protocols are the wire-side of `chaser_header`: they sync +> headers from peers (`protocol_header_in_*`) and serve headers to peers +> (`protocol_header_out_*`). There are two versioned pairs: +> +> - **31800**: classic. Sync via `get_headers`/`headers`; announcements +> discovered indirectly via `inv` messages. +> - **70012** (BIP130): adds `sendheaders` — peers announce new blocks +> directly with `headers` messages, bypassing `inv`. +> +> 70012 *inherits* from 31800 in both directions — it's an extension, +> not a replacement. + +| File | Lines | Role | +| ------------------------------------------------ | ----- | -------------------------------------------------------------------- | +| `src/protocols/protocol_header_in_31800.cpp` | 218 | Inbound sync + inv-based announcement detection | +| `src/protocols/protocol_header_in_70012.cpp` | 47 | Adds `sendheaders` request after initial sync | +| `src/protocols/protocol_header_out_31800.cpp` | 90 | Reply to `get_headers` | +| `src/protocols/protocol_header_out_70012.cpp` | 141 | Adds `chase::block` → announce new blocks via `headers` | + +--- + +## 1. Inheritance and overrides + +```mermaid +classDiagram + class protocol_peer["node::protocol_peer\n(forwards organize, performance,\nnotify; set_announced/was_announced)"] + class protocol_header_in_31800 { + +start + +handle_receive_headers + +handle_receive_inventory + +handle_organize + +complete + +subscribed : bool + } + class protocol_header_in_70012 { + +complete (override) + } + class protocol_header_out_31800 { + +start + +handle_receive_get_headers + +create_headers + } + class protocol_header_out_70012 { + +start (override) + +stopping (override) + +handle_event (override) + +handle_receive_send_headers + +do_announce + } + protocol_peer <|-- protocol_header_in_31800 + protocol_header_in_31800 <|-- protocol_header_in_70012 + protocol_peer <|-- protocol_header_out_31800 + protocol_header_out_31800 <|-- protocol_header_out_70012 +``` + +> **Invariant (HeaderProto-Inherit-1).** 70012 protocols *augment* +> 31800. Sync (in) and serve (out) work identically; 70012 only adds +> the announcement-via-headers path on top. + +--- + +## 2. `protocol_header_in_31800` — inbound sync + +### 2.1 Sync algorithm + +```mermaid +sequenceDiagram + autonumber + participant PHI as protocol_header_in_31800 + participant PEER as Peer + participant ORG as chaser_header (organize) + participant Q as query + + Note over PHI: start + PHI->>Q: get_candidate_hashes(heights(top_candidate)) + PHI->>PEER: get_headers (locator = candidate top hashes) + PEER-->>PHI: headers(2000 max) + loop per header in message + PHI->>ORG: organize(header, handle_organize) + ORG-->>PHI: handle_organize(ec, height) (off-strand) + end + alt size == max_get_headers + PHI->>PEER: get_headers (locator = last header hash) + else size < max_get_headers + Note over PHI: complete() — peer exhausted + end +``` + +Source: +- `start` at `protocol_header_in_31800.cpp:40-50` +- `handle_receive_headers` at `:57-97` +- `handle_organize` at `:100-128` (logs only) +- `complete` at `:132-143` + +### 2.2 The locator + +`create_get_headers()` (`:177-185`) builds the locator from the +*archived candidate chain*. The heights used are +`get_headers::heights(top_candidate)` — the standard exponential-back +locator. The peer uses this to find the highest common ancestor and +replies with up to `max_get_headers` (typically 2000) consecutive +headers above that. + +> **Invariant (HeaderIn-Sync-1).** Each channel syncs *independently* +> from the candidate top. The locator is recomputed from the store on +> every `get_headers` send. With many parallel channels, each will +> converge to the same head — duplicates are dropped by +> `chaser_organize::do_organize` (returns `error::duplicate_header`, +> ignored by `handle_organize`). + +> **Invariant (HeaderIn-Sync-2).** A response of fewer than +> `max_get_headers` headers means the peer is caught up (no more +> headers to send). The protocol transitions to "announcement mode" +> via `complete()`. + +### 2.3 `complete()` — switching to announcement mode + +```cpp +// :132-143 +void protocol_header_in_31800::complete() NOEXCEPT +{ + if (!subscribed && is_current(true)) { + subscribed = true; + SUBSCRIBE_CHANNEL(inventory, handle_receive_inventory, _1, _2); + } +} +``` + +After initial sync, the protocol subscribes to **inventory** messages +to detect new block announcements indirectly (no `sendheaders` at +31800). + +`handle_receive_inventory` (`:149-172`): for each block-typed `inv` +item, if `query.is_block(hash)` is false (we don't have it), send a +fresh `get_headers` to learn about the new branch. + +> **Invariant (HeaderIn-Subscribe-1).** Announcement subscription is +> gated on `is_current(true)` — confirmed chain is recent. Until then, +> the protocol does only initial sync. This prevents announcement +> traffic during catch-up. + +> **Invariant (HeaderIn-Subscribe-2).** `subscribed` is one-shot: +> latched true on first successful `complete()`, never reset. Each +> channel subscribes at most once. + +### 2.4 `set_announced` hookup + +Inside `handle_receive_headers` (`:74-75`): + +```cpp +if (subscribed) + set_announced(ptr->get_hash()); +``` + +Headers received *after* subscription are recorded as +"announced-from-this-peer", to be checked by +`protocol_header_out_70012` so we don't echo them back. See §4. + +--- + +## 3. `protocol_header_in_70012` — adds `sendheaders` + +47 lines. Only override: `complete()`. + +```cpp +// :31-44 +void protocol_header_in_70012::complete() NOEXCEPT +{ + if (!subscribed) { + subscribed = true; + SEND(send_headers{}, handle_send, _1); // ← BIP130 + } + protocol_header_in_31800::complete(); +} +``` + +> **Invariant (HeaderIn-70012-1).** Order matters: +> `subscribed = true` is set *before* `protocol_header_in_31800::complete()` +> runs, so the base's `complete()` finds `subscribed == true` and +> *skips* the `inv` subscription. At 70012 the announcements arrive as +> `headers` messages directly — no `inv` round-trip — so the inv path +> is unnecessary. + +The peer, upon receiving `sendheaders`, will announce new blocks by +sending `headers` messages instead of `inv`. The base +`handle_receive_headers` handles them on the same code path as initial +sync — they flow into `organize` exactly the same way. + +--- + +## 4. `protocol_header_out_31800` — serve `get_headers` + +```cpp +// :40-50 +void start() NOEXCEPT { + SUBSCRIBE_CHANNEL(get_headers, handle_receive_get_headers, _1, _2); + protocol_peer::start(); +} + +// :54-67 +bool handle_receive_get_headers(ec, message) NOEXCEPT { + SEND(create_headers(*message), handle_send, _1); + return true; +} + +// :72-84 +network::messages::peer::headers create_headers(get_headers& locator) { + if (!is_current(true)) + return {}; + return { archive().get_headers(locator.start_hashes, locator.stop_hash, + max_get_headers) }; +} +``` + +> **Invariant (HeaderOut-Serve-1).** Header serving requires +> `is_current(true)` — empty reply otherwise. A node that isn't +> current sends an empty `headers` reply, which the requester +> interprets as "peer exhausted" (HeaderIn-Sync-2). This prevents +> propagating stale data. + +> **Invariant (HeaderOut-Serve-2).** The reply is built from +> `archive().get_headers(start_hashes, stop_hash, max_get_headers)` — +> standard libbitcoin-database query. Result is at most +> `max_get_headers` headers consecutively above the highest match. + +No bus subscription; this protocol is pure request/response. + +--- + +## 5. `protocol_header_out_70012` — announce via `headers` + +This is the only header protocol that **subscribes to the bus**. It +adds: + +1. A handler for the peer's `sendheaders` message: lazily activates the + bus subscription. +2. A handler for `chase::block`: posts `do_announce(link)`, which sends + a 1-header `headers` message to announce the newly-confirmed block. + +### 5.1 Lifecycle + +```cpp +// :39-49 +void start() NOEXCEPT { + SUBSCRIBE_CHANNEL(send_headers, handle_receive_send_headers, _1, _2); + protocol_header_out_31800::start(); // (sets up get_headers serving) +} + +// :51-57 +void stopping(ec) NOEXCEPT { + unsubscribe_events(); + protocol_header_out_31800::stopping(ec); +} +``` + +The bus subscription is *not* set up in `start()` — it's deferred until +the peer sends `sendheaders` (the BIP130 opt-in). + +### 5.2 Activation via `sendheaders` + +```cpp +// :124-135 +bool handle_receive_send_headers(ec, message) NOEXCEPT { + subscribe_events(BIND(handle_event, _1, _2, _3)); + return false; // ← one-shot: stop receiving send_headers +} +``` + +`return false` from a `SUBSCRIBE_CHANNEL` handler unsubscribes from +that message. `sendheaders` is a one-time signal from the peer. + +> **Invariant (HeaderOut-70012-1).** Bus subscription for header +> announcements is *lazy* and *peer-driven*. A peer that doesn't send +> `sendheaders` never triggers it; the channel stays purely +> request/response (like 31800). + +### 5.3 The announce loop + +```cpp +// :62-85, handle_event +case chase::block: + POST(do_announce, std::get(value)); + break; + +// :90-119, do_announce +bool do_announce(header_t link) NOEXCEPT { + const auto hash = query.get_header_key(link); + if (was_announced(hash)) // ← anti-echo + return true; + const auto ptr = query.get_header(link); + if (!ptr) return true; // (logged but suppressed) + SEND(headers{ { ptr } }, handle_send, _1); + return true; +} +``` + +> **Invariant (HeaderOut-70012-2).** A block is *not* announced back +> to the peer that announced it. Enforced by `was_announced(hash)` +> check at `:100`. Prevents announcement loops. + +> **Invariant (HeaderOut-70012-3).** `chase::block` is emitted by +> `chaser_confirm::announce` (`chaser_confirm.cpp:421-428`) only when +> the confirmed chain is current. So this protocol only announces +> "live" blocks, not catch-up state. See +> [`05-chaser-confirm.md §7`](05-chaser-confirm.md#7-side-effect-summary-table). + +--- + +## 6. End-to-end header-flow scenarios + +### 6.1 Initial sync on outbound connect (70012 peer) + +```mermaid +sequenceDiagram + autonumber + participant US as our node (header_in_70012) + participant PEER as peer + participant ORG as chaser_header + participant CHK as chaser_check + + US->>PEER: get_headers (locator: candidate top hashes) + PEER-->>US: headers[2000] + US->>ORG: organize(h_1) ... organize(h_2000) + ORG->>US: notify(chase::headers, branch_point) → CHK + ORG->>US: notify(chase::bump, branch_point) → VAL etc. + US->>PEER: get_headers (locator: last hash) + PEER-->>US: headers[N < 2000] + US->>ORG: organize(h_2001) ... organize(h_2000+N) + Note over US: complete() — peer exhausted + US->>PEER: sendheaders (BIP130 opt-in) + Note over US: bus subscription not active here + Note over US: subscribed = true; suppresses inv-listening path +``` + +### 6.2 Future block announcement (70012 ← peer) + +```mermaid +sequenceDiagram + autonumber + participant PEER as peer + participant US as our node (header_in_70012) + participant ORG as chaser_header + PEER->>US: headers[1] (new block header) + US->>US: set_announced(hash) (anti-echo bookkeeping) + US->>ORG: organize(h) + ORG-->>US: handle_organize(ec, height) (off-strand) +``` + +### 6.3 Future block announcement (us → peer, 70012) + +```mermaid +sequenceDiagram + autonumber + participant CNF as chaser_confirm + participant Bus as event bus + participant POUT as protocol_header_out_70012 + participant PEER as peer + + Note over POUT: peer earlier sent send_headers\nso bus subscription is active + CNF->>Bus: notify(chase::block, link) + Bus-->>POUT: handle_event(chase::block) + POUT->>POUT: do_announce(link) + alt was_announced(hash) + Note over POUT: suppress + else + POUT->>PEER: headers[1] + end +``` + +### 6.4 Block announcement at 31800 (no BIP130) + +```mermaid +sequenceDiagram + autonumber + participant PEER as peer + participant US as our node (header_in_31800) + Note over US: complete() ran, subscribed=true,\ninv subscription active + PEER->>US: inv (containing block hashes) + US->>US: handle_receive_inventory:\nfor each block hash, check is_block(hash) + alt any unknown + US->>PEER: get_headers (locator: candidate top) + PEER-->>US: headers[...] + Note over US: same path as initial sync + end +``` + +--- + +## 7. Bus integration summary + +Only **`protocol_header_out_70012`** subscribes to the bus. Its single +consumed event: + +| Event | Source | Reaction | +| ------------- | ------------------ | --------------------------------------- | +| `chase::block`| `chaser_confirm` | `do_announce(link)`: emit `headers[1]` | + +The header *in* protocols never emit bus events directly — they call +`session_->organize(header, ...)` which goes through +`chaser_organize::do_organize`. The chaser, in turn, emits +`chase::headers`, `chase::bump`, `chase::regressed`, `chase::disorganized` +(see [`02-chaser-organize.md §3`](02-chaser-organize.md#3-do_organize-the-forward-state-machine)). + +> **Invariant (HeaderBus-1).** No header-in protocol emits any +> `chase::*` event. They only feed the `organize` pipeline through +> the session-level RPC. + +--- + +## 8. Anti-echo mechanism (`set_announced` / `was_announced`) + +Lives on `node::channel_peer` (referenced from `protocol_peer.cpp:78-86`). +Each channel maintains a small per-channel set of recently-announced +hashes. + +- `set_announced(hash)`: called by header-in 70012 (and tx-in 106, etc.) + when receiving an announcement. +- `was_announced(hash)`: called by header-out 70012 (and the + corresponding block-out paths) before sending an announcement. + +This is **not** a global state; it's per-channel — exactly what's +needed to break the local A↔B echo. Cross-peer dedup happens at a +higher layer (organize sees duplicates). + +> **Invariant (Anti-Echo-1).** The pair `(set_announced, was_announced)` +> guarantees a peer that announced X to us is never sent X back, *on +> that channel*. It does not guarantee global no-duplication; multiple +> peers may announce the same block and our node will announce it to +> all peers except those that announced it. + +--- + +## 9. Error / outcome inventory + +| Event | Site | Effect | +| -------------------------------------- | --------------------------------------- | --------------------------------------------------------------- | +| organize returns `service_stopped` | `protocol_header_in_31800.cpp:104-106` | silently ignored | +| organize returns `duplicate_header` | same | silently ignored | +| organize returns any other code | `:108-123` | `stop(ec)` — channel dropped | +| `get_header(link)` returns null in out_70012 | `protocol_header_out_70012.cpp:108-113` | log warning; do not announce; do not stop (defensive) | + +None of these are node-faults — header protocol failures only affect +that one channel. (Contrast with `protocol_block_in_31800` which can +emit `protocol1` fault.) + +--- + +## 10. Configuration interactions + +From [`06-sessions-and-protocols.md §2.3`](06-sessions-and-protocols.md#23-attach_protocolschannel----line-session_peeripp57-161), +the attach tree determines which header protocol runs per channel: + +- `headers_first = false` ⇒ no header-in protocol attached. The node + relies on `protocol_block_in_106` for legacy blocks-first sync. +- `headers_first = true, BIP130 negotiated` ⇒ `protocol_header_in_70012` + + `protocol_header_out_70012`. +- `headers_first = true, headers_protocol negotiated but no BIP130` ⇒ + `protocol_header_in_31800` + `protocol_header_out_31800`. +- `headers_first = true, neither` ⇒ no header protocols; legacy + blocks-first path. + +> **Invariant (HeaderAttach-1).** A channel never runs both 31800 and +> 70012 simultaneously for the same direction. The attach tree is a +> strict `if/else if/else`. + +--- + +## 11. Spec view + +### 11.1 As processes + +Both directions are stateless transducers over the channel + store +(plus the per-channel `subscribed` latch and the announce set). + +``` +header_in : Process + state: subscribed ∈ {false, true} + inputs: peer.headers, peer.inv (31800 only), peer.sendheaders (70012) + outputs: peer.get_headers, peer.sendheaders (70012 only), + organize(header) RPC (one per received header) + +header_out : Process + state: events_subscribed ∈ {false, true} (70012 only) + inputs: peer.get_headers, peer.send_headers (70012 only), bus chase::block (70012 only) + outputs: peer.headers +``` + +### 11.2 Safety properties + +1. **No echo within a channel** (Anti-Echo-1). +2. **Sync convergence**: assuming a peer's header chain extends ours, + each `get_headers` reply strictly extends the candidate chain (or + is empty). Convergence after ⌈Δ/2000⌉ round-trips where Δ is the + gap. +3. **Announcement liveness gate** (HeaderIn-Subscribe-1): no + announcement subscription until `is_current(true)`. This is a + liveness *delay*, not a safety constraint. + +### 11.3 Liveness + +- Each channel makes progress as long as the peer responds to + `get_headers`. +- After `complete()`, the channel quiesces until the peer sends a + `headers` (70012) or `inv` (31800) message. + +--- + +## 12. Notes for the Lisp port + +- Header-in/out at 31800 are pure transducers; ideal for a functional + implementation. +- The `complete()` state transition is a one-shot latch — a single + Boolean. +- BIP130 is a strict extension over 31800; mirror that with a class + hierarchy or trait composition. +- `set_announced`/`was_announced` is a per-channel finite set; an + LRU cache is sufficient. + +--- + +## 13. Notes for the formal model + +- These protocols are stateful only in `subscribed` (one bit per + channel) and the per-channel announce set. The rest is computed + from message + store state. +- A full model can use a single labelled transition system per channel + for each direction; the announce set is a finite oracle that returns + `was_announced(hash)`. +- The end-to-end "candidate chain converges to peer's tip" property + requires reasoning across multiple channels — out of scope of a + per-protocol spec but provable given `chaser_organize`'s safety + properties. + +--- + +## Cross-references + +- [`02-chaser-organize.md`](02-chaser-organize.md) — the consumer of + headers organized by these protocols +- [`05-chaser-confirm.md`](05-chaser-confirm.md) §7 — emits + `chase::block`, the only bus event consumed here +- [`06-sessions-and-protocols.md`](06-sessions-and-protocols.md) §2 — + attach tree; §3 — `node::protocol_peer` base +- Upcoming: `08-block-out-and-filter-out.md` (block serving, filters) +- Upcoming: `09-tx-protocols.md` (transaction in/out) diff --git a/docs/architecture/08-block-out-protocols.md b/docs/architecture/08-block-out-protocols.md new file mode 100644 index 00000000..fb69675e --- /dev/null +++ b/docs/architecture/08-block-out-protocols.md @@ -0,0 +1,531 @@ +# 08 — Block-out protocols (106 / 70012) + +> Companion to [`06-sessions-and-protocols.md`](06-sessions-and-protocols.md) +> and [`07-header-protocols.md`](07-header-protocols.md). +> +> Block-out is the **block-serving** side: respond to peer requests +> (`get_blocks`, `get_data`) with inventory / block bodies, and announce +> newly-confirmed blocks via `inv` so peers know to come ask. Two +> versions: +> +> - **106** (`protocol_block_out_106`) — full implementation: serves +> `get_blocks`/`get_data`, announces new blocks via `inv`. +> - **70012** (`protocol_block_out_70012`) — minimal extension that +> *suppresses* inv-based announcement when the peer has opted into +> `sendheaders` (BIP130). The header-out path takes over announcement; +> this protocol then only serves data on request. +> +> A common confusion: 70012 is **not** BIP152 (compact blocks). BIP152 +> is *not* implemented in this repo (see +> `session_peer.ipp:92-97` comment). The 70012 number reflects the +> handshake level at which BIP130 attaches, nothing more. + +| File | Lines | Role | +| --------------------------------------------- | ----- | --------------------------------------------------------------- | +| `src/protocols/protocol_block_out_106.cpp` | 262 | Full block-serving + inv announcement | +| `src/protocols/protocol_block_out_70012.cpp` | 75 | Adds `sendheaders` handler → sets `superseded_` flag | + +--- + +## 1. Inheritance and override surface + +```mermaid +classDiagram + class protocol_peer + class protocol_block_out_106 { + +start + +stopping + +handle_event + +do_announce + +superseded() default false + +handle_receive_get_blocks + +handle_receive_get_data + +send_block + -backlog_ : deque~inventory_item~ + -node_witness_, allow_overlapped_ + } + class protocol_block_out_70012 { + +start (override) + +superseded() override + +handle_receive_send_headers + -superseded_ : atomic_bool + } + protocol_peer <|-- protocol_block_out_106 + protocol_block_out_106 <|-- protocol_block_out_70012 +``` + +> **Invariant (BlockOut-Inherit-1).** 70012 is a strict superset of +> 106. All 106 behaviour remains in effect; 70012 only adds the +> "supersede" gate on outbound announcement. + +--- + +## 2. `protocol_block_out_106` — the workhorse + +### 2.1 Subscriptions + +```cpp +// :41-53 +void start() { + subscribe_events(BIND(handle_event, _1, _2, _3)); // bus + SUBSCRIBE_CHANNEL(get_data, handle_receive_get_data, _1, _2); + SUBSCRIBE_CHANNEL(get_blocks, handle_receive_get_blocks, _1, _2); + protocol_peer::start(); +} + +// :55-61 +void stopping(ec) { + unsubscribe_events(); + protocol_peer::stopping(ec); +} +``` + +The protocol does *three* things: +1. Listens for `chase::block` from the bus → emit `inv` to peer. +2. Listens for `get_blocks` from peer → reply with `inv`. +3. Listens for `get_data` from peer → start a streaming send of blocks. + +### 2.2 Outbound announcement (`do_announce`) + +```cpp +// :99-121 +bool do_announce(header_t link) { + const auto hash = archive().get_header_key(link); + if (was_announced(hash)) // anti-echo + return true; + if (hash == null_hash) return true; // store inconsistency; logged only + SEND(inventory{ { { type_id::block, hash } } }, handle_send, _1); + return true; +} +``` + +> **Invariant (BlockOut-Announce-1).** A block is announced only if +> (a) it wasn't received from this peer (`was_announced` returns +> false), and (b) `get_header_key(link)` returned a non-null hash. +> Anti-echo discipline matches `protocol_header_out_70012` +> ([`07 §5.3`](07-header-protocols.md#53-the-announce-loop)). + +> **Invariant (BlockOut-Announce-2).** Announcement is via a single +> `inv` containing `{ type_id::block, hash }`. The peer then sends +> `get_data` to retrieve the block body if it wants it. See the +> get-data path in §2.4. + +> **Note for the spec.** `chase::block` is consumed by *three* +> protocols (`chaser_snapshot`, `protocol_block_out_106`, and +> `protocol_header_out_70012` — see +> [`01-event-bus.md §2.6`](01-event-bus.md#26-confirm-chain-and-mining)). +> All three react independently; per-channel anti-echo prevents +> duplicate sends. + +### 2.3 Inbound `get_blocks` (peer asks for hashes) + +```cpp +// :126-138 +bool handle_receive_get_blocks(ec, message) { + SEND(create_inventory(*message), handle_send, _1); + return true; +} + +// :244-256 +inventory create_inventory(get_blocks& locator) { + if (!is_current(true)) return {}; + return inventory::factory + ( + archive().get_blocks(locator.start_hashes, locator.stop_hash, + messages::peer::max_get_blocks), + type_id::block + ); +} +``` + +> **Invariant (BlockOut-Serve-1).** Like header-out, block-out is +> gated on `is_current(true)` — non-current nodes reply empty +> (HeaderOut-Serve-1 mirror). This prevents propagating stale block +> sets. + +### 2.4 Inbound `get_data` (peer asks for block bodies) + +This is the streaming path with a per-channel backlog. The reasoning is +documented inline (`:170-181`): + +> *"Satoshi sends overlapping get_data requests, but assumes that the +> recipient is blocking all traffic until the previous is completed. +> So to prevent frequent drops of satoshi peers, and not let one +> protocol block all others, we must accumulate the requests into a +> backlog. If the backlog exceeds the individual message limit we drop +> the peer."* + +```cpp +// :143-191 +bool handle_receive_get_data(ec, message) { + if (!node_witness_ && message->any_witness()) { + stop(network::error::protocol_violation); // ← A + return false; + } + const auto size = message->count(get_data::selector::blocks); + if (is_zero(size)) return true; // no blocks; ignore + + const auto total = ceilinged_add(backlog_.size(), size); + if (total > network::messages::peer::max_inventory) { + stop(network::error::protocol_violation); // ← B + return false; + } + + const auto idle = backlog_.empty(); + if (!allow_overlapped_ && !idle) { + stop(network::error::protocol_violation); // ← C + return false; + } + + merge_inventory(message->items); // append block-typed items only + + if (idle) + send_block(error::success); // kick off send loop + + return true; +} +``` + +Three peer-drop conditions: + +- **A** — peer requested witness data but node doesn't advertise + witness service. +- **B** — total queued requests exceed `max_inventory` (= upper bound + for a single `inv` message; reused here as a defensive bound). +- **C** — overlapping request when `allow_overlapped == false`. + +> **Invariant (BlockOut-Backlog-1).** `backlog_.size() ≤ +> max_inventory` always. Once `merge_inventory` would push it over, +> the peer is dropped first. The bound is therefore strict, not +> "eventually consistent". + +> **Invariant (BlockOut-Backlog-2).** `merge_inventory` discards +> non-block items (`:236-242`). The deque holds only block requests. + +### 2.5 The send loop (`send_block`) + +```cpp +// :196-231 +void send_block(ec) { + if (stopped(ec)) return; + if (backlog_.empty()) return; + + const auto& item = backlog_.front(); + const auto witness = item.is_witness_type(); + if (!node_witness_ && witness) { + stop(network::error::protocol_violation); + return; + } + + const auto link = query.to_header(item.hash); + node::messages::block out{ query.get_wire_block(link, witness), witness }; + if (out.block_data.empty()) { + stop(system::error::not_found); + return; + } + + backlog_.pop_front(); + span(events::block_usecs, start); + SEND(std::move(out), send_block, _1); // ← recursive: callback is send_block +} +``` + +The continuation is `send_block` itself: each completed `SEND` triggers +the next iteration. The loop terminates when `backlog_` is empty (or +on error / stop). + +> **Invariant (BlockOut-Stream-1).** At any time, at most one +> `SEND(block, ...)` is in flight per channel. The next is started +> only from the previous's completion callback. This is the natural +> back-pressure: a slow peer slows the send rate. + +> **Invariant (BlockOut-Stream-2).** A `not_found` from +> `get_wire_block` drops the peer +> (`:218-225`). The comment notes "This block could not have been +> advertised to the peer" — only blocks we (or our peer set) previously +> announced should be requested. So a `not_found` is a protocol +> violation, not a store consistency issue. + +### 2.6 Witness-mode gate + +Two separate witness-gating sites: + +- `handle_receive_get_data` (request-time): drop if peer requests + witness data we don't advertise. +- `send_block` (per-item): drop if a queued item turns out to want + witness data (defensive, since `handle_receive_get_data` already + vetted). + +`node_witness_ = network_settings().witness_node()` — set at +construction (`protocol_block_out_106.hpp:39`). + +> **Invariant (BlockOut-Witness-1).** A non-witness-serving node will +> drop any peer that requests witness data, at either the inv-batch +> level or the per-item level. There is no path that serves +> non-witness data in response to a witness request. + +--- + +## 3. `protocol_block_out_70012` — the supersede gate + +Only ~30 lines of substantive code. + +### 3.1 Lifecycle + +```cpp +// :39-48 +void start() { + SUBSCRIBE_CHANNEL(send_headers, handle_receive_send_headers, _1, _2); + protocol_block_out_106::start(); +} + +// :53-63 +bool handle_receive_send_headers(ec, message) { + superseded_ = true; + return false; // ← one-shot +} + +// :66-69 +bool superseded() const { return superseded_; } +``` + +### 3.2 Interaction with base's `handle_event` + +The base `handle_event` checks `superseded()` first: + +```cpp +// protocol_block_out_106.cpp:66-71 +if (stopped() || superseded()) + return false; // ← unsubscribe from bus +``` + +When `superseded_` flips true: +1. The very next `chase::block` from the bus triggers `handle_event`. +2. `handle_event` sees `superseded() == true` and returns false. +3. The bus desubscriber removes this protocol from the event list. +4. Subsequent `chase::block` events are not delivered here at all. + +> **Invariant (BlockOut-Supersede-1).** After the peer sends +> `sendheaders`, this protocol no longer emits `inv`-based block +> announcements. Header-based announcement (via +> `protocol_header_out_70012`) takes over. + +> **Invariant (BlockOut-Supersede-2).** Supersede is one-way: once +> true, never reset. The peer cannot "un-opt-in" mid-channel. + +> **Note for the spec.** The combined behaviour after `sendheaders`: +> - `protocol_header_out_70012`: subscribes to `chase::block`, sends +> `headers[1]` announcement. +> - `protocol_block_out_70012` (this class): unsubscribes from +> `chase::block`. Still serves `get_data` (block bodies) and +> `get_blocks` (inventory locator queries). +> +> Net effect: announce via headers; serve via 106 paths. + +### 3.3 Why `superseded_` is `std::atomic_bool` + +`superseded_` is written in `handle_receive_send_headers` (channel +strand) and read in `handle_event` (bus subscriber's strand, which +posts back to channel strand for actual processing). In practice both +are the channel strand — see +[`06 §3.1`](06-sessions-and-protocols.md#31-event-subscription-protocol) +on subscription posting back to channel strand. The atomic is +defensive; a non-atomic `bool` would likely be sound, but the cost is +negligible. + +--- + +## 4. End-to-end flows + +### 4.1 Initial sync, peer requests blocks (legacy / 31800 path) + +```mermaid +sequenceDiagram + autonumber + participant PEER as peer + participant POUT as protocol_block_out_106 + participant Q as query + + PEER->>POUT: get_blocks (locator) + POUT->>Q: get_blocks(start_hashes, stop_hash, max) + POUT->>PEER: inv (block hashes) + PEER->>POUT: get_data (block hashes) + Note over POUT: backlog_ = [items]; idle so send_block() + loop until backlog empty + POUT->>Q: get_wire_block(link, witness) + POUT->>PEER: block (full body) + Note over POUT: send completion → send_block() again + end +``` + +### 4.2 New block confirmed, announce to peer (31800 / pre-BIP130) + +```mermaid +sequenceDiagram + autonumber + participant CNF as chaser_confirm + participant Bus as event bus + participant POUT as protocol_block_out_106 + participant PEER as peer + + CNF->>Bus: notify(chase::block, link) + Bus-->>POUT: handle_event + POUT->>POUT: do_announce(link) + alt was_announced(hash) + Note over POUT: suppress + else + POUT->>PEER: inv (one block hash) + end + opt peer wants it + PEER->>POUT: get_data + Note over POUT: enters send_block loop as in §4.1 + end +``` + +### 4.3 BIP130 peer opts into headers (70012 path) + +```mermaid +sequenceDiagram + autonumber + participant PEER as peer (BIP130) + participant POUT as protocol_block_out_70012 + participant HOUT as protocol_header_out_70012 + participant Bus as event bus + participant CNF as chaser_confirm + + PEER->>POUT: send_headers + POUT->>POUT: superseded_ = true + PEER->>HOUT: send_headers + HOUT->>HOUT: subscribe_events(handle_event) + + Note over POUT,HOUT: future chase::block routes to HOUT only + CNF->>Bus: notify(chase::block, link) + Bus-->>POUT: handle_event → returns false (supersede unsubscribes) + Bus-->>HOUT: handle_event → POST(do_announce) + HOUT->>PEER: headers[1] + opt peer wants body + PEER->>POUT: get_data + Note over POUT: 106 serving path + end +``` + +--- + +## 5. Bus integration summary + +| Protocol | Subscribes to | Reaction | +| ------------------------------ | -------------------- | ------------------------------------------------------------------------------------------------- | +| `protocol_block_out_106` | `chase::block` | `do_announce(link)` ⇒ send `inv` | +| `protocol_block_out_70012` | `chase::block` (inherited) | Same as 106 until `superseded_ == true`; then `handle_event` returns false → unsubscribed | + +Both protocols also subscribe to the bus during construction (via +`protocol_block_out_106::start`). 70012 inherits that and adds a +*condition* under which it self-unsubscribes. + +--- + +## 6. Error / outcome inventory + +| Code | Site | Trigger | +| ------------------------------------- | ------------------------------------------ | ------------------------------------------------------------- | +| `network::error::protocol_violation` | `:152-155, :163-168, :176-181, :206-210` | witness/inv-limit/overlap/witness-on-send violations | +| `system::error::not_found` | `:218-225` | requested block has no body in store | + +No node-faults from this protocol family. Failures are per-channel. + +--- + +## 7. Configuration knobs + +| Setting (file) | Effect | +| --------------------------------------- | -------------------------------------------------------------------------------------- | +| `network.witness_node` | Whether to serve witness data. If false: drop peers requesting witness | +| `node.allow_overlapped` | If false: drop peer that issues overlapping `get_data` while a previous is in flight | + +--- + +## 8. Spec view + +### 8.1 As a process (106) + +``` +protocol_block_out_106 : Process + state: backlog : Deque InventoryItem (bounded by max_inventory) + inputs: + bus chase::block(link) → emit inv to peer (filtered by was_announced) + peer get_blocks(locator) → emit inv (from get_blocks query) + peer get_data(items) → enqueue items; kick send loop if idle + send completion → pop one; SEND next; loop + outputs: + peer inv | block messages +``` + +### 8.2 70012 adds: + +``` +state: superseded : Atomic Bool (one-way latch) +inputs (additional): + peer send_headers → superseded := true +gating: + handle_event(chase::block) returns false if superseded +``` + +### 8.3 Safety properties + +1. **Anti-echo** (BlockOut-Announce-1). +2. **Backlog bounded** (BlockOut-Backlog-1): never exceeds + `max_inventory`. Peer is dropped before overflow. +3. **One in-flight send** (BlockOut-Stream-1): no concurrent block + sends per channel. +4. **Supersede monotone** (BlockOut-Supersede-2): once superseded, + always superseded. + +### 8.4 Liveness + +- The send loop drains the backlog as long as the peer accepts data + and the store yields block bodies. +- Stop is non-blocking; an in-flight send completes through the + channel's stop machinery (libbitcoin-network) and the backlog is + abandoned. + +--- + +## 9. Notes for the Lisp port + +- The async-send loop is naturally tail-recursive: `send_block` calls + `SEND(..., send_block, ...)`. In Lisp this is straightforward; in + C++ it relies on the callback being posted on the strand. +- The peer-drop conditions in `handle_receive_get_data` are + straightforward `cond` cases. +- `superseded_` is a one-bit latch — a single boolean. + +--- + +## 10. Notes for the formal model + +- The protocol is strand-confined except for the `superseded_` atomic + (and even that is realistically strand-only). Single-threaded + reasoning suffices per channel. +- The combined "announce via headers OR inv, never both" property + needs joint reasoning across `protocol_block_out_70012` and + `protocol_header_out_70012`: they consume the same `chase::block` + event, and exactly one (or zero) will issue an announcement per + block per channel. This is enforced by the supersede flag in this + protocol turning the header-out protocol into the sole announcer. +- A peer's behaviour (deciding to send `sendheaders` or not, when to + issue `get_data`, etc.) is an oracle in the model. + +--- + +## Cross-references + +- [`05-chaser-confirm.md`](05-chaser-confirm.md) §7 — issues + `chase::block` consumed here +- [`06-sessions-and-protocols.md`](06-sessions-and-protocols.md) §2.3 + — attach tree (which version attaches when) +- [`07-header-protocols.md`](07-header-protocols.md) §5 — the + *header-out_70012* counterpart that takes over announcement when + this one is superseded +- Upcoming: `09-filter-out-70015.md` (BIP157/158 client filters) +- Upcoming: `10-tx-protocols.md` (transaction in/out 106) +- Upcoming: `11-protocol-block-in-106.md` (legacy blocks-first) diff --git a/docs/architecture/09-filter-out-70015.md b/docs/architecture/09-filter-out-70015.md new file mode 100644 index 00000000..057a4d27 --- /dev/null +++ b/docs/architecture/09-filter-out-70015.md @@ -0,0 +1,486 @@ +# 09 — `protocol_filter_out_70015` (BIP157/158 client filters) + +> Companion to [`06-sessions-and-protocols.md`](06-sessions-and-protocols.md). +> +> `protocol_filter_out_70015` is the **light-client compact-block-filter +> serving protocol**, implementing BIP157 (`getcfcheckpt` / +> `getcfheaders` / `getcfilters`) over the Neutrino filter type +> (BIP158). It handles three peer requests and replies with data +> derived from the filter records persisted by `chaser_validate` +> (`set_filter_body`) and `chaser_confirm` (`set_filter_head`). +> +> The protocol is **stateless** per channel — pure request/response. +> The only stateful pattern is a *one-shot subscription* trick used to +> serialize filter-body streaming. + +| File | Lines | Role | +| ------------------------------------------------- | ----- | ------------------------------------------------- | +| `src/protocols/protocol_filter_out_70015.cpp` | 265 | All three BIP157 request handlers | + +--- + +## 1. BIP157/158 in one paragraph + +A *client filter* is a compact, deterministic summary of the +addresses/scripts in a block (BIP158: Golomb-Rice encoded set of +scriptPubKeys + spent prevout scripts). Light clients can ask any +full node for these filters, scan them locally to find blocks +relevant to their wallet, then fetch only those blocks. BIP157 is the +P2P protocol for serving them: + +- **Filter checkpoint** (`getcfcheckpt`): every 1000 blocks (the + `client_filter_checkpoint_interval`), the *filter header* (a hash + chain over filter bodies). Used by clients to anchor trust quickly. +- **Filter headers** (`getcfheaders`): a range of filter hashes from + `start_height` to `stop_hash`, plus the filter header at + `start_height - 1`. Up to `max_client_filter_headers` (2000) per + request. +- **Filters** (`getcfilters`): the actual filter bodies for a range, + up to `max_client_filters` (1000) per request. + +Only one filter type is recognised: `client_filter::type_id::neutrino` +(BIP158). Anything else is a protocol violation. + +> **Invariant (Filter-Type-1).** All three request handlers reject +> any `filter_type != neutrino` with `protocol_violation`. The node +> serves Neutrino filters or nothing. + +--- + +## 2. Where the data comes from + +`protocol_filter_out_70015` is purely a *server*; it reads from the +store. The data is computed and persisted upstream by: + +| Persisted by | Method | What it stores | +| ------------------------------------- | ---------------------------- | ------------------------------- | +| `chaser_validate::validate` | `query.set_filter_body(link, block)` (`chaser_validate.cpp:296`) | Filter body for each validated block (always, even bypass) | +| `chaser_confirm::confirm_block` | `query.set_filter_head(link)` (`chaser_confirm.cpp:306`) | Filter header (running hash) for each confirmable block | +| `chaser_confirm::organize` (bypass) | `query.set_filter_head(link)` (`chaser_confirm.cpp:236`) | Same, for bypass path | + +Read methods used here: + +| Method | Used by | +| --------------------------------------------------- | ------------------------------------------------ | +| `query.to_header(hash) → header_link` | All three request handlers (resolve stop hash) | +| `query.get_height(out h, link) → bool` | All three (validate stop_hash; get range bound) | +| `query.get_filter_heads(out, stop_h, interval)` | `getcfcheckpt` handler | +| `query.get_filter_hashes(out, prev_hdr, stop_link, count)` | `getcfheaders` handler | +| `query.get_ancestry(out, stop_link, count)` | `getcfilters` handler — header_link list | +| `query.get_filter_body(out, link)` | Streamed per ancestry entry | +| `query.get_header_key(link)` | Per filter body — peer expects `block_hash` | + +> **Invariant (Filter-Data-1).** All filter data is derived from +> already-persisted store records. The protocol performs **no +> computation** other than serialization; it can be modelled as a +> pure read transformer. + +--- + +## 3. Subscriptions + +```cpp +// :43-53 +void start() { + SUBSCRIBE_CHANNEL(get_client_filter_checkpoint, handle_receive_get_filter_checkpoint, _1, _2); + SUBSCRIBE_CHANNEL(get_client_filter_headers, handle_receive_get_filter_headers, _1, _2); + SUBSCRIBE_CHANNEL(get_client_filters, handle_receive_get_filters, _1, _2); + protocol_peer::start(); +} +``` + +No bus subscription. No `stopping` override (the base handles +unsubscribe). + +> **Invariant (Filter-Sub-1).** Three independent message +> subscriptions; each handler is independent of the others. No +> internal state is shared between request types beyond the channel +> itself. + +--- + +## 4. `getcfcheckpt` handler + +```cpp +// :58-102 +bool handle_receive_get_filter_checkpoint(ec, message) { + if (filter_type != neutrino) return PROTOCOL_VIOLATION; // A + + size_t stop_height{}; + const auto stop_link = query.to_header(message->stop_hash); + if (!query.get_height(stop_height, stop_link)) + return PROTOCOL_VIOLATION; // B + + client_filter_checkpoint out{}; + if (!query.get_filter_heads(out.filter_headers, stop_height, + client_filter_checkpoint_interval)) + return PROTOCOL_VIOLATION; // C + + out.stop_hash = message->stop_hash; + out.filter_type = neutrino; + span(events::filterchecks_msecs, start); + SEND(out, handle_send, _1); + return true; +} +``` + +Where `PROTOCOL_VIOLATION` means `stop(network::error::protocol_violation); return false;`. + +Three drop conditions: +- **A**: wrong filter type +- **B**: unknown stop hash (no height for it) +- **C**: filter head fetch failed (typically because the branch has + never been confirmed) + +> **Invariant (Filter-Checkpoint-1).** Inconsistency note from the +> comment block (`:86-88`): *"There is no guarantee that this set +> will be consistent across reorgs. However for it to be inconsistent +> there must be a ≥ 1000 block reorg."* The checkpoint interval is +> 1000; a reorg below the lowest sample makes earlier samples +> potentially stale. Spec consequence: do not treat +> `client_filter_checkpoint` as canonical across a 1000-block reorg +> window. + +> **Reporting event.** `events::filterchecks_msecs` is fired per +> response (see [`00-overview.md §9`](00-overview.md#9-failure-model) +> on the metrics enum). + +--- + +## 5. `getcfheaders` handler + +```cpp +// :107-166 +bool handle_receive_get_filter_headers(ec, message) { + if (filter_type != neutrino) return PROTOCOL_VIOLATION; + if (!query.get_height(stop_height, stop_link)) return PROTOCOL_VIOLATION; + + const size_t start_height = message->start_height; + if (is_subtract_overflow(stop_height, start_height)) // BIP157 "stop ≥ start" + return PROTOCOL_VIOLATION; + const auto count = subtract(stop_height, start_height); + if (count >= max_client_filter_headers) // BIP157 "< 2000" + return PROTOCOL_VIOLATION; + + client_filter_headers out{}; + if (!query.get_filter_hashes(out.filter_hashes, out.previous_filter_header, + stop_link, count)) + return PROTOCOL_VIOLATION; + + out.stop_hash = message->stop_hash; + out.filter_type = neutrino; + span(events::filterhashes_msecs, start); + SEND(out, handle_send, _1); + return true; +} +``` + +Additional drop conditions vs. checkpoint: + +- `stop_height < start_height` (subtraction underflow) — BIP157 + validation +- `count ≥ max_client_filter_headers` (= 2000) — BIP157 validation + +> **Invariant (Filter-Headers-1).** A consistent branch is assured by +> the comment at `:151-152`: *"The response is assured to represent a +> consistent branch."* Implementation relies on +> `query.get_filter_hashes` returning either a complete branch from +> `stop_link` going back `count` items, or false. No partial returns. + +> **Invariant (Filter-Headers-2).** `previous_filter_header` is the +> filter header at `start_height - 1` (or all-zero if `start_height +> == 0`), and is set by the store query in a single transaction with +> the hashes list. + +--- + +## 6. `getcfilters` handler and the streaming pattern + +This is the most interesting one. Filter bodies are large, so the +response is *streamed* — one `cfilter` message per block. + +### 6.1 The handler + +```cpp +// :171-226 +bool handle_receive_get_filters(ec, message) { + if (filter_type != neutrino) return PROTOCOL_VIOLATION; + if (!query.get_height(stop_height, stop_link)) return PROTOCOL_VIOLATION; + if (is_subtract_overflow(stop_height, start_height)) return PROTOCOL_VIOLATION; + const auto count = subtract(stop_height, start_height); + if (count >= max_client_filters) return PROTOCOL_VIOLATION; // < 1000 + + const auto ancestry = std::make_shared(); + if (!query.get_ancestry(*ancestry, stop_link, count)) + return PROTOCOL_VIOLATION; + + span(events::ancestry_msecs, start); + send_filter(error::success, ancestry); + return false; // ← ONE-SHOT +} +``` + +**The `return false`** is critical — it unsubscribes the channel from +further `get_client_filters` messages. The protocol then re-subscribes +once the streaming completes (§6.2). + +### 6.2 The streaming continuation + +```cpp +// :228-258 +void send_filter(ec, ancestry_ptr ancestry) { + if (ancestry->empty()) { + SUBSCRIBE_CHANNEL(get_client_filters, handle_receive_get_filters, _1, _2); + return; // ← RESUBSCRIBE + } + + const auto link = system::pop(*ancestry); // remove one from end + client_filter out{}; + if (!query.get_filter_body(out.filter, link)) { + stop(network::error::protocol_violation); + return; + } + + out.block_hash = query.get_header_key(link); + out.filter_type = neutrino; + span(events::filter_msecs, start); + SEND(out, send_filter, _1, ancestry); // ← recursive callback +} +``` + +This is a self-recursive async loop, identical in structure to +`protocol_block_out_106::send_block` +([`08 §2.5`](08-block-out-protocols.md#25-the-send-loop-send_block)). +The completion handler of each `SEND` is `send_filter` itself, with +the same `ancestry_ptr` shared across iterations. + +> **Invariant (Filter-Stream-1).** Exactly one +> `get_client_filters` request is in progress per channel at any +> time. The handler unsubscribes (`return false`) before initiating +> streaming, and the streaming continuation re-subscribes only after +> the ancestry list is fully drained. + +> **Invariant (Filter-Stream-2).** The ancestry list is consumed +> back-to-front (`system::pop`), so the wire order is +> `start_height + 0`, `start_height + 1`, …, `stop_height`. This +> matches BIP157 expected ordering (low height first). + +> **Invariant (Filter-Stream-3).** A failure to fetch any filter body +> mid-stream drops the peer (`protocol_violation`). The ancestry is +> abandoned (its `shared_ptr` count drops when `send_filter` returns). +> The channel is not re-subscribed. + +### 6.3 Why one-shot resubscription? + +The pattern exists because of two constraints: + +1. Each `get_client_filters` request triggers up to 1000 outgoing + `cfilter` messages. Allowing concurrent requests would interleave + their streams. +2. The libbitcoin-network channel send model is single-stream-per + channel (back-pressure via callbacks); concurrent issuance would + create unbounded queueing. + +Unsubscribing for the duration of the stream **serializes** requests +without explicit locks: a peer's second `get_client_filters` arrives +when there is no handler for it, which the libbitcoin-network channel +treats as a *protocol violation* and drops the peer. + +> **Invariant (Filter-Stream-4).** A peer that issues a second +> `get_client_filters` before the first completes will be dropped by +> the channel layer (not by this protocol). This effectively makes +> `getcfilters` request/response exclusive per channel. + +--- + +## 7. State machine view + +```mermaid +stateDiagram-v2 + [*] --> SUBSCRIBED: start()\nsubscribe to: getcfcheckpt, getcfheaders, getcfilters + + SUBSCRIBED --> SUBSCRIBED: getcfcheckpt → send cfcheckpt + SUBSCRIBED --> SUBSCRIBED: getcfheaders → send cfheaders + SUBSCRIBED --> STREAMING: getcfilters → unsubscribe from getcfilters\nstart send_filter loop + + STREAMING --> STREAMING: SEND cfilter; on completion, send_filter + STREAMING --> SUBSCRIBED: ancestry empty → resubscribe to getcfilters + + SUBSCRIBED --> DROPPED: protocol_violation\n(any handler) + STREAMING --> DROPPED: filter body fetch fails + + SUBSCRIBED --> [*]: stop + STREAMING --> [*]: stop + DROPPED --> [*] +``` + +The state space is just `{SUBSCRIBED, STREAMING}` per channel — and +`STREAMING` only differs from `SUBSCRIBED` in that the +`getcfilters` subscription is inactive. + +--- + +## 8. Bus integration + +**None.** This protocol does not subscribe to or emit any `chase` +events. Its sole inputs are peer messages; its sole outputs are peer +messages and metrics events (`events::filter_msecs`, +`events::filterhashes_msecs`, `events::filterchecks_msecs`, +`events::ancestry_msecs`). + +--- + +## 9. Error / outcome inventory + +All errors result in dropping the peer; none are node-faults. + +| Site | Code | Reason | +| ------------------------------------------------------------- | ------------------------------------- | --------------------------------------- | +| `:67-71` (cfcheckpt), `:116-120` (cfheaders), `:180-184` (cfilters) | `protocol_violation` | `filter_type != neutrino` | +| `:80-84`, `:128-133`, `:191-197` | `protocol_violation` | `stop_hash` not in store | +| `:136-141`, `:200-205` | `protocol_violation` | `stop_height < start_height` | +| `:144-149` | `protocol_violation` | cfheaders count ≥ 2000 | +| `:207-213` | `protocol_violation` | cfilters count ≥ 1000 | +| `:90-95` | `protocol_violation` | `get_filter_heads` returned false | +| `:154-159` | `protocol_violation` | `get_filter_hashes` returned false | +| `:217-221` | `protocol_violation` | `get_ancestry` returned false | +| `:249-252` | `protocol_violation` | `get_filter_body` returned false mid-stream | + +> **Note for the spec.** Every `protocol_violation` here represents +> either (a) a misbehaving peer or (b) a store inconsistency. The +> protocol does not distinguish; both produce a channel drop. + +--- + +## 10. Configuration + +Attachment-gate is `node_client_filters` (a service bit on +`network.services_maximum`, see +[`06 §2.4`](06-sessions-and-protocols.md#24-predicate-definitions)): + +``` +if (node_client_filters && blocks_out && peer.is_negotiated(bip157)) + channel->attach(self)->start(); +``` + +(`session_peer.ipp:102-104`) + +Plus the store-level toggle `filter_enabled()` consumed by +`chaser_validate` (`filter_ = !defer_ && archive.filter_enabled()`, +`chaser_validate.cpp:48`). If filters aren't being computed +upstream, this serving protocol will get empty results and drop +peers — so deployment must enable filters everywhere or nowhere. + +> **Invariant (Filter-Deploy-1).** For `protocol_filter_out_70015` to +> serve usefully: +> - `archive.filter_enabled() == true` AND +> - `chaser_validate.filter_` is consequently `true` AND +> - peers actually want filters (advertise BIP157). +> +> Otherwise either filters aren't computed (store returns false) or +> no peers attach the protocol. + +--- + +## 11. Spec view + +### 11.1 As a process + +``` +protocol_filter_out_70015 : Process + state: streaming : Bool (false = subscribed to all 3; true = unsubscribed from getcfilters) + inputs: + peer get_client_filter_checkpoint + peer get_client_filter_headers + peer get_client_filters + send_filter continuation + outputs: + peer client_filter_checkpoint + peer client_filter_headers + peer client_filter (zero or more, streamed) + drop_channel(protocol_violation) + store reads: get_filter_heads, get_filter_hashes, get_ancestry, + get_filter_body, get_header_key, to_header, get_height +``` + +### 11.2 Safety properties + +1. **No data without persistence** (Filter-Data-1). +2. **Type uniformity** (Filter-Type-1): only Neutrino served. +3. **Single in-flight stream** (Filter-Stream-1). +4. **No interleaving across request types**: cfcheckpt and cfheaders + responses are single SENDs; cfilters streams atomically because + of the subscription gate. +5. **No state shared across channels**: no class statics, no globals + beyond the store. + +### 11.3 Liveness + +- Each request handler completes in O(1) SEND for cfcheckpt/cfheaders + and O(count) SENDs for cfilters. +- A slow peer back-pressures the stream via the send callback chain. +- The `STREAMING` state always returns to `SUBSCRIBED` on completion + or to terminal on error. + +### 11.4 Stateless modelling + +Because the only state variable is `streaming` (one bit), a spec can +collapse this protocol to a pure function modulo the streaming gate: + +``` +F : Request × Store → Response | Violation +F(getcfcheckpt(neutrino, stop_hash), store) = + if known(stop_hash) ∧ ∃filter_heads(stop_height, interval=1000) + then cfcheckpt(filter_heads) + else Violation + +F(getcfheaders(neutrino, start, stop_hash), store) = + if known(stop_hash) ∧ stop_h ≥ start ∧ (stop_h − start) < 2000 + ∧ ∃filter_hashes + then cfheaders(prev_hdr, hashes) + else Violation + +F(getcfilters(neutrino, start, stop_hash), store) = + if known(stop_hash) ∧ stop_h ≥ start ∧ (stop_h − start) < 1000 + ∧ ∃ancestry + then stream(cfilter[0], cfilter[1], …, cfilter[count−1]) + else Violation +``` + +--- + +## 12. Notes for the Lisp port + +- Three pure functions over the store + a tail-recursive streamer for + cfilters. +- The one-shot subscription trick can be replaced by an explicit + per-channel `:streaming?` flag or by serialised request dispatch. +- All BIP157 numeric bounds (1000, 2000, checkpoint interval 1000) + should be named constants matching libbitcoin-network exposed names. + +--- + +## 13. Notes for the formal model + +- Stateless except for the streaming subscription gate. +- Drop-or-respond is a strict guard; modelling as `Maybe Response` is + natural. +- No cross-channel state; the protocol is *fully parallelisable* + across channels. +- The only correctness obligation tied to upstream is that + `set_filter_body` and `set_filter_head` are called for every block + that this protocol expects to serve — a liveness constraint owned + by `chaser_validate` and `chaser_confirm`. + +--- + +## Cross-references + +- [`04-chaser-validate.md`](04-chaser-validate.md) §4.2 (writes + `set_filter_body`; the only producer of filter bodies) +- [`05-chaser-confirm.md`](05-chaser-confirm.md) §6 (writes + `set_filter_head`) +- [`06-sessions-and-protocols.md`](06-sessions-and-protocols.md) §2.3 + — `node_client_filters` attach predicate +- libbitcoin-system docs (external): BIP158 filter computation +- BIP157, BIP158 (external) diff --git a/docs/architecture/10-tx-protocols.md b/docs/architecture/10-tx-protocols.md new file mode 100644 index 00000000..d55b2f19 --- /dev/null +++ b/docs/architecture/10-tx-protocols.md @@ -0,0 +1,450 @@ +# 10 — Transaction protocols (in/out 106) + +> Companion to [`06-sessions-and-protocols.md`](06-sessions-and-protocols.md), +> [`08-block-out-protocols.md`](08-block-out-protocols.md), and +> [`09-filter-out-70015.md`](09-filter-out-70015.md). +> +> The transaction protocols implement BIP35-era loose-tx relay. There +> are two versioned classes: +> +> - **`protocol_transaction_in_106`** — receive `inv(tx)` from peer, +> request and ingest. **Currently stubbed**: the handler exists but +> takes no action. Tx ingestion via this protocol is not implemented +> in this repo. +> - **`protocol_transaction_out_106`** — emit `inv(tx)` on bus +> `chase::transaction`; serve `get_data(tx)` requests. +> +> Both are pre-BIP339; the inline TODOs note that wtxidrelay handling +> is future work. There is no `transaction_*_70012` family — BIP130 +> does not apply to transactions, only headers. + +| File | Lines | Role | +| ------------------------------------------------------- | ----- | ----------------------------------------------------------------- | +| `src/protocols/protocol_transaction_in_106.cpp` | 76 | Stub: subscribes to `inv`, ignores | +| `src/protocols/protocol_transaction_out_106.cpp` | 194 | Outbound announce + serve | + +--- + +## 1. Why these are small + +A full Bitcoin node has a *mempool* — an in-memory store of unconfirmed +transactions used for mining templates and relay. In this codebase, the +mempool is represented by `chaser_transaction` and `chaser_template`, +both of which are currently **mostly inactive**: + +- `chaser_transaction` issues a single `chase::transaction(...)` event + at startup (`chaser_transaction.cpp:85`, the only live emit; another + emit at `:105` is commented out) and otherwise has no body beyond + the `chase::stop` handler. +- `chaser_template` consumes `chase::transaction` but currently issues + no `chase::template_` events. + +Consequently the transaction relay path is **wired but largely +dormant**. The protocols are present for the case where a future +mempool/template implementation drives them. + +> **Invariant (TxProto-State-1).** As of this repo state, the +> outbound protocol will emit at most one `inv(tx)` announcement at +> startup (driven by the single `chase::transaction` emit from +> `chaser_transaction`). Operational tx relay requires upstream +> changes. + +> **Invariant (TxProto-State-2).** The inbound protocol is a stub; +> `inv(tx)` from peers is observed but no `getdata` is sent. Tx +> ingestion does not currently produce store mutations. + +--- + +## 2. `protocol_transaction_in_106` — stub + +### 2.1 Full implementation + +```cpp +// :38-47 +void start() { + SUBSCRIBE_CHANNEL(inventory, handle_receive_inventory, _1, _2); + protocol_peer::start(); +} + +// :55-69 +bool handle_receive_inventory(ec, message) { + if (stopped(ec)) return false; + // TODO: get and handle tx as required. + ////const auto tx_count = message->count(type_id::transaction); + ////set_announced(hash); + return true; +} +``` + +That's all of it. The commented-out lines (`:66-67`) show the intended +shape: count tx-typed items in the inventory and record each as +"announced from this peer" via `set_announced(hash)` so that the +*out* protocol won't echo it back. None of this is currently active. + +### 2.2 TODO breadcrumbs + +Inline comments (`:51-53, :65`): + +> *"TODO: bip339: After a node has received a wtxidrelay message +> from a peer, the node SHOULD use a MSG_WTX getdata message to +> request any announced transactions."* + +> *"TODO: get and handle tx as required."* + +This protocol will need: + +1. A subscription to peer `wtxidrelay` to decide between MSG_TX and + MSG_WTX `getdata`. +2. A handler that calls `get_data` for unknown tx hashes. +3. A `tx` message handler that validates and stores the tx. +4. A `chase::transaction(link)` emission point (the only one currently + in the codebase is `chaser_transaction`, but the natural source + would be this protocol — see §6). + +> **Invariant (TxIn-Stub-1).** No bus events are emitted, no store +> mutations are performed, no further messages are sent in response +> to `inv`. The protocol exists solely to subscribe and (in future) +> implement. + +--- + +## 3. `protocol_transaction_out_106` — announce + serve + +### 3.1 Subscriptions + +```cpp +// :39-50 +void start() { + subscribe_events(BIND(handle_event, _1, _2, _3)); // bus + SUBSCRIBE_CHANNEL(get_data, handle_receive_get_data, _1, _2); + protocol_peer::start(); +} + +// :52-58 +void stopping(ec) { + unsubscribe_events(); + protocol_peer::stopping(ec); +} +``` + +One bus subscription, one channel subscription. Symmetric to +`protocol_block_out_106` ([`08 §2.1`](08-block-out-protocols.md#21-subscriptions)) +but for tx not block. + +### 3.2 Bus event handler — `chase::transaction → do_announce` + +```cpp +// :63-86 +bool handle_event(ec, event_, value) { + if (stopped()) return false; + switch (event_) { + case chase::transaction: { + POST(do_announce, std::get(value)); + break; + } + default: break; + } + return true; +} + +// :93-117 +bool do_announce(transaction_t link) { + const auto hash = query.get_tx_key(link); + if (was_announced(hash)) // anti-echo + return true; + if (hash == null_hash) return true; // store inconsistency; logged only + SEND(inventory{ { { type_id::transaction, hash } } }, handle_send, _1); + return true; +} +``` + +> **Invariant (TxOut-Announce-1).** Same anti-echo discipline as +> `protocol_block_out_106` and `protocol_header_out_70012`: a peer +> that announced a tx to us does not receive an echo. Implementation +> identical except for the `type_id::transaction` selector. + +> **Invariant (TxOut-Announce-2).** `chase::transaction(link)` is +> emitted only by `chaser_transaction` (verified in +> [`01-event-bus.md §2.6`](01-event-bus.md#26-confirm-chain-and-mining)). +> Currently fires at most once (at startup), so the announce path is +> driven by upstream rather than by tx relay traffic. + +### 3.3 Inbound `get_data` — streamed tx send + +The pattern mirrors `protocol_filter_out_70015` (one-shot subscription +during streaming, resubscribe on completion) but threads through the +peer's `get_data.items` list rather than an ancestry list. + +```cpp +// :122-133 +bool handle_receive_get_data(ec, message) { + if (stopped(ec)) return false; + send_transaction(error::success, zero, message); // start at index 0 + return false; // ← one-shot +} +``` + +```cpp +// :142-188 (simplified) +void send_transaction(ec, size_t index, get_data::cptr message) { + if (stopped(ec)) return; + + // Skip over non-tx inventory items + for (; index < message->items.size(); ++index) + if (message->items.at(index).is_transaction_type()) + break; + + // BUGBUG: registration race. + if (index >= message->items.size()) { + SUBSCRIBE_CHANNEL(get_data, handle_receive_get_data, _1, _2); // resubscribe + return; + } + + const auto& item = message->items.at(index); + const auto witness = item.is_witness_type(); + if (!node_witness_ && witness) { + stop(network::error::protocol_violation); + return; + } + + const auto ptr = query.get_transaction(query.to_tx(item.hash), witness); + if (!ptr) { + stop(system::error::not_found); + return; + } + + SEND(transaction{ ptr }, send_transaction, _1, sub1(index), message); +} +``` + +> **Invariant (TxOut-Stream-1).** One outstanding `tx` send per +> channel; the recursive `send_transaction` callback chains them. +> Same shape as block-out and filter-out streaming. + +> **Invariant (TxOut-Stream-2).** Resubscription to `get_data` +> happens only after the entire request's tx items have been served +> (or skipped). Until then, the channel is unsubscribed from +> `get_data` — a second incoming `get_data` while streaming will +> hit the libbitcoin-network "no handler" path. + +> ⚠ **Suspect: the `sub1(index)` continuation.** At `:187`, +> the next iteration is scheduled with `sub1(index)` (= `index - 1`), +> not `add1(index)` (= `index + 1`). The loop top is +> `for (; index < size; ++index)`, so the next call enters with +> `index - 1`, the `for` test passes, the inner `if` either matches +> at `index - 1` or `++index` runs and matches at the same `index` +> we just sent. Either way the same tx may be sent again. This +> reading suggests a possible off-by-one — likely intended +> `add1(index)`. Worth reviewing against intent before mirroring in +> a port. Flagged here rather than asserted as a bug because we have +> not built or run the codebase; there may be a subtlety we are +> missing. + +### 3.4 Witness gating + +```cpp +// :164-170 +if (!node_witness_ && witness) + stop(network::error::protocol_violation); +``` + +Same logic as `protocol_block_out_106` (`08 §2.6`): a node that +doesn't advertise witness service drops any peer that requests +witness data. + +### 3.5 BIP339 TODOs + +Multiple inline TODOs at `:51-53, :90-91, :137-140` reference +**BIP339 (`wtxidrelay`)**: after exchanging `wtxidrelay`, the +inv/getdata semantics for transactions switch to `MSG_WTX` (using +witness txids). This is not yet implemented; the protocol currently +uses MSG_TX universally. + +--- + +## 4. Attachment + +From `session_peer.ipp:156-160`: + +```cpp +if (txs_in_out) { + if (peer->peer_version()->relay) + channel->attach(self)->start(); +} +``` + +where `txs_in_out = relay && peer.is_negotiated(bip37) && (!delay || is_current(true))`. + +Two important consequences: + +> **Invariant (TxAttach-1).** Only the OUT protocol is attached +> here. `protocol_transaction_in_106` is **never attached** in this +> repo's current attach tree. This explains why its stubbed status is +> not yet a problem — it cannot receive anything because it isn't +> wired in. + +> **Invariant (TxAttach-2).** Tx out attaches only when (a) relay is +> enabled, (b) peer has negotiated BIP37 (mempool filtering), and (c) +> the peer's version message claimed `relay = true`. So tx +> announcement traffic is opt-in on both sides. + +--- + +## 5. Bus integration + +| Protocol | Subscribes to | Emits | +| --------------------------------- | -------------------- | --------------- | +| `protocol_transaction_in_106` | none | none | +| `protocol_transaction_out_106` | `chase::transaction` | none | + +The out protocol never *emits* bus events. It is a pure consumer of +`chase::transaction`. Compare with `protocol_block_in_31800` which +both consumes (`chase::download`, etc.) and emits (`chase::checked`, +`chase::unchecked`, `chase::starved`) — block-in is much more +deeply integrated. + +--- + +## 6. Where `chase::transaction` comes from + +Currently only `chaser_transaction.cpp:85`: + +```cpp +notify(error::success, chase::transaction, transaction_t{}); +``` + +This is called from the constructor path with a *zero* `transaction_t` +value. The commented-out emit at `:105` would be the operational +emission point — once mempool ingestion exists, each accepted tx +would emit `chase::transaction(link)`. + +> **Invariant (TxEvent-1).** Until `chaser_transaction` is fleshed +> out, the out protocol's bus subscription receives effectively no +> traffic. The protocol is *correctly idle*, not broken — it would +> activate the moment the chaser starts emitting. + +The natural extension path: + +1. Implement `protocol_transaction_in_106::handle_receive_inventory` + to `getdata` for unknown tx hashes. +2. Add a `tx` message handler that runs consensus + policy checks and + calls a store mutation (analogous to `set_code` for blocks). +3. Emit `chase::transaction(link)` from that handler — or from a + chaser that consumes a per-channel "received tx" message. +4. Add `chaser_transaction` body that drives mempool eviction, + miner-template trigger, etc. + +--- + +## 7. State machine view (out_106) + +```mermaid +stateDiagram-v2 + [*] --> SUBSCRIBED: start (subscribe bus + get_data) + SUBSCRIBED --> SUBSCRIBED: chase::transaction → do_announce → SEND inv + SUBSCRIBED --> STREAMING: get_data → unsubscribe from get_data\nsend_transaction(0, msg) + STREAMING --> STREAMING: SEND tx; send_transaction(sub1(i), msg) + STREAMING --> SUBSCRIBED: index ≥ size → resubscribe to get_data + STREAMING --> DROPPED: witness mismatch / not_found / send error + SUBSCRIBED --> [*]: stop / chase::stop + STREAMING --> [*]: stop + DROPPED --> [*] +``` + +State space (per channel): `{SUBSCRIBED, STREAMING}` — same as +filter-out. + +--- + +## 8. Error / outcome inventory + +| Site | Code | Trigger | +| --------------------------------------------- | ------------------------------------- | --------------------------------------------- | +| `:167-169` | `protocol_violation` | witness requested but `node_witness_` false | +| `:183` | `system::error::not_found` | tx requested but missing from store | + +No node-faults. No store mutations originate here (the out protocol is +strictly read+send). + +--- + +## 9. Spec view + +### 9.1 As processes + +``` +protocol_transaction_in_106 : Process (stub) + state: ∅ + inputs: peer inv(tx) + outputs: none (ignored) + +protocol_transaction_out_106 : Process + state: streaming : Bool + inputs: + bus chase::transaction(link) → emit inv(tx) [filtered by was_announced] + peer get_data(items) → enter streaming + send_transaction continuation + outputs: + peer inv(tx) | tx messages + drop_channel + store reads: get_tx_key, to_tx, get_transaction +``` + +### 9.2 Safety properties + +1. **Anti-echo** (TxOut-Announce-1). +2. **Single in-flight stream** (TxOut-Stream-1). +3. **Witness consistency**: never serves witness tx if not advertising + witness service. +4. **Stub no-op** (TxIn-Stub-1, TxProto-State-2): the in protocol + produces no observable effects. + +### 9.3 Liveness + +Bounded entirely by upstream. Until `chase::transaction` fires +frequently, the out protocol is idle. + +### 9.4 Open question for the spec + +The `sub1(index)` continuation (§3.3) needs verification against +intended behaviour. If it's a bug, a fix to `add1(index)` is a +one-character change. If it's intentional, the rationale is non-obvious +and warrants a comment in the source. + +--- + +## 10. Notes for the Lisp port + +- The in protocol is currently a no-op; mirror it as such until the + mempool design is settled. +- The out protocol is a near-clone of block-out: one announce path, + one streaming serve path. Reuse the same actor template. +- The `sub1` vs `add1` continuation should be tested when porting. + +--- + +## 11. Notes for the formal model + +- The in protocol contributes no transitions; it can be modelled as + identity. +- The out protocol's streaming state machine is identical in shape + to filter-out and block-out — three serialised request types + collapse to one "drain queue then resubscribe" pattern in the + abstract. +- The dormancy of `chase::transaction` is a *deployment* fact, not a + model property. A spec should still encode the protocol's intended + behaviour given an active source. + +--- + +## Cross-references + +- [`01-event-bus.md`](01-event-bus.md) §2.6 (`chase::transaction` — + emitter / consumer table) +- [`06-sessions-and-protocols.md`](06-sessions-and-protocols.md) §2.3 + — `txs_in_out` attach predicate +- [`08-block-out-protocols.md`](08-block-out-protocols.md) §2.5 + (streaming send loop — same pattern, working version) +- [`09-filter-out-70015.md`](09-filter-out-70015.md) §6 (streaming + + resubscribe pattern — same pattern, working version) +- BIPs 35, 37, 144, 339 (external) diff --git a/docs/architecture/11-protocol-block-in-106.md b/docs/architecture/11-protocol-block-in-106.md new file mode 100644 index 00000000..2b69dd5a --- /dev/null +++ b/docs/architecture/11-protocol-block-in-106.md @@ -0,0 +1,512 @@ +# 11 — `protocol_block_in_106` (legacy blocks-first) + +> Companion to [`06-sessions-and-protocols.md`](06-sessions-and-protocols.md) +> §2.3 (attach tree) and +> [`02-chaser-organize.md`](02-chaser-organize.md) §4 +> (`chaser_block::validate`). +> +> `protocol_block_in_106` is the **legacy blocks-first sync protocol**. +> It is used when a peer does not negotiate `headers_protocol` (BIP31) +> or BIP130. The protocol asks for block *inventory*, then for block +> *bodies*, organizing each block directly via `chaser_block` (the +> templated `chaser_organize` instantiation). +> +> Functionally it is the predecessor of `protocol_block_in_31800` but +> sits on a completely different pipeline: no `chaser_check`, no +> `chaser_validate`, no `chase::checked`/`chase::valid` events. + +| File | Lines | Role | +| --------------------------------------------------- | ----- | --------------------------------------------------------------- | +| `src/protocols/protocol_block_in_106.cpp` | 287 | Full implementation | +| `include/bitcoin/node/protocols/protocol_block_in_106.hpp` | 95 | Declaration + per-channel `track` struct | + +The file's own header comment is worth quoting (`:24-29`): + +> *"The block protocol is partially obsoleted by the headers protocol. +> Both block and header protocols conflate iterative requests and +> unsolicited announcements, which introduces several ambiguities. +> Furthermore inventory messages can contain a mix of types, further +> increasing complexity. Unlike header protocol, block protocol cannot +> leave announcement disabled until current and in both cases nodes +> announce to peers that are not current."* + +--- + +## 1. When is this attached? + +From the attach tree +([`06 §2.3`](06-sessions-and-protocols.md#23-attach_protocolschannel----line-session_peeripp57-161)): + +```cpp +// session_peer.ipp:115-133, the "block-in" arm: +if (headers && peer->is_negotiated(level::bip130)) { + channel->attach(self)->start(); + channel->attach(self)->start(); +} +else if (headers && peer->is_negotiated(level::headers_protocol)) { + channel->attach(self)->start(); + channel->attach(self)->start(); +} +else { + // Very hard to find < 31800 peer to connect with. + // Blocks-first synchronization (not base of block_in_31800). + channel->attach(self)->start(); +} +``` + +> **Invariant (BlockIn106-Attach-1).** This protocol is attached on a +> channel iff `headers_first == false` OR the peer doesn't negotiate +> `headers_protocol`. The inline comment notes such peers are rare in +> practice ("very hard to find < 31800 peer to connect with"). The +> protocol is present for completeness and as the operational mode +> when `node.headers_first` is configured false. + +> **Invariant (BlockIn106-Attach-2).** A channel never runs both +> `protocol_block_in_31800` and `protocol_block_in_106` — the +> `if/else if/else` is strict (the header explicitly states "This +> class does NOT inherit from protocol_block_in_106" at +> `protocol_block_in_31800.cpp:28`). + +--- + +## 2. State + +```cpp +// hpp:50-58 +using hashmap = std::unordered_set; + +struct track { + hashmap ids{}; // outstanding requested block hashes + size_t announced{}; // count from the inv that started this batch + system::hash_digest last{}; // last hash in that inv (for the next get_blocks) +}; + +const type_id block_type_; // witness_block or block (depending on node.witness) +track tracker_{}; // strand-protected +``` + +Compared to `protocol_block_in_31800`: + +| Feature | `_106` | `_31800` | +| ----------------------------- | -------------------- | ------------------------------ | +| Bus subscription | **no** | yes (`chase::download/purge/split/stall/report`) | +| Performance reporting | **no** | yes (via `protocol_performer`) | +| Per-channel work tracking | `hashmap` ids + count | `map_ptr` from `chaser_check` | +| Counterpart chaser | `chaser_block` (organize) | `chaser_check` (download orchestration) | +| Emits `chase::checked` etc. | **no** | yes | +| Speed/σ policing | **no** | yes | + +> **Invariant (BlockIn106-State-1).** All `tracker_` access is on the +> channel strand. The header `track` struct is "protected by strand" +> per `hpp:88`. + +--- + +## 3. Subscriptions and start + +```cpp +// :49-60 +void start() { + SUBSCRIBE_CHANNEL(block, handle_receive_block, _1, _2); + SUBSCRIBE_CHANNEL(inventory, handle_receive_inventory, _1, _2); + SEND(create_get_inventory(), handle_send, _1); // ← initial request + protocol_peer::start(); +} +``` + +No bus subscription. No `stopping` override. + +> **Invariant (BlockIn106-Sub-1).** Two channel subscriptions +> (`block`, `inventory`), one initial outbound message +> (`get_blocks`). No interaction with the chase event bus. + +--- + +## 4. The sync loop + +```mermaid +sequenceDiagram + autonumber + participant US as protocol_block_in_106 + participant PEER as peer + participant ORG as chaser_block (organize) + participant Q as query + + Note over US: start + US->>Q: get_candidate_hashes(heights(top_candidate)) + US->>PEER: get_blocks (locator) + PEER-->>US: inv (block hashes; up to max_get_blocks = 500) + US->>US: create_get_data: filter by !is_block(hash) + alt all known + opt inv.size == max_get_blocks + US->>PEER: get_blocks(last hash) + end + Note over US: log completion + else any new + US->>US: tracker_.ids = {new hashes}; announced; last + US->>PEER: get_data (new block hashes) + loop per block message + PEER-->>US: block + alt hash not in tracker_.ids + Note over US: log "unrequested"; ignore + else + US->>ORG: session.organize(block, handle_organize) + ORG-->>US: handle_organize(ec, height) (off-strand) + US->>US: POST do_handle_organize + Note over US: erase hash from tracker_.ids + alt ec error + US->>PEER: stop(ec) + else + Note over US: log success + alt tracker_.ids empty + alt announced == max_get_blocks + US->>PEER: get_blocks(tracker_.last) + else + Note over US: log completion + end + end + end + end + end + end +``` + +--- + +## 5. Inventory handling — `handle_receive_inventory` + +```cpp +// :66-119 +bool handle_receive_inventory(ec, message) { + if (stopped(ec)) return false; + + const auto block_count = message->count(type_id::block); + if (is_zero(block_count)) return true; // non-block inv; ignore + + // Work on only one block inventory at a time. + if (!tracker_.ids.empty()) { + // unrequested-while-busy: log and ignore + return true; + } + + const auto getter = create_get_data(*message); + + if (getter.items.empty()) { + // we already have everything in the inv + if (block_count == max_get_blocks) { + const auto& last = message->items.back().hash; + SEND(create_get_inventory(last), handle_send, _1); + } + // Otherwise: peer exhausted; just log + return true; + } + + // Some unknown blocks — request them + tracker_.announced = block_count; + tracker_.last = getter.items.back().hash; + tracker_.ids = to_hashes(block_count, getter); + SEND(getter, handle_send, _1); + return true; +} +``` + +> **Invariant (BlockIn106-Inv-1).** Only one batch of block requests +> is in flight per channel at a time. A new inv message arriving while +> `tracker_.ids` is non-empty is dropped (logged as "unrequested with +> pending"). This serializes inventory → request cycles on a single +> channel. + +> **Invariant (BlockIn106-Inv-2).** `create_get_data(inv)` filters by +> `!archive().is_block(hash)`. The node only requests blocks it +> doesn't already have. This eliminates most duplicate downloads even +> with multiple concurrent channels. + +> **Invariant (BlockIn106-Inv-3).** The "max-sized inv" heuristic: +> if the peer sent back exactly `max_get_blocks` (500) hashes, more +> are expected to follow; immediately request the next round with +> `get_blocks(last)`. If fewer than max, the peer is treated as +> exhausted (no further round-trip). +> +> The header comment block notes the ambiguity case at exactly 500 +> with no new blocks: completion is logged but no further round-trip +> is issued — "Completeness stalls if on 500 as empty message is +> ambiguous. This is ok, since complete is not used for anything +> essential." (`:202-205`) + +--- + +## 6. Block handling — `handle_receive_block` and `do_handle_organize` + +```cpp +// :125-146 +bool handle_receive_block(ec, message) { + if (stopped(ec)) return false; + const auto& block_ptr = message->block_ptr; + + // Unrequested block, may not have been announced via inventory. + if (tracker_.ids.find(block_ptr->get_hash()) == tracker_.ids.end()) { + return true; // log + ignore + } + + // organize is async; callback goes off-strand + organize(block_ptr, BIND(handle_organize, _1, _2, block_ptr)); + return true; +} + +// :149-153 (off-strand — post back to strand to access tracker_) +void handle_organize(ec, height, block_ptr) { + POST(do_handle_organize, ec, height, block_ptr); +} + +// :155-210 (stranded) +void do_handle_organize(ec, height, block_ptr) { + if (stopped() || ec == service_stopped) return; + + tracker_.ids.erase(block_ptr->get_hash()); + + if (ec == error::duplicate_block) return; // benign + if (ec) { stop(ec); return; } + + // Round complete? + if (tracker_.ids.empty()) { + if (tracker_.announced == max_get_blocks) { + SEND(create_get_inventory(tracker_.last), handle_send, _1); + } + // else: log completion + } +} +``` + +> **Invariant (BlockIn106-Recv-1).** Unrequested blocks are silently +> ignored. The protocol does not drop the peer for them — the comment +> notes "Many peers blindly broadcast blocks even at/above v31800, +> ugh" (`:176`). + +> **Invariant (BlockIn106-Recv-2).** Errors from `organize` (other +> than `duplicate_block` and `service_stopped`) drop the channel. +> This includes orphan_block (peer sent an out-of-order block) and +> any consensus failure detected in `chaser_block::validate`. + +> **Invariant (BlockIn106-Order-1).** The header comment notes +> "Order is enforced by organize" (`:164`). Out-of-order blocks +> received from the peer become orphans (parent unknown) and +> `chaser_block::do_organize` returns `error_orphan` ⇒ this channel +> is dropped. So this protocol is intolerant of out-of-order +> delivery — different from headers-first which queues headers in a +> tree. + +### 6.1 The strand-hopping for tracker_ access + +`handle_organize` (off-strand) only POSTs to `do_handle_organize`; +all `tracker_` access is on the channel strand. This is the same +pattern used in `chaser_validate` for back-posting from the +validation threadpool. + +> **Invariant (BlockIn106-Strand-1).** `tracker_` is read/written +> only on the channel strand. The off-strand +> `handle_organize` does *no* state access; it merely POSTs. + +--- + +## 7. Locator construction + +```cpp +// :215-251 +get_blocks create_get_inventory() const { + const auto index = get_blocks::heights(query.get_top_candidate()); + return create_get_inventory(query.get_candidate_hashes(index)); +} + +get_blocks create_get_inventory(const hash_digest& last) const { + return create_get_inventory(hashes{ last }); +} + +get_blocks create_get_inventory(hashes&& hashes) const { + if (hashes.empty()) return {}; + return { std::move(hashes) }; +} +``` + +Notes from the inline comments (`:217-220`): +- Sync is from the archived (strong) candidate chain. +- Will bypass blocks with candidate headers if headers-first ran + previously — this can produce "block orphans if headers-first is + run followed by a restart and blocks-first". + +> **Invariant (BlockIn106-Locator-1).** Each channel syncs +> independently from the archived candidate top. Same logic as +> `protocol_header_in_31800::create_get_headers` but using +> `get_blocks::heights` for the locator (vs. +> `get_headers::heights`). + +> **Note (BlockIn106-Mixed-Mode).** Switching from headers-first to +> blocks-first across a node restart can produce orphans because +> headers-first leaves "candidate headers without bodies" in the +> store. A blocks-first restart asks peers for blocks the headers of +> which are already on the candidate chain, but the parent linking +> may break. This is operational guidance, not a protocol +> obligation — flagged in the source at `:219-220`. + +--- + +## 8. Difference from `protocol_block_in_31800` — full table + +| Concern | `_106` (this) | `_31800` | +| ---------------------------------------- | ---------------------------------------------------------- | -------------------------------------------------------------- | +| Companion chaser | `chaser_block` (organize) | `chaser_check` (download orchestration) | +| What goes through `organize` | full block via `session.organize(block, ...)` | nothing — block goes to store directly via `query.set_code` | +| Validation lives where | `chaser_block::validate` hook (full block check + connect) | `chaser_validate` (separate strand, parallel pool) | +| Confirmation | Not emitted (`chaser_block` skips `chase::valid`) | `chaser_validate` emits `chase::valid` → `chaser_confirm` | +| Work attribution | per-channel `tracker_` | per-channel `map_ptr` from chaser; barrier (`job_`) | +| Inventory ↔ batches | One inv → one batch → next inv (strict serialisation) | Many maps concurrent across channels; split/stall rebalancing | +| `chase::checked`/`unchecked` emission | **none** | yes | +| Witness handling | `block_type_` set at construction | per-block at receive time | + +--- + +## 9. Bus integration + +**None.** Like `protocol_transaction_in_106`, this protocol does not +subscribe to nor emit bus events. Its only interactions with the rest +of the node are: + +- `session->organize(block, handler)` (forwarded to `chaser_block`) +- store reads in `create_get_inventory` and `create_get_data` + +> **Invariant (BlockIn106-Bus-1).** Zero `chase::` events flow +> through this protocol. The blocks-first pipeline is event-bus-free +> except for `chase::start`/`resume`/`bump` arriving at +> `chaser_block` (the organize template) and the +> `chase::regressed`/`disorganized` it emits. + +--- + +## 10. Error / outcome inventory + +| Source | Code | Behavior | +| ------------------------------------- | ------------------------------------- | -------------------------------------------------------- | +| organize returns `service_stopped` | (no action) | ignored | +| organize returns `duplicate_block` | (no action) | erase from tracker, ignored | +| organize returns any other error | `stop(ec)` | channel dropped | +| Unrequested block | (no action) | logged, ignored | +| Unrequested inv while pending | (no action) | logged, ignored | + +No node-faults. No `protocol_violation` drops either — this protocol +is *forgiving* of peers (matching the operational reality that BIP31 +peers may be quirky). + +--- + +## 11. State machine view + +```mermaid +stateDiagram-v2 + [*] --> IDLE: start (SEND initial get_blocks) + IDLE --> AWAITING_INV: get_blocks sent + AWAITING_INV --> CHOOSE: inv received + CHOOSE --> AWAITING_INV: getter empty AND inv == max → SEND get_blocks(last) + CHOOSE --> IDLE: getter empty AND inv < max → log completion + CHOOSE --> AWAITING_BLOCKS: getter non-empty → SEND get_data, tracker filled + AWAITING_BLOCKS --> AWAITING_BLOCKS: block received, in tracker → organize\n(off-strand callback POSTs back) + AWAITING_BLOCKS --> AWAITING_BLOCKS: block received, NOT in tracker → ignore + AWAITING_BLOCKS --> ROUND_DONE: tracker empty + ROUND_DONE --> AWAITING_INV: announced == max → SEND get_blocks(tracker.last) + ROUND_DONE --> IDLE: announced < max → log completion + AWAITING_BLOCKS --> DROPPED: organize error (non-duplicate, non-stopped) + AWAITING_INV --> [*]: stop + AWAITING_BLOCKS --> [*]: stop + IDLE --> [*]: stop + DROPPED --> [*] +``` + +--- + +## 12. Spec view + +### 12.1 As a process + +``` +protocol_block_in_106 : Process + state: tracker : (ids : Set hash, announced : ℕ, last : hash) | Empty + inputs: + peer inv(items) → if tracker empty: enter round; else ignore + peer block(body) → if hash ∈ tracker.ids: organize then update tracker + outputs: + peer get_blocks(locator) + peer get_data(items) + session.organize(block, handler) + drop_channel(ec) + store reads: + get_top_candidate, get_candidate_hashes, is_block(hash) +``` + +### 12.2 Safety properties + +1. **Serial rounds per channel** (BlockIn106-Inv-1): one outstanding + request batch at a time. +2. **No duplicate fetch within node** (BlockIn106-Inv-2): only blocks + not already in the store are requested. +3. **Strict in-order acceptance** (BlockIn106-Order-1): out-of-order + blocks cause channel drop via organize error. +4. **No bus emissions** (BlockIn106-Bus-1): the protocol is invisible + on the event bus. + +### 12.3 Liveness + +- Progresses one round per peer reply. +- A peer that returns `max_get_blocks` keeps the channel active; one + that returns fewer ends the channel's contribution. + +### 12.4 Spec mapping to organize state machine + +The validation and storage of each block flow through +`chaser_block::do_organize` (`chaser_organize::do_organize`), +described in [`02 §3`](02-chaser-organize.md#3-do_organize-the-forward-state-machine). +Specifically: + +- `chaser_block::validate` (`02 §4`): runs `block.check + block.accept + + block.connect` synchronously per block. So in blocks-first mode, + full consensus validation happens *inside* the organize call from + this protocol. +- `chaser_block::is_storable` returns `true` always: every received + block is archived (not just cached). + +> **Spec implication.** In blocks-first mode, "checked" and +> "validated" coincide: a block that organize-returns success is +> validated. There is no separate `chase::valid` phase. The +> consensus contract is therefore concentrated in the +> `chaser_block::validate` hook, not split across check/validate +> chasers. + +--- + +## 13. Notes for the Lisp port + +- Single-stream-per-channel makes this much simpler than 31800. One + outstanding batch, one tracker hashset per channel. +- The strand-hopping for `tracker_` (§6.1) is the only concurrency + subtlety; modelable as posting back to a single mailbox. +- Out-of-order intolerance (BlockIn106-Order-1) simplifies the port — + no reordering buffer needed. + +--- + +## 14. Notes for the formal model + +- Pure transducer with one small piece of state (`tracker_`). +- Single-threaded per channel modulo the strand-hop, which is + observationally equivalent to a self-message. +- The "duplicate block" / "service stopped" exceptions are the only + non-terminating organize-error paths. + +--- + +## Cross-references + +- [`02-chaser-organize.md`](02-chaser-organize.md) §4 — the + `chaser_block::validate` hook that runs from this protocol's + `organize` call +- [`06-sessions-and-protocols.md`](06-sessions-and-protocols.md) §2.3 + — attach tree (the `else` branch) +- [`06-sessions-and-protocols.md`](06-sessions-and-protocols.md) §5 — + the modern counterpart `protocol_block_in_31800` diff --git a/docs/architecture/12-periphery-chasers.md b/docs/architecture/12-periphery-chasers.md new file mode 100644 index 00000000..b7820ae4 --- /dev/null +++ b/docs/architecture/12-periphery-chasers.md @@ -0,0 +1,512 @@ +# 12 — Periphery chasers (snapshot, storage, template, transaction) + +> Companion to the core-pipeline chaser docs +> [`02`](02-chaser-organize.md)–[`05`](05-chaser-confirm.md). +> +> The four chasers covered here are operationally peripheral to consensus. +> Three are **partial or stub implementations**; one (`chaser_storage`) +> is fully active. None are critical to the validate/confirm path. +> +> | Chaser | Role | Current state | Bus events | +> | ------------------- | ------------------------------------------- | --------------------------------- | --------------------------------------------------------- | +> | `chaser_snapshot` | Periodic store snapshots + one-shot prune | Partially live (prune only) | Consumes `chase::block`, `chase::snap` (dormant) | +> | `chaser_storage` | Disk-space recovery + reload after full | Fully live | Consumes `chase::space` | +> | `chaser_template` | Mining template construction | Stub | Consumes `chase::transaction` | +> | `chaser_transaction`| Tx graph / mempool | Stub | Emits `chase::transaction` once (no live consumers in pipeline) | +> +> Documented together because individually they are short and the +> design intent is most clearly understood as a single "operational +> support" subsystem. + +| File | Lines | +| --------------------------------------------- | ----- | +| `src/chasers/chaser_snapshot.cpp` | 290 | +| `src/chasers/chaser_storage.cpp` | 178 | +| `src/chasers/chaser_template.cpp` | 96 | +| `src/chasers/chaser_transaction.cpp` | 112 | + +All four inherit from the base `chaser` ( +[`02 §2`](02-chaser-organize.md#2-public-interface-process-boundary) +applies to base lifecycle), run on their own strand on the network +pool, and subscribe to the bus exactly once in `start()`. + +--- + +## 1. `chaser_snapshot` + +### 1.1 Role + +Two distinct operational triggers: + +- **One-shot prune**: when the confirmed chain reaches its first + current block (`chase::block` from `chaser_confirm`), perform a + store *prune* of the prevout cache once. Latched by `pruned_`. +- **Explicit snapshot**: when a `chase::snap(height)` arrives, take + a full store snapshot. *No issuer of `chase::snap` exists in this + repo* — this handler is dormant + (see [`01-event-bus.md §2.1`](01-event-bus.md#21-work-shuffling)). + +### 1.2 State + +```cpp +// chaser_snapshot.cpp:39-48 (live state only) +std::atomic_bool pruned_{}; // one-shot latch for chase::block-driven prune + +// commented out — would gate periodic snapshots: +// const size_t snapshot_bytes_, snapshot_valid_, snapshot_confirm_; +// const bool enabled_bytes_, enabled_valid_, enabled_confirm_; +// size_t bytes_{}, valid_{}, confirm_{}; +``` + +The commented-out fields show the **planned design**: three +threshold-driven periodic-snapshot triggers (every N bytes archived, +every N blocks validated, every N blocks confirmed). The triggers and +their event handlers (`chase::blocks`/`checked`/`valid`/`confirmable`) +are commented out at `chaser_snapshot.cpp:89-116`. Only the +`chase::block` (singular) prune trigger is live. + +> **Invariant (Snapshot-State-1).** `pruned_` is set at most once per +> process lifetime. After the first successful prune, subsequent +> `chase::block` events are dropped at the handler entry +> (`chaser_snapshot.cpp:119`). + +### 1.3 Live event handlers + +```cpp +// :117-131 +case chase::block: // singular: from chaser_confirm.cpp:427 + if (pruned_.load()) break; + POST(do_prune, std::get(value)); + break; +case chase::snap: // payload height_t — no live issuer + POST(do_snap, std::get(value)); + break; +``` + +#### `do_prune(link)` — `chaser_snapshot.cpp:144-178` + +Calls the base `chaser::prune(handler)` (a wrapper around +`full_node::prune` — see +[`00-overview.md §6.2`](00-overview.md#62-suspend--resume--fault)). +The prune itself suspends the network, runs `query.prune(handler)`, +and leaves the network suspended. If prune *succeeded* and the chaser +was previously running (not suspended), this chaser **calls +`resume()` directly** to bring the network back up — and latches +`pruned_ = true`. + +If prune *failed* (typically because the store wasn't yet +coalesced), the chaser leaves `pruned_` false and the next +`chase::block` will retry. + +> **Invariant (Snapshot-Prune-1).** `chaser::prune` and the +> subsequent `resume()` are the cleanup of a suspend-then-resume +> sequence initiated by `full_node::prune`. The `running && +> !is_full()` check at `:169` ensures resume only happens if the +> chaser was running before *and* the prune didn't fill the disk. + +> **Invariant (Snapshot-Prune-2).** Prune is gated on +> `archive().is_coalesced()` (`:148`). If the store isn't ready, the +> handler returns silently; the next `chase::block` re-triggers it. +> Idempotency comes from the `pruned_` latch. + +#### `do_snap(height)` — `chaser_snapshot.cpp:180-188` + +Calls `take_snapshot(height)`, which calls the base +`chaser::snapshot(handler)`. Same suspend/resume dance as prune. + +Currently unreachable — no `chase::snap` issuer in source. + +### 1.4 Coupling + +```mermaid +flowchart LR + CNF[chaser_confirm] -- "chase::block (link)\nonly when current" --> SNP[chaser_snapshot] + DEAD[(no issuer)] -. "chase::snap (height)" .-> SNP + SNP -- "calls archive().prune,\narchive().snapshot,\nresume()" --> NODE[full_node] +``` + +### 1.5 Spec view + +- Process state: `pruned ∈ {false, true}` (monotone, one-way latch). +- One observable effect: a prune operation on the store, at most once + per process lifetime, gated on `archive().is_coalesced()`. +- The `chase::snap` arm is **dead code in current builds**; an + interpreter / formal model can either omit it or model it as + unreachable. + +--- + +## 2. `chaser_storage` + +### 2.1 Role + +Single, narrow responsibility: when the network is **suspended due to +a full-disk fault**, monitor disk free space and **reload + resume** +once space becomes available. + +This is the only chaser that owns a `network::deadline` timer +directly (separate from any threadpool). + +### 2.2 State + +```cpp +// chaser_storage.hpp (implied), chaser_storage.cpp:39-43, 51 +const std::filesystem::path store_; // database root path +network::deadline::ptr disk_timer_; // 1-second tick; constructed in start() +``` + +### 2.3 Lifecycle + +```cpp +// :48-55 +code start() { + disk_timer_ = std::make_shared(log, strand(), seconds{1}); + SUBSCRIBE_EVENTS(handle_event, _1, _2, _3); + return error::success; +} + +// :57-71 +void stopping(ec) { POST(do_stopping, ec); } +void do_stopping(ec) { + if (disk_timer_) { disk_timer_->stop(); disk_timer_.reset(); } +} +``` + +### 2.4 Event handling + +```cpp +// :76-100 +case chase::space: // from full_node::fault when is_full + POST(do_space, count_t{}); + break; +case chase::stop: + return false; +``` + +`do_space` (`:105-112`) jump-starts the polling loop with an immediate +`handle_timer(success)` (no wait for the first tick). + +### 2.5 The polling loop + +```mermaid +stateDiagram-v2 + [*] --> IDLE: start() + IDLE --> POLLING: chase::space → do_space → handle_timer(success) + POLLING --> POLLING: timer fires; suspended ∧ ¬is_fault ∧ ¬have_capacity\n→ restart timer + POLLING --> RELOAD: timer fires; have_capacity() + RELOAD --> IDLE: do_reload → archive.reload(); resume() + POLLING --> IDLE: !suspended OR is_fault → cancel (no action) + POLLING --> [*]: stop + IDLE --> [*]: stop +``` + +```cpp +// :117-142 +void handle_timer(ec) { + if (closed() || !disk_timer_ || ec == operation_canceled) return; + if (ec && ec != operation_timeout) { + LOGF(...); return; + } + // Network is resumed or store is failed, cancel monitoring. + if (!suspended() || archive().is_fault()) return; + + if (!have_capacity()) { + disk_timer_->start(BIND(handle_timer, _1)); // wait another second + return; + } + + // Disk now has space, reset store condition and resume network. + do_reload(); +} +``` + +```cpp +// :144-163 +void do_reload() { + if (const auto ec = reload([this](event_, table) NOEXCEPT { /* log */ })) + LOGF("Reload from disk full condition failed, " << ec.message()); + else { + resume(); + LOGN("Reload from disk full complete in ..."); + } +} +``` + +```cpp +// :165-173 +bool have_capacity() const { + size_t have{}; + const auto require = archive().get_space(); + return file::space(have, store_) && have >= require; +} +``` + +> **Invariant (Storage-Loop-1).** The polling loop exits exactly when +> either: +> - `!suspended()` (network was resumed externally) OR +> - `archive().is_fault()` (different, non-recoverable failure) OR +> - disk has capacity, triggering `do_reload + resume()`. +> +> Each exit returns the chaser to its IDLE state, ready for the next +> `chase::space`. + +> **Invariant (Storage-Reload-1).** `archive().reload(handler)` is +> called only after `file::space` returns ≥ `archive().get_space()`. +> A failure here is logged but not faulted upward — the chaser will +> *not* loop on reload failure; subsequent space events would. + +### 2.6 Coupling + +```mermaid +flowchart LR + FN[full_node::fault\n(when query.is_full)] -- "chase::space" --> STG[chaser_storage] + STG -- "archive().reload\nresume()" --> NODE[full_node] + STG -- "file::space(store_)" --> FS[(filesystem)] +``` + +### 2.7 Spec view + +- Process state: `{IDLE, POLLING}`; timer is conceptually a self-tick + message. +- One observable effect: clear the store's full-disk condition and + resume the network, exactly once per `chase::space` event (assuming + capacity is eventually restored). +- Liveness: assumes the operator adds disk space. + +--- + +## 3. `chaser_template` + +### 3.1 Role + +**Stub** for mining-template construction. Currently subscribes to +`chase::transaction` and ignores it. The intended behaviour (per +TODOs at `:43, :64, :86`) is: + +1. On chain top change (confirmed or candidate), recompute the + block template. +2. On transaction graph change, update the template's tx set. +3. Emit `chase::template_(height)` to wake any external miner. + +`chase::template_` has **no live issuer** in this repo +(see [`01-event-bus.md §2.6`](01-event-bus.md#26-confirm-chain-and-mining)). + +### 3.2 Live code + +The entire substantive body: + +```cpp +// :44-48 +code start() { + SUBSCRIBE_EVENTS(handle_event, _1, _2, _3); + return error::success; +} + +// :53-84 +bool handle_event(ec, event_, value) { + if (closed()) return false; + if (suspended()) return true; + + switch (event_) { + case chase::transaction: + POST(do_transaction, std::get(value)); + break; + case chase::stop: + return false; + default: break; + } + return true; +} + +// :87-90 +void do_transaction(transaction_t) { + BC_ASSERT(stranded()); + // (empty body) +} +``` + +That's all the live code. The TODO at `:64` notes: *"also handle +confirmed/unconfirmed"* — meaning `chase::organized` and +`chase::reorganized` are the next events to wire in. + +### 3.3 Spec view + +- **No state**, **no observable effects** in current builds. +- A spec / port can treat this as identity (no transitions modify + any shared state) until the chaser is implemented. +- The natural design once implemented: maintain a current `template_` + (block template) as a function of confirmed top + selected tx set; + emit `chase::template_(height)` after each recomputation. + +--- + +## 4. `chaser_transaction` + +### 4.1 Role + +**Stub** for mempool / tx-graph management. Subscribes to the bus but +its `handle_event` switches only on `chase::stop`. The +`do_confirmed(header_t)` method (not wired to any bus event currently) +contains the one live emission of `chase::transaction(transaction_t{})`, +giving rise to the single startup-time announcement that exercises +`chaser_template`'s and `protocol_transaction_out_106`'s subscriptions +([`10 §6`](10-tx-protocols.md#6-where-chasetransaction-comes-from)). + +### 4.2 Live code + +```cpp +// :44-48 +code start() { + SUBSCRIBE_EVENTS(handle_event, _1, _2, _3); + return error::success; +} + +// :53-78 +bool handle_event(ec, event_, value) { + if (closed()) return false; + // TODO: allow required messages. + switch (event_) { + case chase::stop: + return false; + default: break; + } + return true; +} + +// :81-86 — invoked from where? not from handle_event +void do_confirmed(header_t) { + BC_ASSERT(stranded()); + notify(error::success, chase::transaction, transaction_t{}); +} + +// :91-95 (stub) +void store(const transaction::cptr&) { + // Push new checked tx into store and update DAG. +} + +// :98-106 (stub) +void do_store(const transaction::cptr&) { + BC_ASSERT(stranded()); + // TODO: validate and store transaction. + ////notify(error::success, chase::transaction, link); // ← commented out +} +``` + +### 4.3 Where is `do_confirmed` called from? + +Not from `handle_event` in this file. A grep across the codebase shows +**no caller** for `chaser_transaction::do_confirmed` in this repo. +The single `chase::transaction` emission is therefore reached only +through tests or via a future caller not yet wired. + +> **Invariant (TxChaser-State-1).** In a normal node run, *no* +> bus event in this repo's wiring causes `chaser_transaction` to +> emit `chase::transaction`. The protocol_transaction_out_106's bus +> subscription is therefore effectively idle in current deployments. +> This is a known design state, not a bug — the `chaser_transaction` +> is awaiting a mempool design. + +### 4.4 Spec view + +Same as `chaser_template`: no state, no observable effects beyond +`chase::stop`-driven unsubscribe. + +The skeleton of the intended design is visible in the comments at +`:91-95` (push tx into store + update DAG + emit `chase::transaction`). +A port or spec can encode it as the obvious mempool: a partial DAG +ordered by dependencies, with insertion emitting per-tx events. + +--- + +## 5. Summary table of bus interactions for the periphery + +Reproduces and refines [`01-event-bus.md §3`](01-event-bus.md#3-verified-issuer--handler-diagram) +for these four chasers: + +| Chaser | Subscribes to (live) | Subscribes to (commented out) | Emits (live) | Emits (planned) | +| ------------------- | ------------------------------------ | -------------------------------------------------- | ------------ | --------------- | +| `chaser_snapshot` | `chase::block`, `chase::snap` | `chase::blocks`, `chase::checked`, `chase::valid`, `chase::confirmable` | (none) | (none) | +| `chaser_storage` | `chase::space`, `chase::stop` | — | (none) | (none) | +| `chaser_template` | `chase::transaction`, `chase::stop` | "also handle confirmed/unconfirmed" | (none) | `chase::template_` | +| `chaser_transaction`| `chase::stop` only | (none) | `chase::transaction` (from `do_confirmed`, no live caller) | per accepted tx | + +> **Invariant (Periphery-Bus-1).** Three of the four periphery chasers +> are *consumers only* (or completely idle). `chaser_transaction` is +> the only one with a live emission, and that emission has no live +> trigger in the current wiring. Operationally these four are +> bookkeeping infrastructure awaiting upstream completeness. + +--- + +## 6. Spec view (combined) + +For a formal model the periphery breaks down cleanly: + +``` +chaser_snapshot : Process + state: pruned : Bool -- one-way latch + inputs: chase::block(link) — fires do_prune unless pruned + chase::snap(height) — fires do_snap (dormant; no issuer) + effects: archive.prune (network suspend → reload → resume), archive.snapshot + +chaser_storage : Process + state: in {IDLE, POLLING} + inputs: chase::space — switch IDLE → POLLING + timer tick — re-check capacity; if ok do_reload → IDLE + effects: archive.reload (clears fault), resume() + +chaser_template : Process + state: ∅ (stub) + inputs: chase::transaction (ignored) + effects: none + +chaser_transaction : Process + state: ∅ (stub) + inputs: chase::stop only + effects: none in normal operation +``` + +The two stubs (template, transaction) can be omitted from a formal +model entirely until they are implemented — neither modifies any +shared state. + +--- + +## 7. Notes for the Lisp port + +- Three of the four can start as one-liners. `chaser_storage` is the + only one with non-trivial behaviour, and even there the logic is + ~30 lines. +- A Lisp port can defer `chaser_template` and `chaser_transaction` + pending decisions about the mempool. +- `chaser_snapshot::pruned_` is a single boolean per process; trivial. +- `chaser_storage`'s deadline timer can be implemented with any + scheduler primitive (asio's `deadline_timer`, an actor with a + delayed self-message, etc.). + +--- + +## 8. Notes for the formal model + +- These four chasers can be modelled with very small state spaces and + contribute little to the global proof obligation set. +- `chaser_snapshot`'s `pruned_` latch is a standard monotone Boolean. +- `chaser_storage`'s polling loop is a classic "wait until precondition + holds" pattern; formally a liveness property "if `chase::space` is + emitted and disk eventually has capacity, the system eventually + reaches `RUNNING`". +- The two stubs contribute nothing until implemented; flag them as + "external" in any spec. + +--- + +## Cross-references + +- [`00-overview.md`](00-overview.md) §6.2 (the suspend/resume/fault + cycle that `chaser_storage` participates in) +- [`01-event-bus.md`](01-event-bus.md) §2 (verified event table — + rows for `block`, `snap`, `space`, `transaction`, `template_`) +- [`05-chaser-confirm.md`](05-chaser-confirm.md) §3 (emits + `chase::block` consumed here by `chaser_snapshot::do_prune`) +- [`10-tx-protocols.md`](10-tx-protocols.md) §6 (consumer of the dormant + `chase::transaction` from `chaser_transaction`) diff --git a/docs/architecture/README.md b/docs/architecture/README.md new file mode 100644 index 00000000..087d523c --- /dev/null +++ b/docs/architecture/README.md @@ -0,0 +1,231 @@ +# libbitcoin-node — Architecture documentation + +This directory contains a top-down, source-anchored description of +libbitcoin-node. It was written to serve two downstream purposes: + +1. A **Lisp re-implementation** — needs precise functional + decomposition, explicit module boundaries, and concurrency + semantics. +2. A **formal-verification effort** — needs state machines, + invariants, pre/post-conditions, and a clear bus-vs.-store + ownership model. + +Every behavioural claim is anchored with `file:line` references into +the C++ source so it can be re-verified or re-derived if source drifts. + +--- + +## The 13 documents + +| # | File | Topic | +| -- | ------------------------------------------------------------------------- | --------------------------------------------------------------------------- | +| 00 | [`00-overview.md`](00-overview.md) | Top-down map: layer stack, object graph, threading model, lifecycle | +| 01 | [`01-event-bus.md`](01-event-bus.md) | Every `chase` event with verified issuer/handler sites; methodology | +| 02 | [`02-chaser-organize.md`](02-chaser-organize.md) | Templated header+block organize state machine; 15 `organizeN` fault sites | +| 03 | [`03-chaser-check.md`](03-chaser-check.md) | Block-download orchestration; race_all purge barrier; σ slow-peer detector | +| 04 | [`04-chaser-validate.md`](04-chaser-validate.md) | Consensus validation; own threadpool + backlog; 8 `validateN` fault sites | +| 05 | [`05-chaser-confirm.md`](05-chaser-confirm.md) | UTXO double-spend check + confirmed-chain writer; 12 `confirmN` fault sites | +| 06 | [`06-sessions-and-protocols.md`](06-sessions-and-protocols.md) | Sessions, the protocol attach tree, `protocol_block_in_31800` | +| 07 | [`07-header-protocols.md`](07-header-protocols.md) | Header in/out at 31800 + 70012; anti-echo discipline | +| 08 | [`08-block-out-protocols.md`](08-block-out-protocols.md) | Block out at 106 + 70012 supersede gate; streaming send loop | +| 09 | [`09-filter-out-70015.md`](09-filter-out-70015.md) | BIP157/158 client filters; three request types; one-shot stream trick | +| 10 | [`10-tx-protocols.md`](10-tx-protocols.md) | Tx in (stub) + tx out at 106; flags a suspect `sub1`/`add1` line | +| 11 | [`11-protocol-block-in-106.md`](11-protocol-block-in-106.md) | Legacy blocks-first sync (used when peer doesn't negotiate `headers_protocol`) | +| 12 | [`12-periphery-chasers.md`](12-periphery-chasers.md) | `chaser_snapshot`, `chaser_storage`, `chaser_template`, `chaser_transaction` | + +--- + +## Recommended reading orders + +### For a new contributor +Read in order: **00 → 01 → 02 → 03 → 04 → 05** for the consensus +pipeline, then **06 → 07/08/11** for the corresponding peer protocols. +Skim 09, 10, 12 as needed. + +### For a Lisp re-implementer +Start with **00** (layer stack + object graph), then **01** (the +event bus is the system's interface backbone — read this carefully). +Then **02–05** for the consensus pipeline, paying attention to each +doc's "Notes for the Lisp port" §s. Network layer +(**06–11**) maps cleanly onto per-channel actors. Periphery +(**12**) can be deferred — three of the four chasers there are stubs +or near-stubs. + +### For a formal-verification reader +**01** is foundational — the `chase` events are the inter-process +interface. Then **02 §3, §5, §7** (organize state machine, disorganize +state machine, error-code → proof-obligation list), **04 §4, §6** +(validate consensus + error inventory), **05 §4–§8** (confirm +algorithm, rollback atomicity, error inventory). The numbered +invariants throughout (`Validate-Backlog-1`, `Confirm-Rollback-1`, +etc.) are the recommended target set for a TLA+/Alloy spec. + +### For a peer-protocol implementer +**00 §7** for the network-layer overview, then **06** for the +attach tree, then read whichever versioned family you need +(**07** for headers, **08+11** for blocks, **09** for filters, +**10** for tx). + +--- + +## Conventions used throughout + +### File:line citations +Every non-trivial behavioural claim has a `path/to/file.cpp:N` (or +`.hpp`, `.ipp`) citation. If you change source, run +`grep -rn 'pattern' src include` from the repo root to find new +locations and update the citation. + +### Numbered invariants +Each doc names its invariants in the form `Topic-N` (e.g. +`Validate-Bypass-1`, `Confirm-Rollback-1`). The intent is that a +formal-spec encoding can cite them stably. Major invariant +families: + +| Prefix | Doc | Roughly | +| --------------- | -------------------------------- | -------------------------------------------------- | +| `Concurrency-*` | 00 | Strand discipline across the node | +| `Lifecycle-*` | 00 | start/run/close ordering | +| `Store-*` | 00 | Suspend/resume around store maintenance | +| `Bus-*` | 01 | Subscription/unsubscription semantics | +| `Organize-*` | 02 | Template state machine | +| `Check-*` | 03 | Download orchestration | +| `Validate-*` | 04 | Consensus validation | +| `Confirm-*` | 05 | Confirmation + UTXO | +| `Session-*`, `Attach-*`, `Protocol-*`, `Observer-*`, `Performer-*`, `BlockIn-*`, `HeaderIn-*`, `HeaderOut-*`, `HeaderBus-*`, `Anti-Echo-*`, `BlockOut-*`, `Filter-*`, `TxProto-*`, `TxAttach-*`, `TxOut-*`, `TxIn-*`, `TxEvent-*`, `BlockIn106-*` | 06–11 | Network layer | +| `Snapshot-*`, `Storage-*`, `Periphery-*`, `TxChaser-*` | 12 | Periphery chasers | + +### Mermaid diagrams +All diagrams use Mermaid (sequence, flowchart, stateDiagram-v2, +classDiagram). They render natively on GitHub. + +### "Spec view" / "Notes for the Lisp port" / "Notes for the formal model" +Most subsystem docs have these three closing sections. They are the +distilled-for-export view of each subsystem. + +--- + +## Honest caveats + +1. **No build verification.** Citations are textual. They were + re-grepped during writing but the code has not been built or + executed during this documentation effort. A version drift + between source and docs is possible. +2. **`block.accept` / `block.connect` semantics** live in + libbitcoin-system, not this repo. Docs 02/04/05 describe the + *sequencing* of those calls, not their content. The consensus + surface is delegated to libbitcoin-system documentation. +3. **Store consistency** is treated as an oracle. Every numbered + `organizeN`/`validateN`/`confirmN` fault is documented as a proof + obligation against store-consistency invariants supplied by + libbitcoin-database. A full proof would require importing those. +4. **Discrepancies with `chase.hpp` comments**: the inline event-bus + doc in `chase.hpp` has several stale entries (issuer + misattributions, dormant events). All are flagged in + [`01-event-bus.md`](01-event-bus.md) with `⚠` and discussed in + §2 of that doc. +5. **Stub subsystems**: `chaser_template`, `chaser_transaction`, and + `protocol_transaction_in_106` are stubs. The docs describe their + wiring and intended design but note clearly where current + behaviour is "no-op". +6. **Suspect source line**: [`10 §3.3`](10-tx-protocols.md#33-inbound-get_data--streamed-tx-send) + flags `protocol_transaction_out_106.cpp:187` as a possible + off-by-one (`sub1(index)` where `add1(index)` looks intended) — + not asserted as a bug; worth code review. + +--- + +## Coverage map + +``` +libbitcoin-node +├── src/ +│ ├── full_node.cpp ........................ 00 +│ ├── configuration.cpp / settings.cpp ..... (mentioned in 00, not detailed) +│ ├── error.cpp ............................ 00 §9 +│ ├── block_arena.cpp / block_memory.cpp ... 00 §8 (overview only) +│ │ +│ ├── chasers/ +│ │ ├── chaser.cpp ....................... 00 §3 (base; per-doc as needed) +│ │ ├── chaser_block.cpp ................. 02 §4 +│ │ ├── chaser_header.cpp ................ 02 §4, §6.2 +│ │ ├── chaser_check.cpp ................. 03 +│ │ ├── chaser_validate.cpp .............. 04 +│ │ ├── chaser_confirm.cpp ............... 05 +│ │ ├── chaser_snapshot.cpp .............. 12 §1 +│ │ ├── chaser_storage.cpp ............... 12 §2 +│ │ ├── chaser_template.cpp .............. 12 §3 +│ │ └── chaser_transaction.cpp ........... 12 §4 +│ │ +│ ├── sessions/ +│ │ ├── session.cpp ...................... 06 §1 +│ │ ├── session_inbound.cpp .............. 06 §1.3 +│ │ ├── session_outbound.cpp ............. 06 §1 (typedef) +│ │ └── session_manual.cpp ............... 06 §1 (typedef) +│ │ +│ └── protocols/ +│ ├── protocol.cpp ..................... 06 §3 +│ ├── protocol_peer.cpp ................ 06 (referenced); 07 §8 anti-echo +│ ├── protocol_observer.cpp ............ 06 §3.2 +│ ├── protocol_performer.cpp ........... 06 §4 +│ │ +│ ├── protocol_block_in_106.cpp ........ 11 +│ ├── protocol_block_in_31800.cpp ...... 06 §5 +│ ├── protocol_block_out_106.cpp ....... 08 §2 +│ ├── protocol_block_out_70012.cpp ..... 08 §3 +│ │ +│ ├── protocol_header_in_31800.cpp ..... 07 §2 +│ ├── protocol_header_in_70012.cpp ..... 07 §3 +│ ├── protocol_header_out_31800.cpp .... 07 §4 +│ ├── protocol_header_out_70012.cpp .... 07 §5 +│ │ +│ ├── protocol_filter_out_70015.cpp .... 09 +│ ├── protocol_transaction_in_106.cpp .. 10 §2 +│ └── protocol_transaction_out_106.cpp . 10 §3 +│ +└── include/bitcoin/node/ + ├── full_node.hpp ........................ 00 + ├── chase.hpp ............................ 00 §4; 01 (verified against source) + ├── events.hpp ........................... 00 §4 (metrics enum) + ├── error.hpp ............................ 00 §9 + └── impl/chasers/chaser_organize.ipp ..... 02 +``` + +Files **not** detailed (treated as "use as documented"): +- `configuration.cpp` / `settings.cpp` — straightforward configuration +- `messages/` headers — wire types from libbitcoin-network +- `channels/` headers — channel base from libbitcoin-network with a + small node-specific subclass + +--- + +## Next steps (suggested follow-on docs, not yet written) + +- **`docs/spec/`** — TLA+ or Alloy skeletons encoding the numbered + invariants. The recommended starting set: `Validate-Backlog-1`, + `Validate-Ordering-1`, `Confirm-Rollback-1`, `Confirm-Order-1`, + `Organize-Disorg-1`, `Bus-1..2`. These are the smallest set whose + proofs would cover the consensus-critical safety claims. +- **`docs/lisp/`** — per-subsystem porting notes that turn each + doc's "Notes for the Lisp port" section into concrete data type + + module layouts. +- **A change log** — when source changes, the affected `file:line` + citations should be updated. A `docs/architecture/CHANGES.md` + recording which docs were re-verified at which git SHA would + help keep the corpus honest over time. + +--- + +## How to update these docs after source changes + +1. Run the verification greps from + [`01-event-bus.md §4`](01-event-bus.md#4-methodology) to detect + bus-graph drift. +2. For per-chaser docs, the relevant invariants reference specific + `file:line`s — re-grep those when their files change. +3. The error inventories in each chaser doc (`organizeN`, + `validateN`, `confirmN`) are exhaustive at the time of writing + — any new `error::N` entry should add a row. +4. The attach tree in + [`06 §2.3`](06-sessions-and-protocols.md#23-attach_protocolschannel----line-session_peeripp57-161) + should be re-checked whenever `session_peer.ipp` is modified. From 79fd214606cbe3347400d3087c2434ed6e730f03 Mon Sep 17 00:00:00 2001 From: rob Date: Sun, 17 May 2026 18:54:15 +0200 Subject: [PATCH 2/2] Apply review fixes to architecture docs Reviewer-confirmed corrections from evoskuil across 12 docs: - 02: milestone allows validation bypass (not chain-fixing); chaser_block skips milestones because blocks-first has no PoW DoS guard; debug-only checks are !NDEBUG; LRU eviction on tree_ would create a new DoS vector. - 03: get_inventory_size gates on candidate-chain currentness, with the weak-chain rationale (not "wait until caught up"). - 04: consensus is split across headers, block-receive, this chaser, and confirm; intro softened from "single source of consensus acceptance". - 05: !NDEBUG polarity fix; expanded block_confirmable to describe strong-tx association, maturity, and relative-locktime rules. - 06: session-template class diagram now shows all three instantiations; recent != current (max-height config for testing). - 08: superseded_ is atomic because superseded() is protected and read non-stranded from the base. - 09: unhandled channel messages are ignored, not protocol_violation. - 10: sub1/add1 was a real off-by-one bug, fixed in PR #1007. - 11: order-discipline is the same as headers-first; BIP130 typo fix. - 12: chaser_storage timer runs on the chaser's strand (network threadpool), not a separate pool. - 00, README: roll-up updates. --- docs/architecture/00-overview.md | 9 ++- docs/architecture/02-chaser-organize.md | 33 ++++++++--- docs/architecture/03-chaser-check.md | 13 ++++- docs/architecture/04-chaser-validate.md | 25 +++++++- docs/architecture/05-chaser-confirm.md | 58 ++++++++++++++----- .../architecture/06-sessions-and-protocols.md | 52 ++++++++++++----- docs/architecture/08-block-out-protocols.md | 17 +++--- docs/architecture/09-filter-out-70015.md | 21 ++++--- docs/architecture/10-tx-protocols.md | 41 ++++++------- docs/architecture/11-protocol-block-in-106.md | 13 +++-- docs/architecture/12-periphery-chasers.md | 6 +- docs/architecture/README.md | 17 ++++-- 12 files changed, 214 insertions(+), 91 deletions(-) diff --git a/docs/architecture/00-overview.md b/docs/architecture/00-overview.md index ce47dfa2..186b94af 100644 --- a/docs/architecture/00-overview.md +++ b/docs/architecture/00-overview.md @@ -155,9 +155,12 @@ its *own* mutations while still running in parallel with the other chasers on its own strand … allowing concurrent chaser operations to the extent that threads are available"*). -This is the **central source of parallelism** in the node. The chasers form -a pipeline; each stage runs on its own strand and they communicate by -publishing events. +This is one of the two main axes of parallelism in the node. The chasers +form a pipeline; each stage runs on its own strand and they communicate +by publishing events. The other axis, equally important, is **per-channel +strands**: every peer connection also runs on its own strand. Peers and +chasers therefore execute concurrently with each other, bounded only by +the shared threadpool size. --- diff --git a/docs/architecture/02-chaser-organize.md b/docs/architecture/02-chaser-organize.md index 283c7b75..0dee8590 100644 --- a/docs/architecture/02-chaser-organize.md +++ b/docs/architecture/02-chaser-organize.md @@ -469,8 +469,13 @@ Exit paths: ### 6.2 Header milestone tracking (`chaser_header` only) -A *milestone* is a configured `(hash, height)` pair that fixes the chain. -Functionally similar to a checkpoint but mutable per node settings. +A *milestone* is a configured `(hash, height)` pair. Unlike a +checkpoint, a milestone does **not** fix the chain — the node can +still reorganise around it. What a milestone *does* is allow the +**bypass of validation** of all blocks up to the milestone height, +*if* the milestone is found in the active candidate chain. So a +milestone is an operational optimisation gated on the node's own +configuration, not a consensus commitment. State: `active_milestone_height_` is the height of the *most recent milestone observed on the current candidate*. Initialised by @@ -493,8 +498,17 @@ and: > the candidate is reorganized below the milestone (a rare event), and > only via `update_milestone`. -`chaser_block` skips milestones entirely. The full block already carries -enough state to validate without the heuristic. +`chaser_block` skips milestones entirely, for a different reason than +"the full block carries enough state". The blocks-first design has no +PoW guard before archival: a peer can flood the node with full +blocks, and without an upstream headers-first chain to gate which +blocks are *worth* the work, the node has no cheap way to refuse +malicious blocks short of validating them. (One could imagine running +headers-first internally by downloading every block and stripping its +txs — but that is prohibitively expensive and redundant with running +headers-first directly.) So in blocks-first mode, **every block must +be validated before archival**, full stop; bypass-on-milestone would +defeat the only DoS guard the mode has. --- @@ -523,8 +537,8 @@ store-corruption error). For a formal model, each is a proof obligation: | `organize13` | `ipp:428-432` | disorganize: `get_candidate_chain_state(fork_point)` after rebuild returned null | | `organize14` | `ipp:515` (in `push_block`) | `set_organized` failed after a successful `set_code` (we archived but couldn't push to candidate) | | `organize15` | `ipp:521-523` (in `push_block(key)`)| Tree extract returned no handle (item was missing when expected) | -| `stalled_channel` | `ipp:471-475` (in `set_organized`, NDEBUG-only check) | Candidate height isn't `top+1` (broken sequencing) | -| `suspended_channel`| `ipp:477-483` (in `set_organized`, NDEBUG-only check) | Parent of new candidate isn't current top (broken sequencing) | +| `stalled_channel` | `ipp:471-475` (in `set_organized`, debug-only `!NDEBUG` check) | Candidate height isn't `top+1` (broken sequencing). Redundant safety check; release builds skip it. | +| `suspended_channel`| `ipp:477-483` (in `set_organized`, debug-only `!NDEBUG` check) | Parent of new candidate isn't current top (broken sequencing). Redundant safety check; release builds skip it. | > **Spec obligation list.** A formal model should be able to discharge > `organize2` through `organize15` as unreachable, given: @@ -570,8 +584,11 @@ store-corruption error). For a formal model, each is a proof obligation: factoring in §4.1. - `tree_` is naturally a `hash-table` keyed by header hash. The DoS - concern (`§6.1 TODO`) can be enforced by a size cap with a - least-recently-used eviction. + concern flagged at `§6.1 TODO` is real but the obvious mitigation + (an LRU eviction cap) is **not** the right answer — that would open + a new DoS vector where an attacker forces eviction of legitimate + weak branches. The right fix is more subtle and not solved here; + treat the unbounded-tree assumption as load-bearing. - `update_milestone` walks `tree_` by parent-hash chain — straightforward recursion. diff --git a/docs/architecture/03-chaser-check.md b/docs/architecture/03-chaser-check.md index fe1da630..7c9ed338 100644 --- a/docs/architecture/03-chaser-check.md +++ b/docs/architecture/03-chaser-check.md @@ -330,10 +330,19 @@ generation. It: `get_inventory_size` (`chaser_check.cpp:534-543`): -- Returns 0 if no connections OR if the *confirmed* chain isn't current - (so no inventory work issues until the node is reasonably caught up). +- Returns 0 if no connections OR if `is_current(false)` — i.e. the + **candidate** (header) chain isn't current. - Otherwise: `ceilinged_divide(unassociated_count_above(fork, step), connections)`. +The candidate-current gate is *not* "wait until caught up" — issuing +zero work until the node is caught up would just stall (you can't get +caught up without downloading). Instead it guards against dividing +work over a *weak* header chain: until headers are current, the +partitioning of unassociated heights into per-peer inventory chunks +could be meaningless or wrong, so block download is paused until +header sync has produced a current (and therefore presumed +near-canonical) candidate. + > **Invariant (Check-Inventory-1).** `inventory_` is computed at most > once (latch on first nonzero result). Stored in > `chaser_check.cpp:496-498`. This is intentional: peer inventory size is diff --git a/docs/architecture/04-chaser-validate.md b/docs/architecture/04-chaser-validate.md index 35506b0e..5c5c75cc 100644 --- a/docs/architecture/04-chaser-validate.md +++ b/docs/architecture/04-chaser-validate.md @@ -17,9 +17,28 @@ > - it emits `chase::valid(height)` on success or `chase::unvalid(link)` > on failure > -> This is the chaser most directly relevant to **formal verification**: it -> is the single source of consensus acceptance, and the only place script -> execution and UTXO availability are checked before confirmation. +> This is the chaser most directly relevant to **formal verification**: +> it is where script execution and prevout (UTXO availability) checks +> happen before confirmation. Consensus is **not** confined to this +> chaser however — it is split across several stages: +> +> - **Headers** (`chaser_header::validate`) — context-free header +> consensus (proof-of-work, version, etc.) plus limited-context checks. +> - **Blocks at receive time** (`protocol_block_in_31800::check`, +> `chaser_block::validate` in blocks-first mode) — context-free +> block-level consensus (size, sigops, commitments) and limited +> context checks (no prevouts yet). +> - **This chaser** — full block consensus with prevouts populated: +> `block.accept(ctx, …)` (block-wide rules) and `block.connect(ctx)` +> (script execution per input). +> - **`chaser_confirm`** — block-relative order-based consensus checks +> via `query.block_confirmable(link)`: previous-output is confirmed in +> a "strong" tx, maturity, relative-locktime rules (see +> [`05 §10.4`](05-chaser-confirm.md#104-the-utxo-oracle)). +> - **Transactions** (planned) — same shape, not yet implemented. +> +> So this chaser is the **largest single block of consensus work** and +> the natural focal point of a formal model, but not the sole source. | File | Role | | --------------------------------------------------------------------- | ------------------------------------------------------------------------------------- | diff --git a/docs/architecture/05-chaser-confirm.md b/docs/architecture/05-chaser-confirm.md index aa69187d..cde14b3a 100644 --- a/docs/architecture/05-chaser-confirm.md +++ b/docs/architecture/05-chaser-confirm.md @@ -442,8 +442,8 @@ All terminal (call `fault`, suspend network). | `confirm10` | `chaser_confirm.cpp:298` | `roll_back` failed | | `confirm11` | `chaser_confirm.cpp:308` | `set_filter_head` failed before `set_block_confirmable` | | `confirm12` | `chaser_confirm.cpp:314` | `set_block_confirmable` failed | -| `suspended_channel` | `chaser_confirm.cpp:379` (NDEBUG-only)| `confirmed_height != top+1` in `set_organized` — sequencing bug | -| `suspended_service` | `chaser_confirm.cpp:387` (NDEBUG-only)| `to_parent(link) != to_confirmed(previous_height)` — parent mismatch | +| `suspended_channel` | `chaser_confirm.cpp:379` (debug-only check, `!NDEBUG`) | `confirmed_height != top+1` in `set_organized` — sequencing bug. Redundant safety check; no effect in release builds. | +| `suspended_service` | `chaser_confirm.cpp:387` (debug-only check, `!NDEBUG`) | `to_parent(link) != to_confirmed(previous_height)` — parent mismatch. Redundant safety check; no effect in release builds. | > **Spec obligation list.** As with organize/validate, every `confirmN` > is unreachable under store-consistency invariants plus the strand @@ -528,23 +528,55 @@ chaser_confirm : Process no stall when `chase::valid` arrived during the in-progress iteration. -### 10.4 The UTXO oracle +### 10.4 What `query.block_confirmable(link)` actually checks -`query.block_confirmable(link)` is the UTXO double-spend check. Its -correctness is the responsibility of libbitcoin-database. For a formal -model, treat it as: +This is a **significant consensus operation**, not a narrow +double-spend probe. It evaluates *all block-relative, order-based +consensus constraints* on a block — everything except header chaining +and the chain-summation rules (cumulative work, MTP). Specifically: + +1. **Strong-tx association for every spent prevout.** For every input, + the previous output must be in a *strong* transaction — i.e. a tx + that is associated to a block which is either (a) confirmed, or (b) + in the confirmable candidate fork at *lesser* height than the + spending block. This is the property that subsumes the "is the + prevout in the UTXO set?" question, expressed in a tx→block + association model rather than a separate UTXO snapshot. + +2. **Coinbase maturity** (BIP34 / 100-confirmation rule for spending + coinbase outputs). + +3. **Relative locktime rules** (BIP68 sequence locks). + +These checks also exist on the `system::chain` objects (for +completeness, e.g. for stand-alone validation tools), but driving them +there requires populating each input's metadata first. The store +optimizes by performing the queries that *would have populated that +metadata*, directly — much more efficient than populate-then-check. + +So `block_confirmable` is correctly read as: *"under the assumption +that all lower-height blocks in this fork are confirmable, is this +block consensus-valid against all order-sensitive rules?"* Its +correctness is the joint responsibility of the libbitcoin-database +query implementation and the consensus rules it encodes; a formal +model should treat the call as a non-trivial proof obligation, not a +thin UTXO oracle. + +For a model: ``` -block_confirmable(link, store_state) → - Right(()) if every input refers to a UTXO present in store_state, - and double-spend checks pass +block_confirmable(link, fork, store_state) → + Right(()) if for every input in block(link): + - prevout tx is strong under (store_state ∪ fork[< height]) + - coinbase maturity satisfied + - relative-locktime constraints satisfied + - no double-spends within fork[≤ height] Left(error_code) otherwise ``` -The chaser sequences calls so that `store_state` at the moment of -`block_confirmable(link)` reflects all blocks confirmed below `link`'s -height in this fork (because `set_block_confirmable` for prior heights -has already run by the loop ordering at `chaser_confirm.cpp:230-275`). +The chaser sequences calls so that prior fork blocks have already had +`set_block_confirmable` written by the time `block_confirmable(link)` +runs for this block (loop ordering at `chaser_confirm.cpp:230-275`). --- diff --git a/docs/architecture/06-sessions-and-protocols.md b/docs/architecture/06-sessions-and-protocols.md index 9893677d..bcdf0705 100644 --- a/docs/architecture/06-sessions-and-protocols.md +++ b/docs/architecture/06-sessions-and-protocols.md @@ -39,6 +39,17 @@ to protocols. ### 1.1 Session hierarchy +`session_peer` is a class template whose +`NetworkSession` parameter is instantiated separately for each +concrete session: `network::session_outbound`, +`network::session_inbound`, and `network::session_manual`. The +template inherits *from its parameter* (the network base) and *also* +from `node::session` (the mixin) — so each instantiation produces a +different concrete network-base parent. The class diagram below shows +the outbound instantiation explicitly; the inbound and manual +instantiations are structurally identical, parameterised on their +respective `network::session_*` base. + ```mermaid classDiagram class network_session["network::session"] { @@ -58,11 +69,13 @@ classDiagram class network_session_outbound["network::session_outbound"] class network_session_inbound["network::session_inbound"] class network_session_manual["network::session_manual"] - class session_peer["session_peer<NetworkSession> (template)"] { + class session_peer_out["session_peer<network::session_outbound>"] { +create_channel (override) +attach_handshake (override) +attach_protocols (override) } + class session_peer_in["session_peer<network::session_inbound>"] + class session_peer_man["session_peer<network::session_manual>"] class session_outbound class session_inbound { +enabled() override @@ -72,14 +85,21 @@ classDiagram network_session <|-- network_session_outbound network_session <|-- network_session_inbound network_session <|-- network_session_manual - network_session_outbound <|-- session_peer - node_session <|-- session_peer - session_peer <|-- session_outbound - session_peer <|-- session_inbound - session_peer <|-- session_manual - - note for session_peer "Multiply derived:\n• node::session for chaser/bus access\n• network::session_* for socket lifecycle" - note for node_session "All methods forward to full_node" + + network_session_outbound <|-- session_peer_out + network_session_inbound <|-- session_peer_in + network_session_manual <|-- session_peer_man + + node_session <|-- session_peer_out + node_session <|-- session_peer_in + node_session <|-- session_peer_man + + session_peer_out <|-- session_outbound + session_peer_in <|-- session_inbound + session_peer_man <|-- session_manual + + note for session_peer_out "Three template instantiations,\nidentical except for which\nnetwork::session_* is the\nnetwork-side base." + note for node_session "Mixin: all methods\nforward to full_node." ``` The `node::session` mixin (`src/sessions/session.cpp:35-160`) is **pure @@ -115,10 +135,16 @@ bool session_inbound::enabled() const NOEXCEPT > **Invariant (Session-Inbound-1).** Inbound connection attempts are > rejected (the network layer disables the listener) until either > `delay_inbound == false` *or* the confirmed chain is "recent". The -> definition of "recent" is the same as `full_node::is_recent` — top -> equals configured max height *or* top timestamp is within the -> `currency_window` (`src/full_node.cpp:415-425`). This prevents a -> not-yet-caught-up node from serving stale data. +> definition of "recent" is `full_node::is_recent` — top equals +> configured max height *or* top timestamp is within the +> `currency_window` (`src/full_node.cpp:415-425`). Note that **recent +> is weaker than current**: "recent" considers the configured +> `node.maximum_height`, so a node deliberately limited to a fixed +> height (typically for testing) can activate inbound service at that +> ceiling even though it would never satisfy the time-based +> "currentness" test. This prevents a not-yet-caught-up node from +> serving stale data in normal deployments while still allowing +> bounded-height test deployments. This is implemented via `enabled()` rather than the bus `suspend`/`resume` mechanism so that the listener has independent diff --git a/docs/architecture/08-block-out-protocols.md b/docs/architecture/08-block-out-protocols.md index fb69675e..3f312e49 100644 --- a/docs/architecture/08-block-out-protocols.md +++ b/docs/architecture/08-block-out-protocols.md @@ -324,14 +324,15 @@ When `superseded_` flips true: ### 3.3 Why `superseded_` is `std::atomic_bool` -`superseded_` is written in `handle_receive_send_headers` (channel -strand) and read in `handle_event` (bus subscriber's strand, which -posts back to channel strand for actual processing). In practice both -are the channel strand — see -[`06 §3.1`](06-sessions-and-protocols.md#31-event-subscription-protocol) -on subscription posting back to channel strand. The atomic is -defensive; a non-atomic `bool` would likely be sound, but the cost is -negligible. +`protocol_block_out_70012::superseded()` is **`protected`**, so the +derived class (`protocol_block_out_70012`) exposes read access to +its own base (`protocol_block_out_106::handle_event`) which uses it +as the supersede gate. The base reads `superseded()` from its bus +handler context; the derived class writes `superseded_` from its +`handle_receive_send_headers` channel handler. Making the flag +atomic allows that read to happen **without** posting through the +channel strand — i.e. the gate is non-stranded by design, and the +atomic is what makes that safe. --- diff --git a/docs/architecture/09-filter-out-70015.md b/docs/architecture/09-filter-out-70015.md index 057a4d27..7eb793f1 100644 --- a/docs/architecture/09-filter-out-70015.md +++ b/docs/architecture/09-filter-out-70015.md @@ -282,14 +282,19 @@ The pattern exists because of two constraints: create unbounded queueing. Unsubscribing for the duration of the stream **serializes** requests -without explicit locks: a peer's second `get_client_filters` arrives -when there is no handler for it, which the libbitcoin-network channel -treats as a *protocol violation* and drops the peer. - -> **Invariant (Filter-Stream-4).** A peer that issues a second -> `get_client_filters` before the first completes will be dropped by -> the channel layer (not by this protocol). This effectively makes -> `getcfilters` request/response exclusive per channel. +without explicit locks. A peer's second `get_client_filters` arriving +while the first is in flight has no handler registered; the +libbitcoin-network channel currently **ignores** unhandled messages +(this may be tightened in future). The serializing effect therefore +comes from the protocol simply not seeing the second request until it +re-subscribes — not from peer drops. + +> **Invariant (Filter-Stream-4).** While streaming, any additional +> `get_client_filters` from this peer is dropped on the floor (no +> handler). The first request completes; only after re-subscription +> can another arrive. `getcfilters` is therefore *effectively* +> serialized per channel, even though no explicit lock is held and +> no peer-drop policy enforces it. --- diff --git a/docs/architecture/10-tx-protocols.md b/docs/architecture/10-tx-protocols.md index d55b2f19..d9c49b9f 100644 --- a/docs/architecture/10-tx-protocols.md +++ b/docs/architecture/10-tx-protocols.md @@ -213,7 +213,7 @@ void send_transaction(ec, size_t index, get_data::cptr message) { return; } - SEND(transaction{ ptr }, send_transaction, _1, sub1(index), message); + SEND(transaction{ ptr }, send_transaction, _1, add1(index), message); } ``` @@ -224,21 +224,16 @@ void send_transaction(ec, size_t index, get_data::cptr message) { > **Invariant (TxOut-Stream-2).** Resubscription to `get_data` > happens only after the entire request's tx items have been served > (or skipped). Until then, the channel is unsubscribed from -> `get_data` — a second incoming `get_data` while streaming will -> hit the libbitcoin-network "no handler" path. - -> ⚠ **Suspect: the `sub1(index)` continuation.** At `:187`, -> the next iteration is scheduled with `sub1(index)` (= `index - 1`), -> not `add1(index)` (= `index + 1`). The loop top is -> `for (; index < size; ++index)`, so the next call enters with -> `index - 1`, the `for` test passes, the inner `if` either matches -> at `index - 1` or `++index` runs and matches at the same `index` -> we just sent. Either way the same tx may be sent again. This -> reading suggests a possible off-by-one — likely intended -> `add1(index)`. Worth reviewing against intent before mirroring in -> a port. Flagged here rather than asserted as a bug because we have -> not built or run the codebase; there may be a subtlety we are -> missing. +> `get_data`. A second incoming `get_data` arriving in that window +> currently has no handler registered, which the libbitcoin-network +> channel **ignores**. (Behaviour may be tightened in future to drop +> peers in that case; do not rely on either policy.) + +> **Note: previously-flagged off-by-one at `:187`.** Earlier +> revisions of this protocol used `sub1(index)` here, which produced +> an off-by-one (re-sending the same tx). Fixed in libbitcoin-node +> PR #1007 (commit 940ccea) to use `add1(index)`. Keep this in mind +> when reading older checkouts. ### 3.4 Witness gating @@ -343,7 +338,7 @@ stateDiagram-v2 [*] --> SUBSCRIBED: start (subscribe bus + get_data) SUBSCRIBED --> SUBSCRIBED: chase::transaction → do_announce → SEND inv SUBSCRIBED --> STREAMING: get_data → unsubscribe from get_data\nsend_transaction(0, msg) - STREAMING --> STREAMING: SEND tx; send_transaction(sub1(i), msg) + STREAMING --> STREAMING: SEND tx; send_transaction(add1(i), msg) STREAMING --> SUBSCRIBED: index ≥ size → resubscribe to get_data STREAMING --> DROPPED: witness mismatch / not_found / send error SUBSCRIBED --> [*]: stop / chase::stop @@ -404,12 +399,11 @@ protocol_transaction_out_106 : Process Bounded entirely by upstream. Until `chase::transaction` fires frequently, the out protocol is idle. -### 9.4 Open question for the spec +### 9.4 Historical note -The `sub1(index)` continuation (§3.3) needs verification against -intended behaviour. If it's a bug, a fix to `add1(index)` is a -one-character change. If it's intentional, the rationale is non-obvious -and warrants a comment in the source. +The `sub1(index)` continuation at `:187` was an off-by-one bug, +fixed in PR #1007 (commit 940ccea) to `add1(index)`. The spec +should treat the streamed-tx loop as a simple forward iteration. --- @@ -419,7 +413,8 @@ and warrants a comment in the source. mempool design is settled. - The out protocol is a near-clone of block-out: one announce path, one streaming serve path. Reuse the same actor template. -- The `sub1` vs `add1` continuation should be tested when porting. +- The continuation passes `(index + 1)` forward through `add1`; the + loop is a straightforward forward iteration over `message->items`. --- diff --git a/docs/architecture/11-protocol-block-in-106.md b/docs/architecture/11-protocol-block-in-106.md index 2b69dd5a..0c3d0183 100644 --- a/docs/architecture/11-protocol-block-in-106.md +++ b/docs/architecture/11-protocol-block-in-106.md @@ -291,9 +291,12 @@ void do_handle_organize(ec, height, block_ptr) { > "Order is enforced by organize" (`:164`). Out-of-order blocks > received from the peer become orphans (parent unknown) and > `chaser_block::do_organize` returns `error_orphan` ⇒ this channel -> is dropped. So this protocol is intolerant of out-of-order -> delivery — different from headers-first which queues headers in a -> tree. +> is dropped. This is the **same policy as headers-first**: headers +> are required by protocol to arrive in order, and blocks are +> requested in order; an out-of-order delivery on either side drops +> the peer. The `tree_` in headers-first does *not* relax this — it +> caches *weak-branch* candidates whose parent is already known, not +> orphans whose parent is unknown. ### 6.1 The strand-hopping for tracker_ access @@ -392,8 +395,8 @@ of the node are: | Unrequested inv while pending | (no action) | logged, ignored | No node-faults. No `protocol_violation` drops either — this protocol -is *forgiving* of peers (matching the operational reality that BIP31 -peers may be quirky). +is *forgiving* of peers (matching the operational reality that +pre-BIP130 peers may be quirky). --- diff --git a/docs/architecture/12-periphery-chasers.md b/docs/architecture/12-periphery-chasers.md index b7820ae4..253a57c1 100644 --- a/docs/architecture/12-periphery-chasers.md +++ b/docs/architecture/12-periphery-chasers.md @@ -145,7 +145,11 @@ a full-disk fault**, monitor disk free space and **reload + resume** once space becomes available. This is the only chaser that owns a `network::deadline` timer -directly (separate from any threadpool). +directly. The timer is constructed against the chaser's own strand +(`disk_timer_ = std::make_shared(log, strand(), seconds{1})`, +`chaser_storage.cpp:51`), which runs on the **network threadpool** — +not a separate pool. So the timer adds no new execution context; it +fires callbacks onto the chaser's existing strand. ### 2.2 State diff --git a/docs/architecture/README.md b/docs/architecture/README.md index 087d523c..bd24506d 100644 --- a/docs/architecture/README.md +++ b/docs/architecture/README.md @@ -128,10 +128,19 @@ distilled-for-export view of each subsystem. `protocol_transaction_in_106` are stubs. The docs describe their wiring and intended design but note clearly where current behaviour is "no-op". -6. **Suspect source line**: [`10 §3.3`](10-tx-protocols.md#33-inbound-get_data--streamed-tx-send) - flags `protocol_transaction_out_106.cpp:187` as a possible - off-by-one (`sub1(index)` where `add1(index)` looks intended) — - not asserted as a bug; worth code review. +6. **Reviewer-confirmed corrections**: a first pass of reviewer + feedback (evoskuil) corrected several claims; the affected docs + are 02 (milestone semantics, NDEBUG polarity, tree_ DoS), 03 + (`is_current(false)` is candidate-chain), 04 (consensus is split + across multiple stages, not concentrated here), 05 (NDEBUG + polarity, expanded `block_confirmable` description), 06 + (session-template diagram, "recent" vs "current"), 08 + (`superseded_` atomic rationale), 09 (no-handler messages are + ignored, not protocol violations), 10 (the previously-flagged + `sub1`/`add1` at `protocol_transaction_out_106.cpp:187` was an + off-by-one bug, fixed in PR #1007), 11 (order-discipline is the + same as headers-first; BIP130 typo), and 12 (chaser_storage + timer runs on the network threadpool via the chaser's strand). ---