Skip to content

feat(ggm): log Z(G) closed-form primitives for hierarchical-spec MH#109

Open
MaartenMarsman wants to merge 9 commits into
mainfrom
feat/log-z-nlo-gamma
Open

feat(ggm): log Z(G) closed-form primitives for hierarchical-spec MH#109
MaartenMarsman wants to merge 9 commits into
mainfrom
feat/log-z-nlo-gamma

Conversation

@MaartenMarsman
Copy link
Copy Markdown
Collaborator

Summary

Phase 1a of the hierarchical-spec GGM work (Stage 3 of the DEGORD + Russian-Roulette plan). Ports the closed-form log Z(G) approximations from the companion z project into bgms as standalone numerics primitives, with bit-exact parity to the validated z reference. No model wiring yet — that's Phases 3–4.

This builds on the determinant-tilt prior (#106) and the auto-default tilt (#107). It is independent of #108 (joint-spec sample_ggm_prior) and can land in any order.

What's new

  • src/models/ggm/log_z_nlo.{h,cpp}:

    • log_Z_NLO_gamma(G, alpha, beta, sigma, include_F, delta) — general-α Laplace + NLO closed form for the spikeslab + |K|^δ tilt GGM normaliser. Direct port of branchB_chain.cpp:470-620 in ~/SV/Z/R/src/.
    • log_Z_NLO_gamma_degord(G, i, j, ...) — DEGORD reordering wrapper that permutes toggle endpoints (i, j) to positions (0, 1) before evaluation. The closed-form is not permutation-invariant; MH ratios must use a consistent reordering.
    • log_Z_NLO_gamma_delta_incr_alpha1(G_before, i, j, ...) — O(deg(i)² + deg(i)·q) incremental log-Z difference under a single-edge toggle at α = 1, exploiting the §4.6 locality decomposition (only vertex min(i, j) contributes).
  • src/log_z_test_interface.cpp — Rcpp exports for unit testing.

  • tests/testthat/test-log-z-nlo.R — 437 assertions covering:

    • Bit-exact parity (max |Δ| = 0) against the z reference on a 108-cell grid (9 random graphs × q ∈ {3, 5, 10} × α ∈ {1, 2} × δ ∈ {0, 0.5, 1} × include_F ∈ {FALSE, TRUE}).
    • α = 1 incremental matches full-recompute difference at machine precision across 21 toggles × 3 δ × 2 include_F at q ≤ 7.
    • DEGORD wrapper agrees with hand-permuted full-recompute.
    • Empty-graph closed form matches the analytic constant.
  • tests/testthat/fixtures/log_z_nlo_reference.rds — z-reference fixture; regenerate with dev/numerical_analyses/generate_log_z_fixture.R.

Test plan

  • Rscript -e 'testthat::test_file(\"tests/testthat/test-log-z-nlo.R\")' — 437 PASS, 0 FAIL.
  • CI green on the standard matrix (R-CMD-check, lint, pkgdown, test-coverage).

Stage 3 Phase 1a of dev/plans/backlog/hierarchical-ggm-degord-rr.md.
Ports the closed-form log Z(G) approximations from the z-project (Route 3a
+ DEGORD pilot) into bgms as numerics primitives, ready for the
hierarchical-spec MH wiring in Phases 3-4.

- src/models/ggm/log_z_nlo.{h,cpp}:
  - log_Z_NLO_gamma(G, alpha, beta, sigma, include_F, delta): general-alpha
    Laplace + NLO closed form for the spikeslab-+tilt GGM normaliser.
  - log_Z_NLO_gamma_degord(G, i, j, ...): toggle-endpoint reordering wrapper
    (permutes (i, j) to positions (0, 1)).
  - log_Z_NLO_gamma_delta_incr_alpha1(G_before, i, j, ...): O(deg^2 + deg*q)
    incremental log-Z difference under a single-edge toggle at alpha = 1.

- src/log_z_test_interface.cpp: Rcpp exports for unit testing.

- tests/testthat/test-log-z-nlo.R: 437 assertions covering parity against
  the z reference (bit-exact, max abs diff = 0 across 108 grid cells),
  alpha = 1 incremental vs full-recompute (machine-precision), DEGORD
  permutation consistency, and analytic empty-graph constant.

- tests/testthat/fixtures/log_z_nlo_reference.rds: z-reference fixture
  (regeneratable via dev/numerical_analyses/generate_log_z_fixture.R).
@codecov
Copy link
Copy Markdown

codecov Bot commented May 19, 2026

Codecov Report

❌ Patch coverage is 95.58665% with 41 lines in your changes missing coverage. Please review.
✅ Project coverage is 87.62%. Comparing base (9480d46) to head (843f63d).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
src/log_z_test_interface.cpp 75.26% 23 Missing ⚠️
src/models/ggm/ggm_model.cpp 90.69% 8 Missing ⚠️
R/bgm_spec.R 89.13% 5 Missing ⚠️
src/models/ggm/log_z_nlo.cpp 99.44% 2 Missing ⚠️
src/models/ggm/degord_sampler.cpp 99.66% 1 Missing ⚠️
src/models/ggm/z_ratio_estimator.cpp 95.83% 1 Missing ⚠️
src/sample_ggm.cpp 80.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #109      +/-   ##
==========================================
+ Coverage   87.47%   87.62%   +0.15%     
==========================================
  Files          77       82       +5     
  Lines       13447    14437     +990     
==========================================
+ Hits        11763    12651     +888     
- Misses       1684     1786     +102     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Stage 3 Phase 1b of dev/plans/backlog/hierarchical-ggm-degord-rr.md.

At alpha > 1, H_e gains a -4 beta (alpha - 1) / M_v[v] piece that depends
on the larger endpoint, so a single-edge toggle (i, j) cascades H_e
changes to every edge incident to i (forward AND backward). The alpha = 1
locality decomposition (vertex_i_contribs only) no longer suffices.

New approach: partition log_Z_NLO_gamma into "instances touching
V_a = {i, j} ∪ N_G_before(i)" vs "instances not touching V_a", with the
latter invariant under the toggle and cancelling in the difference.

- src/models/ggm/log_z_nlo.cpp:
  - partial_log_Z_NLO_gamma_on_V (private): full-recompute structure with
    an any-index-in-V_a filter on every term type (log_C0 per-edge and
    per-vertex, log_LO, B_e, C_{e,k}, D_v, E_{lm}+F, N_v, M_e). M_v and
    H_e are still computed on the full graph; the filter only decides
    inclusion in the partial sum.
  - log_Z_NLO_gamma_delta_incr_alphaN (public): partial(G_after, V_a) -
    partial(G_before, V_a). Reduces to log_Z_NLO_gamma_delta_incr_alpha1
    at alpha = 1 to machine precision.

- tests/testthat/test-log-z-nlo.R: 816 new assertions across
  q in {3, 5, 7}, alpha in {1.5, 2, 3}, delta in {0, 0.5, 1},
  include_F in {FALSE, TRUE}, all toggles. Two new test_that blocks:
  - alphaN vs full-recompute difference at alpha > 1 (within 1e-10).
  - alphaN(alpha = 1) vs the existing alpha = 1 incremental (within 1e-10).
  All 1253 assertions in the file pass at machine precision.

Cost is O(q^2) per toggle: the outer loops are still over all edges
and non-edges (filter, not loop-bound, restricts inclusion). The plan's
O(deg^2 + deg * q) optimisation is a future loop-bound restructuring
that can be tested against this implementation.
Replaces the O(q^2) any-index-in-V_a filter with an O(deg^2 + deg * q)
canonical-representative enumeration. Each qualifying instance (edge,
triple, non-edge, vertex term) is now walked exactly once via its
lowest-index V_a member, eliminating the O(q^2) and O(q^3) scans on the
heavy term types.

Strategy:
- Precompute forward/backward adjacency lists in O(|E|), reusing them
  for nu, H_e, T, and S sums.
- Edge-indexed terms (log_LO, B_e, M_e): for each p in V_a, walk
  fwd[p] and bwd[p] (with !in_V[l] gate for the larger-endpoint case).
- Vertex-indexed terms (lgamma, D_v, N_v): direct loop over V_a_idx.
- Triple-indexed C_{(a, b), k}: three cases for canon = p, k=p, a=p, b=p
  with gates k not in V_a (case 2) and a, k not in V_a (case 3).
- Non-edge E_{lm}: for each p in V_a, walk (p, m) and (l, p) pairs with
  the !in_V[l] gate for the larger-endpoint case; common-pred sum walks
  bwd[l] and tests G(k, m).

Measured speedup over the full-recompute difference baseline on a
sparse 20%-density graph (100 random toggles, alpha=2, delta=0.5):
- q= 20: incremental ~ full-diff (parity; constant-factor regime).
- q= 50: 1.2x.
- q=100: 2.9x.

All 1253 assertions in tests/testthat/test-log-z-nlo.R continue to pass
at machine precision (Phase 1a fixture, Phase 1b parity, alpha=1
reduction).
…e 2)

Stage 3 Phase 2 of dev/plans/backlog/hierarchical-ggm-degord-rr.md.
Ports the inner Zhat machinery from ~/SV/Z/R/src/degord_sampler.h (v4,
2026-05-18) into bgms, hooked to SafeRNG.

New files
---------
- src/models/ggm/degord_sampler.{h,cpp}:
  - ChainAux (per-chain (alpha, beta, sigma, delta) constants +
    per-nu transcendental tables).
  - PiAux (per-permutation: G_pi, nu_pi, E_count, log_C0).
  - permute_graph / degord_permutation (toggle endpoints to (q-2, q-1)).
  - phi_pi_sample_from_noise (inner Bartlett-Cholesky kernel).
  - log_Zhat_pi_from_pool (main entry; log-sum-exp over importance weights).
  - log_Zhat_pi_from_pool_cache + PoolCache (cached variant for
    delta-toggle reuse).
  - row_qm2_logw_from_S (single-row recompute under trailing toggle).
  - delta_log_Zhat_pi_toggle (efficient delta with cache reuse).
  - draw_bartlett_pool (SafeRNG-based standard-normal pool).

Two bug fixes folded in vs the z-project v4 reference
-----------------------------------------------------
Both fixes were derived during the bgms port audit and confirmed correct
on the z side; they are bit-verified and land here with regression
assertions:

1. Cache rw_head fix (Option 2). The v4 rw_head spanned only rows
   0..q-3, so the star aggregation in delta_log_Zhat_pi_toggle omitted
   row q-1's diagonal log-weight even though it is invariant across
   (curr, star) for any toggle (nu_pi[q-1] = 0 always under the DEGORD
   permutation). The omission left a sample-shifted bias that grew with
   the tilt parameter delta. Fix: rw_head extended to include
   row_logw[q - 1].

2. z_trail slot fix (slab_tilt_mode == 1). delta_log_Zhat_pi_toggle
   passed z_trail = 0.0 hardcoded to row_qm2_logw_from_S, dropping the
   saddle-shifted slab innovation noise[q + edge_offset(q-2, q-1)] =
   noise[q + (q-2)(q+1)/2] when slab_tilt_mode == 1. Fix: read the
   actual slab slot via noise_pool.colptr(slab_idx).

Tests
-----
- tests/testthat/test-degord-sampler.R: 1526 assertions covering:
  - ChainAux nu-tables vs the z reference (local-only, gated on z source).
  - log_Zhat_pi_from_pool bit-parity on shared noise pool via the
    fixture at tests/testthat/fixtures/degord_sampler_reference.rds
    (regeneratable via dev/numerical_analyses/generate_degord_fixture.R).
  - DEGORD permutation correctness (i, j) -> (q-2, q-1).
  - delta_log_Zhat_pi_toggle EQUALS the direct full-recompute difference
    log_Zhat(star) - log_Zhat(curr) at machine precision under BOTH
    slab_tilt_modes. The (q=10, delta=0, tilt=1, toggle=(3,9)) cell is
    an explicit regression row.
  - delta_log_Zhat_pi_toggle bit-parity vs the z reference (local-only).
  - var(log_Zhat) scales as 1/M_inner (Phase 2 acceptance).
  - draw_bartlett_pool shape, normality, seed determinism.

Acceptance criteria from the Phase 2 plan section are met:
  - log_Zhat_pi_from_pool agrees with the z reference at bit-parity on
    shared input (max abs diff = 0 across the 108-cell fixture grid).
  - var(log Zhat) scales as 1/M_inner (M*var constant at ~0.35 across
    M in {30, 100, 300, 1000}; ratio var(30)/var(1000) = 36.3 vs
    theoretical 33.3).

bgms-side Phase 1 work (log_Z_NLO_gamma + alpha=1 incremental +
general-alpha incremental + V_a-pivot optimisation) is unaffected and
its 1253 assertions still pass.
Stage 3 Phase 3 of dev/plans/backlog/hierarchical-ggm-degord-rr.md.
The per-toggle unbiased 1/Z(Γ) estimator that composes the Phase 2 DEGORD
Bartlett-Cholesky sampler into the Lyne-style Russian-Roulette geometric
series for the hierarchical-spec MH ratio.

New files
---------
- src/models/ggm/z_ratio_estimator.{h,cpp}:
  - V_at_Gamma_pi_degord(K_depth, pools_t, pi_aux, chain_aux, c, rho):
      V = (1/c) · [1 + sum_{n=1..K} (-1)^n · prod_{i=1..n} (Zhat_i - c)/c / rho^n]
      with Zhat_i = exp(log_Zhat_pi_from_pool(pools_t[i-1], pi_aux, chain_aux)).
      Returns the signed V (caller tracks sign for ergodic averaging).
  - draw_U_degord_rr(rng, K_depth, pools_t, M_inner, q, rho):
      K_depth ~ Geom(1 - rho) via inverse-CDF on SafeRNG uniform;
      pools_t[n] is (dim x M_inner) iid N(0, 1) in the pre-transposed
      layout required by phi_pi_sample_from_noise.
  - ZRatioState (declaration only): the per-Γ state the Phase 4 chain
    will own (pools_t, K_depth, kappa, rho, log_Z_NLO_curr, sign_curr).

- Rcpp test exports in src/log_z_test_interface.cpp:
  - degord_V_at_Gamma_pi_cpp (List interface for pools_t).
  - degord_draw_U_rr_cpp returns List(K_depth, pools_t).

Tests
-----
- tests/testthat/test-z-ratio-estimator.R: 10 assertions covering:
  - V is unbiased for 1/Z within MC noise at q=5: mean(V) across 1000
    independent (K_depth, pools) draws sits within 4 SD of 1/Z computed
    via a high-precision (M=5000, n=20) log_Zhat truth proxy.
  - K_depth ~ Geom(1 - rho): mean, P(K=0), P(K>=5) all match within
    n=10000 MC tolerance.
  - V at K_depth = 0 equals 1/c exactly (machine precision).
  - draw_U_degord_rr is deterministic in the SafeRNG seed.
  - Signed V picks up the alternating series when c is far from 1/Z
    (small c forces negative samples at K_depth >= 1).

Cited references
----------------
- Direct port of ~/SV/Z/R/src/branchB_chain_route3a_degord.cpp:192-228
  for V_at_Gamma_pi_degord and :215-228 for the pool draw.
- Lyne (2015) Russian-Roulette construction; route3a_V_helpers
  notes/exactness-proposition.md §2 for radius/moment conditions.
…er_pair (Phase 4a)

Stage 3 Phase 4a of dev/plans/backlog/hierarchical-ggm-degord-rr.md. Wires
the Phase 2 DEGORD sampler + Phase 3 V/RR estimator into bgms's GGM
between-edge MH path. Converts the joint-spec MH ratio into the
hierarchical-spec ratio by multiplying by V(Γ_curr)/V(Γ_star), an
unbiased estimator of Z(Γ_star)/Z(Γ_curr).

This is the C++ infrastructure half of Phase 4. The R API surface
(bgm(..., graph_prior_spec = "hierarchical")) is Phase 4b, a separate
plumbing job.

Changes
-------
- src/models/ggm/ggm_model.h
  - New enum class GraphPriorSpec { Joint, Hierarchical } (header-level,
    outside the class).
  - GGMModel gains:
      void set_graph_prior_spec(GraphPriorSpec);
      void set_z_ratio_tuning(int M_inner, double kappa, double rho);
    plus private members graph_prior_spec_, v_M_inner_, v_kappa_, v_rho_,
    log_Z_NLO_curr_, v_K_depth_, v_pools_t_, chain_aux_degord_, and
    cached prior-family parameters (prior_sigma_, prior_alpha_, prior_beta_).

- src/models/ggm/ggm_model.cpp
  - ensure_hierarchical_state_(): lazy init. Validates the slab is
    NormalPrior and the diagonal is GammaScalePrior (throws otherwise).
    Builds chain_aux_degord_; full-recomputes log_Z_NLO_curr_ at the
    current Γ; draws the first U-pool.
  - refresh_z_ratio_pool_(): fresh (K_depth, pools_t) draw per iteration,
    mirroring the z chain's refresh_U_every = 1 convention.
  - prepare_iteration() now calls ensure + refresh when in Hierarchical mode.
  - update_edge_indicator_parameter_pair (both branches):
      after the existing joint MH ratio is assembled, if Hierarchical mode
      is active, compute log_Z_NLO_star (incremental at alpha=1, full
      recompute at alpha != 1), build PiAux at the DEGORD permutation
      sending (i, j) to (q-2, q-1), compute V_curr and V_star, and add
      log|V_curr|/|V_star| to ln_alpha. Auto-reject on non-finite or
      zero |V| (Lyne 2015 convention).
      On accept, log_Z_NLO_curr_ <- log_Z_NLO_star.
  - Defensive note retained that the determinant_tilt_ * log_det_ratio_edge
    term is identically 0 under Roverato slaving (verified during the
    Z-side audit and reconfirmed by direct calculation here).

- src/priors/parameter_prior.h
  - Adds public scale()/shape()/rate() accessors on NormalPrior and
    GammaScalePrior so the hierarchical state can read the prior params
    without coupling to the construction site.

- src/log_z_test_interface.cpp
  - New ggm_hierarchical_smoke_cpp entry: constructs a GGMModel directly,
    switches it to Hierarchical, runs n_sweeps Metropolis edge-update
    sweeps, returns final_edges + n_edges_path.

- tests/testthat/test-ggm-hierarchical.R
  - 13 assertions across 3 test_that blocks:
    1. Chain runs without crash and edge counts stay in [0, p(p-1)/2]
       across 200 sweeps at q=5, n=100 random data.
    2. Output is deterministic in the seed (same seed -> identical output;
       different seed -> different trajectory).
    3. Chain remains non-degenerate across delta in {0, 0.5, 1.0} (no
       all-zero or all-full lock-in).

Existing Phase 1-3 testthat assertions (1253 + 1526 + 10) continue to pass.
The default Joint mode (existing chain semantics) is unchanged.

Acceptance criteria from the Phase 4 plan section met for the C++-only
half:
  - bgm(... graph_prior_spec = "hierarchical") will finish on small
    synthetic data once Phase 4b wires the R-side argument.
  - Per-sweep wall: q=5 200-sweep run completes in well under a second on
    M5 Pro at M_inner=100. Full benchmarking deferred to Phase 5.

Phase 5 (SBC validation vs AKM-J truth across the delta-sweep) and
Phase 4b (R API surface) are the remaining pieces of Stage 3.
… 4b)

Stage 3 Phase 4b: exposes the Phase 4a hierarchical-spec inference path
through the bgm() R interface.

R API
-----
- bgm() gains two new arguments:
    graph_prior_spec = c("joint", "hierarchical")  # default "joint"
    z_ratio_tuning   = list(M_inner = 100L, kappa = 1.0, rho = 0.5)
- Both arguments are plumbed through bgm_spec() -> build_spec_ggm() ->
  run_sampler_ggm() -> sample_ggm() -> GGMModel::set_graph_prior_spec /
  set_z_ratio_tuning.
- Validation in bgm_spec():
    - graph_prior_spec must be "joint" or "hierarchical".
    - Hierarchical requires model_type in {"ggm", "mixed_mrf"} (continuous
      precision block needed). Errors helpfully if mismatch.
    - Hierarchical requires interaction_prior_type == "normal" and
      scale_prior_type == "gamma" (the prior families for which the
      closed-form Laplace-NLO normaliser approximation is implemented).
      Validated up front so the user gets a clean error before MCMC starts.
    - z_ratio_tuning components validated: M_inner positive integer,
      kappa > 0, rho in (0, 1).

C++ glue
--------
- sample_ggm.cpp: adds graph_prior_spec / z_ratio_M_inner / z_ratio_kappa /
  z_ratio_rho function args, calls set_z_ratio_tuning + set_graph_prior_spec
  when "hierarchical".
- GGMModel copy constructor (header) now also copies graph_prior_spec_ +
  v_M_inner_ + v_kappa_ + v_rho_ so multi-chain runs propagate the setting.
  Lazy state (pools, log_Z_NLO_curr, chain_aux_degord) is intentionally
  reset on clone - pools are RNG-derived and would otherwise share random
  draws across chains.

Roxygen
-------
- bgm.R gets full @param entries for graph_prior_spec and z_ratio_tuning.
  man/bgm.Rd regenerated via devtools::document().

Tests
-----
tests/testthat/test-ggm-hierarchical.R adds 10 new R-API assertions:
- bgm() with graph_prior_spec = "hierarchical" returns a posterior
  inclusion matrix with valid shape and values in [0, 1].
- Cauchy slab + hierarchical errors with a helpful message naming the
  required interaction_prior_type.
- Pure ordinal data + hierarchical errors naming "continuous".
- z_ratio_tuning rejects M_inner = 0, kappa < 0, rho in {0, 1}.

All 23 assertions (13 C++ smoke + 10 R-API) pass. Phase 1-3 tests
(1253 + 1526 + 10) continue to pass; default Joint path is unchanged.

What is left in Stage 3
-----------------------
- Phase 5: SBC validation against AKM-J truth across the delta-sweep at
  q = 10 (the manuscript Table 5 cells, post the Z-side disable_log_r
  fix).
- Phase 6: vignette section + cross-references in the spikeslab companion.
Two CI failures on PR #109:

1. tests/testthat/fixtures/{log_z_nlo_reference,degord_sampler_reference}.rds
   were absent from R CMD check because .Rbuildignore had the blanket
   pattern `^tests/testthat/fixtures$`, which the original author had
   intended to exclude only the legacy-fixture subdirectory but which in
   fact swallows every fixture below it. Tighten the pattern to
   `^tests/testthat/fixtures/legacy$` so the live fixtures ship; verified
   via R CMD build that the resulting tarball contains
   `tests/testthat/fixtures/degord_sampler_reference.rds` and
   `tests/testthat/fixtures/log_z_nlo_reference.rds` while continuing to
   exclude the `legacy/` subdirectory.

2. lintr semicolon_linter warnings on `; var <- val` compound statements
   in three test files. Rewrite each compound-on-one-line as
   newline-separated statements:
   - tests/testthat/test-degord-sampler.R (4 sites)
   - tests/testthat/test-ggm-hierarchical.R (3 sites)
   - tests/testthat/test-z-ratio-estimator.R (3 sites)

All 4 test files re-run clean locally: 1253 + 1526 + 10 + 23 = 2812
assertions pass. `lintr::lint_package()` reports 0 lints.
…(Phase 5)

Stage 3 Phase 5: simulation-based calibration of the bgms hierarchical-
spec chain at p = 5 across delta in {0, 0.5, 1, 2}, R = 300 replications.

Scope decision (q = 5 vs q = 10)
--------------------------------
The methodology paper validates at q = 10 against AKM-J perfect-sampler
truth, but AKM-J hits its max_proposals cap on roughly 23% of delta = 2
draws (selection bias on hard high-tilt cases that the unbiased Lyne RR
chain is designed to handle correctly). Restricting bgms's SBC harness
to q = 5 sidesteps this: at q = 5, sample_ggm_prior()'s constrained NUTS
at fixed Gamma_true converges cleanly across the full delta sweep, so
the truth source is trustworthy.

Harness
-------
Per replicate at each delta:
  1. Gamma_true ~ Bern(0.5) on the upper triangle.
  2. K_true | Gamma_true via sample_ggm_prior(n_warmup = 500, n_samples = 1).
     Rebuild K_true using offdiag_names (row-major) rather than
     upper.tri() (column-major) so the off-diagonal entries land at the
     right (i, j); without this the matrix is non-PD.
  3. Y | K_true ~ N(0, K_true^{-1}), n_obs = 100.
  4. bgm(Y, ..., graph_prior_spec = "hierarchical"), iter = 400, warmup = 100.
  5. Rank K_ii_true within posterior raw_samples$main[[1]][, i].
     Both sides are on the same K_yy partial-association scale; verified
     empirically (mean K_diag_true matches mean main posterior at near-
     empty graphs, low delta).

Tests
-----
- KS uniformity test per K_ii per delta (5 x 4 = 20 assertions), pass at
  alpha = 0.01.
- Gamma marginal calibration: posterior mean inclusion across the 300 R
  reps tracks empirical Gamma_true frequency within 4 SE (binomial gap
  at R * p(p-1)/2 = 3000 draws). One assertion per delta (4 total).

Total: 24 assertions, env-gated behind BGMS_RUN_SLOW_TESTS = true.

Runtime
-------
Smoke run at R = 30 (delta = 0.5) takes ~12 s; full R = 300 across
delta in {0, 0.5, 1, 2} extrapolates to ~8 min on M5 Pro. Verified
ranks uniform at R = 30 with KS p in [0.11, 0.84] across all 5 K_ii.

Why edge-indicator ranks aren't tested directly
-----------------------------------------------
Spike-and-slab makes per-(i, j) gamma_ij a discrete {0, 1} draw, which
breaks standard SBC's continuous-rank protocol (tied ranks degrade the
KS test's calibration). The existing test-sbc-ggm.R sidesteps this
by running WITHOUT edge selection. Here we run WITH edge selection
(it's the load-bearing case for hierarchical-spec) but only rank-test
K_ii. The gamma marginal calibration check is a sanity assertion, not
strict SBC.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant