Skip to content

Serve singular scalar string eq filters with a GIN @> containment predicate#5138

Open
habdelra wants to merge 2 commits into
mainfrom
cs-11414-speed-up-indexer-search-queries-gin-served-containment-for
Open

Serve singular scalar string eq filters with a GIN @> containment predicate#5138
habdelra wants to merge 2 commits into
mainfrom
cs-11414-speed-up-indexer-search-queries-gin-served-containment-for

Conversation

@habdelra
Copy link
Copy Markdown
Contributor

@habdelra habdelra commented Jun 5, 2026

What

On the realm-server search path, a singular-path scalar eq filter — e.g. eq: { 'customer.id': X } from a query-backed linksToMany getter — compiled to:

search_doc -> 'customer' ->> 'id' = $1

That ->> path-extraction cannot use the existing GIN search_doc index, so Postgres narrows to the card type and then heap-scans every row of that type, discarding the non-matches. This PR rewrites such predicates as a containment:

search_doc @> '{"customer":{"id":X}}'::jsonb

which the existing gin (search_doc) index serves directly (bitmap index scan instead of a heap scan). On staging this was a ~33× win on the measured filter (200 ms → 6 ms, cold) for the same rows.

How

The rewrite is a two-pass concern, modeled parallel to the existing table-valued (json_tree) machinery:

  • Pass 1 (schema-aware). A new JsonContainsQuery node, emitted by fieldEqFilter. fieldArity still routes on cardinality only (singular → this node, plural → the existing json_tree + fullkey LIKE path). The resolver emits a JsonContains node only when the leaf is a string-valued, non-numeric field; otherwise it falls back to today's ->> extraction.
  • Pass 2 (dialect). expressionToSql renders JsonContains per adapter — Postgres search_doc @> $n::jsonb (nested object from the path segments), SQLite search_doc -> 'a' ->> 'b' = $n. SQLite has no @>, so it keeps the extraction form.

Why the scope is exactly these three gates

The rewrite is applied only where @> containment is provably equivalent to the ->> extraction equality it replaces:

  1. Singular paths only. A plural segment anywhere in the path keeps the json_tree machinery — @>'s array containment is order-insensitive and "discard non-matching elements", so it loses the per-element positional binding that fullkey LIKE '$.…[%]…' enforces.
  2. String, non-numeric leaves only. Numeric leaves (number/big-integer) keep their ::numeric cast — JSON numeric normalization (5 vs 5.0) differs from text equality. Booleans never reach the branch (a boolean value isn't a string).
  3. Positive polarity only. On an absent path ->> yields SQL NULL while @> yields FALSE; identical in a positive filter, but under not they diverge (NOT NULL drops the row, NOT FALSE keeps it). A polarity flag is threaded through filterCondition (flipped at each not), and negated eq keeps extraction.

@> is also anchored from the document root (object containment is key-matched recursively from the top), so {"customer":{"id":X}} matches exactly search_doc.customer.id == X and never a stray id elsewhere — equivalent to the -> … ->> navigation, at any nesting depth.

Behavior / compatibility

  • SQLite output is byte-identical to before, so the host test suite (sqlite-backed) is unaffected.
  • Plural-field, numeric, boolean, range, in, contains, matches, and negated filters are unchanged.

Tests

Adds eq-containment-integration-test.ts, a Postgres integration test that asserts both the result set and which SQL form was emitted (by capturing executed SQL), covering: top-level string eq, linksTo .id, nested contains-composite, numeric, boolean, plural-leaf, interior-plural (contacts.emailjson_tree), negated (preserving NULL-on-absent-path row exclusion), and double-negation (back to @>).

…redicate

A singular-path scalar `eq` (e.g. `customer.id`) compiled to
`search_doc -> 'customer' ->> 'id' = $v` — a JSON path-extraction the GIN
`search_doc` index cannot serve, so Postgres heap-scans every row of the
filtered card type and discards the non-matches. Rewrite it as a containment
predicate `search_doc @> {"customer":{"id":$v}}`, which the existing
`gin (search_doc)` index serves directly.

The rewrite is modeled as a two-pass concern, parallel to the existing
table-valued (`json_tree`) machinery:
- Pass 1 (schema-aware): a `JsonContainsQuery` node, routed by `fieldArity` on
  cardinality only. The singular branch resolves it; the resolver emits a
  `JsonContains` node only when the leaf is a string-valued, non-numeric field,
  otherwise it falls back to today's `->>` extraction.
- Pass 2 (dialect): `expressionToSql` renders `JsonContains` per adapter —
  Postgres `search_doc @> $n::jsonb`, SQLite `search_doc -> 'a' ->> 'b' = $n`.

Scoped so the rewrite is exactly equivalent to extraction-equality:
- singular paths only — a plural segment anywhere keeps the `json_tree` +
  `fullkey LIKE` machinery (containment loses per-element positional binding);
- string, non-numeric leaves only — numeric leaves keep their `::numeric` cast;
- positive polarity only — under `not`, `->>` yields NULL on an absent path
  while `@>` yields FALSE, and `NOT NULL` vs `NOT FALSE` diverge, so a
  polarity flag is threaded through `filterCondition` and negated `eq` keeps
  extraction.

SQLite output is byte-identical to before. Adds an `eq-containment`
Postgres integration test covering each routing path.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes realm-server search for singular-path scalar eq filters by rewriting eligible predicates into Postgres JSONB containment (search_doc @> ...::jsonb) so the existing GIN index on search_doc can be used, while preserving existing semantics for cases where containment is not equivalent (e.g., under not, numeric leaves, plural paths, and SQLite).

Changes:

  • Thread filter polarity through filter compilation so the containment rewrite only applies at positive polarity.
  • Add a two-pass JsonContainsQueryJsonContains expression node that resolves schema/serializer details before emitting adapter-specific SQL (@> on Postgres, ->/->> extraction on SQLite).
  • Add a Postgres integration test that asserts both result sets and that the expected SQL predicate form was emitted.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
packages/runtime-common/index-query-engine.ts Adds filter polarity tracking and a schema-aware containment rewrite for eligible singular-path string eq filters.
packages/runtime-common/expression.ts Introduces JsonContainsQuery/JsonContains nodes and renders them as @> (pg) or extraction equality (sqlite).
packages/realm-server/tests/index.ts Registers the new Postgres integration test in the realm-server test suite.
packages/realm-server/tests/eq-containment-integration-test.ts Adds coverage verifying containment vs extraction behavior across string/numeric/boolean/plural/negated cases and checks emitted SQL.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread packages/realm-server/tests/eq-containment-integration-test.ts
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 5, 2026

Host Test Results

    1 files      1 suites   1h 50m 1s ⏱️
2 944 tests 2 929 ✅ 15 💤 0 ❌
2 963 runs  2 948 ✅ 15 💤 0 ❌

Results for commit fa5911b.

Realm Server Test Results

    1 files  ±0      1 suites  ±0   8m 49s ⏱️ +14s
1 553 tests ±0  1 553 ✅ ±0  0 💤 ±0  0 ❌ ±0 
1 644 runs  ±0  1 644 ✅ ±0  0 💤 ±0  0 ❌ ±0 

Results for commit fa5911b. ± Comparison against earlier commit 025a5c8.

…query

- Brand code-ref module URLs as RealmResourceIdentifier (via a `ref` helper)
  and match the DefinitionLookup.lookupDefinition signature
  (ResolvedCodeRef -> Promise<Definition>), so glint `lint:types` passes.
- `lastFilterSql()` now excludes the per-test seeding INSERTs so the
  `@>`-present / `@>`-absent assertions reflect only the search query.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@habdelra habdelra requested a review from a team June 5, 2026 23:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants