Skip to content

Propagate null through unnest and single() three-valued logic#2406

Open
MuhammadTahaNaveed wants to merge 1 commit intoapache:masterfrom
MuhammadTahaNaveed:i2393
Open

Propagate null through unnest and single() three-valued logic#2406
MuhammadTahaNaveed wants to merge 1 commit intoapache:masterfrom
MuhammadTahaNaveed:i2393

Conversation

@MuhammadTahaNaveed
Copy link
Copy Markdown
Member

Two related defects in how AGE handled cypher null inside list-iterating constructs.

age_unnest packaged every iterated element as a non-SQL-NULL agtype datum, even AGTV_NULL scalars. SQL IS NULL / IS NOT NULL then couldn't see those nulls, so [x IN [null, 1] WHERE x IS NULL] dropped the null it was meant to keep, and WHERE x IS NOT NULL kept the null it was meant to drop. The same mismatch surfaced in UNWIND. AGE already treats SQL NULL as the row-level representation of cypher null elsewhere (RETURN null AS v yields SQL NULL, strict operators short-circuit on it); age_unnest now does the same by emitting the row with nulls[0] = true when the element is AGTV_NULL.

single() previously transformed to SELECT count(*) FROM unnest(list) AS x WHERE pred IS TRUE, with the grammar wrapping the result as (subquery) = 1. With the unnest fix, [null, 5] WHERE x > 0 left one definite true after the WHERE filter -> count = 1 -> true. Neo4j returns null because the unknown predicate could itself be a second match. Rewritten to a CASE built on count(*) FILTER (WHERE pred IS TRUE) and bool_or(pred IS NULL):

CASE WHEN count() FILTER (WHERE pred IS TRUE) >= 2 THEN false
WHEN bool_or(pred IS NULL) THEN NULL
WHEN count(
) FILTER (WHERE pred IS TRUE) = 1 THEN true
ELSE false END

The >=2 arm runs first so two definite trues dominate any unknowns. Fits inside the existing make_predicate_case_expr helper alongside all/any/none, removes the special-case transform branch and the grammar = 1 wrap. A small make_count_star_filter_agg helper mirrors the existing make_bool_or_agg. Verified against Neo4j for the new edge cases (one-true-plus-null, two-trues-plus-null, all-nulls, mixed-true-false-null).

The predicate_functions regression also picks up the corrected behavior of any/all/none over null elements: null > 0 now yields SQL NULL instead of being silently treated as true, so the three-valued combinators in those functions produce the openCypher results the comments previously documented as buggy.

Two related defects in how AGE handled cypher null inside list-iterating
constructs.

age_unnest packaged every iterated element as a non-SQL-NULL agtype
datum, even AGTV_NULL scalars. SQL `IS NULL` / `IS NOT NULL` then
couldn't see those nulls, so `[x IN [null, 1] WHERE x IS NULL]` dropped
the null it was meant to keep, and `WHERE x IS NOT NULL` kept the null
it was meant to drop. The same mismatch surfaced in UNWIND. AGE already
treats SQL NULL as the row-level representation of cypher null
elsewhere (`RETURN null AS v` yields SQL NULL, strict operators
short-circuit on it); age_unnest now does the same by emitting the row
with `nulls[0] = true` when the element is AGTV_NULL.

single() previously transformed to `SELECT count(*) FROM unnest(list)
AS x WHERE pred IS TRUE`, with the grammar wrapping the result as
`(subquery) = 1`. With the unnest fix, `[null, 5] WHERE x > 0` left
one definite true after the WHERE filter -> count = 1 -> true. Neo4j
returns null because the unknown predicate could itself be a second
match. Rewritten to a CASE built on `count(*) FILTER (WHERE pred IS
TRUE)` and `bool_or(pred IS NULL)`:

  CASE WHEN count(*) FILTER (WHERE pred IS TRUE) >= 2 THEN false
       WHEN bool_or(pred IS NULL)                     THEN NULL
       WHEN count(*) FILTER (WHERE pred IS TRUE) =  1 THEN true
       ELSE false END

The >=2 arm runs first so two definite trues dominate any unknowns.
Fits inside the existing make_predicate_case_expr helper alongside
all/any/none, removes the special-case transform branch and the
grammar `= 1` wrap. A small `make_count_star_filter_agg` helper
mirrors the existing `make_bool_or_agg`. Verified against Neo4j for
the new edge cases (one-true-plus-null, two-trues-plus-null,
all-nulls, mixed-true-false-null).

The predicate_functions regression also picks up the corrected
behavior of any/all/none over null elements: `null > 0` now yields
SQL NULL instead of being silently treated as true, so the
three-valued combinators in those functions produce the openCypher
results the comments previously documented as buggy.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

List-comprehension WHERE filters may mishandle null elements.

1 participant