[#11672] improvement(optimizer): optimize trigger-expr evaluation with table metadata & partition short-circuit#11664
Open
roryqi wants to merge 3 commits into
Open
[#11672] improvement(optimizer): optimize trigger-expr evaluation with table metadata & partition short-circuit#11664roryqi wants to merge 3 commits into
roryqi wants to merge 3 commits into
Conversation
Extend the trigger-expr/score-expr evaluation context so that, in addition to partition and table statistics, table metadata is available: column_count, partition_count, sort_order_count, and table properties (numeric values parsed to long, others kept as string). Both partitioned and non-partitioned tables now evaluate against partition statistics (when present), table statistics, and table metadata. The representation of trigger-expr is unchanged (a string in the policy rules). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Port of Pinterest gravitino-pinterest#249: 1. Short-circuit partitioned-table evaluation: try the trigger expression with table-level context only (table stats + metadata + rules); if it resolves without referencing partition variables, skip the per-partition loop. Relies on the QL engine's left-to-right && / || short-circuiting. 2. Precompute the table-level context once per initialize() instead of rebuilding it for every partition. 3. Cache compiled hyphen-to-underscore regex patterns in QLExpressionEvaluator to avoid Pattern.compile on every evaluation. Adds ExpressionEvaluator#tryToEvaluateBool returning Optional<Boolean>. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Code Coverage Report
Files
|
yuqi1129
reviewed
Jun 16, 2026
| .forEach( | ||
| (k, v) -> { | ||
| try { | ||
| context.put(k, Long.parseLong(v)); |
Contributor
There was a problem hiding this comment.
I would suggest we use NumberUtils.isCreatable to replace this one.
| return expressionEvaluator.tryToEvaluateBool(expression, context); | ||
| } catch (RuntimeException e) { | ||
| LOG.warn("Failed to evaluate expression '{}' with context {}", expression, context, e); | ||
| return Optional.of(false); |
Contributor
There was a problem hiding this comment.
It should be Optional.of(null) according to the Java docs in expressionEvaluator.
- TableMetadataTriggerExpressionUtils: replace try/catch around Long.parseLong with NumberUtils.isCreatable, converting via createNumber(v).longValue() to preserve Long semantics. - BaseExpressionStrategyHandler: return Optional.empty() (not Optional.of(false)) on evaluation failure, matching the ExpressionEvaluator#tryToEvaluateBool contract so the caller falls back to per-partition evaluation. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
Two related optimizations to the
maintenance/optimizerrecommender's trigger-expr / score-expr evaluation:column_count,partition_count,sort_order_count, and table properties (numeric values parsed tolong, others kept asstring), in addition to partition and table statistics. Both partitioned and non-partitioned tables now evaluate against partition statistics (when present), table statistics, and table metadata. The trigger-expr string representation is unchanged.&&/||short-circuiting).initialize()instead of rebuilding it for every partition.QLExpressionEvaluatorto avoidPattern.compileon every evaluation.ExpressionEvaluator#tryToEvaluateBoolreturningOptional<Boolean>.Why are the changes needed?
Trigger expressions previously could only reference partition/table statistics, limiting the rules users can write. They also re-evaluated every partition even when a table-level expression already decided the outcome, which is costly for large partitioned tables.
Fixes #11672
Does this PR introduce any user-facing change?
No API changes. Trigger-expr authors gain new referenceable variables (
column_count,partition_count,sort_order_count, and table properties).How was this patch tested?
New/extended unit tests:
TestTableMetadataTriggerExpressionUtils,TestQLExpressionEvaluator, andTestCompactionStrategyHandler../gradlew :maintenance:optimizer:testpasses locally.