feat(tracing): skip Agentex span-start write by default (end-only ingest)#438
Merged
Merged
Conversation
danielmillerp
approved these changes
Jun 23, 2026
…est) The Agentex tracing processor wrote every span twice — a `spans.create` on start and a `spans.update` on end — doubling per-span HTTP/DB writes against the Agentex control plane. Under load this is what timed out span-start activities and pressured the Agentex Postgres connection pool. Default to end-only ingest: skip the start write and persist each span once, as a single `spans.create` on end (a bare `spans.update` would 404 since the row was never created). Gated by a dedicated, default-ON env var `AGENTEX_TRACING_SKIP_AGENTEX_SPAN_START`, independent of the SGP/EGP processor's `AGENTEX_TRACING_SKIP_SPAN_START` so the EGP span path is unchanged. Set to 0/false/no/off to restore start writes. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Per Greptile review: re-reading the flag in both on_span_start and on_span_end risks splitting a span's lifecycle if the env toggles between them — start-skip + end-update lands on a non-existent row (404). Unlike the SGP processor (idempotent upsert_batch on both halves), the Agentex backend's create-on-start / update-on-end asymmetry makes this a real failure. Capture _skip_span_start once in __init__ for both sync and async processors and reference it in both handlers, so the decision is always consistent. Adds a regression test (toggle env mid-span -> still end-only INSERT). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
e22b496 to
8474cc0
Compare
Merged
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The Agentex tracing processor writes every span twice — a
spans.createon start and aspans.updateon end. The start row is overwritten by the end write moments later, so persisting it doubles per-span HTTP/DB write volume against the Agentex control plane. Under load this pressures the Agentex Postgres load.This defaults the Agentex backend to end-only ingest: skip the start write and persist each span once, as a single
spans.createon end.Key correctness detail
Unlike the SGP processor (uniform
upsert_batch), the Agentex processor doescreateon start +updateon end. Naively skipping start would make the endupdatehit a non-existent row (404). So when the start is skipped,on_span_endnow does a singlespans.createcarrying the complete span — one INSERT per span instead of INSERT + UPDATE. Applied to both the sync and async processors.Scope / flag
AGENTEX_TRACING_SKIP_AGENTEX_SPAN_START.0/false/no/offto restore the start write.Trade-off
Same as the SGP end-only path: in-flight spans aren't visible until they complete, and spans whose process crashes before
endwon't persist. Documented in the docstring.Tests
pytest tests/lib/core/tracing/processors/test_agentex_tracing_processor.py— 7 passed (3 existing + 4 new: default-skip and skip-disabled, sync + async).pytest tests/lib/core/tracing— 85 passed, 2 skipped.ruff checkclean.Related
🤖 Generated with Claude Code
Greptile Summary
This PR introduces an "end-only ingest" mode for the Agentex tracing processor, skipping the per-span
spans.createcall on start and converting the end-of-span write fromspans.updateto a singlespans.createcarrying the complete span — halving the HTTP/DB write volume against the Agentex control plane.self._skip_span_startonce in__init__, ensuring the two halves of each span's lifecycle always agree on the mode (avoids the previously flagged 404/double-create split)._create_kwargshelper: extracts the full-span keyword dict shared by the start write (skip=OFF) and the end-only insert (skip=ON), removing the field-list duplication between sync and async paths.Confidence Score: 5/5
Safe to merge — the change is a default-ON write-reduction flag with a clear opt-out, and both the split-decision hazard and the 404 risk are properly addressed.
The init-time flag capture ensures both halves of every span always agree on the mode, the end-only path correctly switches from spans.update to spans.create, and _create_kwargs passes the same fields as the original start-path code did. Tests cover all three behavioural cases for both processors. No correctness or safety issues found in the changed code.
No files require special attention.
Important Files Changed
Flowchart
%%{init: {'theme': 'neutral'}}%% flowchart TD INIT["__init__(config)\nself._skip_span_start = _skip_span_start_enabled()"] INIT --> START["on_span_start(span)"] INIT --> END["on_span_end(span)"] START --> C1{self._skip_span_start?} C1 -- "True (default)" --> NOOP["return (no-op)"] C1 -- "False" --> CREATE_START["spans.create(**_create_kwargs(span))"] END --> C2{self._skip_span_start?} C2 -- "True (default)" --> CREATE_END["spans.create(**_create_kwargs(span))\n(single INSERT — complete span)"] C2 -- "False" --> UPDATE_END["spans.update(span.id, **span.model_dump(...))"]%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%% flowchart TD INIT["__init__(config)\nself._skip_span_start = _skip_span_start_enabled()"] INIT --> START["on_span_start(span)"] INIT --> END["on_span_end(span)"] START --> C1{self._skip_span_start?} C1 -- "True (default)" --> NOOP["return (no-op)"] C1 -- "False" --> CREATE_START["spans.create(**_create_kwargs(span))"] END --> C2{self._skip_span_start?} C2 -- "True (default)" --> CREATE_END["spans.create(**_create_kwargs(span))\n(single INSERT — complete span)"] C2 -- "False" --> UPDATE_END["spans.update(span.id, **span.model_dump(...))"]Reviews (3): Last reviewed commit: "fix(tracing): capture skip-span-start de..." | Re-trigger Greptile