Skip to content

feat(tracing): skip Agentex span-start write by default (end-only ingest)#438

Merged
NiteshDhanpal merged 2 commits into
nextfrom
feat/agentex-skip-span-start-default
Jun 23, 2026
Merged

feat(tracing): skip Agentex span-start write by default (end-only ingest)#438
NiteshDhanpal merged 2 commits into
nextfrom
feat/agentex-skip-span-start-default

Conversation

@NiteshDhanpal

@NiteshDhanpal NiteshDhanpal commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Summary

The Agentex tracing processor writes every span twice — a spans.create on start and a spans.update on end. The start row is overwritten by the end write moments later, so persisting it doubles per-span HTTP/DB write volume against the Agentex control plane. Under load this pressures the Agentex Postgres load.

This defaults the Agentex backend to end-only ingest: skip the start write and persist each span once, as a single spans.create on end.

Key correctness detail

Unlike the SGP processor (uniform upsert_batch), the Agentex processor does create on start + update on end. Naively skipping start would make the end update hit a non-existent row (404). So when the start is skipped, on_span_end now does a single spans.create carrying the complete span — one INSERT per span instead of INSERT + UPDATE. Applied to both the sync and async processors.

Scope / flag

  • Gated by a dedicated, default-ON env var AGENTEX_TRACING_SKIP_AGENTEX_SPAN_START.
  • Set to 0/false/no/off to restore the start write.

Trade-off

Same as the SGP end-only path: in-flight spans aren't visible until they complete, and spans whose process crashes before end won't persist. Documented in the docstring.

Tests

  • pytest tests/lib/core/tracing/processors/test_agentex_tracing_processor.py — 7 passed (3 existing + 4 new: default-skip and skip-disabled, sync + async).
  • pytest tests/lib/core/tracing — 85 passed, 2 skipped.
  • ruff check clean.

Related

🤖 Generated with Claude Code

Greptile Summary

This PR introduces an "end-only ingest" mode for the Agentex tracing processor, skipping the per-span spans.create call on start and converting the end-of-span write from spans.update to a single spans.create carrying the complete span — halving the HTTP/DB write volume against the Agentex control plane.

  • Skip flag captured at init: both sync and async processors store self._skip_span_start once in __init__, ensuring the two halves of each span's lifecycle always agree on the mode (avoids the previously flagged 404/double-create split).
  • _create_kwargs helper: extracts the full-span keyword dict shared by the start write (skip=OFF) and the end-only insert (skip=ON), removing the field-list duplication between sync and async paths.
  • Test coverage: 4 new test cases verify default-skip, skip-disabled, and the mid-span toggle regression for both sync and async processors.

Confidence Score: 5/5

Safe to merge — the change is a default-ON write-reduction flag with a clear opt-out, and both the split-decision hazard and the 404 risk are properly addressed.

The init-time flag capture ensures both halves of every span always agree on the mode, the end-only path correctly switches from spans.update to spans.create, and _create_kwargs passes the same fields as the original start-path code did. Tests cover all three behavioural cases for both processors. No correctness or safety issues found in the changed code.

No files require special attention.

Important Files Changed

Filename Overview
src/agentex/lib/core/tracing/processors/agentex_tracing_processor.py Adds _skip_span_start_enabled(), _create_kwargs(), and the init-time flag capture; both processors correctly branch on self._skip_span_start to skip start writes and substitute a single end-of-span INSERT. Logic is sound and consistent across sync/async.
tests/lib/core/tracing/processors/test_agentex_tracing_processor.py Adds TestAgentexSyncSkipSpanStart and TestAgentexAsyncSkipSpanStart with 3 cases each (default, skip-disabled, mid-span toggle regression). New _make_span() helper builds a realistic Span fixture. Async tests rely on project-wide asyncio_mode=auto in keeping with existing test patterns.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    INIT["__init__(config)\nself._skip_span_start = _skip_span_start_enabled()"]

    INIT --> START["on_span_start(span)"]
    INIT --> END["on_span_end(span)"]

    START --> C1{self._skip_span_start?}
    C1 -- "True (default)" --> NOOP["return (no-op)"]
    C1 -- "False" --> CREATE_START["spans.create(**_create_kwargs(span))"]

    END --> C2{self._skip_span_start?}
    C2 -- "True (default)" --> CREATE_END["spans.create(**_create_kwargs(span))\n(single INSERT — complete span)"]
    C2 -- "False" --> UPDATE_END["spans.update(span.id, **span.model_dump(...))"]
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
flowchart TD
    INIT["__init__(config)\nself._skip_span_start = _skip_span_start_enabled()"]

    INIT --> START["on_span_start(span)"]
    INIT --> END["on_span_end(span)"]

    START --> C1{self._skip_span_start?}
    C1 -- "True (default)" --> NOOP["return (no-op)"]
    C1 -- "False" --> CREATE_START["spans.create(**_create_kwargs(span))"]

    END --> C2{self._skip_span_start?}
    C2 -- "True (default)" --> CREATE_END["spans.create(**_create_kwargs(span))\n(single INSERT — complete span)"]
    C2 -- "False" --> UPDATE_END["spans.update(span.id, **span.model_dump(...))"]
Loading

Reviews (3): Last reviewed commit: "fix(tracing): capture skip-span-start de..." | Re-trigger Greptile

Comment thread src/agentex/lib/core/tracing/processors/agentex_tracing_processor.py Outdated
NiteshDhanpal and others added 2 commits June 23, 2026 11:20
…est)

The Agentex tracing processor wrote every span twice — a `spans.create` on
start and a `spans.update` on end — doubling per-span HTTP/DB writes against
the Agentex control plane. Under load this is what timed out span-start
activities and pressured the Agentex Postgres connection pool.

Default to end-only ingest: skip the start write and persist each span once,
as a single `spans.create` on end (a bare `spans.update` would 404 since the
row was never created). Gated by a dedicated, default-ON env var
`AGENTEX_TRACING_SKIP_AGENTEX_SPAN_START`, independent of the SGP/EGP
processor's `AGENTEX_TRACING_SKIP_SPAN_START` so the EGP span path is
unchanged. Set to 0/false/no/off to restore start writes.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Per Greptile review: re-reading the flag in both on_span_start and
on_span_end risks splitting a span's lifecycle if the env toggles between
them — start-skip + end-update lands on a non-existent row (404). Unlike the
SGP processor (idempotent upsert_batch on both halves), the Agentex backend's
create-on-start / update-on-end asymmetry makes this a real failure.

Capture _skip_span_start once in __init__ for both sync and async processors
and reference it in both handlers, so the decision is always consistent.
Adds a regression test (toggle env mid-span -> still end-only INSERT).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@NiteshDhanpal NiteshDhanpal force-pushed the feat/agentex-skip-span-start-default branch from e22b496 to 8474cc0 Compare June 23, 2026 18:20
@NiteshDhanpal NiteshDhanpal merged commit 10d22a2 into next Jun 23, 2026
57 checks passed
@NiteshDhanpal NiteshDhanpal deleted the feat/agentex-skip-span-start-default branch June 23, 2026 18:48
@stainless-app stainless-app Bot mentioned this pull request Jun 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants