Skip to content

[superlog] Use warn level for recoverable AI tool errors to stop false-positive ERROR job logs#490

Open
superlog-app[bot] wants to merge 1 commit into
stagingfrom
superlog/fix-tool-error-log-level-escalation
Open

[superlog] Use warn level for recoverable AI tool errors to stop false-positive ERROR job logs#490
superlog-app[bot] wants to merge 1 commit into
stagingfrom
superlog/fix-tool-error-log-level-escalation

Conversation

@superlog-app

@superlog-app superlog-app Bot commented Jun 20, 2026

Copy link
Copy Markdown

Summary

Successful insights generation jobs were being logged at ERROR level whenever an intermediate AI tool call (SQL query, annotation creation) failed during the run — even when the agent recovered and completed with job_status: succeeded. This created noisy false-positive incidents that masked real failures.

The root cause is that createToolLogger.error() called requestLogger.error() on the job-scoped evlog logger, which set hasError = true internally. When logger.emit() was called at the end of a successful job, evlog determined the final severity using level = hasError ? "error" : hasWarn ? "warn" : "info" — yielding ERROR even though the job succeeded.

The fix is a one-line change in packages/ai/src/ai/tools/utils/logger.ts: the error() method of createToolLogger now calls requestLogger.warn() instead of requestLogger.error(). This means recoverable tool failures set hasWarn = true (final level WARN) instead of hasError = true (final level ERROR). True job failures still call logger.error() directly in jobs.ts and will continue to emit at ERROR level correctly.

Alternative: Instead of downgrading the tool logger, the job success path could explicitly reset hasError before calling logger.emit(). However, evlog does not expose a reset API, and the warn approach is cleaner and more semantically correct — tool errors are warnings to the agent, not failures of the job.

Incident on Superlog


Was this PR helpful? Leave feedback — goes straight to the Superlog team.


Summary by cubic

Use warn level for recoverable tool errors so successful jobs no longer emit ERROR severity logs. This removes noisy false positives while keeping real failures at ERROR.

  • Bug Fixes
    • Updated createToolLogger.error to call requestLogger.warn(...), preventing hasError from being set and avoiding final job-level ERROR on successful runs.
    • True job failures still log via logger.error() in jobs.ts and emit ERROR as expected.

Written for commit 02a2354. Summary will update on new commits.

Review in cubic

@vercel

vercel Bot commented Jun 20, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
databuddy-status Ready Ready Preview, Comment Jun 20, 2026 1:38am
2 Skipped Deployments
Project Deployment Actions Updated (UTC)
dashboard Skipped Skipped Jun 20, 2026 1:38am
documentation Skipped Skipped Jun 20, 2026 1:38am

@unkey-deploy

unkey-deploy Bot commented Jun 20, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Unkey Deploy

Name Status Preview Inspect Updated (UTC)
api (preview) Ready Visit Preview Inspect Jun 20, 2026 1:38am

@greptile-apps

greptile-apps Bot commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR fixes a log-level escalation bug where recoverable AI tool call failures (e.g., a failed SQL query during an insights job) were causing the final job-level evlog event to emit at ERROR even when the job ultimately succeeded. The fix changes requestLogger.error(err, ...) to requestLogger.warn(message, ...) in createToolLogger, so intermediate tool failures set hasWarn instead of hasError on the scoped logger.

  • The error() method of createToolLogger now calls requestLogger.warn() when a job-scoped logger is active, preventing false-positive ERROR job logs on successful runs.
  • The Error object that was constructed to carry a stack trace is removed; warn-level entries now receive only the message string.
  • True job failures that call logger.error() directly in jobs.ts are unaffected and will still emit at ERROR level.

Confidence Score: 4/5

Safe to merge; the change correctly eliminates false-positive ERROR job logs for successful runs without affecting true failure paths.

The core fix is targeted and correct. The two observations are: the Error object (stack trace) is dropped from warn-level entries, and the fallback path (no requestLogger) still emits at error while the scoped path now emits at warn. Neither is a blocking issue, but the level asymmetry between the two paths of the same method is worth addressing.

Only packages/ai/src/ai/tools/utils/logger.ts changed; the fallback branch (lines 33–38) is the one worth a second look for the log-level consistency question.

Important Files Changed

Filename Overview
packages/ai/src/ai/tools/utils/logger.ts One-line fix: requestLogger.error(err, ...) replaced with requestLogger.warn(message, ...) in the error() method to prevent recoverable tool failures from escalating the job-level evlog severity to ERROR. Minor: the Error object (stack trace) and fallback path level consistency are worth noting.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Tool encounters error
toolLogger.error called] --> B{requestLogger active?}
    B -- Yes --> C["requestLogger.warn(message)
hasWarn = true"]
    B -- No --> D["log.error(message)
global fallback - ERROR level"]
    C --> E{Job completes}
    E -- job_status: succeeded --> F["evlog.emit()
final level = WARN ✓"]
    E -- job_status: failed
jobs.ts calls logger.error --> G["evlog.emit()
final level = ERROR ✓"]
    D --> H[Standalone log at ERROR
no job scope - no aggregation effect]
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
flowchart TD
    A[Tool encounters error
toolLogger.error called] --> B{requestLogger active?}
    B -- Yes --> C["requestLogger.warn(message)
hasWarn = true"]
    B -- No --> D["log.error(message)
global fallback - ERROR level"]
    C --> E{Job completes}
    E -- job_status: succeeded --> F["evlog.emit()
final level = WARN ✓"]
    E -- job_status: failed
jobs.ts calls logger.error --> G["evlog.emit()
final level = ERROR ✓"]
    D --> H[Standalone log at ERROR
no job scope - no aggregation effect]
Loading

Comments Outside Diff (2)

  1. packages/ai/src/ai/tools/utils/logger.ts, line 21-31 (link)

    P2 Stack trace lost on downgraded tool errors

    The original requestLogger.error(err, ...) path wrapped the message in a new Error(message) to capture a stack trace. The new requestLogger.warn(message, ...) call passes only the string, so any debugging of why a recoverable tool failure occurred (e.g., which call site triggered it) loses the stack. If a pattern of a specific recoverable failure needs investigation, there will be no trace to follow in the warn-level log.

    Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

  2. packages/ai/src/ai/tools/utils/logger.ts, line 33-38 (link)

    P2 Fallback path logs at ERROR while scoped path logs at WARN

    When no requestLogger is active, toolLogger.error() still routes to log.error(...). The same call produces different log levels depending on whether a job scope is active. If a tool failure is semantically a warning (recoverable), the fallback path should also use log.warn(...) for consistency. As-is, the same recoverable tool error appears at different severities in different execution contexts.

Reviews (1): Last reviewed commit: "[superlog] Use warn level for recoverabl..." | Re-trigger Greptile

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants