Skip to content

[superlog] Stop tool-call failures from escalating succeeded job logs to ERROR#494

Open
superlog-app[bot] wants to merge 1 commit into
stagingfrom
superlog/fix-tool-error-escalates-job-log-level-retry-2abcfd47
Open

[superlog] Stop tool-call failures from escalating succeeded job logs to ERROR#494
superlog-app[bot] wants to merge 1 commit into
stagingfrom
superlog/fix-tool-error-escalates-job-log-level-retry-2abcfd47

Conversation

@superlog-app

@superlog-app superlog-app Bot commented Jun 24, 2026

Copy link
Copy Markdown

Summary

Insights generation jobs that encounter an intermediate AI tool failure (SQL query error, annotation not found, goal update forbidden) emit their final wide-event log at ERROR level, even though the job completed successfully with all insights generated.

The root cause is in packages/ai/src/ai/tools/utils/logger.ts. When an AI tool fails, createToolLogger(toolName).error() calls requestLogger.error(err, context) on the job-level RequestLogger. This marks the request-scoped wide event as ERROR. When the job succeeds and logger.emit({ job_status: "succeeded" }) fires in jobs.ts, it inherits the ERROR level set by the intermediate failure — producing a log with both "level":"error" and "job_status":"succeeded".

The fix changes requestLogger.error(err, ...) to requestLogger.warn(message, { error: {...}, ...context }). Succeeded jobs with intermediate tool failures now emit at WARN. The error details (error.name, error.message, error.stack) are preserved in the wide event for debugging. Real fatal job failures recorded via captureInsightsError() and the jobs.ts catch block are unaffected — they call emitInsightsEvent("error", ...) directly.

An alternative approach would be to use requestLogger.set() to merge tool-error context without affecting the log level at all (INFO for fully succeeded jobs). The warn() approach chosen here preserves a visible signal that something failed internally even though the job recovered.

Incident on Superlog


Was this PR helpful? Leave feedback — goes straight to the Superlog team.


Summary by cubic

Stop recoverable tool-call failures from marking succeeded jobs as ERROR by logging them at WARN in packages/ai/src/ai/tools/utils/logger.ts. Error details are still included in the wide event, and real job failures continue to log as ERROR.

Written for commit 27e29ec. Summary will update on new commits.

Review in cubic

@vercel

vercel Bot commented Jun 24, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
databuddy-status Ready Ready Preview, Comment Jun 24, 2026 4:09pm
2 Skipped Deployments
Project Deployment Actions Updated (UTC)
dashboard Skipped Skipped Jun 24, 2026 4:09pm
documentation Skipped Skipped Jun 24, 2026 4:09pm

@unkey-deploy

unkey-deploy Bot commented Jun 24, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Unkey Deploy

Name Status Preview Inspect Updated (UTC)
api (preview) Ready Visit Preview Inspect Jun 24, 2026 4:10pm

@greptile-apps

greptile-apps Bot commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR fixes a log-level escalation bug where a recoverable AI tool failure (e.g. SQL error, annotation-not-found) was calling requestLogger.error(), permanently marking the job's wide-event at ERROR even when the job ultimately succeeded. The fix changes that call to requestLogger.warn() and inlines the structured error fields (error.name, error.message, error.stack) into the log context so debugging context is preserved.

  • The change is scoped to one branch of createToolLogger.error() — the path that holds an active RequestLogger. Fatal failures via captureInsightsError() and the jobs.ts catch block are unaffected.
  • The fallback branch (no active request scope) still emits via log.error() but does not receive the new error: {...} enrichment added to the request-logger path, creating a minor field-schema inconsistency between the two paths.

Confidence Score: 4/5

The change is minimal and correctly targeted — it only affects the in-scope requestLogger path in the tool logger, leaving fatal-failure paths untouched.

The fix is narrow and well-reasoned: swapping requestLogger.error() for requestLogger.warn() stops recoverable tool failures from polluting the job-level wide event, and the structured error fields are preserved. The only gap is that the fallback log.error() branch (no active request scope) was not updated to include the same error: { name, message, stack } fields, so the two code paths emit different field schemas — a minor inconsistency that could affect log queries in non-request contexts.

packages/ai/src/ai/tools/utils/logger.ts — specifically the fallback log.error() branch around line 36 which was not updated to include the new structured error fields.

Important Files Changed

Filename Overview
packages/ai/src/ai/tools/utils/logger.ts Downgrades requestLogger.error() → requestLogger.warn() for recoverable tool-call failures and adds structured error fields; fallback path is not updated to include the new error fields.

Comments Outside Diff (1)

  1. packages/ai/src/ai/tools/utils/logger.ts, line 36-41 (link)

    P2 Fallback path lacks error: enrichment added to the request-logger branch

    The requestLogger path now includes error: { name, message, stack } for structured querying, but the fallback branch (no active AI request scope) still calls log.error() without those fields. Any tool failure that occurs outside a request scope — e.g. during testing or background bootstrap — will be missing error.name/error.message/error.stack in the emitted log, making field-based queries inconsistent across call sites.

Reviews (1): Last reviewed commit: "[superlog] Stop tool-call failures from ..." | Re-trigger Greptile

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants