[superlog] Use warn level for recoverable AI tool errors to stop false-positive ERROR job logs#490
Conversation
…e-positive ERROR job logs
|
The latest updates on your projects. Learn more about Vercel for GitHub.
2 Skipped Deployments
|
|
The latest updates on your projects. Learn more about Unkey Deploy
|
Greptile SummaryThis PR fixes a log-level escalation bug where recoverable AI tool call failures (e.g., a failed SQL query during an insights job) were causing the final job-level evlog event to emit at
Confidence Score: 4/5Safe to merge; the change correctly eliminates false-positive ERROR job logs for successful runs without affecting true failure paths. The core fix is targeted and correct. The two observations are: the Error object (stack trace) is dropped from warn-level entries, and the fallback path (no requestLogger) still emits at error while the scoped path now emits at warn. Neither is a blocking issue, but the level asymmetry between the two paths of the same method is worth addressing. Only packages/ai/src/ai/tools/utils/logger.ts changed; the fallback branch (lines 33–38) is the one worth a second look for the log-level consistency question. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[Tool encounters error
toolLogger.error called] --> B{requestLogger active?}
B -- Yes --> C["requestLogger.warn(message)
hasWarn = true"]
B -- No --> D["log.error(message)
global fallback - ERROR level"]
C --> E{Job completes}
E -- job_status: succeeded --> F["evlog.emit()
final level = WARN ✓"]
E -- job_status: failed
jobs.ts calls logger.error --> G["evlog.emit()
final level = ERROR ✓"]
D --> H[Standalone log at ERROR
no job scope - no aggregation effect]
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
flowchart TD
A[Tool encounters error
toolLogger.error called] --> B{requestLogger active?}
B -- Yes --> C["requestLogger.warn(message)
hasWarn = true"]
B -- No --> D["log.error(message)
global fallback - ERROR level"]
C --> E{Job completes}
E -- job_status: succeeded --> F["evlog.emit()
final level = WARN ✓"]
E -- job_status: failed
jobs.ts calls logger.error --> G["evlog.emit()
final level = ERROR ✓"]
D --> H[Standalone log at ERROR
no job scope - no aggregation effect]
|
Summary
Successful insights generation jobs were being logged at ERROR level whenever an intermediate AI tool call (SQL query, annotation creation) failed during the run — even when the agent recovered and completed with
job_status: succeeded. This created noisy false-positive incidents that masked real failures.The root cause is that
createToolLogger.error()calledrequestLogger.error()on the job-scopedevloglogger, which sethasError = trueinternally. Whenlogger.emit()was called at the end of a successful job,evlogdetermined the final severity usinglevel = hasError ? "error" : hasWarn ? "warn" : "info"— yieldingERROReven though the job succeeded.The fix is a one-line change in
packages/ai/src/ai/tools/utils/logger.ts: theerror()method ofcreateToolLoggernow callsrequestLogger.warn()instead ofrequestLogger.error(). This means recoverable tool failures sethasWarn = true(final levelWARN) instead ofhasError = true(final levelERROR). True job failures still calllogger.error()directly injobs.tsand will continue to emit at ERROR level correctly.Alternative: Instead of downgrading the tool logger, the job success path could explicitly reset
hasErrorbefore callinglogger.emit(). However, evlog does not expose a reset API, and the warn approach is cleaner and more semantically correct — tool errors are warnings to the agent, not failures of the job.Incident on Superlog
Was this PR helpful? Leave feedback — goes straight to the Superlog team.
Summary by cubic
Use warn level for recoverable tool errors so successful jobs no longer emit ERROR severity logs. This removes noisy false positives while keeping real failures at ERROR.
createToolLogger.errorto callrequestLogger.warn(...), preventinghasErrorfrom being set and avoiding final job-levelERRORon successful runs.logger.error()injobs.tsand emitERRORas expected.Written for commit 02a2354. Summary will update on new commits.