Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions packages/db/src/schema.ts
Original file line number Diff line number Diff line change
Expand Up @@ -4361,6 +4361,7 @@ export type CloudAgentSessionRunFailureCode =
| 'assistant_error'
| 'wrapper_error_after_activity'
| 'missing_assistant_reply'
| 'payment_required'

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kilocode-bot Why was this removed? What replaced this?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Short answer: 'payment_required' was added, not removed. The + in the diff hunk means this is a new member being inserted into the union — nothing above or below it was removed.

Why it was added

The DB enum is bound to the runtime types via a compile-time contract in session-message-state.ts:

type CodeContractMatchesReport = AssertTrue<
  CloudAgentRunFailureCode extends CloudAgentFailureCode ? true : false
>;

This PR adds payment_required in lockstep to all layers — packages/db/src/schema.ts (here), packages/worker-utils/src/cloud-agent-failure.ts, and safe-failure-projection.ts. If it were added to CLOUD_AGENT_FAILURE_CODES but not to the DB type, the AssertTrue contract would fail to compile. The schema addition is required for the Durable Object to carry a typed failureCode: 'payment_required' at all.

What it does

Previously, insufficient-credits failures were flattened into assistant_error (indistinguishable from any generic error). This PR introduces a typed wrapper signal (WrapperTerminalFailureCodes = ['payment_required', 'model_missing']) so the wrapper can now communicate the precise failure, which terminal-error-projector.ts classifies as non-retryable — feeding the new clientError: { code, message, retryable } contract on callbacks and getMessageResult.

The subtlety

payment_required is intentionally not persisted to the DB column. In telemetry/queue-reports.ts, persistedFailureCode() normalizes it away before writing to failure_code:

function persistedFailureCode(state) {
  if (state.failureCode !== 'model_missing' && state.failureCode !== 'payment_required')
    return state.failureCode;
  if (state.failureStage === 'agent_activity') return 'assistant_error';
  if (state.failureStage === 'post_dispatch_no_activity') return 'wrapper_error_before_activity';
  return state.failureCode;
}

So if the concern is "this enum value will never actually be written to the DB column" — that's accurate and intentional by design. The value exists in CloudAgentSessionRunFailureCode solely to satisfy the shared contract the DO runtime depends on, not because the column is expected to store it.

| 'user_interrupt'
| 'container_shutdown'
| 'system_interrupt'
Expand Down
3 changes: 2 additions & 1 deletion packages/worker-utils/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,8 @@
"./security-auto-analysis-policy": "./src/security-auto-analysis-policy.ts",
"./security-remediation-policy": "./src/security-remediation-policy.ts",
"./security-notification-policy": "./src/security-notification-policy.ts",
"./dependabot-dismissal-target": "./src/dependabot-dismissal-target.ts"
"./dependabot-dismissal-target": "./src/dependabot-dismissal-target.ts",
"./client-error": "./src/client-error.ts"
},
"scripts": {
"test": "vitest run",
Expand Down
26 changes: 26 additions & 0 deletions packages/worker-utils/src/client-error.test.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
import { describe, expect, it } from 'vitest';

import { ClientErrorSchema, PublicErrorCodeSchema } from './client-error.js';

describe('ClientErrorSchema', () => {
it('accepts the public client error wire contract', () => {
expect(
ClientErrorSchema.parse({
code: 'PENDING_QUEUE_FULL',
message: 'Queue is full',
retryable: true,
})
).toEqual({
code: 'PENDING_QUEUE_FULL',
message: 'Queue is full',
retryable: true,
});
});

it.each(['', 'lowercase', '_PRIVATE', '9INVALID', 'HAS-DASH'])(
'rejects invalid code %j',
code => {
expect(PublicErrorCodeSchema.safeParse(code).success).toBe(false);
}
);
});
11 changes: 11 additions & 0 deletions packages/worker-utils/src/client-error.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
import { z } from 'zod';

export const PublicErrorCodeSchema = z.string().regex(/^[A-Z][A-Z0-9_]*$/);

export const ClientErrorSchema = z.object({
code: PublicErrorCodeSchema,
message: z.string(),
retryable: z.boolean(),
});

export type ClientError = z.infer<typeof ClientErrorSchema>;
1 change: 1 addition & 0 deletions packages/worker-utils/src/cloud-agent-failure.ts
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ export const CLOUD_AGENT_FAILURE_CODES = [
'assistant_error',
'wrapper_error_after_activity',
'missing_assistant_reply',
'payment_required',
'user_interrupt',
'container_shutdown',
'system_interrupt',
Expand Down
4 changes: 4 additions & 0 deletions services/cloud-agent-next/src/callbacks/types.ts
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
import type { CloudAgentFailureStage } from '@kilocode/worker-utils/cloud-agent-failure';
import type { ClientError } from '@kilocode/worker-utils/client-error';
import type { SafeFailureProjection } from '../session/safe-failure-projection.js';

export type CallbackTarget = {
Expand All @@ -20,6 +22,8 @@ export type ExecutionCallbackPayload = {
status: 'completed' | 'failed' | 'interrupted';
errorMessage?: string;
failure?: SafeFailureProjection;
failureStage?: CloudAgentFailureStage;
clientError?: ClientError;
/** Present when errorMessage was shortened to fit the callback queue. */
errorMessageTruncation?: CallbackTextTruncation;
lastSeenBranch?: string;
Expand Down
64 changes: 64 additions & 0 deletions services/cloud-agent-next/src/middleware/auth.test.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
import { Hono } from 'hono';
import { beforeEach, describe, expect, it, vi } from 'vitest';
import type { HonoContext } from '../hono-context.js';
import type { Env } from '../types.js';

vi.mock('../auth.js', () => ({
validateKiloToken: vi.fn(),
}));

vi.mock('../logger.js', () => {
const logger = {
setTags: vi.fn(),
info: vi.fn(),
warn: vi.fn(),
error: vi.fn(),
withFields: vi.fn(),
};
logger.withFields.mockReturnValue(logger);
return {
logger,
withLogTags: async (_tags: unknown, fn: () => Promise<unknown>) => fn(),
WithLogTags:
() =>
(
_target: unknown,
_propertyKey: string,
descriptor: PropertyDescriptor
): PropertyDescriptor =>
descriptor,
};
});

const { authMiddleware } = await import('./auth.js');
const { validateKiloToken } = await import('../auth.js');

describe('authMiddleware', () => {
beforeEach(() => {
vi.clearAllMocks();
vi.mocked(validateKiloToken).mockResolvedValue({ success: false, error: 'Invalid token' });
});

it('returns a non-retryable unauthorized client error without changing message or path', async () => {
const app = new Hono<HonoContext>();
app.use('/trpc/*', authMiddleware);
app.post('/trpc/:procedure', c => c.json({ ok: true }));

const response = await app.fetch(
new Request('https://worker.test/trpc/send', { method: 'POST' }),
{ NEXTAUTH_SECRET: 'secret' } as Env
);
const body: any = await response.json();

expect(response.status).toBe(401);
expect(body.error.message).toBe('Invalid token');
expect(body.error.data).toMatchObject({
path: 'send',
clientError: {
code: 'UNAUTHORIZED',
message: 'Invalid token',
retryable: false,
},
});
});
});
34 changes: 34 additions & 0 deletions services/cloud-agent-next/src/middleware/balance.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,40 @@ describe('balanceMiddleware', () => {
);
}

it('returns non-retryable clientError for insufficient credits', async () => {
vi.mocked(validateBalanceOnly).mockResolvedValue({
success: false,
status: 402,
message: 'Insufficient credits',
});

const response = await postTrpc('start', {});
const body: any = await response.json();

expect(body.error.data.clientError).toEqual({
code: 'PAYMENT_REQUIRED',
message: 'Insufficient credits',
retryable: false,
});
});

it('returns retryable clientError for balance infrastructure failures', async () => {
vi.mocked(validateBalanceOnly).mockResolvedValue({
success: false,
status: 500,
message: 'Failed to verify balance',
});

const response = await postTrpc('start', {});
const body: any = await response.json();

expect(body.error.data.clientError).toEqual({
code: 'INTERNAL_SERVER_ERROR',
message: 'Failed to verify balance',
retryable: true,
});
});

it('uses nested start organization context for balance validation', async () => {
const orgId = '11111111-2222-3333-4444-555555555555';

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ import {
import { readProfileBundle, type SessionProfileBundle } from '../session-profile.js';
import { fitCallbackJobToQueueLimit } from '../callbacks/queue-payload.js';
import type { CallbackJob, CallbackTarget } from '../callbacks/index.js';
import { projectTerminalClientError } from '../session/terminal-error-projector.js';
import { drizzle } from 'drizzle-orm/durable-sqlite';
import { logger } from '../logger.js';
import { BUILTIN_AGENT_MODES, Limits } from '../schema.js';
Expand Down Expand Up @@ -345,6 +346,9 @@ export class CloudAgentSession extends DurableObject<WorkerEnv> {
executionId: execution.executionId,
status,
errorMessage: error,
...(status === 'completed'
? {}
: { clientError: projectTerminalClientError({ status, error }) }),
lastSeenBranch: metadata.repository?.upstreamBranch,
kiloSessionId: metadata.auth.kiloSessionId,
gateResult,
Expand Down
95 changes: 95 additions & 0 deletions services/cloud-agent-next/src/router/auth.test.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
import { describe, expect, it } from 'vitest';
import { TRPCError } from '@trpc/server';
import { fetchRequestHandler } from '@trpc/server/adapters/fetch';

import type { TRPCContext } from '../types.js';
import { t } from './auth.js';

const testRouter = t.router({
invalid: t.procedure.query(() => {
throw new TRPCError({ code: 'BAD_REQUEST', message: 'Invalid request' });
}),
unavailable: t.procedure.query(() => {
throw new TRPCError({
code: 'SERVICE_UNAVAILABLE',
message: 'Sandbox unavailable',
cause: { error: 'SANDBOX_CONNECT_FAILED', retryable: true },
});
}),
internal: t.procedure.query(() => {
throw new TRPCError({ code: 'INTERNAL_SERVER_ERROR', message: 'Internal failure' });
}),
});

async function requestProcedure(procedure: 'invalid' | 'unavailable' | 'internal') {
return fetchRequestHandler({
endpoint: '/trpc',
req: new Request(`http://localhost/trpc/${procedure}`),
router: testRouter,
createContext: () => ({}) as TRPCContext,
});
}

describe('tRPC client error formatter', () => {
it('adds a non-retryable client error to known request failures', async () => {
const response = await requestProcedure('invalid');

await expect(response.json()).resolves.toMatchObject({
error: {
message: 'Invalid request',
data: {
code: 'BAD_REQUEST',
httpStatus: 400,
path: 'invalid',
clientError: {
code: 'BAD_REQUEST',
message: 'Invalid request',
retryable: false,
},
},
},
});
});

it('preserves explicit legacy retry fields beside the client error', async () => {
const response = await requestProcedure('unavailable');

await expect(response.json()).resolves.toMatchObject({
error: {
message: 'Sandbox unavailable',
data: {
code: 'SERVICE_UNAVAILABLE',
httpStatus: 503,
path: 'unavailable',
error: 'SANDBOX_CONNECT_FAILED',
retryable: true,
clientError: {
code: 'SANDBOX_CONNECT_FAILED',
message: 'Sandbox unavailable',
retryable: true,
},
},
},
});
});

it('defaults generic internal failures to retryable', async () => {
const response = await requestProcedure('internal');

await expect(response.json()).resolves.toMatchObject({
error: {
message: 'Internal failure',
data: {
code: 'INTERNAL_SERVER_ERROR',
httpStatus: 500,
path: 'internal',
clientError: {
code: 'INTERNAL_SERVER_ERROR',
message: 'Internal failure',
retryable: true,
},
},
},
});
});
});
29 changes: 5 additions & 24 deletions services/cloud-agent-next/src/router/auth.ts
Original file line number Diff line number Diff line change
@@ -1,34 +1,15 @@
import { initTRPC, TRPCError } from '@trpc/server';
import { timingSafeEqual } from '@kilocode/encryption';
import type { TRPCContext } from '../types.js';

/**
* Type for error cause data that should be surfaced in the response.
* Used for structured 409 Conflict and 503 Retryable errors.
*/
type ErrorCauseData = {
error?: string;
message?: string;
retryable?: boolean;
};
import { projectTrpcErrorData } from '../trpc-error.js';

// Initialize tRPC with context and error formatter
export const t = initTRPC.context<TRPCContext>().create({
errorFormatter({ shape, error }) {
// Surface cause data in the response for specific error types
const causeData = error.cause as ErrorCauseData | undefined;
if (causeData && typeof causeData === 'object') {
return {
...shape,
data: {
...shape.data,
// Include structured error info from cause
...(causeData.error && { error: causeData.error }),
...(causeData.retryable !== undefined && { retryable: causeData.retryable }),
},
};
}
return shape;
return {
...shape,
data: projectTrpcErrorData(shape.data, shape.message, error.cause),
};
},
});

Expand Down
26 changes: 24 additions & 2 deletions services/cloud-agent-next/src/router/schemas.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -387,17 +387,39 @@ describe('getMessageResult contract', () => {
acceptedAt: 3,
terminalAt: 4,
completionSource: 'wrapper_failure',
failure: { stage: 'agent_activity', code: 'assistant_error', attempts: 2 },
failure: {
stage: 'agent_activity',
code: 'assistant_error',
attempts: 2,
retryable: false,
},
}).success
).toBe(true);
});

it('requires retryability on failed and interrupted failure details', () => {
expect(
GetMessageResultOutput.safeParse({
...baseOutput,
status: 'interrupted',
failure: { retryable: true },
}).success
).toBe(true);
expect(
GetMessageResultOutput.safeParse({
...baseOutput,
status: 'failed',
failure: { code: 'assistant_error' },
}).success
).toBe(false);
});

it('fails closed on contradictory lifecycle result fields', () => {
for (const output of [
{ ...baseOutput, status: 'queued', acceptedAt: 2 },
{ ...baseOutput, status: 'queued', terminalAt: 2 },
{ ...baseOutput, status: 'running', completionSource: 'assistant_message_event' },
{ ...baseOutput, status: 'queued', failure: { attempts: 1 } },
{ ...baseOutput, status: 'queued', failure: { attempts: 1, retryable: true } },
{ ...baseOutput, status: 'failed', assistant: { messageId: 'assistant_1', text: 'wrong' } },
{ ...baseOutput, status: 'interrupted', gateResult: 'fail' },
]) {
Expand Down
2 changes: 1 addition & 1 deletion services/cloud-agent-next/src/router/schemas.ts
Original file line number Diff line number Diff line change
Expand Up @@ -870,7 +870,7 @@ export const GetMessageResultOutput = z
acceptedAt: z.number().optional(),
terminalAt: z.number().optional(),
completionSource: SessionMessageCompletionSourceSchema.optional(),
failure: SafeFailureProjectionSchema.optional(),
failure: SafeFailureProjectionSchema.extend({ retryable: z.boolean() }).optional(),
gateResult: z.enum(['pass', 'fail']).optional(),
assistant: z
.object({
Expand Down
Loading