[superlog] Add queryTimeoutMs to cacheable() to prevent indefinite DB hangs#489
[superlog] Add queryTimeoutMs to cacheable() to prevent indefinite DB hangs#489superlog-app[bot] wants to merge 1 commit into
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
The latest updates on your projects. Learn more about Unkey Deploy
|
|
Found 7 test failures on Blacksmith runners: Failures
|
Greptile SummaryThis PR adds an optional
Confidence Score: 4/5The core fix is correct and safe to merge. The The apps/basket/src/hooks/auth.ts — Important Files Changed
|
|
|
||
| const results = await Promise.allSettled([ | ||
| cached(), | ||
| cached(), | ||
| cached(), | ||
| ]); | ||
|
|
||
| for (const result of results) { | ||
| expect(result.status).toBe("rejected"); | ||
| if (result.status === "rejected") { | ||
| expect(result.reason.message).toBe("Query timeout"); | ||
| } | ||
| } | ||
| }, 5000); | ||
|
|
||
| it("does not time out fast functions", async () => { | ||
| const cached = cacheable(async () => "quick", { | ||
| expireInSec: 60, | ||
| prefix: "qtimeout-fast", | ||
| queryTimeoutMs: 1000, | ||
| }); |
There was a problem hiding this comment.
Missing test for background stale-while-revalidate timeout path
The triggerBackgroundRevalidation function now accepts queryTimeoutMs and uses withTimeout around the fn() call (line 223–225 of cacheable.ts), but the test suite has no case that exercises this path. A test with staleWhileRevalidate: true and a slow function would confirm the background fetch is bounded and that the stale value is still served to the caller (i.e. the timeout in background doesn't surface to callers).
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Summary
The basket
/batchendpoint returned after ~943 seconds (nearly 16 minutes) for a single request. Thecacheable()wrapper inpackages/redishas a 2-second timeout for Redis operations but no timeout for the underlying function call. When the PostgreSQLwebsitestable lookup stalled, the request — and all concurrent requests for the same website ID deduplicated by theinflightRequestsmap — were frozen for the entire duration of the DB stall.The
cacheable()wrapper was extended with an optionalqueryTimeoutMsoption. When set, the underlyingfn()call is wrapped withwithTimeout(..., queryTimeoutMs, "Query timeout")on all execution paths: cache miss, Redis-unavailable bypass, and background stale-while-revalidate.getWebsiteByIdWithOwnerCachedinapps/basket/src/hooks/auth.tsis set toqueryTimeoutMs: 10_000(10 seconds), matching the p99 expectation for a simple PK lookup.An alternative remediation would be to set
statement_timeouton the PostgreSQL connection pool so the database itself enforces the limit — this would also protect all other Drizzle queries that don't go throughcacheable. Both approaches are complementary and can be combined.Incident on Superlog
Was this PR helpful? Leave feedback — goes straight to the Superlog team.
Summary by cubic
Add a per-call timeout to
cacheable()to prevent long DB stalls from freezing requests. Applied a 10s timeout to Basket’s website lookup so requests fail fast instead of hanging.queryTimeoutMstopackages/rediscacheable(), wrapping the underlying function withwithTimeout(..., "Query timeout")on cache miss, Redis-bypass, and background stale-while-revalidate.inflightRequestson timeout; concurrent callers share the timed promise and receive the timeout error.queryTimeoutMs: 10_000forgetWebsiteByIdWithOwnerCachedinapps/basket/src/hooks/auth.ts.Written for commit 24c9674. Summary will update on new commits.