fix(git): detach clone context from HTTP request to survive client disconnect#273
fix(git): detach clone context from HTTP request to survive client disconnect#273joshfriend wants to merge 1 commit intomainfrom
Conversation
…sconnect When a client disconnects (e.g. request timeout) while a clone is in progress, the cancelled request context cascades into exec.CommandContext, which sends SIGKILL to the git process. This leaves an orphaned .clone-* temp directory and forces the next request to start a fresh clone from scratch. Fix by using context.WithoutCancel in startClone so the clone runs independently of the initiating request. The clone's own 30-minute timeout (CloneTimeout) still acts as a safety net.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 34f6a18600
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| // (CloneTimeout) as a safety net. Without this, a client timeout | ||
| // cancels the request context, which cascades into cmd.Cancel sending | ||
| // SIGKILL to the git process. | ||
| ctx = context.WithoutCancel(ctx) |
There was a problem hiding this comment.
Keep scheduler cancellation for in-flight clone jobs
startClone now unconditionally calls context.WithoutCancel, which strips cancellation not only for HTTP request contexts but also for scheduler worker contexts. I checked the scheduler contract in internal/jobscheduler/jobs.go (Wait expects workers to exit after context cancellation), and this change means a running clone can ignore shutdown/cancel and continue until CloneTimeout (30 minutes), holding a worker and queue slot the whole time. This creates long shutdown/test drains whenever a clone job is active.
Useful? React with 👍 / 👎.
When a client disconnects while a mirror clone is in progress, the cancelled request context propagates into exec.CommandContext, which kills the git process. This leaves an orphaned .clone-* temp directory and forces the next request to start a full clone from scratch, wasting time and memory.
Fixes this by calling context.WithoutCancel in startClone so the clone runs to completion regardless of client lifecycle. The clone still has its own 30-minute timeout as a safety net.