Skip to content

fix: streaming database dump with pagination for large databases#140

Open
aadityaranjan01 wants to merge 1 commit intoouterbase:mainfrom
aadityaranjan01:fix/streaming-dump
Open

fix: streaming database dump with pagination for large databases#140
aadityaranjan01 wants to merge 1 commit intoouterbase:mainfrom
aadityaranjan01:fix/streaming-dump

Conversation

@aadityaranjan01
Copy link
Copy Markdown

@aadityaranjan01 aadityaranjan01 commented May 2, 2026

/claim #59

Fix: Large database dumps fail due to memory and timeout limits

This PR replaces the current in-memory dump implementation with a streaming-based approach to support large databases reliably.

Problem

The existing /export/dump implementation:

  • loads entire tables into memory (executeOperation → cursor.toArray())
  • builds a full dump string before responding
  • fails for large databases due to memory limits and timeouts

Solution

This PR introduces a minimal, streaming-based fix:

  • Uses ReadableStream to stream dump output progressively
  • Fetches rows in batches using LIMIT/OFFSET
  • Adds ORDER BY rowid to ensure stable pagination
  • Includes breathing intervals to prevent Durable Object blocking
  • Keeps all changes scoped to src/export/dump.ts

Key Benefits

  • Works for arbitrarily large databases
  • Avoids memory overflow
  • Avoids timeout failures
  • No new infrastructure required
  • Minimal diff, no changes to shared utilities

Example Usage

curl --location 'https://<your-worker>/export/dump' \
--header 'Authorization: Bearer <token>' \
--output database_dump.sql

Notes

  • Does not modify executeOperation or other export routes (JSON/CSV remain unchanged)
  • Designed as a safe, incremental improvement without introducing async jobs or R2 dependencies

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant