Decode RESP with byte-accurate slicing (fix crash on non-ASCII)#8
Conversation
decode-one computes every offset in bytes (char-at, index-of-from), but extracted values with String.slice, which is codepoint-indexed. On any RESP value containing multi-byte UTF-8 — bulk strings are binary-safe — the byte offsets ran past the decoded codepoint array, segfaulting the decoder; in less extreme cases it silently truncated or garbled the value. It was also quadratic: String.slice rebuilds a codepoint array of the whole accumulated buffer on every call. Switch the seven slices in decode-one to String.byte-slice, and add regression tests for a multi-byte bulk string and a multi-byte element inside an array (both crash the old decoder).
There was a problem hiding this comment.
Build & Tests
Checked out claude/resp-byte-slice, built and ran carp -x test/resp.carp: builds clean, all 79 tests pass. CI is green on both platforms.
I verified the core change is genuinely correct, end to end:
String.length,char-at, andindex-of-fromare all byte-based at the C level (strlen,(*s)[i], and a byte-wise scan inString_index_of_any_from), and the RESP length prefix is a byte count — so every offsetdecode-onecomputes is a byte offset, and extracting with byte-indexedbyte-sliceis the right primitive. The old codepoint-indexedslicewas mismatched.- Confirmed the bug on the old code: decoding
$…\r\nnaïve café\r\nreturned"naïve café\r\n…"at 14 bytes instead of the correct 12 — silent corruption (the codepoint slice ran past the intended end). The segfault in the PR description is just a more extreme manifestation of the same mismatch. - Confirmed the new code round-trips multi-byte payloads correctly:
😀x→$5, embedded-CRLF binary-safe bulk strings, nested arrays withcafé, 4-byte emoji, etc. The encoder's$%dprefix usesString.length(bytes), which now matches the decoder. Good.
So the byte-vs-codepoint fix itself is correct and should be kept.
Findings
1. (blocking) The unbounded bulk-string slice turns a partial/oversized payload into an out-of-bounds heap read + a silently-corrupt "successful" decode.
redis.carp:74 slices the bulk body without first checking the buffer actually contains len (+2 for the trailing CRLF) bytes:
(Str (String.byte-slice s data-start (+ data-start len)))
String.byte-slice is a raw memcpy(ptr, *s + a, b - a) with no bounds check. When data-start + len exceeds (String.length s), it reads past the buffer. This is reachable on the normal streaming path, not just on malicious input:
Redis.read (redis.carp:167) accumulates socket chunks and calls decode-n &acc 1 after each read. So any bulk value that hasn't fully arrived yet is decoded from a partial buffer. I simulated exactly that mid-stream buffer:
decode-n &"$20\r\nhello" 1(20-byte value declared, 5 bytes arrived):- new code: returns 1 result — it treats the incomplete value as complete (after
memcpy-ing 20 bytes out of a 5-byte tail), soreadstops accumulating and hands back a string padded with heap garbage. - old code: aborts on the
Array.unsafe-nthbounds assertion (exit -6).
- new code: returns 1 result — it treats the incomplete value as complete (after
$100\r\nhi\r\n(oversized length, e.g. a truncated/malformed reply): new codememcpys 100 bytes from a 4-byte tail (a ~96-byte over-read; it only looks fine becausestrlenstops at the embedded\0); old code aborts.
So both versions mishandle incomplete bulk data, but the switch to unchecked byte-slice converts a loud crash into a silent out-of-bounds read + corrupted result (CWE-125). For a Redis client this is a real functional regression — any bulk value larger than a single socket read (serialized blobs, large JSON, etc. — routine for Redis) would silently decode to garbage instead of waiting for the rest of the payload. It's also a memory-safety hole reachable from a buggy/hostile server.
Fix — guard the slice with a completeness check and signal "not yet complete" the same way the simple-string path already does (returning Error, which makes decode-n yield 0 results so Redis.read keeps accumulating):
(let [data-start (+ idx 2)]
(if (< (String.length s) (+ data-start (+ len 2)))
(Error @"Incomplete bulk string")
(Success
(Pair.init
(Str (String.byte-slice s data-start (+ data-start len)))
(+ data-start (+ len 2))))))
This removes the over-read and fixes streaming of large bulk values (they currently can't decode at all — old crashes, new corrupts). Worth a regression test that feeds decode-n a deliberately-truncated bulk buffer and asserts 0 results. (The partial-read mishandling pre-dates this PR, but since the PR is rewriting these exact slices and changes the failure mode from crash to silent memory-unsafety, the guard belongs here.)
Verdict: revise
The byte-vs-codepoint fix is correct and well-targeted — keep it. But as written it leaves the bulk-string slice unbounded, so partial/oversized payloads (a normal streaming case) become an out-of-bounds heap read and a silently-corrupt decode. Add the length guard above before merging.
The byte-accurate rewrite still sliced a bulk string using its declared length without checking the bytes had actually arrived, so a partial streaming read (or a truncated/oversized reply) memcpy'd past the buffer — an out-of-bounds read that silently returned a corrupt value. decode-one now guards the slice: an incomplete payload reports as incomplete (Error), so Redis.read keeps accumulating until the whole value arrives, and a malformed negative or overflow-sized length is rejected rather than driving byte-slice past its start. The completeness check compares the declared length against the available bytes without adding to it, so a near-Int-max length can't overflow the guard. Adds regression tests for partial, oversized, huge, and negative lengths.
|
Addressed the blocking finding in 3c4446b — the bulk-string slice is now bounded to the bytes actually received.
Regression tests added to |
There was a problem hiding this comment.
Build & Tests
Checked out claude/resp-byte-slice at 3c4446b and ran the full CI pipeline locally:
- Build:
carp -b examples/simple.carp— clean (exit 0; only the pre-existing benignprn/StringBufwarning). - Tests:
carp -x test/resp.carp— 83 passed, 0 failed, including the 4 new regression tests (partial, oversized, huge/overflow, negative). - Lint/format:
anglerandcarp-fmt --checkon both changed files — both exit 0, no findings. - CI: green on
ubuntu-latestandmacos-latest.
Prior feedback
The one blocking finding from round 1 — the unbounded bulk-string slice turning a partial/oversized payload into an out-of-bounds heap read plus a silently-corrupt "successful" decode — is properly addressed by the cond guard in decode-one (redis.carp:68–83). I re-ran the exact scenario I flagged rather than taking it on faith:
decode-n "$20\r\nhello" 1(20 declared, 5 present): now returns 0 results (Error "Incomplete bulk string") instead ofmemcpy-ing 20 bytes out of a 5-byte tail and reporting success. SoRedis.readkeeps accumulating, and the same buffer completes correctly once the rest arrives ("$5\r\nhel"→ 0, then"$5\r\nhello\r\n"→ 1 with the right value). Streaming of large bulk values — which previously either crashed or corrupted — now works.
The refinement over my sketched fix is correct and genuinely better: keeping len isolated in (> len (- available 2)) instead of (< (String.length s) (+ data-start (+ len 2))) avoids overflowing the comparison on a near-INT_MAX declared length (Int is 32-bit on some targets), which would otherwise wrap negative and slip back into an OOB slice. The $2147483647\r\n… test exercises exactly that and passes. The explicit (< len 0) rejection of non--1 negatives is a good addition too.
Findings
I wrote 16 adversarial cases beyond the suite to try to break the guard — all pass:
- No over-read / no crash on: mid-stream partial, oversized-with-tail (
$100\r\nhi\r\n),"$5\r"(whereavailablegoes negative), and empty-bulk-missing-CRLF ($0\r\n). - Behavior preserved on: exact-fit boundary (
available == len+2decodes — no off-by-one false "incomplete"), empty bulk ($0\r\n\r\n→""), binary-safe payload with an embedded CRLF ($4\r\na\r\nb\r\n→"a\r\nb", sliced by length not delimiter), multi-byte split reassembly (naïve= 5 bytes), two back-to-back bulk strings advancing correctly, null bulk ($-1), andINT_MINlength rejected.
Bounds check out analytically: on the success path available >= len+2, so the slice upper bound data-start + len <= String.length s - 2 and the returned next-position data-start + len + 2 <= String.length s — both in range. The other six byte-slice conversions (+, -, :, the $/* length prefixes, and the default arm) all bound their upper index by a found \return position or pos, so none can over-read. The byte-vs-codepoint fix itself remains correct across all seven slices.
One pre-existing, non-blocking note for the record: the bulk path advances +2 past the payload without verifying those two bytes are actually CRLF — e.g. $2\r\nhiXY decodes to "hi" and swallows XY as the terminator. This is in-bounds and memory-safe, and it's unchanged from before this PR (the old code did the same), so it's out of scope here — just flagging it as a possible future hardening, not something to fix in this PR.
Verdict: merge
The byte-vs-codepoint fix is correct across all seven slices, and the blocking OOB finding from round 1 is now fixed with an overflow-safe guard that's better than what I originally sketched. Build, tests, lint, and format are all clean locally and CI is green. No changelog exists in this repo, so there's nothing to update there.
What
RESP.decode-onecomputes every offset in bytes —String.char-at,String.index-of-fromand the bulk-string length prefix are all byte-based — but it extracted the actual values withString.slice, which is codepoint-indexed (from-chars (Array.slice (chars s) …)).This swaps the seven slices in
decode-onetoString.byte-slice.Why this matters
RESP bulk strings are binary-safe, so a perfectly valid reply can contain multi-byte UTF-8. When it does, the byte offsets index past the decoded codepoint array and the decoder segfaults (I confirmed exit code -11 on the new tests against the old code). In less extreme cases it silently truncates or garbles the value — e.g.
$3\r\n€\r\ndecoding to"€\r\n"instead of"€".It's also a performance fix:
String.slicerebuilds a codepoint array of the whole accumulated read buffer on every call, making array/pipeline decoding O(n²);byte-sliceis a direct byte copy.Tests
Two regression tests added to
test/resp.carp, both of which crash the old decoder and pass with the fix:"naïve café")"héllo"+ an integer)Full suite: 79 passed, 0 failed locally (
carp -x test/resp.carp).