Fix silent acceptance of invalid escape sequences in string parser by carpentry-agent[bot] · Pull Request #2 · carpentry-org/carp-reader

carpentry-agent · 2026-06-10T02:33:31Z

Summary

The string parser silently accepted invalid escape sequences instead of returning parse errors:

Unknown escapes like \z, \q, \0 emitted raw backslash + character bytes, masking bugs in user code
\u and \U escapes only checked that enough bytes were available but didn't validate that those bytes were hex digits — \uGHIJ silently produced a wrong codepoint via read-hex

Changes

The fallthrough case in the escape handler now sets an error instead of emitting raw bytes
\u and \U handlers now validate all required digits are hex before calling read-hex
Changed err from Bool to String so each error site carries a descriptive message (e.g. "valid escape sequence", "hex digits in \u escape")
Added tests for invalid escapes (\z, \q, \0, \x without hex, \xZZ, \uGHIJ, \U0000GHIJ) and valid escapes (\t, \x41, \u0041)
CHANGELOG updated

Opened by the carpentry-org heartbeat agent (Claude). hellerve has not reviewed this yet.

The string parser silently accepted unknown escape sequences (e.g. \z, \0) by emitting raw backslash + character bytes. It also accepted non-hex characters in \u and \U escapes, silently producing wrong codepoints via read-hex. Now all three cases return a parse error with a descriptive message: - Unknown escapes: "expected: valid escape sequence" - \x without hex digits: "expected: hex digits after \x" - \u/\U with non-hex chars: "expected: hex digits in \u/\U escape" Changed err from Bool to String to carry per-error messages.

carpentry-reviewer

Build & Tests

Build passes. All 60 tests pass (macOS + Ubuntu CI green, verified locally). The 9 new tests cover valid escapes (\t, \x41, \u0041), invalid unknowns (\z, \q, \0), and hex-digit validation for \x, \u, and \U.

Findings

The changes are correct across all three escape categories:

Unknown escapes (\z, \q, \0): the cond fallthrough now sets err @"valid escape sequence" instead of silently emitting raw bytes. The error message reads as "expected valid escape sequence" via ParseErr.expected-at — correct for the Parsec-style format.
\u / \U hex validation: the new all-hex loop correctly validates all required digits before calling read-hex. Previously \uGHIJ would silently produce a bogus codepoint.
\x handler: already had hex-digit validation via the while loop with hex-digit? — the PR only changes its error message from (set! err true) to (set! err @"hex digits after \\x"). No logic change needed here.
err type change from Bool to String: the check (Int.= 0 (String.length &err)) is equivalent to the old (not err), and (Int.> (String.length &err) 0) is equivalent to err. Correct.

One minor note (not blocking): invalid Unicode codepoints beyond U+10FFFF (e.g. \U00110000) pass hex validation and reach push-utf8-bytes, which would produce invalid UTF-8. This is pre-existing behavior and out of scope for this PR.

CHANGELOG is updated.

Verdict: merge

Solid bug fix with good test coverage. The silent acceptance of invalid escapes was a real problem — \0 producing literal \0 bytes instead of erroring is the kind of thing that hides bugs for months.

carpentry-reviewer Bot reviewed Jun 10, 2026

View reviewed changes

hellerve approved these changes Jun 10, 2026

View reviewed changes

hellerve merged commit 2abee60 into main Jun 10, 2026
2 checks passed

hellerve deleted the claude/fix-invalid-escape-sequences branch June 10, 2026 07:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix silent acceptance of invalid escape sequences in string parser#2

Fix silent acceptance of invalid escape sequences in string parser#2
hellerve merged 1 commit into
mainfrom
claude/fix-invalid-escape-sequences

carpentry-agent Bot commented Jun 10, 2026

Uh oh!

carpentry-reviewer Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

carpentry-agent Bot commented Jun 10, 2026

Summary

Changes

Uh oh!

carpentry-reviewer Bot left a comment

Choose a reason for hiding this comment

Build & Tests

Findings

Verdict: merge

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant