Skip to content

ARW: add Sony "ARW 6.0" lossy decoder (compression 32766; A7R VI, A7 V)#972

Open
Kab1r wants to merge 3 commits into
darktable-org:developfrom
Kab1r:sony-arw6
Open

ARW: add Sony "ARW 6.0" lossy decoder (compression 32766; A7R VI, A7 V)#972
Kab1r wants to merge 3 commits into
darktable-org:developfrom
Kab1r:sony-arw6

Conversation

@Kab1r

@Kab1r Kab1r commented Jun 27, 2026

Copy link
Copy Markdown

Summary

Adds SonyArw6Decompressor, implementing Sony's "ARW 6.0" lossy codec
(TIFF Compression 32766) used by the A7R VI (ILCE-7RM6) and A7 V
(ILCE-7M5) in their Compressed / Compressed HQ RAW modes. These formats were
previously undecodable.

What this adds

  • SonyArw6Decompressor (+ SonyArw6LogToLinear LUT) — the codec itself.
  • compression == 32766 dispatch in ArwDecoder, the active-area crop, and a
    dimension-ceiling bump for the 66.8 MP sensor (which also enables the A7R VI
    lossless tiled LJpeg path).
  • ILCE-7RM6 camera entry in cameras.xml. (The ILCE-7M5 entry already
    exists.)
  • A fuzz target for the new decompressor.

Pipeline

Container tiles → per-row GCLI bitplane entropy decode → inverse reversible
5/3 wavelet (LL0 + 3 detail levels) → log→linear LUT → per-line colour
conversion → YCC→RGGB mosaic compose.

Coefficients are int16; tiles and their (up to four) components decode in
parallel under OpenMP (a no-op when it's unavailable). The per-tile sub-header
marker carries a 1-byte format tag that varies by body ('0' on the A7R VI,
'A' on the A7 V), so only its fixed "000" suffix is matched; the
black/white-level domain (1024/32800) and the DefaultCrop-derived active area
are codec-universal.

Reverse-engineering & validation

The format was reverse-engineered clean-room by observing the decoder's
input/output behavior; the 4096-entry log→linear table was recovered as a
black-box input→output sampling, not transcribed.

The full pipeline is validated byte-exact (MAE 0) against the reference
decoder output (Adobe DNG) on both bodies, full-frame and APS-C, in both lossy
and HQ modes.
Malformed input is rejected via ThrowRDE rather than crashing,
hanging, or invoking undefined behavior (a fuzz target is included).

I have also personally tested the decoder with a dirty build of Darktable built against this branch and was able to edit images successfully.

AI usage

This contribution was produced with
substantial AI assistance. The AI was used as a tool by me; the codec was
reverse-engineered from, and validated stage-by-stage against, the reference
decoder output, and I take full responsibility for the change and stand behind
its correctness.

Adds SonyArw6Decompressor implementing Sony's "ARW 6.0" reversible-5/3-wavelet
lossy codec (TIFF Compression 32766), used by the A7R VI (ILCE-7RM6) and A7 V
(ILCE-7M5) "Compressed" / "Compressed HQ" RAW modes. Also adds the ILCE-7RM6
camera entry, the compression-32766 dispatch, the active-area crop, a dimension
ceiling bump for the 66.8 MP sensor (which also enables the A7R VI lossless tiled
LJpeg path), and a fuzz target. (The ILCE-7M5 entry already exists.)

Pipeline: container tiles -> per-row GCLI bitplane entropy decode -> inverse
reversible 5/3 wavelet (LL0 + 3 detail levels) -> log->linear LUT -> per-line
colour conversion -> YCC->RGGB mosaic compose. Coefficients are int16; tiles and
their (up to four) components decode in parallel under OpenMP (a no-op without
it). The per-tile sub-header marker carries a 1-byte format tag that varies by
body ('0' on the A7R VI, 'A' on the A7 V), so only its fixed "000" suffix is
matched; the black/white level domain (1024/32800) and the DefaultCrop-derived
active area are codec-universal.

The format was reverse-engineered clean-room by observing the decoder's
input/output behaviour; the 4096-entry log->linear table was recovered as a
black-box input->output sampling, not transcribed. The full pipeline is
validated byte-exact (MAE 0) against the reference decoder output (Adobe DNG) on
both bodies, full-frame and APS-C, in both lossy and HQ modes. Malformed input
is rejected via ThrowRDE rather than crashing, hanging, or invoking undefined
behaviour.

AI-usage: produced with substantial AI assistance;
the codec was reverse-engineered from and validated stage-by-stage against the
reference output.
@Kab1r Kab1r requested a review from LebedevRI as a code owner June 27, 2026 06:41
@github-actions

Copy link
Copy Markdown

The proposed diff is not clang-formatted.
To make this check pass, download the following patch
(via browser, you must be logged-in in order for this URL to work),
(NOTE: save it into the repo checkout dir for the snippet to work)
https://github.com/darktable-org/rawspeed/actions/runs/28281503790/artifacts/7922007367
... and run:

cd <path/to/repo/checkout> # NOTE: use your own path here
unzip clang-format.patch.zip
git stash # Temporairly stash away any preexisting diff
git apply clang-format.patch # Apply the diff
git add -u # Stage changed files
git commit -m "Applying clang-format" # Commit the patch
git push
git stash pop # Unstast preexisting diff
rm clang-format.patch.zip clang-format.patch

Kab1r added 2 commits June 26, 2026 23:45
coefDiffDecode (SonyArw6Decompressor.cpp:383) — signed integer overflow:

int32_t s = static_cast<int32_t>(static_cast<uint32_t>(acc + coef[i]) & 0xffffU);

The uint32_t cast was meant to give wraparound, but it's applied after acc + coef[i] is already computed in signed int.
The raw coefficients from decodeCoeffs/decodeSignRev can saturate to ±0x7fffffff, so the add overflows int.

The fix:

Compute the sum in int64_t (can't overflow), then mask — semantics provably identical (only the low 16 bits ever mattered):
int32_t s = static_cast<int32_t>((static_cast<int64_t>(acc) + coef[i]) & 0xffff);
@github-actions

Copy link
Copy Markdown

The proposed diff is not clang-formatted.
To make this check pass, download the following patch
(via browser, you must be logged-in in order for this URL to work),
(NOTE: save it into the repo checkout dir for the snippet to work)
https://github.com/darktable-org/rawspeed/actions/runs/28297440066/artifacts/7926886445
... and run:

cd <path/to/repo/checkout> # NOTE: use your own path here
unzip clang-format.patch.zip
git stash # Temporairly stash away any preexisting diff
git apply clang-format.patch # Apply the diff
git add -u # Stage changed files
git commit -m "Applying clang-format" # Commit the patch
git push
git stash pop # Unstast preexisting diff
rm clang-format.patch.zip clang-format.patch

@LebedevRI

Copy link
Copy Markdown
Member

I should probably post an issue documenting the current maintenance stance of this project.
(but the TLDR is: i'm not interested in touching C++ any more. at least for free.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants