ARW: add Sony "ARW 6.0" lossy decoder (compression 32766; A7R VI, A7 V)#972
ARW: add Sony "ARW 6.0" lossy decoder (compression 32766; A7R VI, A7 V)#972Kab1r wants to merge 3 commits into
Conversation
Adds SonyArw6Decompressor implementing Sony's "ARW 6.0" reversible-5/3-wavelet
lossy codec (TIFF Compression 32766), used by the A7R VI (ILCE-7RM6) and A7 V
(ILCE-7M5) "Compressed" / "Compressed HQ" RAW modes. Also adds the ILCE-7RM6
camera entry, the compression-32766 dispatch, the active-area crop, a dimension
ceiling bump for the 66.8 MP sensor (which also enables the A7R VI lossless tiled
LJpeg path), and a fuzz target. (The ILCE-7M5 entry already exists.)
Pipeline: container tiles -> per-row GCLI bitplane entropy decode -> inverse
reversible 5/3 wavelet (LL0 + 3 detail levels) -> log->linear LUT -> per-line
colour conversion -> YCC->RGGB mosaic compose. Coefficients are int16; tiles and
their (up to four) components decode in parallel under OpenMP (a no-op without
it). The per-tile sub-header marker carries a 1-byte format tag that varies by
body ('0' on the A7R VI, 'A' on the A7 V), so only its fixed "000" suffix is
matched; the black/white level domain (1024/32800) and the DefaultCrop-derived
active area are codec-universal.
The format was reverse-engineered clean-room by observing the decoder's
input/output behaviour; the 4096-entry log->linear table was recovered as a
black-box input->output sampling, not transcribed. The full pipeline is
validated byte-exact (MAE 0) against the reference decoder output (Adobe DNG) on
both bodies, full-frame and APS-C, in both lossy and HQ modes. Malformed input
is rejected via ThrowRDE rather than crashing, hanging, or invoking undefined
behaviour.
AI-usage: produced with substantial AI assistance;
the codec was reverse-engineered from and validated stage-by-stage against the
reference output.
|
The proposed diff is not cd <path/to/repo/checkout> # NOTE: use your own path here
unzip clang-format.patch.zip
git stash # Temporairly stash away any preexisting diff
git apply clang-format.patch # Apply the diff
git add -u # Stage changed files
git commit -m "Applying clang-format" # Commit the patch
git push
git stash pop # Unstast preexisting diff
rm clang-format.patch.zip clang-format.patch |
coefDiffDecode (SonyArw6Decompressor.cpp:383) — signed integer overflow: int32_t s = static_cast<int32_t>(static_cast<uint32_t>(acc + coef[i]) & 0xffffU); The uint32_t cast was meant to give wraparound, but it's applied after acc + coef[i] is already computed in signed int. The raw coefficients from decodeCoeffs/decodeSignRev can saturate to ±0x7fffffff, so the add overflows int. The fix: Compute the sum in int64_t (can't overflow), then mask — semantics provably identical (only the low 16 bits ever mattered): int32_t s = static_cast<int32_t>((static_cast<int64_t>(acc) + coef[i]) & 0xffff);
|
The proposed diff is not cd <path/to/repo/checkout> # NOTE: use your own path here
unzip clang-format.patch.zip
git stash # Temporairly stash away any preexisting diff
git apply clang-format.patch # Apply the diff
git add -u # Stage changed files
git commit -m "Applying clang-format" # Commit the patch
git push
git stash pop # Unstast preexisting diff
rm clang-format.patch.zip clang-format.patch |
|
I should probably post an issue documenting the current maintenance stance of this project. |
Summary
Adds
SonyArw6Decompressor, implementing Sony's "ARW 6.0" lossy codec(TIFF Compression
32766) used by the A7R VI (ILCE-7RM6) and A7 V(ILCE-7M5) in their Compressed / Compressed HQ RAW modes. These formats were
previously undecodable.
What this adds
SonyArw6Decompressor(+SonyArw6LogToLinearLUT) — the codec itself.compression == 32766dispatch inArwDecoder, the active-area crop, and adimension-ceiling bump for the 66.8 MP sensor (which also enables the A7R VI
lossless tiled LJpeg path).
ILCE-7RM6camera entry incameras.xml. (TheILCE-7M5entry alreadyexists.)
Pipeline
Container tiles → per-row GCLI bitplane entropy decode → inverse reversible
5/3 wavelet (LL0 + 3 detail levels) → log→linear LUT → per-line colour
conversion → YCC→RGGB mosaic compose.
Coefficients are
int16; tiles and their (up to four) components decode inparallel under OpenMP (a no-op when it's unavailable). The per-tile sub-header
marker carries a 1-byte format tag that varies by body (
'0'on the A7R VI,'A'on the A7 V), so only its fixed"000"suffix is matched; theblack/white-level domain (1024/32800) and the
DefaultCrop-derived active areaare codec-universal.
Reverse-engineering & validation
The format was reverse-engineered clean-room by observing the decoder's
input/output behavior; the 4096-entry log→linear table was recovered as a
black-box input→output sampling, not transcribed.
The full pipeline is validated byte-exact (MAE 0) against the reference
decoder output (Adobe DNG) on both bodies, full-frame and APS-C, in both lossy
and HQ modes.
Malformed input is rejected via
ThrowRDErather than crashing,hanging, or invoking undefined behavior (a fuzz target is included).
I have also personally tested the decoder with a dirty build of Darktable built against this branch and was able to edit images successfully.
AI usage
This contribution was produced with
substantial AI assistance. The AI was used as a tool by me; the codec was
reverse-engineered from, and validated stage-by-stage against, the reference
decoder output, and I take full responsibility for the change and stand behind
its correctness.