fix(gguf): correct mismatched-shape error message in check_quantized_param_shape by Ricardo-M-L · Pull Request #13504 · huggingface/diffusers

Ricardo-M-L · 2026-04-19T15:13:43Z

What does this PR do?

Fixes the misleading error raised by GGUFQuantizer.check_quantized_param_shape when a loaded GGUF weight doesn't match the model's expected shape.

Before

inferred_shape = _quant_shape_from_byte_shape(loaded_param_shape, type_size, block_size)
if inferred_shape != current_param_shape:
    raise ValueError(
        f"{param_name} has an expected quantized shape of: {inferred_shape}, "
        f"but received shape: {loaded_param_shape}"
    )

The check compares inferred_shape against current_param_shape, but the message reports inferred_shape vs loaded_param_shape. Since inferred_shape is derived from loaded_param_shape, the two values on either side of the reported "mismatch" are effectively the same thing described at different unpacking stages — the shape the model actually expected (current_param_shape) never shows up in the message.

Concretely, the 9B Q8 GGUF failure noted in #13001 produced:

ValueError: double_stream_modulation_img.linear.weight has an expected quantized shape of: (24576, 4096), but received shape: torch.Size([24576, 8192])

…even though the model parameter was (36864, 6144), which is the real expected shape and the thing the user needs to see when diagnosing a Klein-vs-Dev/GGUF-variant mix-up.

After

<param_name> has an expected shape of: <current_param_shape>, but the loaded GGUF weight decodes to shape: <inferred_shape>

Now both sides of the comparison are visible, and the "expected" side actually reflects what the model wants.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Did you write any new necessary tests? — N/A; this is a one-line error-message correction with no behavior change.

Who can review?

@DN6 @sayakpaul

@Vargol

check_quantized_param_shape compares inferred_shape against current_param_shape, but the error message printed inferred_shape vs loaded_param_shape — and inferred_shape is derived from loaded_param_shape, so the reported mismatch was effectively self-referential and gave no signal about the model's expected shape. Print current_param_shape (what the model expected) vs inferred_shape (what the quantized weight decodes to) so the two sides of the comparison are actually visible. Noted by @Vargol in huggingface#13001.

sayakpaul · 2026-04-21T14:26:11Z

@Ricardo-M-L I am seeing that you're opening a lot of PRs in a very short period of time. I politely as you to reduce that pace a bit.

Ricardo-M-L · 2026-04-24T09:16:11Z

Friendly ping — this PR has been approved. Is there anything else needed before merging? Happy to make any requested changes.

github-actions Bot added quantization size/S PR with diff < 50 LOC labels Apr 19, 2026

sayakpaul requested a review from DN6 April 21, 2026 14:25

DN6 approved these changes Apr 22, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(gguf): correct mismatched-shape error message in check_quantized_param_shape#13504

fix(gguf): correct mismatched-shape error message in check_quantized_param_shape#13504
Ricardo-M-L wants to merge 1 commit intohuggingface:mainfrom
Ricardo-M-L:fix/gguf-shape-error-message

Ricardo-M-L commented Apr 19, 2026

Uh oh!

sayakpaul commented Apr 21, 2026

Uh oh!

Ricardo-M-L commented Apr 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Ricardo-M-L commented Apr 19, 2026

What does this PR do?

Before

After

Related

Before submitting

Who can review?

Uh oh!

sayakpaul commented Apr 21, 2026

Uh oh!

Ricardo-M-L commented Apr 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants