fix(gguf): correct mismatched-shape error message in check_quantized_param_shape#13504
Open
Ricardo-M-L wants to merge 1 commit intohuggingface:mainfrom
Open
fix(gguf): correct mismatched-shape error message in check_quantized_param_shape#13504Ricardo-M-L wants to merge 1 commit intohuggingface:mainfrom
Ricardo-M-L wants to merge 1 commit intohuggingface:mainfrom
Conversation
check_quantized_param_shape compares inferred_shape against current_param_shape, but the error message printed inferred_shape vs loaded_param_shape — and inferred_shape is derived from loaded_param_shape, so the reported mismatch was effectively self-referential and gave no signal about the model's expected shape. Print current_param_shape (what the model expected) vs inferred_shape (what the quantized weight decodes to) so the two sides of the comparison are actually visible. Noted by @Vargol in huggingface#13001.
Member
|
@Ricardo-M-L I am seeing that you're opening a lot of PRs in a very short period of time. I politely as you to reduce that pace a bit. |
DN6
approved these changes
Apr 22, 2026
Contributor
Author
|
Friendly ping — this PR has been approved. Is there anything else needed before merging? Happy to make any requested changes. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Fixes the misleading error raised by
GGUFQuantizer.check_quantized_param_shapewhen a loaded GGUF weight doesn't match the model's expected shape.Before
The check compares
inferred_shapeagainstcurrent_param_shape, but the message reportsinferred_shapevsloaded_param_shape. Sinceinferred_shapeis derived fromloaded_param_shape, the two values on either side of the reported "mismatch" are effectively the same thing described at different unpacking stages — the shape the model actually expected (current_param_shape) never shows up in the message.Concretely, the 9B Q8 GGUF failure noted in #13001 produced:
…even though the model parameter was
(36864, 6144), which is the real expected shape and the thing the user needs to see when diagnosing a Klein-vs-Dev/GGUF-variant mix-up.After
Now both sides of the comparison are visible, and the "expected" side actually reflects what the model wants.
Related
Partially addresses the error-message confusion noted by @Vargol in #13001 (comment). This PR only touches the error text — it does not change the detection logic or attempt to resolve the underlying Klein-vs-Dev GGUF shape-inference issue that @DN6 is tracking.
Before submitting
Who can review?
@DN6 @sayakpaul