[DO NOT MERGE] Qualcomm AI Engine Direct - Reproduce Cat Error#19182
[DO NOT MERGE] Qualcomm AI Engine Direct - Reproduce Cat Error#19182winskuo-quic wants to merge 2 commits intopytorch:mainfrom
Conversation
…h#19159) Summary: Replace the Qualcomm concat observer path with an explicit same-domain-or-requantize model for `aten.cat`. Preserve shared qparams for `pixel_shuffle` and `pixel_unshuffle`, extend `split_with_sizes_copy` coverage, and add regressions for mismatched `cat` branches plus value-preserving ops that must use `SharedQuantizationSpec`. Differential Revision: D102626539
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19182
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ❌ 1 Awaiting Approval, 12 New Failures, 4 Unrelated FailuresAs of commit 9ad825b with merge base 2084866 ( NEW FAILURES - The following jobs have failed:
FLAKY - The following job failed but was likely due to flakiness present on trunk:
BROKEN TRUNK - The following jobs failed but was present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
Summary
Please reproduce with the following command:
python backends/qualcomm/tests/test_qnn_delegate.py -k TestQNNQuantizedOperator.test_qnn_backend_cat_fixed_input --model SM8750 --device $DEVICE --build_folder build-androidWhen concat 2 input tensors:
torch.tensor([[[[-10.0, 2.0], [3.0, 4.0]]]]) torch.tensor([[[[1.0, 3.0], [8.0, 10.0]]]]),we should ensure smallest value and largest value of input is representable with quant spec, which is why we cannot reuse input[0] quant_spec and apply to rest of all.
If we lower this model with mainline, we can get reasonable quant_spec for graph and also reasonable output.

Mainline:
However, if we do it with this PR, we are getting quant_spec that cannot represent values from -10 to 10.

This PR (10 got clipped to 4 for last value):