-
Notifications
You must be signed in to change notification settings - Fork 766
Changing logic to deal with graphs with derived quantization spec #16357
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/16357
Note: Links to docs will display an error until the docs builds have been completed. ✅ You can merge normally! (1 Unrelated Failure)As of commit bafed32 with merge base 6785e1f ( UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
…torch#16357) Summary: We want to add a test for `default_addmm_A8W8` to fully finish testing `CadenceDefaultQuantizer`. However there are a couple changes we need to make to the testing function. ## Change 1: We allow passing `None` in the vec of `QuantizationSpec` This is because the addmm op has 3 inputs: `bias`, `mat1`, `mat2`. The bias uses a `DerivedQuantizationSpec`, which is dynamically constructed with references to the actual graph nodes (`mat1` and `mat2`). We can't construct an identical `DerivedQuantizationSpec` in the test because we'd need to reference the exact same node objects that the quantizer creates internally. Since we can't compare it directly, we use `None` to skip validation for that input. If `mat1` and `mat2` are quantized correctly, the derived bias spec will be correct too. https://www.internalfb.com/code/fbsource/[2cfdb40fd8b628da2f46366115516408cfb9f50f]/xplat/executorch/backends/cadence/aot/quantizer/patterns.py?lines=91-103 ## Change 2: We changed how we iterate through `input_qspec_map` `input_qspec_map` is a dictionary mapping input nodes to their `qspecs`. The iteration order depends on insertion order, which follows how the quantizer processes `PartitionAnchors`. Each `QuantizationPattern` implements a `get_anchors()` method that returns a `PartitionAnchors` describing which arguments are inputs, weights, biases and nodes. This is relevant because for `addmm`, the `PartitionAnchors` lists them as `inputs=[(node, 1)], weights=[(node, 2)], biases=[(node, 0, ...)]. ` So the map might iterate in order `mat1, mat2, bias` (args indices 1, 2, 0) rather than `bias, mat1, mat2` (args indices 0, 1, 2). This means that our previous way of iterating wouldn't work. Thus, we now use the following way to iterate: ``` for input_node, input_qspec in annotation.input_qspec_map.items(): // Find the index of this input node in the op's args arg_index = None for i, arg in enumerate(op_node.args): if arg is input_node: arg_index = i break self.assertIsNotNone( arg_index, f"Input node {input_node} not found in op_node.args", ) # Skip comparison if expected qspec is None (e.g., for DerivedQuantizationSpec) if expected_input_qspecs[arg_index] is not None: self.assertEqual( input_qspec, expected_input_qspecs[arg_index], f"Input qspec mismatch at arg index {arg_index}", ) ``` The new code looks up which argument index each input_node corresponds to by searching through `op_node.args`, rather than assuming the enumeration index i matches the argument position. Differential Revision: D88955761
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds comprehensive test coverage for several Cadence quantizers, including the CadenceDefaultQuantizer, by enhancing the testing framework to handle graphs with derived quantization specifications. The key innovation is allowing None in expected input quantization specs to skip validation of dynamically-constructed DerivedQuantizationSpec objects, and updating the iteration logic to handle non-sequential argument ordering in PartitionAnchors.
- Added test coverage for 5 previously untested quantizer classes:
CadenceDefaultQuantizer,CadenceWakeWordQuantizer,CadenceWith16BitConvActivationsQuantizer,CadenceWithLayerNormQuantizer, andCadenceWithSoftmaxQuantizer - Enhanced testing framework to support
DerivedQuantizationSpecby allowingNonevalues in expected input specs - Updated iteration logic to map input nodes to argument positions dynamically, handling cases where the quantizer's internal ordering differs from argument position order
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Summary: Add annotation tests for CadenceWith16BitConvActivationsQuantizer covering both conv1d and conv2d operations. Differential Revision: D88895865
Differential Revision: D88896712
Differential Revision: D88898823
Differential Revision: D88898933
Differential Revision: D88899457
…torch#16357) Summary: Pull Request resolved: pytorch#16357 We want to add a test for `default_addmm_A8W8` to fully finish testing `CadenceDefaultQuantizer`. However there are a couple changes we need to make to the testing function. ## Change 1: We allow passing `None` in the vec of `QuantizationSpec` This is because the addmm op has 3 inputs: `bias`, `mat1`, `mat2`. The bias uses a `DerivedQuantizationSpec`, which is dynamically constructed with references to the actual graph nodes (`mat1` and `mat2`). We can't construct an identical `DerivedQuantizationSpec` in the test because we'd need to reference the exact same node objects that the quantizer creates internally. Since we can't compare it directly, we use `None` to skip validation for that input. If `mat1` and `mat2` are quantized correctly, the derived bias spec will be correct too. https://www.internalfb.com/code/fbsource/[2cfdb40fd8b628da2f46366115516408cfb9f50f]/xplat/executorch/backends/cadence/aot/quantizer/patterns.py?lines=91-103 ## Change 2: We changed how we iterate through `input_qspec_map` `input_qspec_map` is a dictionary mapping input nodes to their `qspecs`. The iteration order depends on insertion order, which follows how the quantizer processes `PartitionAnchors`. Each `QuantizationPattern` implements a `get_anchors()` method that returns a `PartitionAnchors` describing which arguments are inputs, weights, biases and nodes. This is relevant because for `addmm`, the `PartitionAnchors` lists them as `inputs=[(node, 1)], weights=[(node, 2)], biases=[(node, 0, ...)]. ` So the map might iterate in order `mat1, mat2, bias` (args indices 1, 2, 0) rather than `bias, mat1, mat2` (args indices 0, 1, 2). This means that our previous way of iterating wouldn't work. Thus, we now use the following way to iterate: ``` for input_node, input_qspec in annotation.input_qspec_map.items(): // Find the index of this input node in the op's args arg_index = None for i, arg in enumerate(op_node.args): if arg is input_node: arg_index = i break self.assertIsNotNone( arg_index, f"Input node {input_node} not found in op_node.args", ) # Skip comparison if expected qspec is None (e.g., for DerivedQuantizationSpec) if expected_input_qspecs[arg_index] is not None: self.assertEqual( input_qspec, expected_input_qspecs[arg_index], f"Input qspec mismatch at arg index {arg_index}", ) ``` The new code looks up which argument index each input_node corresponds to by searching through `op_node.args`, rather than assuming the enumeration index i matches the argument position. Reviewed By: hsharma35 Differential Revision: D88955761
9f4be46 to
bafed32
Compare
Summary:
We want to add a test for
default_addmm_A8W8to fully finish testingCadenceDefaultQuantizer. However there are a couple changes we need to make to the testing function.Change 1: We allow passing
Nonein the vec ofQuantizationSpecThis is because the addmm op has 3 inputs:
bias,mat1,mat2. The bias uses aDerivedQuantizationSpec, which is dynamically constructed with references to the actual graph nodes (mat1andmat2). We can't construct an identicalDerivedQuantizationSpecin the test because we'd need to reference the exact same node objects that the quantizer creates internally. Since we can't compare it directly, we useNoneto skip validation for that input. Ifmat1andmat2are quantized correctly, the derived bias spec will be correct too.https://www.internalfb.com/code/fbsource/[2cfdb40fd8b628da2f46366115516408cfb9f50f]/xplat/executorch/backends/cadence/aot/quantizer/patterns.py?lines=91-103
Change 2: We changed how we iterate through
input_qspec_mapinput_qspec_mapis a dictionary mapping input nodes to theirqspecs. The iteration order depends on insertion order, which follows how the quantizer processesPartitionAnchors.Each
QuantizationPatternimplements aget_anchors()method that returns aPartitionAnchorsdescribing which arguments are inputs, weights, biases and nodes. This is relevant because foraddmm, thePartitionAnchorslists them asinputs=[(node, 1)], weights=[(node, 2)], biases=[(node, 0, ...)].So the map might iterate in ordermat1, mat2, bias(args indices 1, 2, 0) rather thanbias, mat1, mat2(args indices 0, 1, 2).This means that our previous way of iterating wouldn't work. Thus, we now use the following way to iterate:
The new code looks up which argument index each input_node corresponds to by searching through
op_node.args, rather than assuming the enumeration index i matches the argument position.Differential Revision: D88955761