Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .readthedocs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@ build:
os: ubuntu-22.04
tools:
python: "3.10"
apt_packages:
- cmake # Install CMake system-wide

# Build documentation in the docs/ directory with Sphinx
sphinx:
Expand Down
9 changes: 8 additions & 1 deletion docs/changelog.rst
Original file line number Diff line number Diff line change
@@ -1 +1,8 @@
.. _changes:
========================
Release Notes
========================

.. changelog::
:changelog-url: https://fastmachinelearning.org/qonnx/release_notes.html
:github: https://github.com/fastmachinelearning/qonnx/releases/
:pypi: https://pypi.org/project/qonnx/
10 changes: 7 additions & 3 deletions docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -72,13 +72,17 @@
extensions = ['sphinx.ext.autodoc', 'sphinx.ext.intersphinx', 'sphinx.ext.todo',
'sphinx.ext.autosummary', 'sphinx.ext.coverage', #'sphinx.ext.viewcode',
'sphinx.ext.doctest', 'sphinx.ext.ifconfig', 'sphinx.ext.mathjax',
'sphinx.ext.napoleon']
'sphinx.ext.napoleon', 'myst_parser', 'sphinx_github_changelog']

# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']

# The suffix of source filenames.
source_suffix = '.rst'
source_suffix = {
'.rst': 'restructuredtext',
'.txt': 'restructuredtext',
'.md': 'markdown',
}

# The encoding of source files.
# source_encoding = 'utf-8-sig'
Expand All @@ -88,7 +92,7 @@

# General information about the project.
project = u'qonnx'
copyright = u'2021-2022 QONNX Contributors'
copyright = u'2021-2025 QONNX Contributors'

# The version info for the project you're documenting, acts as replacement for
# |version| and |release|, also used in various other places throughout the
Expand Down
20 changes: 15 additions & 5 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,10 @@ QONNX

.. note:: **QONNX** is currently under active development. APIs will likely change.

QONNX (Quantized ONNX) introduces three new custom operators -- `Quant <docs/qonnx-custom-ops/quant_op.md>`_, `BipolarQuant <docs/qonnx-custom-ops/bipolar_quant_op.md>`_ and `Trunc <docs/qonnx-custom-ops/trunc_op.md>`_ -- in order to represent arbitrary-precision uniform quantization in ONNX. This enables:
QONNX (Quantized ONNX) introduces four new custom operators -- `IntQuant`_, `BipolarQuant`_, `FloatQuant`_, and `Trunc`_
-- in order to represent arbitrary-precision uniform quantization in ONNX. This enables:

* Representation of binary, ternary, 3-bit, 4-bit, 6-bit or any other quantization.
* Representation of binary, ternary, 3-bit, 4-bit, 6-bit or any other quantization, or quantized floating-point values.

* Quantization is an operator itself, and can be applied to any parameter or layer input.

Expand All @@ -33,11 +34,13 @@ Quickstart
Operator definitions
+++++++++++++++++++++

* `Quant <docs/qonnx-custom-ops/quant_op.md>`_ for 2-to-arbitrary-bit quantization, with scaling and zero-point
* `IntQuant`_ for 2-to-arbitrary-bit quantization, with scaling and zero-point

* `BipolarQuant <docs/qonnx-custom-ops/bipolar_quant_op.md>`_ for 1-bit (bipolar) quantization, with scaling and zero-point
* `BipolarQuant`_ for 1-bit (bipolar) quantization, with scaling and zero-point

* `Trunc <docs/qonnx-custom-ops/trunc_op.md>`_ for truncating to a specified number of bits, with scaling and zero-point
* `FloatQuant`_ for arbitrary-precision-float-quantized values

* `Trunc`_ for truncating to a specified number of bits, with scaling and zero-point

Installation
+++++++++++++
Expand Down Expand Up @@ -90,11 +93,18 @@ QONNX also uses GitHub actions to run the full test suite on PRs.

ONNX-Based Compiler Infrastructure <overview>
Tutorials <tutorials>
qonnx-custom-ops/overview
API <api/modules>
License <license>
Contributors <authors>
Change log <changelog>
Index <genindex>


* :ref:`modindex`
* :ref:`search`

.. _IntQuant: qonnx-custom-ops/intquant_v1.html
.. _BipolarQuant: qonnx-custom-ops/bipolarquant_v1.html
.. _FloatQuant: qonnx-custom-ops/floatquant_v1.html
.. _Trunc: qonnx-custom-ops/trunc_v2.html
2 changes: 1 addition & 1 deletion docs/license.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,4 @@
License
========

.. include:: ../LICENSE
.. literalinclude:: ../LICENSE
3 changes: 2 additions & 1 deletion docs/readme.rst
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
.. _readme:
.. include:: ../README.rst
.. include:: ../README.md

Check warning on line 2 in docs/readme.rst

View workflow job for this annotation

GitHub Actions / docs

Error in "include" directive:
:parser: myst_parser.sphinx_
2 changes: 2 additions & 0 deletions docs/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -12,3 +12,5 @@ sigtools==2.0.3
sphinx==4.0.3
sphinx_rtd_theme==1.1.1
toposort==1.7.0
myst_parser
sphinx_github_changelog
12 changes: 4 additions & 8 deletions docs/tutorials.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,17 +10,13 @@ All Jupyter notebooks can be found under the `notebook folder <https://github.co


* 0_how_to_work_with_onnx

* This notebook can help you to learn how to create and manipulate a simple ONNX model, also by using QONNX
This notebook can help you to learn how to create and manipulate a simple ONNX model, also by using QONNX

* 1_custom_analysis_pass

* Explains what an analysis pass is and how to write one for QONNX.
Explains what an analysis pass is and how to write one for QONNX.

* 2_custom_transformation_pass

* Explains what a transformation pass is and how to write one for QONNX.
Explains what a transformation pass is and how to write one for QONNX.

* 3_custom_op

* Explains the basics of QONNX custom ops and how to define a new one.
Explains the basics of QONNX custom ops and how to define a new one.
9 changes: 6 additions & 3 deletions src/qonnx/custom_op/registry.py
Original file line number Diff line number Diff line change
Expand Up @@ -469,9 +469,12 @@ def get_ops_in_domain(domain: str) -> List[Tuple[str, Type[CustomOp]]]:
List of (op_type, op_class) tuples

Example:
ops = get_ops_in_domain("qonnx.custom_op.general")
for op_name, op_class in ops:
print(f"{op_name}: {op_class}")
::

ops = get_ops_in_domain("qonnx.custom_op.general")
for op_name, op_class in ops:
print(f"{op_name}: {op_class}")

"""
module_path = resolve_domain(domain)
ops_dict = {}
Expand Down
2 changes: 2 additions & 0 deletions src/qonnx/transformation/fixedpt_quantize.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ class FixedPointQuantizeParamsFromDict(Transformation):
"""
Quantize model parameters to a given fixed-point representation.
The self.max_err dictionary stores the maximum error for each quantized input after calling.

Parameters:
fixedpt_dict: Dictionary containing tensor names and their corresponding target fixed-point
data type or its canonical name
Expand Down Expand Up @@ -91,6 +92,7 @@ class FixedPointQuantizeParams(Transformation):
Identifies specific operations in a model (e.g., "Add", "Mul") using a filter function,
and quantizes any non-quantized input initializers to the given fixed-point representation.
The self.max_err dictionary stores the maximum error for each quantized input after calling.

Parameters:
fixedpt_dtype: The fixed-point data type or its canonical name to use for quantization.
op_filter: A lambda function to filter operations in the model graph
Expand Down
4 changes: 2 additions & 2 deletions src/qonnx/transformation/insert.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,8 +57,8 @@ class InsertIdentity(Transformation):
the graph output will be replaced with a new tensor name <old_name>_identity

Parameters:
tensor_name (str): The name of the tensor where the Identity node will be inserted.
producer_or_consumer (str): Indicates whether the Identity node will be inserted before ('producer')
tensor_name (str): The name of the tensor where the Identity node will be inserted.
producer_or_consumer (str): Indicates whether the Identity node will be inserted before ('producer')
or after ('consumer') the tensor_name.

"""
Expand Down
24 changes: 15 additions & 9 deletions src/qonnx/transformation/qcdq_to_qonnx.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,10 @@ def extract_elem_type(elem_type: int, clip_range=None) -> Tuple[int, int, bool]:
"""
Return Quant attribute specification based on element type and (optional)
clipping range.
Returns: (bitwidth, signed, is_narrow_qnt)

Returns:
(bitwidth, signed, is_narrow_qnt)

"""
is_narrow = False
# pylint: disable=no-member
Expand Down Expand Up @@ -82,14 +85,17 @@ class QCDQToQuant(Transformation):
during the quantization process into a QONNX Quant node. If a Clip node is
found between the QuantizeLinear+DequantizeLinear, this will be taken into
account for the Quant bitwidth calculation.
Input
-----
A model potentially quantized with QuantizeLinear, (optional) Clip and
DequantizeLinear nodes.
Output
------
A model with QuantizeLinear, Clip and DequantizeLinear nodes re-fused back into QONNX
Quant nodes.

Input:

A model potentially quantized with QuantizeLinear, (optional) Clip and
DequantizeLinear nodes.

Output:

A model with QuantizeLinear, Clip and DequantizeLinear nodes re-fused back into QONNX
Quant nodes.

"""

def __init__(self) -> None:
Expand Down
4 changes: 4 additions & 0 deletions src/qonnx/transformation/qonnx_to_qcdq.py
Original file line number Diff line number Diff line change
Expand Up @@ -120,12 +120,14 @@ def qcdq_pattern(op, x, scale, zero_point, bitwidth, signed, narrow, rounding_mo
def is_valid_qcdq_transformation(context, x, scale, zero_point, bitwidth, signed, narrow, rounding_mode, **_) -> bool:
"""Condition to check if the Quant node can be replaced.
The following conditions must be satisfied:

- the scale, zero-point and bitwidth inputs for Quant must be statically specified
by an initializer
- the bitwidth must be an integer in the range [2, 8] # TODO: Change max bitwidth to 16 for opset >= 21
- the zero-point tensor must be zero
- the scale must be a scalar value or 1D tensor
- the rounding_mode attribute must be ROUND

"""

# Check scale
Expand Down Expand Up @@ -158,12 +160,14 @@ class QuantToQCDQ(Transformation):
"""Replace QONNX Quant-style quantization nodes with QuantizeLinear
-> Clip -> DequantizeLinear (QCDQ)-style quantization nodes. The following
restictions apply on the Quant:

- the scale, zero-point and bitwidth inputs for Quant must be statically specified
by an initializer
- the bitwidth must be an integer in the range [2, 8]
- the zero-point tensor must be zero
- the scale must be a scalar value or 1D tensor
- the rounding_mode attribute must be ROUND

BipolarQuant is not (yet) supported.
"""

Expand Down
26 changes: 20 additions & 6 deletions src/qonnx/transformation/quantize_graph.py
Original file line number Diff line number Diff line change
Expand Up @@ -144,33 +144,45 @@ class QuantizeGraph(Transformation):
as the parameters.

1) Expectations:

a) Onnx model in the modelwraper format.
b) Model must be cleaned using qonnx.util.cleanup.cleanup_model()
c) Batchsize to be set.

2) Steps to transform are:

Step1: Finding the input for the quant node.

Step2: Finding the consumer of the quant node output.

Step3: Finding the shape for the output tensor of quant node.

Note: The output tensor of the quant node must have the same shape as the consumer of the input
to the quant node.
to the quant node.

3) Input:

A dict "quantnode_map" specifying the criterion, positions, and input parameters like
scale, bitwidth, zeropoint, and others for a specific quantnode.

Criterion:
a) name: This will allow users to add quant nodes for specific node like "Conv_0" and "Gemm_0".

a) name:
This will allow users to add quant nodes for specific node like "Conv_0" and "Gemm_0".
Note: using this users can have quant nodes with different parameters. Ex: quantizing
"Conv_0" and "Conv_1" with bitwidth of 4 and 6, respectively.
b) op_type: This will allow users to add quant nodes for all nodes of a particular op_type such

b) op_type:
This will allow users to add quant nodes for all nodes of a particular op_type such
as, "Conv", "Gemm", and others.
Note: All quant nodes created using op_type criterion will have the same input
parameters (scale, zeropoint, bitwidth, and others.)
c) name and op_type: In this case, quant nodes will be added with precedence to "Name"
in comparison to "op_type".

c) name and op_type:
In this case, quant nodes will be added with precedence to "Name" in comparison to "op_type".

Positions: ("input", index) or ("output", index)

a) "input": indicates that the user want to quantize the input of the selected node.
b) "output": indicates that the user want to quantize the output of the selected node.
c) index: refers to the input/output index to quantize (a node can have multiple inputs and outputs)
Expand All @@ -188,7 +200,8 @@ class QuantizeGraph(Transformation):
5) Return:
Returns a model with new quant nodes created at the positions specified using the "quantnode_map".

6) Example:
6) Example::

quantnode_map = {"name": {"Conv_0": [(("input", 0), (1, 0, 8, 0, 1, "ROUND")),
(("input", 1), (1, 0, 8, 0, 1, "ROUND")),
(("output", 0), (1, 0, 8, 0, 1, "ROUND"))],
Expand All @@ -200,6 +213,7 @@ class QuantizeGraph(Transformation):
(("input", 1), (1, 0, 8, 0, 1, "ROUND")),
(("input", 2), (1, 0, 8, 0, 1, "ROUND")),
(("output", 0), (1, 0, 8, 0, 1, "ROUND"))]}}

"""

def __init__(self, quantnode_map):
Expand Down
Loading