PyTorch quantization observer
basic class
name | inherit | describe |
ObserverBase | ABC, nn.Module | Base observer Module |
UniformQuantizationObserverBase | ObserverBase |
standard observer
name | inherit | describe |
MinMaxObserver | UniformQuantizationObserverBase | computing the quantization parameters based on the running min and max values |
MovingAverageMinMaxObserver | MinMaxObserver | computing the quantization parameters based on the moving average of the min and max values |
PerChannelMinMaxObserver | UniformQuantizationObserverBase | computing the quantization parameters based on the running per channel min and max values |
MovingAveragePerChannelMinMaxObserver | PerChannelMinMaxObserver | computing the quantization parameters based on the running per channel min and max values |
HistogramObserver | UniformQuantizationObserverBase | records the running histogram of tensor values along with min/max values. |
PlaceholderObserver | ObserverBase | doesn’t do anything and just passes its configuration to the quantized module’s .from_float() . |
RecordingObserver | ObserverBase | mainly for debug and records the tensor values during runtime. |
NoopObserver | ObserverBase | doesn’t do anything and just passes its configuration to the quantized module’s .from_float() . |
FixedQParamsObserver | ObserverBase | |
ReuseInputObserver | ObserverBase |
substandard observer
name | inherit | describe |
default_observer | MinMaxObserver | quant_min=0, quant_max=127 |
default_placeholder_observer | PlaceholderObserver | Default placeholder observer, usually used for quantization to torch.float16. |
default_debug_observer | RecordingObserver | Default debug-only observer. |
default_weight_observer | MinMaxObserver | dtype=torch.qint8, qscheme=torch.per_tensor_symmetric |
default_histogram_observer | HistogramObserver | quant_min=0, quant_max=127 |
default_per_channel_weight_observer | PerChannelMinMaxObserver | dtype=torch.qint8, qscheme=torch.per_channel_symmetric |
default_dynamic_quant_observer | PlaceholderObserver | dtype=torch.float, compute_dtype=torch.quint8 |
default_float_qparams_observer | PerChannelMinMaxObserver | dtype=torch.quint8, qscheme=torch.per_channel_affine_float_qparams, ch_axis=0 |
weight_observer_range_neg_127_to_127 | MinMaxObserver | dtype=torch.qint8, qscheme=torch.per_tensor_symmetric, quant_min=-127, quant_max=127, eps=2 ** -12 |
per_channel_weight_observer_range_neg_127_to_127 | MinMaxObserver | dtype=torch.qint8, qscheme=torch.per_channel_symmetric, quant_min=-127, quant_max=127, eps=2 ** -12 |
default_float_qparams_observer_4bit | PerChannelMinMaxObserver | dtype=torch.quint4x2, qscheme=torch.per_channel_affine_float_qparams, ch_axis=0 |
default_fixed_qparams_range_neg1to1_observer | FixedQParamsObserver | scale=2.0 / 256.0, zero_point=128, dtype=torch.quint8, quant_min=0, quant_max=255 |
default_fixed_qparams_range_0to1_observer | FixedQParamsObserver | scale=1.0 / 256.0, zero_point=0, dtype=torch.quint8, quant_min=0, quant_max=255 |
default_symmetric_fixed_qparams_observer | default_fixed_qparams_range_neg1to1_observer | |
default_affine_fixed_qparams_observer | default_fixed_qparams_range_0to1_observer | |
default_reuse_input_observer | ReuseInputObserver |