PyTorch quantization observer

简介: PyTorch quantization observer

PyTorch quantization observer

basic class

name inherit describe
ObserverBase ABC, nn.Module Base observer Module
UniformQuantizationObserverBase ObserverBase

standard observer

name inherit describe
MinMaxObserver UniformQuantizationObserverBase computing the quantization parameters based on the running min and max values
MovingAverageMinMaxObserver MinMaxObserver computing the quantization parameters based on the moving average of the min and max values
PerChannelMinMaxObserver UniformQuantizationObserverBase computing the quantization parameters based on the running per channel min and max values
MovingAveragePerChannelMinMaxObserver PerChannelMinMaxObserver computing the quantization parameters based on the running per channel min and max values
HistogramObserver UniformQuantizationObserverBase records the running histogram of tensor values along with min/max values.
PlaceholderObserver ObserverBase doesn’t do anything and just passes its configuration to the quantized module’s .from_float().
RecordingObserver ObserverBase mainly for debug and records the tensor values during runtime.
NoopObserver ObserverBase doesn’t do anything and just passes its configuration to the quantized module’s .from_float().
FixedQParamsObserver ObserverBase
ReuseInputObserver ObserverBase

substandard observer


name inherit describe
default_observer MinMaxObserver quant_min=0,
quant_max=127
default_placeholder_observer PlaceholderObserver Default placeholder observer, usually used for quantization to torch.float16.
default_debug_observer RecordingObserver Default debug-only observer.
default_weight_observer MinMaxObserver dtype=torch.qint8,
qscheme=torch.per_tensor_symmetric
default_histogram_observer HistogramObserver quant_min=0,
quant_max=127
default_per_channel_weight_observer PerChannelMinMaxObserver dtype=torch.qint8,
qscheme=torch.per_channel_symmetric
default_dynamic_quant_observer PlaceholderObserver dtype=torch.float,
compute_dtype=torch.quint8
default_float_qparams_observer PerChannelMinMaxObserver dtype=torch.quint8,
qscheme=torch.per_channel_affine_float_qparams,
ch_axis=0
weight_observer_range_neg_127_to_127 MinMaxObserver dtype=torch.qint8,
qscheme=torch.per_tensor_symmetric,
quant_min=-127,
quant_max=127,
eps=2 ** -12
per_channel_weight_observer_range_neg_127_to_127 MinMaxObserver dtype=torch.qint8,
qscheme=torch.per_channel_symmetric,
quant_min=-127,
quant_max=127,
eps=2 ** -12
default_float_qparams_observer_4bit PerChannelMinMaxObserver dtype=torch.quint4x2, qscheme=torch.per_channel_affine_float_qparams,
ch_axis=0
default_fixed_qparams_range_neg1to1_observer FixedQParamsObserver scale=2.0 / 256.0,
zero_point=128,
dtype=torch.quint8,
quant_min=0,
quant_max=255
default_fixed_qparams_range_0to1_observer FixedQParamsObserver scale=1.0 / 256.0,
zero_point=0,
dtype=torch.quint8,
quant_min=0,
quant_max=255
default_symmetric_fixed_qparams_observer default_fixed_qparams_range_neg1to1_observer
default_affine_fixed_qparams_observer default_fixed_qparams_range_0to1_observer
default_reuse_input_observer ReuseInputObserver
目录
相关文章
|
6月前
|
PyTorch 算法框架/工具
Bert Pytorch 源码分析:五、模型架构简图 REV1
Bert Pytorch 源码分析:五、模型架构简图 REV1
91 0
|
3月前
|
机器学习/深度学习
langchain 入门指南 - In-context Learning
langchain 入门指南 - In-context Learning
95 0
|
3月前
|
机器学习/深度学习 IDE API
【Tensorflow+keras】Keras 用Class类封装的模型如何调试call子函数的模型内部变量
该文章介绍了一种调试Keras中自定义Layer类的call方法的方法,通过直接调用call方法并传递输入参数来进行调试。
33 4
|
4月前
|
机器学习/深度学习 人工智能 API
LangChain之模型调用
LangChain的模型是框架中的核心,基于语言模型构建,用于开发LangChain应用。通过API调用大模型来解决问题是LangChain应用开发的关键过程。
118 1
|
5月前
|
机器学习/深度学习 PyTorch 算法框架/工具
PyTorch中的模型创建(一)
最全最详细的PyTorch神经网络创建
|
5月前
|
机器学习/深度学习 PyTorch 算法框架/工具
|
6月前
ChatGLM2 源码分析:`ChatGLMForConditionalGeneration.chat, .stream_chat`
ChatGLM2 源码分析:`ChatGLMForConditionalGeneration.chat, .stream_chat`
107 0
langchain中的chat models介绍和使用
之前我们介绍了LLM模式,这种模式是就是文本输入,然后文本输出。 chat models是基于LLM模式的更加高级的模式。他的输入和输出是格式化的chat messages。 一起来看看如何在langchain中使用caht models吧。
|
存储 机器学习/深度学习 PyTorch
Pytorch学习笔记-03 模型创建
Pytorch学习笔记-03 模型创建
139 0
Pytorch学习笔记-03 模型创建