6. Tensor Views
7. torch.autograd
自动求导包。可以对任何以标量为值的函数进行求导(神经网络也可以,某个矩阵也可以)
7.1 Functional higher level API
7.2 Locally disabling gradient computation
7.3 Default gradient layouts
7.4 In-place operations on Tensors
7.5 Variable (deprecated)
7.6 Tensor autograd functions
CLASS torch.Tensor
- backward(gradient=None, retain_graph=None, create_graph=False, inputs=None)
计算当前Tensor相对于图上叶节点的梯度。
对图的微分使用了链式法则。
如果当前Tensor不是一个标量且需要梯度,就需要指定参数gradient。这个gradient是和当前Tensor形状相同,且包含当前Tensor的梯度
2.detach()
返回一个从当前图中分离下来的Tensor
用于切断反向传播6
7.7 Function
CLASS torch.autograd.Function(*args, **kwargs)
自定义autograd.Function需要subclass autograd.Function,应用forward()和backward()(调用ctx()),调用apply()(不能直接调用forward())
具体的没太看懂,以后研究研究来补充。
示例代码:
class Exp(Function): @staticmethod def forward(ctx, i): result = i.exp() ctx.save_for_backward(result) return result @staticmethod def backward(ctx, grad_output): result, = ctx.saved_tensors return grad_output * result # Use it by calling the apply method: output = Exp.apply(input)
7.8 Context method mixins
7.9 Numerical gradient checking
7.10 Profiler
7.11 Anomaly detection
CLASS torch.autograd.detect_anomaly
开启autograd引擎anomaly detection功能的上下文管理器。
功能:
- 运行前向传播时,如开启检测,在运行反向传播时,可以打印前向传播时导致反向传播崩溃的traceback。
- 生成NaN值的反向传播计算操作会raise error。
注意:这一操作仅应在debug阶段开启,因为不同的测试会导致程序运行变慢。
示例代码,不用detect_anomaly:
import torch from torch import autograd class MyFunc(autograd.Function): @staticmethod def forward(ctx, inp): return inp.clone() @staticmethod def backward(ctx, gO): # Error during the backward pass raise RuntimeError("Some error in backward") return gO.clone() def run_fn(a): out = MyFunc.apply(a) return out.sum() inp = torch.rand(10, 10, requires_grad=True) out = run_fn(inp) out.backward()
输出:
Traceback (most recent call last): File "randomly_try2.py", line 17, in <module> out.backward() File "virtual_env/lib/python3.8/site-packages/torch/_tensor.py", line 363, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs) File "virtual_env/lib/python3.8/site-packages/torch/autograd/__init__.py", line 173, in backward Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass File "virtual_env/lib/python3.8/site-packages/torch/autograd/function.py", line 253, in apply return user_fn(self, *args) File "randomly_try2.py", line 10, in backward raise RuntimeError("Some error in backward") RuntimeError: Some error in backward
可以看到仅输出在backward()中有错。
示例代码,使用detect_anomaly:
import torch from torch import autograd class MyFunc(autograd.Function): @staticmethod def forward(ctx, inp): return inp.clone() @staticmethod def backward(ctx, gO): # Error during the backward pass raise RuntimeError("Some error in backward") return gO.clone() def run_fn(a): out = MyFunc.apply(a) return out.sum() with autograd.detect_anomaly(): inp = torch.rand(10, 10, requires_grad=True) out = run_fn(inp) out.backward()
输出:
randomly_try2.py:15: UserWarning: Anomaly Detection has been enabled. This mode will increase the runtime and should only be enabled for debugging. with autograd.detect_anomaly(): virtual_env/lib/python3.8/site-packages/torch/autograd/__init__.py:173: UserWarning: Error detected in MyFuncBackward. Traceback of forward call that caused the error: File "randomly_try2.py", line 17, in <module> out = run_fn(inp) File "randomly_try2.py", line 13, in run_fn out = MyFunc.apply(a) (Triggered internally at /opt/conda/conda-bld/pytorch_1646755853042/work/torch/csrc/autograd/python_anomaly_mode.cpp:104.) Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass Traceback (most recent call last): File "randomly_try2.py", line 18, in <module> out.backward() File "virtual_env/lib/python3.8/site-packages/torch/_tensor.py", line 363, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs) File "virtual_env/lib/python3.8/site-packages/torch/autograd/__init__.py", line 173, in backward Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass File "virtual_env/lib/python3.8/site-packages/torch/autograd/function.py", line 253, in apply return user_fn(self, *args) File "randomly_try2.py", line 10, in backward raise RuntimeError("Some error in backward") RuntimeError: Some error in backward
可以看到能追溯到apply()函数。
- CLASS torch.autograd.set_detect_anomaly(mode)
设置autograd引擎anomaly detection是否开关的上下文管理器。
mode=True时开。可以作为上下文管理器或函数。异常检测功能见detect_anomaly。
参考7,使用如下代码可在NaN出现时报错,定位错误代码:
import torch # 正向传播时:开启自动求导的异常侦测 torch.autograd.set_detect_anomaly(True) # 反向传播时:在求导时开启侦测 with torch.autograd.detect_anomaly(): loss.backward()
7.12 Saved tensors default hooks
8. torch.cuda
- 会自动导入
- is_available() 查看CUDA能不能用,返回布尔值
- current_device() 返回当前被选device的索引
8.1 Random Number Generator
- manual_seed(seed)
设置当前GPU的随机种子,如果cuda不可用会自动忽略。
2.manual_seed_all(seed)
设置所有GPU的随机种子,如果cuda不可用会自动忽略。
9. torch.cuda.amp
10. torch.backends
10.1 torch.backends.cuda
10.2 torch.backends.cudnn
- torch.backends.cudnn.deterministic
布尔值,if True, causes cuDNN to only use deterministic convolution algorithms.
11. torch.distributed
12. torch.distributions
13. torch.fft
14. torch.futures
15. torch.fx
16. torch.hub
17. torch.jit
18. torch.linalg
19. torch.overrides
20. torch.profiler
21. torch.nn.init
这一部分的主要功能是初始化神经网络的参数。
绝大多数torch.nn中的网络层都是自带reset_parameters()函数,会自动初始化参数。
在这个问题:python 3.x - Reset parameters of a neural network in pytorch - Stack Overflow中给出了一种重置网络参数的示例代码:
for layer in model.children(): if hasattr(layer, 'reset_parameters'): layer.reset_parameters()
关于神经网络参数的初始化,更多简单的细节可以参考这些博文:
【pytorch参数初始化】 pytorch默认参数初始化以及自定义参数初始化_华仔的博客-CSDN博客_pytorch参数初始化
【Pytorch】各网络层的默认初始化方法_guofei_fly的博客-CSDN博客_pytorch 默认初始化
【Pytorch】模型权重的初始化函数_guofei_fly的博客-CSDN博客_pytorch模型权重初始化
pytorch系列 – 9 pytorch nn.init 中实现的初始化函数 uniform, normal, const, Xavier, He initialization_墨氲的博客-CSDN博客_nn.init.normal_
- calculate_gain(nonlinearity, param=None):返回指定非线性函数的推荐gain value。
示例代码:
>>> gain = nn.init.calculate_gain('leaky_relu', 0.2) # leaky_relu with negative_slope=0.2
输出值:1.3867504905630728
2.xavier_uniform_(tensor, gain=1.0):又名Glorot initialization
示例代码:
w = torch.empty(3, 5) nn.init.xavier_uniform_(w, gain=nn.init.calculate_gain('relu'))
22. torch.onnx
22. torch.optim
22.1 How to use an optimizer
示例代码:
optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.9) optimizer = optim.Adam([var1, var2], lr=0.0001)
22.2 Algorithms
- class torch.optim.Optimizer(params, defaults)
优化器基类
- zero_grad(set_to_none=False)
设置所有被放进优化器的参数Tensor的梯度为0
2.class torch.optim.SGD(params, lr=, momentum=0, dampening=0, weight_decay=0, nesterov=False)
随机梯度下降
23. Complex Numbers
24. DDP Communication Hooks
25. Pipeline Parallelism
26. Quantization
27. Distributed RPC Framework
28. torch.random
29. torch.sparse
30. torch.Storage
31. torch.utils.benchmark
32. torch.utils.bottleneck
33. torch.utils.checkpoint
34. torch.utils.cpp_extension
35. torch.utils.data
- class Dataset
表示数据集的抽象类。
需要复写__getitem__()(通过key获得一个观测),可以复写__len__()
- class DataLoader
用于以mini-batch范式调用数据集
入参:
- dataset
- batch_size
- shuffle:置True的话,每次调用都打乱顺序
- collate_fn:对每个batch运行的函数
- pin_memory:默认置False
- drop_last:如果置True且最后一个batch不满batch_size,将直接舍弃最后一个batch。默认置False