DL:深度学习算法(神经网络模型集合)概览之《THE NEURAL NETWORK ZOO》的中文解释和感悟(五)

简介: DL:深度学习算法(神经网络模型集合)概览之《THE NEURAL NETWORK ZOO》的中文解释和感悟

MC

image.png



      Markov chains (MC or discrete time Markov Chain, DTMC) are kind of the predecessors to BMs and HNs. They can be understood as follows: from this node where I am now, what are the odds of me going to any of my neighbouring nodes? They are memoryless (i.e. Markov Property) which means that every state you end up in depends completely on the previous state. While not really a neural network, they do resemble neural networks and form the theoretical basis for BMs and HNs. MC aren’t always considered neural networks, as goes for BMs, RBMs and HNs. Markov chains aren’t always fully connected either.

      马尔可夫链(MC或离散时间马尔可夫链,DTMC)是BMs和HNs的前身。它们可以这样理解:从我现在所在的节点,我到相邻节点的概率是多少?它们是无记忆的(即马尔可夫性质),这意味着你最终所处的每一种状态完全依赖于前一种状态。虽然不是真正的神经网络,但它们确实类似于神经网络,构成了BMs和HNs的理论基础。MC并不总是像BMs、RBMs和HNs那样被认为是神经网络。马尔可夫链也不总是完全连接的。


Hayes, Brian. “First links in the Markov chain.” American Scientist 101.2 (2013): 252.

Original Paper PDF



HN

image.png



      A Hopfield network (HN) is a network where every neuron is connected to every other neuron; it is a completely entangled plate of spaghetti as even all the nodes function as everything. Each node is input before training, then hidden during training and output afterwards. The networks are trained by setting the value of the neurons to the desired pattern after which the weights can be computed. The weights do not change after this. Once trained for one or more patterns, the network will always converge to one of the learned patterns because the network is only stable in those states. Note that it does not always conform to the desired state (it’s not a magic black box sadly). It stabilises in part due to the total “energy” or “temperature” of the network being reduced incrementally during training. Each neuron has an activation threshold which scales to this temperature, which if surpassed by summing the input causes the neuron to take the form of one of two states (usually -1 or 1, sometimes 0 or 1). Updating the network can be done synchronously or more commonly one by one. If updated one by one, a fair random sequence is created to organise which cells update in what order (fair random being all options (n) occurring exactly once every n items). This is so you can tell when the network is stable (done converging), once every cell has been updated and none of them changed, the network is stable (annealed). These networks are often called associative memory because the converge to the most similar state as the input; if humans see half a table we can image the other half, this network will converge to a table if presented with half noise and half a table.

      Hopfield网络(HN)是一个网络,其中每个神经元都与其他神经元相连;它是一个完全缠结的意大利面盘,因为所有节点的功能都是一样的。每个节点在训练前输入,训练中隐藏,训练后输出。通过将神经元的值设置为所需的模式来训练网络,然后计算权重。在这之后,权重不会改变。

      一旦对一个或多个模式进行了训练,网络将始终收敛到所学习的模式之一,因为网络只在这些状态下是稳定的。注意,它并不总是符合所需的状态(遗憾的是,它不是一个神奇的黑盒)。它之所以稳定,部分原因是在训练过程中,网络的总“能量”或“温度”逐渐降低。每个神经元都有一个激活阈值尺度这个温度,如果超过了通过加总输入导致神经元以两种状态之一的形式(通常1或1,有时是0或1),更新网络可以做到同步或更常见。

       如果逐个更新,将创建一个公平随机序列来组织哪些单元格以何种顺序更新(公平随机是所有选项(n),每n个项目恰好发生一次)。这样您就可以知道网络何时是稳定的(聚合完成),一旦每个单元都更新了,并且没有一个单元更改,那么网络就是稳定的(退火)。这些网络通常被称为联想记忆,因为它们收敛到与输入最相似的状态;如果人类看到半张桌子,我们就能想象另一半,如果半张桌子有半张噪音,这个网络就会收敛到一张桌子。


Hopfield, John J. “Neural networks and physical systems with emergent collective computational abilities.” Proceedings of the national academy of sciences 79.8 (1982): 2554-2558.

Original Paper PDF



BM

image.png



      Boltzmann machines (BM) are a lot like HNs, but: some neurons are marked as input neurons and others remain “hidden”. The input neurons become output neurons at the end of a full network update. It starts with random weights and learns through back-propagation, or more recently through contrastive divergence (a Markov chain is used to determine the gradients between two informational gains). Compared to a HN, the neurons mostly have binary activation patterns. As hinted by being trained by MCs, BMs are stochastic networks. The training and running process of a BM is fairly similar to a HN: one sets the input neurons to certain clamped values after which the network is set free (it doesn’t get a sock). While free the cells can get any value and we repetitively go back and forth between the input and hidden neurons. The activation is controlled by a global temperature value, which if lowered lowers the energy of the cells. This lower energy causes their activation patterns to stabilise. The network reaches an equilibrium given the right temperature.

      玻尔兹曼机器(BM)很像HNs,但是:一些神经元被标记为输入神经元,而另一些仍然是“隐藏的”。在整个网络更新结束时,输入神经元变成输出神经元。它从随机权重开始,通过反向传播学习,或者最近通过对比发散学习(使用马尔可夫链来确定两个信息增益之间的梯度)。

      与HN相比,神经元大多具有二元激活模式。由MCs训练可知,BMs是随机网络。BM的训练和运行过程与HN非常相似:将输入神经元设置为特定的固定值,在此之后网络将被释放(它不会得到袜子)。当自由时,细胞可以得到任何值,我们不断地在输入和隐藏的神经元之间来回移动。

      激活由一个全局温度值控制,如果降低这个温度值,就会降低细胞的能量。这种较低的能量使它们的激活模式趋于稳定。在适当的温度下,网络达到平衡。


Hinton, Geoffrey E., and Terrence J. Sejnowski. “Learning and releaming in Boltzmann machines.” Parallel distributed processing: Explorations in the microstructure of cognition 1 (1986): 282-317.

Original Paper PDF



RBM


image.png


     Restricted Boltzmann machines (RBM) are remarkably similar to BMs (surprise) and therefore also similar to HNs. The biggest difference between BMs and RBMs is that RBMs are a better usable because they are more restricted. They don’t trigger-happily connect every neuron to every other neuron but only connect every different group of neurons to every other group, so no input neurons are directly connected to other input neurons and no hidden to hidden connections are made either. RBMs can be trained like FFNNs with a twist: instead of passing data forward and then back-propagating, you forward pass the data and then backward pass the data (back to the first layer). After that you train with forward-and-back-propagation.

      受限玻尔兹曼机(RBM)与BMs (surprise)非常相似,因此也与HNs相似。BMs和RBMs之间最大的区别是,RBMs的可用性更好,因为它们受到了更多的限制。它们不会把每个神经元连接到另一个神经元上,而是把每一组不同的神经元连接到另一组神经元上,所以没有输入神经元直接连接到其他输入神经元上,也没有隐藏到隐藏的连接。

     RBMs可以像FFNNs一样进行训练:不需要先向前传递数据,然后向后传播,而是向前传递数据,然后向后传递数据(回到第一层)。在此之后,您将使用正向和反向传播进行训练。


Smolensky, Paul. Information processing in dynamical systems: Foundations of harmony theory. No. CU-CS-321-86. COLORADO UNIV AT BOULDER DEPT OF COMPUTER SCIENCE, 1986.

Original Paper PDF




相关文章
|
23天前
|
机器学习/深度学习 编解码 自动驾驶
RT-DETR改进策略【模型轻量化】| 替换骨干网络为MoblieNetV1,用于移动视觉应用的高效卷积神经网络
RT-DETR改进策略【模型轻量化】| 替换骨干网络为MoblieNetV1,用于移动视觉应用的高效卷积神经网络
37 3
RT-DETR改进策略【模型轻量化】| 替换骨干网络为MoblieNetV1,用于移动视觉应用的高效卷积神经网络
|
23天前
|
机器学习/深度学习 移动开发 测试技术
RT-DETR改进策略【模型轻量化】| 替换骨干网络为MoblieNetV2,含模型详解和完整配置步骤
RT-DETR改进策略【模型轻量化】| 替换骨干网络为MoblieNetV2,含模型详解和完整配置步骤
43 1
RT-DETR改进策略【模型轻量化】| 替换骨干网络为MoblieNetV2,含模型详解和完整配置步骤
|
23天前
|
机器学习/深度学习 编解码 TensorFlow
RT-DETR改进策略【模型轻量化】| 替换骨干网络为EfficientNet v1 高效的移动倒置瓶颈结构
RT-DETR改进策略【模型轻量化】| 替换骨干网络为EfficientNet v1 高效的移动倒置瓶颈结构
39 0
RT-DETR改进策略【模型轻量化】| 替换骨干网络为EfficientNet v1 高效的移动倒置瓶颈结构
|
15天前
|
机器学习/深度学习 数据采集 自然语言处理
深度学习实践技巧:提升模型性能的详尽指南
深度学习模型在图像分类、自然语言处理、时间序列分析等多个领域都表现出了卓越的性能,但在实际应用中,为了使模型达到最佳效果,常规的标准流程往往不足。本文提供了多种深度学习实践技巧,包括数据预处理、模型设计优化、训练策略和评价与调参等方面的详细操作和代码示例,希望能够为应用实战提供有效的指导和支持。
|
23天前
RT-DETR改进策略【模型轻量化】| 替换骨干网络为 GhostNet V3 2024华为的重参数轻量化模型
RT-DETR改进策略【模型轻量化】| 替换骨干网络为 GhostNet V3 2024华为的重参数轻量化模型
42 2
RT-DETR改进策略【模型轻量化】| 替换骨干网络为 GhostNet V3 2024华为的重参数轻量化模型
|
23天前
|
机器学习/深度学习 文件存储 异构计算
RT-DETR改进策略【模型轻量化】| 替换骨干网络为EfficientNet v2,加速训练,快速收敛
RT-DETR改进策略【模型轻量化】| 替换骨干网络为EfficientNet v2,加速训练,快速收敛
31 1
|
23天前
|
机器学习/深度学习 存储
RT-DETR改进策略【模型轻量化】| PP-LCNet:轻量级的CPU卷积神经网络
RT-DETR改进策略【模型轻量化】| PP-LCNet:轻量级的CPU卷积神经网络
44 0
RT-DETR改进策略【模型轻量化】| PP-LCNet:轻量级的CPU卷积神经网络
|
9天前
|
机器学习/深度学习 算法 数据安全/隐私保护
基于GRU网络的MQAM调制信号检测算法matlab仿真,对比LSTM
本研究基于MATLAB 2022a,使用GRU网络对QAM调制信号进行检测。QAM是一种高效调制技术,广泛应用于现代通信系统。传统方法在复杂环境下性能下降,而GRU通过门控机制有效提取时间序列特征,实现16QAM、32QAM、64QAM、128QAM的准确检测。仿真结果显示,GRU在低SNR下表现优异,且训练速度快,参数少。核心程序包括模型预测、误检率和漏检率计算,并绘制准确率图。
81 65
基于GRU网络的MQAM调制信号检测算法matlab仿真,对比LSTM
|
14天前
|
算法
基于遗传优化算法的风力机位置布局matlab仿真
本项目基于遗传优化算法(GA)进行风力机位置布局的MATLAB仿真,旨在最大化风场发电效率。使用MATLAB2022A版本运行,核心代码通过迭代选择、交叉、变异等操作优化风力机布局。输出包括优化收敛曲线和最佳布局图。遗传算法模拟生物进化机制,通过初始化、选择、交叉、变异和精英保留等步骤,在复杂约束条件下找到最优布局方案,提升风场整体能源产出效率。
|
1天前
|
算法 数据挖掘 数据安全/隐私保护
基于CS模型和CV模型的多目标协同滤波跟踪算法matlab仿真
本项目基于CS模型和CV模型的多目标协同滤波跟踪算法,旨在提高复杂场景下多个移动目标的跟踪精度和鲁棒性。通过融合目标间的关系和数据关联性,优化跟踪结果。程序在MATLAB2022A上运行,展示了真实轨迹与滤波轨迹的对比、位置及速度误差均值和均方误差等关键指标。核心代码包括对目标轨迹、速度及误差的详细绘图分析,验证了算法的有效性。该算法结合CS模型的初步聚类和CV模型的投票机制,增强了目标状态估计的准确性,尤其适用于遮挡、重叠和快速运动等复杂场景。

热门文章

最新文章