1. 简介
鞍点是一种特殊的驻点。对于多变量函数,在鞍点位置,函数沿任意方向的导数都为0,但函数并不是最大值或者最小值。我们关注一类特殊的鞍点,在这个位置,函数在某一方向上是最大值,但是在剩余所有方向上是极小值。
寻找鞍点在科学和工程研究中有很多应用。一个常用的例子是地形图,地势高度取决于水平坐标,因此这是一个双变量函数。假设在起伏的地势中有两个盆地(对应于函数的局部极小值)A和B。一个人想要从A出发到达B,在连接A和B的所有可能的路径中,哪一条路径走过的地势最高点最低?这个问题的实质就是寻找这个双变量函数的鞍点(或者一个更常见的名称,过渡态)。
2. 作业题目
考虑一个三变量函数(见下方代码),寻找这个函数的在(0.5, 0.5, 0.5)和(-0.5, -0.5, -0.5)附近的两个局部极小值,以及两个极小值之间路径上(0.7, 0.3, 0.5)附近的鞍点。
2.1 要求
数值结果误差小于1e-3。
不允许使用numpy和scipy之外的数学库,需手动实现算法。
不允许对提供的data的生成函数做符号分析(例如解析求导),data只能作为一个黑盒子使用。
为了模拟真实场景(data中数据点需要借助实验或者模拟仿真高成本获得),不允许对data进行遍历或暴力搜索。
代码优越性的评价取决于data中数据的读取次数,越低越好(读取次数越低,对应于真实场景中解决问题速度越快)。
为了方便大家对函数值的分布有直观认识,这里使用ipyvolume进行了3D可视化。
import numpy as np import ipyvolume as ipv x = np.linspace(-1, 1, 100) y = np.linspace(-1, 1, 100) z = np.linspace(-1, 1, 100) Y, Z, X = np.meshgrid(x, y, z) def gaussian(x0, y0, z0): return np.exp(-(X - x0)**2 - (Y - y0)**2 - (Z - z0)**2) data = gaussian(0.1, -0.1, -0.1) - gaussian(-0.5, -0.5, -0.5) - gaussian(0.5, 0.5, 0.5) ipv.figure() ipv.volshow(data) ipv.view(-40) ipv.show()
借助IPython提供的交互控件,可以使用下方的拖动条查看data的在xy方向上的二维数据切片。
from ipywidgets import interactive import matplotlib.pyplot as plt def slice_z(i): fig, ax = plt.subplots(figsize=(8, 8)) im = ax.imshow(data[i,:,:], vmin=-1, vmax=1, cmap=plt.get_cmap('gist_rainbow')) ct = ax.contour(data[i,:,:]) bar = plt.colorbar(im) plt.show() interact_plot = interactive(slice_z, i=(0, 99)) interact_plot
2.2 提示
寻找过渡态的一个常用算法是微动弹性带(Nudged Elastic Band)。它的核心思想是,将初始坐标和终态坐标用若干个中间态(例如11个)连接起来,然后让这些中间态沿着函数梯度的反方向移动(类似于小球在地形图中沿着山坡向下移动);为了避免这些中间态都收敛到附近的局部极小(类似于小球都落入了盆地),相邻中间态之间用一根虚拟的弹簧连接,从而迫使相邻中间态有一定的间距。当这个小球弹簧链(微动弹性带)移动停止时,其所在位置就是所谓的最低能量路径(minimal energy path),其中间函数值最大的位置就是鞍点或者过渡态。
在迭代计算过程中,中间态的移动同时受函数梯度和弹簧弹力的调控。为了保持中间态的间距尽量不改变,以及虚拟弹簧不影响路径正确性,可以对函数梯度和弹簧弹力进行矢量分解。其中,函数梯度只保留垂直于路径的分量;弹簧弹力只保留沿着路径的分量。
3. 解答
3.1 建立数据来源
import numpy as np import ipyvolume as ipv x = np.linspace(-1, 1, 100) y = np.linspace(-1, 1, 100) z = np.linspace(-1, 1, 100) Y, Z, X = np.meshgrid(x, y, z) def gaussian(x,y,z,x0, y0, z0): return np.exp(-(x- x0)**2 - (y - y0)**2 - (z - z0)**2) def data(x,y,z): return gaussian(x,y,z,0.1, -0.1, -0.1) - gaussian(x,y,z,-0.5, -0.5, -0.5) - gaussian(x,y,z,0.5, 0.5, 0.5) data(0,0,0)
返回查找的data数据
3.2 设计当前点梯度
三个方向分别采用相邻两点的差求解对应的偏导
##梯度 def gradient(x,y,z,delta=0.01): gradient_x = (data(x+delta,y,z) - data(x-delta,y,z))/2/delta gradient_y = (data(x,y+delta,z) - data(x,y-delta,z))/2/delta gradient_z = (data(x,y,z+delta) - data(x,y,z-delta))/2/delta return gradient_x,gradient_y,gradient_z gradient(0,0,0)
返回对应的点的梯度
3.3 寻找最小值
np.random.seed(0)#定义随机种子数 pos = np.random.rand(3)*2-1#生成点的坐标(-1,1)之间三个随机数 print(pos,data(*pos),gradient(*pos)) rate=0.1#下降速率 pos_new=pos-np.array(gradient(*pos))*rate#采用梯度下降法 print(pos_new,data(*pos_new)) def find_minimum(pos,rate,err):#寻找局部最小值的点 data_diff = 1.0 while np.abs(data_diff) > err: pos_new = pos - np.array(gradient(*pos))*rate data_diff = data(*pos_new) - data(*pos) pos = pos_new return pos pos = find_minimum(pos,rate,err = 1e-6) print(pos,data(*pos))
返回
可以随机寻找多次得到局部最优点,原则采用相似的取最小值的位置
for i in range(10): pos = np.random.rand(3)*2-1 pos = find_minimum(pos,rate,err = 1e-6) print(pos,data(*pos))
返回
选择第四个和第五个最小值点, 然后保存A和B两点
pos_A = np.array([0.60501534,0.67242841,0.67002837])#三个坐标用逗号 pos_B = np.array([-0.73177095,-0.64662581,-0.64406625])
4. 寻找鞍点
4.1 珠子等间距连接
首先采用9个珠子等间距连接两点
import ipyvolume as ipy#导入画图库 n = 9#采用9粒珠子将两个局部最小值点圈连起来 images = np.zeros([n,3])#设置9个位置为零清空内存 for i in range(n):#连接A\B两点 images[i] = (pos_B - pos_A)/(n - 1) * i + pos_A ipy.quickscatter(images[:,0], images[:,1], images[:,2], size = 5, marker = "sphere")#可视化9个珠子
显示:
4.2 设置弹簧弹力
两端珠子已经固定,中间的1 一个珠子分别连接两个,先算单个弹力(与弹簧长度相关)和方向(采用单位方向),direction = np.zeros_like(force)其维度与矩阵force一致,并为其初始化为全0,后期叠加。
dist_AB = np.sqrt(np.sum((pos_A - pos_B)**2))#AB两点之间距离 dist_avg = dist_AB / (n - 1)#两珠子之间平均距离 n = 9 images = np.zeros([n,3]) for i in range(n):#用9粒珠子均匀连接A\B两点 images[i] = (pos_B - pos_A)/(n - 1) * i + pos_A def spring_force(image_before, image, image_next, k = 2.0):#前三个三参数是相邻的位置点,k为弹力增益系数 dist_before = np.sqrt(np.sum((image - image_before)**2)) force_before = (dist_before - dist_avg)*k direction_before = (image - image_before)/dist_before dist_next = np.sqrt(np.sum((image - image_next)**2)) force_next = (dist_next - dist_avg)*k direction_next = (image - image_next)/dist_next force = force_before*direction_before + force_next*direction_next if np.sum(force**2)< 1e-8: direction = np.zeros_like(force) else: direction = force / np.sqrt(np.sum(force**2)) return force,direction spring_force(images[4], images[5], images[6], k = 2.0)#测试4/5/6点力和方向
返回,没有变换之前力和方向都几乎为零
4.3 微动弹性带
珠子在重力和弹力限制下运动,判断所有合内力是否在允许范围,允许结束,反而言之。
n = 9 images = np.zeros([n,3]) for i in range(n): images[i] = (pos_B - pos_A)/(n - 1) * i + pos_A s_force = np.zeros_like(images) direction = np.zeros_like(images) g_force = np.zeros_like(images) Saddle_Point = np.zeros([n]) rate = 0.1#滚动速率 err =1e-5 print(data(*images[5])) def Saddle(rate, err): n_step = 0 data_diff = 1.0 while np.abs(data_diff) > err:#判断所有合内力是否在允许范围 data_diff = 0 for i in range(1,n-1): old_data = data(*images[i]) s_force[i],direction[i] = spring_force(images[i-1], images[i], images[i+1], k = 2.0) s_force[i] = np.dot(s_force[i], direction[i]) * direction[i]#力矢量分解 g_force[i] = gradient(*images[i])#重力与梯度相关 g_force[i] = g_force[i] - np.dot(g_force[i], direction[i]) * direction[i] images[i] -= (s_force[i] + g_force[i]) * rate#珠子在重力和弹力限制下运动 new_data = data(*images[i]) data_diff=data_diff + (new_data - old_data)#计算所有合力 n_step+= 1 return n_step n_step = Saddle(rate, err) for i in range(n): Saddle_Point[i]=data(*images[i]) print(Saddle_Point[i]) print(n_step) print('Saddle Point value:'+str(max(Saddle_Point)))#打印鞍点值 ipy.quickscatter(images[:,0], images[:,1], images[:,2], size = 5, marker = "sphere")
经过2158次迭代,得到“Saddle Point value:-0.09851946983280313”。返回如下
完整代码,需要修改地方是pos_A,pos_B的坐标
#导入基本库 import numpy as np import ipyvolume as ipv #定义三坐标线性矩阵系列 x = np.linspace(-1, 1, 100) y = np.linspace(-1, 1, 100) z = np.linspace(-1, 1, 100) np.random.seed(0)#定义随机种子数 pos = np.random.rand(3)*2-1#生成点的坐标(-1,1)之间三个随机数 rate=0.1#下降速率 #定义高斯函数 def gaussian(x,y,z,x0, y0, z0): return np.exp(-(x- x0)**2 - (y - y0)**2 - (z - z0)**2) #定义数据库 def data(x,y,z): return gaussian(x,y,z,0.1, -0.1, -0.1) - gaussian(x,y,z,-0.5, -0.5, -0.5) - gaussian(x,y,z,0.5, 0.5, 0.5) data(0,0,0) ##定义梯度函数 def gradient(x,y,z,delta=0.01): gradient_x = (data(x+delta,y,z) - data(x-delta,y,z))/2/delta gradient_y = (data(x,y+delta,z) - data(x,y-delta,z))/2/delta gradient_z = (data(x,y,z+delta) - data(x,y,z-delta))/2/delta return gradient_x,gradient_y,gradient_z gradient(0,0,0) #寻找局部最小值函数 def find_minimum(pos,rate,err): data_diff = 1.0 while np.abs(data_diff) > err: pos_new = pos - np.array(gradient(*pos))*rate data_diff = data(*pos_new) - data(*pos) pos = pos_new return pos #随机搜索10次寻找局部最小值 for i in range(10): pos = np.random.rand(3)*2-1 pos = find_minimum(pos,rate,err = 1e-6) print(pos,data(*pos)) #注意此次需要根据自己计算的差异性修改为局部最小值,保存两个最小值 pos_A = np.array([0.60501534,0.67242841,0.67002837])#三个坐标用逗号 pos_B = np.array([-0.73177095,-0.64662581,-0.64406625]) #准备计算弹力 dist_AB = np.sqrt(np.sum((pos_A - pos_B)**2))#AB两点之间距离 dist_avg = dist_AB / (n - 1)#两珠子之间平均距离 n = 9 images = np.zeros([n,3]) for i in range(n):#用9粒珠子均匀连接A\B两点 images[i] = (pos_B - pos_A)/(n - 1) * i + pos_A #定义弹力大小和方向 def spring_force(image_before, image, image_next, k = 2.0):#前三个三参数是相邻的位置点,k为弹力增益系数 dist_before = np.sqrt(np.sum((image - image_before)**2)) force_before = (dist_before - dist_avg)*k direction_before = (image - image_before)/dist_before dist_next = np.sqrt(np.sum((image - image_next)**2)) force_next = (dist_next - dist_avg)*k direction_next = (image - image_next)/dist_next force = force_before*direction_before + force_next*direction_next if np.sum(force**2)< 1e-8: direction = np.zeros_like(force) else: direction = force / np.sqrt(np.sum(force**2)) return force,direction #定义珠子之间的弹力、方向、重力、珠子目标函数值 s_force = np.zeros_like(images) direction = np.zeros_like(images) g_force = np.zeros_like(images) Saddle_Point = np.zeros([n]) rate = 0.1#滚动速率 err =1e-5 print(data(*images[5])) def Saddle(rate, err): n_step = 0 data_diff = 1.0 while np.abs(data_diff) > err:#判断所有合内力是否在允许范围 data_diff = 0 for i in range(1,n-1): old_data = data(*images[i]) s_force[i],direction[i] = spring_force(images[i-1], images[i], images[i+1], k = 2.0) s_force[i] = np.dot(s_force[i], direction[i]) * direction[i]#力矢量分解 g_force[i] = gradient(*images[i])#重力与梯度相关 g_force[i] = g_force[i] - np.dot(g_force[i], direction[i]) * direction[i] images[i] -= (s_force[i] + g_force[i]) * rate#珠子在重力和弹力限制下运动 new_data = data(*images[i]) data_diff=data_diff + (new_data - old_data)#计算所有合力 n_step+= 1 return n_step n_step = Saddle(rate, err) for i in range(n): Saddle_Point[i]=data(*images[i]) print(Saddle_Point[i]) print(n_step) print('Saddle Point value:'+str(max(Saddle_Point)))#打印鞍点值 ipy.quickscatter(images[:,0], images[:,1], images[:,2], size = 5, marker = "sphere")
升级版,不需要手动更新,直接一键运行
#导入基本库 import numpy as np import ipyvolume as ipy #定义三坐标线性矩阵系列 x = np.linspace(-1, 1, 100) y = np.linspace(-1, 1, 100) z = np.linspace(-1, 1, 100) n = 9#珠子个数 #np.random.seed(0)#定义随机种子数 #pos = np.random.rand(3)*2-1#生成点的坐标(-1,1)之间三个随机数 poss = np.zeros([n+1, 4])# gradient_rate=0.1#梯度下降速率 find_minimum_err = 1e-6#局部最小值误差 #定义高斯函数 def gaussian(x,y,z,x0, y0, z0): return np.exp(-(x- x0)**2 - (y - y0)**2 - (z - z0)**2) #定义数据库 def data(x,y,z): return gaussian(x,y,z,0.1, -0.1, -0.1) - gaussian(x,y,z,-0.5, -0.5, -0.5) - gaussian(x,y,z,0.5, 0.5, 0.5) data(0,0,0) ##定义梯度函数 def gradient(x,y,z,delta=0.01): gradient_x = (data(x+delta,y,z) - data(x-delta,y,z))/2/delta gradient_y = (data(x,y+delta,z) - data(x,y-delta,z))/2/delta gradient_z = (data(x,y,z+delta) - data(x,y,z-delta))/2/delta return gradient_x,gradient_y,gradient_z #寻找局部最小值函数 def find_minimum(pos,gradient_rate,find_minimum_err): data_diff = 1.0 while np.abs(data_diff) > find_minimum_err: pos_new = pos - np.array(gradient(*pos))*gradient_rate data_diff = data(*pos_new) - data(*pos) pos = pos_new return pos #定义弹力大小和方向 def spring_force(image_before, image, image_next, k = 2.0):#前三个三参数是相邻的位置点,k为弹力增益系数 dist_before = np.sqrt(np.sum((image - image_before)**2)) force_before = (dist_before - dist_avg)*k direction_before = (image - image_before)/dist_before dist_next = np.sqrt(np.sum((image - image_next)**2)) force_next = (dist_next - dist_avg)*k direction_next = (image - image_next)/dist_next force = force_before*direction_before + force_next*direction_next if np.sum(force**2)< 1e-8: direction = np.zeros_like(force) else: direction = force / np.sqrt(np.sum(force**2)) return force,direction #采用微动弹性带寻找过渡态 def Saddle(roll_rate, roll_err): n_step = 0 data_diff = 1.0 while np.abs(data_diff) > roll_err:#判断所有合内力是否在允许范围 data_diff = 0 for i in range(1,n-1): old_data = data(*images[i]) s_force[i],direction[i] = spring_force(images[i-1], images[i], images[i+1], k = 2.0) s_force[i] = np.dot(s_force[i], direction[i]) * direction[i]#力矢量分解 g_force[i] = gradient(*images[i])#重力与梯度相关 g_force[i] = g_force[i] - np.dot(g_force[i], direction[i]) * direction[i] images[i] -= (s_force[i] + g_force[i]) * roll_rate#珠子在重力和弹力限制下运动 new_data = data(*images[i]) data_diff=data_diff + (new_data - old_data)#计算所有合力 n_step+= 1 return n_step np.random.seed(0)#定义随机种子数 #pos = np.random.rand(3)*2-1 #随机搜索10次寻找局部最小值 for i in range(10): #np.random.seed(0)#定义随机种子数 pos = np.random.rand(3)*2-1 pos = find_minimum(pos,gradient_rate,find_minimum_err) print(pos,data(*pos)) poss[i, 0:3] = pos poss[i, 3] = data(*pos) #注意此次需要根据自己计算的差异性修改为局部最小值,保存两个最小值 index_a = np.squeeze(np.where(poss[:, 3] == min(poss[:, 3]))) pos_A = poss[index_a, 0:3] pos_A = np.array([round(i,8) for i in pos_A]) for i in range(10): if abs(data(*pos_A) - poss[i, 3]) < 1e-3: poss[i, 3] = 0 index_b = np.squeeze(np.where(poss[:, 3] == min(poss[:, 3]))) pos_B = poss[index_b, 0:3] pos_B = np.array([round(i,8) for i in pos_B]) print(pos_A,pos_B) #pos_A = np.array([0.60501534,0.67242841,0.67002837])#三个坐标用逗号 #pos_B = np.array([-0.73177095,-0.64662581,-0.64406625]) #print(pos_A,pos_B) dist_AB = np.sqrt(np.sum((pos_A - pos_B)**2))#AB两点之间距离 dist_avg = dist_AB / (n - 1)#两珠子之间平均距离 images = np.zeros([n,3])#清空所以珠子的位置信息 s_force = np.zeros_like(images)#定义珠子之间的弹力 direction = np.zeros_like(images)#定义珠子之间的方向 g_force = np.zeros_like(images)#定义珠子之间的重力 Saddle_Point = np.zeros([n])#定义珠子目标函数值 roll_rate = 0.1#滚动速率 roll_err =1e-5 #准备计算弹力 for i in range(n):#用9粒珠子均匀连接A\B两点 images[i] = (pos_B - pos_A)/(n - 1) * i + pos_A n_step = Saddle(roll_rate, roll_err) for i in range(n): Saddle_Point[i]=data(*images[i]) print(Saddle_Point[i]) print(n_step) print('Saddle Point value:'+str(max(Saddle_Point)))#打印鞍点值 ipy.quickscatter(images[:,0], images[:,1], images[:,2], size = 5, marker = "sphere")
详细代码可见:考虑一个三变量函数(见下方代码),寻找这个函数的在(0.5,0.5,0.5)和(-0.5,-0.5,-0.5)附近的两个-Python文档类资源-CSDN下载
5. 总结
本文完成了鞍点Saddle Point Locator求解,后期会分享更多有趣的操作从而实现对外部世界进行感知,充分认识这个有机与无机的环境,科学地合理地进行创作和发挥效益,然后为人类社会发展贡献一点微薄之力。