1. 鞍点
鞍点是一种特殊的驻点。对于多变量函数,在鞍点位置,函数沿任意方向的导数都为0,但函数并不是最大值或者最小值。我们关注一类特殊的鞍点,在这个位置,函数在某一方向上是最大值,但是在剩余所有方向上是极小值。
寻找鞍点在科学和工程研究中有很多应用。一个常用的例子是地形图,地势高度取决于水平坐标,因此这是一个双变量函数。假设在起伏的地势中有两个盆地(对应于函数的局部极小值)A和B。一个人想要从A出发到达B,在连接A和B的所有可能的路径中,哪一条路径走过的地势最高点最低?这个问题的实质就是寻找这个双变量函数的鞍点(或者一个更常见的名称,过渡态)。
2. 微动弹性带方法
寻找过渡态的一个常用算法是微动弹性带(Nudged Elastic Band)。它的核心思想是,将初始坐标和终态坐标用若干个中间态(例如11个)连接起来,然后让这些中间态沿着函数梯度的反方向移动(类似于小球在地形图中沿着山坡向下移动);为了避免这些中间态都收敛到附近的局部极小(类似于小球都落入了盆地),相邻中间态之间用一根虚拟的弹簧连接,从而迫使相邻中间态有一定的间距。当这个小球弹簧链(微动弹性带)移动停止时,其所在位置就是所谓的最低能量路径(minimal energy path),其中间函数值最大的位置就是鞍点或者过渡态。
在迭代计算过程中,中间态的移动同时受函数梯度和弹簧弹力的调控。为了保持中间态的间距尽量不改变,以及虚拟弹簧不影响路径正确性,可以对函数梯度和弹簧弹力进行矢量分解。其中,函数梯度只保留垂直于路径的分量;弹簧弹力只保留沿着路径的分量。
参考资料:Nudged Elastic Band Method
3. 例题
考虑一个三变量函数(见下方代码),坐标区间为[-1, 1]。
寻找这个函数的在(0.5, 0.5, 0.5)和(-0.5, -0.5, -0.5)附近的两个局部极小值,以及两个极小值之间最低能量路径上的鞍点。data是目标函数
import numpy as np def gaussian(pos, pos0): return np.exp(-np.sum((pos-pos0)**2)) def data(pos): return gaussian(pos, np.array([0.1, -0.1, -0.1])) - gaussian(pos, np.array([-0.5, -0.5, -0.5])) - gaussian(pos, np.array([0.5, 0.5, 0.5]))
3.1 SciPy求解最小值
采用BFGS求解两点附近最小值。
from scipy import optimize pos = np.array([0.5, 0.5, 0.5]) print(optimize.minimize(data, pos, method='BFGS')) pos = np.array([-0.5, -0.5, -0.5]) print(optimize.minimize(data, pos, method='BFGS'))
3.2 IPyVolume可视化NumPy数组
打印梯度
N = 50 X = Y = Z = np.linspace(-1, 1, N) data_grid = np.zeros([N, N, N]) for i in range(N): for j in range(N): for k in range(N): data_grid[k, j, i] = data(np.array([X[i], Y[j], Z[k]])) print(np.min(data_grid))
3D Volume
三维可视化
2D Slice
二维展平
from ipywidgets import interactive import matplotlib.pyplot as plt def slice_z(i): fig, ax = plt.subplots(figsize=(8, 8)) im = ax.imshow(data_grid[i,:,:], vmin=-1, vmax=1, cmap=plt.get_cmap('gist_rainbow')) ct = ax.contour(data_grid[i,:,:]) bar = plt.colorbar(im) plt.show() interact_plot = interactive(slice_z, i=(0, 49)) interact_plot
3.3 梯度
def gradient(pos, delta = 0.01): gradient_x = (data(pos + np.array([delta, 0, 0])) - data(pos - np.array([delta, 0, 0]))) / 2 / delta gradient_y = (data(pos + np.array([0, delta, 0])) - data(pos - np.array([0, delta, 0]))) / 2 / delta gradient_z = (data(pos + np.array([0, 0, delta])) - data(pos - np.array([0, 0, delta]))) / 2 / delta return np.array([gradient_x, gradient_y, gradient_z])
3.4 寻找最小值
随机生成三维点
np.random.seed(0) pos = np.random.rand(3) * 2 - 1 print(pos,data(pos),gradient(pos))
rate = 0.1 pos_new = pos - np.array(gradient(pos)) * rate print(pos_new,data(pos_new))
寻找最小值函数
def find_minimal(pos, rate, err): pos_diff = 1 data_diff = 1 while np.abs(data_diff) > err: pos_new = pos - np.array(gradient(pos)) * rate data_diff = data(pos_new) - data(pos) pos_diff = np.max(np.abs(np.array(gradient(pos)) * rate)) pos = pos_new return pos pos = find_minimal(pos, 0.01, 1e-10) print(pos, data(pos))
for i in range(5): pos = np.random.rand(3) * 2 - 1 pos = find_minimal(pos, 0.001, 1e-10) print(pos, data(pos))
记录两个局部最优点
pos_A = np.array([0.60486467, 0.67087103, 0.67164885]) pos_B = np.array([-0.73136024, -0.64616776, -0.64521874])
3.5 寻找鞍点
n = 11 images = np.zeros([n, 3]) for i in range(n): images[i] = (pos_B - pos_A)/(n - 1) * i + pos_A print(images)
可视化等间距
ipv.quickscatter(images[:,0], images[:,1], images[:,2], size=5, marker="sphere")
定义弹力的合力大小和方向
dist_AB = np.sqrt(np.sum((pos_A - pos_B)**2)) dist_image = dist_AB / (n - 1) def spring_force(image_before, image, image_next, k = 2.0): dist_before = np.sqrt(np.sum((image - image_before)**2)) force_before = (dist_before - dist_image) * k direction_before = (image - image_before)/dist_before dist_next = np.sqrt(np.sum((image_next - image)**2)) force_next = (dist_image - dist_next) * k direction_next = (image_next - image)/dist_next force = force_before*direction_before + force_next*direction_next direction = (image_next - image_before) / np.sqrt(np.sum((image_next - image_before)**2)) return force, direction
采用弹力重力的合力进行迭代,返回鞍点位置索引,判断条件是珠子位置导数和目标函数导数的最大值
for i in range(n): images[i] = (pos_B - pos_A)/(n - 1) * i + pos_A s_force = np.zeros_like(images) direction = np.zeros_like(images) g_force = np.zeros_like(images) rate = 0.1 err = 1e-8 def NEB(rate, err): n_step = 0 pos_diff = 1 data_diff = 1 while pos_diff > err or data_diff > err: old_pos = images old_saddle = np.max([data(images[i]) for i in range(n)]) for i in range(1, n-1): s_force[i], direction[i] = spring_force(images[i-1], images[i], images[i+1]) s_force[i] = np.dot(s_force[i], direction[i]) * direction[i] # Vector decomposition g_force[i] = gradient(images[i]) g_force[i] = g_force[i] - np.dot(g_force[i], direction[i]) * direction[i] # Vector decomposition images[i] -= (s_force[i]+g_force[i]) * rate new_pos = images new_saddle = np.max([data(images[i]) for i in range(n)]) idx_saddle = np.argmax([data(images[i]) for i in range(n)]) pos_diff = np.max(np.abs(new_pos - old_pos)) data_diff = np.abs(new_saddle - old_saddle) n_step += 1 return n_step, idx_saddle n_step, idx_saddle = NEB(rate, err) print("n_step ", n_step) print("idx_saddle", idx_saddle) print("images ", images[idx_saddle]) print("data ", data(images[idx_saddle])) ipv.quickscatter(images[:,0], images[:,1], images[:,2], size=5, marker="sphere") gradient(images[idx_saddle])
中间构型仍然存在沿着最低能量路径方向的梯度,说明其并不在鞍点位置。我们需要进一步精修其位置。“生活就是这样给你一个不怎么精确的解,还需要优化迭代出最优解”
3.6 Climbing Image Nudged Elastic Band
采用单纯的重力更新位置,判断条件是珠子受重力方向的最大值
def cNEB(image, direction): g_force = 1 while np.max(np.abs(g_force)) > err: g_force = gradient(image) g_force = np.dot(g_force, direction) * direction image += g_force * rate return image saddle_direction = (images[idx_saddle+1] - images[idx_saddle-1]) / np.sqrt(np.sum((images[idx_saddle+1] - images[idx_saddle-1])**2)) saddle_point = cNEB(images[idx_saddle], saddle_direction) print(saddle_point) print(gradient(saddle_point)) print(data(saddle_point))
精度还是不错的,梯度非常接近零啦!