创建神经网络类
我们的网络类接收variantal_estimator
装饰器,该装饰器可简化对贝叶斯神经网络损失的采样。我们的网络具有一个贝叶斯LSTM层,参数设置为in_features = 1
以及out_features = 10
,后跟一个nn.Linear(10, 1)
,该层输出股票的标准化价格。
@variational_estimator class NN(nn.Module): def __init__(self): super(NN, self).__init__() self.lstm_1 = BayesianLSTM(1, 10) self.linear = nn.Linear(10, 1) def forward(self, x): x_, _ = self.lstm_1(x) #gathering only the latent end-of-sequence for the linear layer x_ = x_[:, -1, :] x_ = self.linear(x_) return x_
如您所见,该网络可以正常工作,唯一的不同点是BayesianLSTM层和variantal_estimator装饰器,但其行为与一般的Torch对象相同。
完成后,我们可以创建我们的神经网络对象,分割数据集并进入训练循环:
创建对象
我们现在可以创建损失函数、神经网络、优化器和dataloader。请注意,我们不是随机分割数据集,因为我们将使用最后一批时间戳来计算模型。由于我们的数据集很小,我们不会对训练集创建dataloader。
Xs, ys = create_timestamps_ds(close_prices) X_train, X_test, y_train, y_test = train_test_split(Xs, ys, test_size=.25, random_state=42, shuffle=False) ds = torch.utils.data.TensorDataset(X_train, y_train) dataloader_train = torch.utils.data.DataLoader(ds, batch_size=8, shuffle=True) net = NN() criterion = nn.MSELoss() optimizer = optim.Adam(net.parameters(), lr=0.001)
我们将使用MSE损失函数和学习率为0.001的Adam优化器
训练循环
对于训练循环,我们将使用添加了variational_estimator
的sample_elbo
方法。它对X个样本的损失进行平均,并帮助我们轻松地用蒙特卡洛估计来计算损失。
为了使网络正常工作,网络forward
方法的输出必须与传入损失函数对象的标签的形状一致。
iteration = 0 for epoch in range(10): for i, (datapoints, labels) in enumerate(dataloader_train): optimizer.zero_grad() loss = net.sample_elbo(inputs=datapoints, labels=labels, criterion=criterion, sample_nbr=3) loss.backward() optimizer.step() iteration += 1 if iteration%250==0: preds_test = net(X_test)[:,0].unsqueeze(1) loss_test = criterion(preds_test, y_test) print("Iteration: {} Val-loss: {:.4f}".format(str(iteration), loss_test))
评估模型并计算置信区间
我们将首先创建一个具有要绘制的真实数据的dataframe:
original = close_prices_unscaled[1:][window_size:] df_pred = pd.DataFrame(original) df_pred["Date"] = df.Date df["Date"] = pd.to_datetime(df_pred["Date"]) df_pred = df_pred.reset_index()
要预测置信区间,我们必须创建一个函数来预测同一数据X次,然后收集其均值和标准差。同时,在查询真实数据之前,我们必须设置将尝试预测的窗口大小。
让我们看一下预测函数的代码:
def pred_stock_future(X_test, future_length, sample_nbr=10): #sorry for that, window_size is a global variable, and so are X_train and Xs global window_size global X_train global Xs global scaler #creating auxiliar variables for future prediction preds_test = [] test_begin = X_test[0:1, :, :] test_deque = deque(test_begin[0,:,0].tolist(), maxlen=window_size) idx_pred = np.arange(len(X_train), len(Xs)) #predict it and append to list for i in range(len(X_test)): #print(i) as_net_input = torch.tensor(test_deque).unsqueeze(0).unsqueeze(2) pred = [net(as_net_input).cpu().item() for i in range(sample_nbr)] test_deque.append(torch.tensor(pred).mean().cpu().item()) preds_test.append(pred) if i % future_length == 0: #our inptus become the i index of our X_test #That tweak just helps us with shape issues test_begin = X_test[i:i+1, :, :] test_deque = deque(test_begin[0,:,0].tolist(), maxlen=window_size) #preds_test = np.array(preds_test).reshape(-1, 1) #preds_test_unscaled = scaler.inverse_transform(preds_test) return idx_pred, preds_test
我们要将置信区间保存下来,确定我们置信区间的宽度。
def get_confidence_intervals(preds_test, ci_multiplier): global scaler preds_test = torch.tensor(preds_test) pred_mean = preds_test.mean(1) pred_std = preds_test.std(1).detach().cpu().numpy() pred_std = torch.tensor((pred_std)) upper_bound = pred_mean + (pred_std * ci_multiplier) lower_bound = pred_mean - (pred_std * ci_multiplier) #gather unscaled confidence intervals pred_mean_final = pred_mean.unsqueeze(1).detach().cpu().numpy() pred_mean_unscaled = scaler.inverse_transform(pred_mean_final) upper_bound_unscaled = upper_bound.unsqueeze(1).detach().cpu().numpy() upper_bound_unscaled = scaler.inverse_transform(upper_bound_unscaled) lower_bound_unscaled = lower_bound.unsqueeze(1).detach().cpu().numpy() lower_bound_unscaled = scaler.inverse_transform(lower_bound_unscaled) return pred_mean_unscaled, upper_bound_unscaled, lower_bound_unscaled
由于我们使用的样本数量很少,因此用一个很高的标准差对其进行了补偿。我们的网络将尝试预测7天,然后将参考数据:
future_length=7 sample_nbr=4 ci_multiplier=10 idx_pred, preds_test = pred_stock_future(X_test, future_length, sample_nbr) pred_mean_unscaled, upper_bound_unscaled, lower_bound_unscaled = get_confidence_intervals(preds_test, ci_multiplier)
我们可以通过查看实际值是否低于上限并高于下限来检查置信区间。设置好参数后,您应该拥有95%的置信区间,如下所示:
y = np.array(df.Close[-750:]).reshape(-1, 1) under_upper = upper_bound_unscaled > y over_lower = lower_bound_unscaled < y total = (under_upper == over_lower) print("{} our predictions are in our confidence interval".format(np.mean(total)))
检查输出图形
现在,我们将把预测结果绘制为可视化图形来检查我们的网络是否运行的很顺利,我们将在置信区间内绘制真实值与预测值。
params = {"ytick.color" : "w", "xtick.color" : "w", "axes.labelcolor" : "w", "axes.edgecolor" : "w"} plt.rcParams.update(params) plt.title("IBM Stock prices", color="white") plt.plot(df_pred.index, df_pred.Close, color='black', label="Real") plt.plot(idx_pred, pred_mean_unscaled, label="Prediction for {} days, than consult".format(future_length), color="red") plt.fill_between(x=idx_pred, y1=upper_bound_unscaled[:,0], y2=lower_bound_unscaled[:,0], facecolor='green', label="Confidence interval", alpha=0.5) plt.legend()
最后,我们放大一下着重看看预测部分。
params = {"ytick.color" : "w", "xtick.color" : "w", "axes.labelcolor" : "w", "axes.edgecolor" : "w"} plt.rcParams.update(params) plt.title("IBM Stock prices", color="white") plt.fill_between(x=idx_pred, y1=upper_bound_unscaled[:,0], y2=lower_bound_unscaled[:,0], facecolor='green', label="Confidence interval", alpha=0.75) plt.plot(idx_pred, df_pred.Close[-len(pred_mean_unscaled):], label="Real", alpha=1, color='black', linewidth=0.5) plt.plot(idx_pred, pred_mean_unscaled, label="Prediction for {} days, than consult".format(future_length), color="red", alpha=0.5) plt.legend()
总结
我们看到BLiTZ内置的贝叶斯LSTM使得贝叶斯深度学习的所有功能都变得非常简单,并且可以顺利地在时间序列上进行迭代。我们还看到,贝叶斯LSTM已与Torch很好地集成在一起,并且易于使用,你可以在任何工作或研究中使用它。
我们还可以非常准确地预测IBM股票价格的置信区间,而且这比一般的点估计可能要有用的多。