TensorFlow 机器学习秘籍第二版：1~5（3）-阿里云开发者社区

TensorFlow 机器学习秘籍第二版：1~5（2）https://developer.aliyun.com/article/1426832

工作原理

与之前的秘籍或本书中的大多数秘籍不同，此处的解决方案仅通过矩阵运算找到。我们将使用的大多数 TensorFlow 算法都是通过训练循环实现的，并利用自动反向传播来更新模型变量。在这里，我们通过实现将模型拟合到数据的直接解决方案来说明 TensorFlow 的多功能性。

我们在这里使用了一个二维数据示例来显示与数据拟合的图。值得注意的是，用于求解系数的公式

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-wYgUiXzw-1681566813067)(https://gitcode.net/apachecn/apachecn-dl-zh/-/raw/master/docs/tf-ml-cookbook-2e-zh/img/32e2728d-bb84-491b-a4e1-e28108f5fff1.png)]

将根据需要扩展到数据中的许多特征（除非存在任何共线性问题）。

实现分解方法

对于这个秘籍，我们将实现一个用于线性回归的矩阵分解方法。具体来说，我们将使用 Cholesky 分解，TensorFlow 中存在相关函数。

准备

在大多数情况下，实现前一个秘籍中的逆方法在数值上效率低，尤其是当矩阵变得非常大时。另一种方法是分解A矩阵并对分解执行矩阵运算。一种方法是在 TensorFlow 中使用内置的 Cholesky 分解方法。

人们对将矩阵分解为更多矩阵如此感兴趣的一个原因是，所得到的矩阵将具有允许我们有效使用某些方法的保证属性。 Cholesky 分解将矩阵分解为下三角矩阵和上三角矩阵，比如L和L'，使得这些矩阵是彼此的转置。有关此分解属性的更多信息，有许多可用资源来描述它以及如何到达它。在这里，我们将通过将其写为LL'x = b来解决系统Ax = b。我们首先解决Ly = b的y，然后求解L'x = y得到我们的系数矩阵x。

操作步骤

我们按如下方式处理秘籍：

我们将以与上一个秘籍完全相同的方式设置系统。我们将导入库，初始化图并创建数据。然后，我们将以之前的方式获得我们的A矩阵和b矩阵：

import matplotlib.pyplot as plt 
import numpy as np 
import tensorflow as tf 
from tensorflow.python.framework import ops 
ops.reset_default_graph() 
sess = tf.Session() 
x_vals = np.linspace(0, 10, 100) 
y_vals = x_vals + np.random.normal(0, 1, 100) 
x_vals_column = np.transpose(np.matrix(x_vals)) 
ones_column = np.transpose(np.matrix(np.repeat(1, 100))) 
A = np.column_stack((x_vals_column, ones_column)) 
b = np.transpose(np.matrix(y_vals)) 
A_tensor = tf.constant(A) 
b_tensor = tf.constant(b)

接下来，我们找到方阵的 Cholesky 分解，A^T A：

tA_A = tf.matmul(tf.transpose(A_tensor), A_tensor) 
L = tf.cholesky(tA_A) 
tA_b = tf.matmul(tf.transpose(A_tensor), b) 
sol1 = tf.matrix_solve(L, tA_b) 
sol2 = tf.matrix_solve(tf.transpose(L), sol1)

请注意，TensorFlow 函数cholesky()仅返回分解的下对角线部分。这很好，因为上对角矩阵只是下对角矩阵的转置。

现在我们有了解决方案，我们提取系数：

solution_eval = sess.run(sol2) 
slope = solution_eval[0][0] 
y_intercept = solution_eval[1][0] 
print('slope: ' + str(slope)) 
print('y_intercept: ' + str(y_intercept)) 
slope: 0.956117676145 
y_intercept: 0.136575513864 
best_fit = [] 
for i in x_vals: 
    best_fit.append(slope*i+y_intercept) 
plt.plot(x_vals, y_vals, 'o', label='Data') 
plt.plot(x_vals, best_fit, 'r-', label='Best fit line', linewidth=3) 
plt.legend(loc='upper left') 
plt.show()

绘图如下：

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-NpZoY504-1681566813067)(https://gitcode.net/apachecn/apachecn-dl-zh/-/raw/master/docs/tf-ml-cookbook-2e-zh/img/57a190f8-110c-4cd2-902f-3668dc603f65.png)]

图 2：通过 Cholesky 分解获得的数据点和最佳拟合线

工作原理

如您所见，我们得出了与之前秘籍非常相似的答案。请记住，这种分解矩阵然后对碎片执行操作的方式有时会更加高效和数值稳定，尤其是在考虑大型数据矩阵时。

学习 TensorFlow 线性回归方法

虽然使用矩阵和分解方法非常强大，但 TensorFlow 还有另一种解决斜率和截距的方法。 TensorFlow 可以迭代地执行此操作，逐步学习最小化损失的线性回归参数。

准备

在这个秘籍中，我们将遍历批量数据点并让 TensorFlow 更新斜率和y截距。我们将使用内置于 scikit-learn 库中的鸢尾花数据集，而不是生成的数据。具体来说，我们将通过数据点找到最佳线，其中x值是花瓣宽度，y值是萼片长度。我们选择了这两个，因为它们之间似乎存在线性关系，我们将在最后的绘图中看到。我们还将在下一节中详细讨论不同损失函数的影响，但对于这个秘籍，我们将使用 L2 损失函数。

操作步骤

我们按如下方式处理秘籍：

我们首先加载必要的库，创建图并加载数据：

import matplotlib.pyplot as plt 
import numpy as np 
import tensorflow as tf 
from sklearn import datasets 
from tensorflow.python.framework import ops 
ops.reset_default_graph() 
sess = tf.Session() 
iris = datasets.load_iris() 
x_vals = np.array([x[3] for x in iris.data]) 
y_vals = np.array([y[0] for y in iris.data])

然后我们声明我们的学习率，批量大小，占位符和模型变量：

learning_rate = 0.05 
batch_size = 25 
x_data = tf.placeholder(shape=[None, 1], dtype=tf.float32) 
y_target = tf.placeholder(shape=[None, 1], dtype=tf.float32) 
A = tf.Variable(tf.random_normal(shape=[1,1])) 
b = tf.Variable(tf.random_normal(shape=[1,1]))

接下来，我们编写线性模型的公式y = Ax + b：

model_output = tf.add(tf.matmul(x_data, A), b)

然后，我们声明我们的 L2 损失函数（包括批量的平均值），初始化变量，并声明我们的优化器。请注意，我们选择0.05作为我们的学习率：

loss = tf.reduce_mean(tf.square(y_target - model_output)) 
init = tf.global_variables_initializer() 
sess.run(init) 
my_opt = tf.train.GradientDescentOptimizer(learning_rate) 
train_step = my_opt.minimize(loss)

我们现在可以在随机选择的批次上循环并训练模型。我们将运行 100 个循环并每 25 次迭代打印出变量和损失值。请注意，在这里，我们还保存了每次迭代的损失，以便我们以后可以查看它们：

loss_vec = [] 
for i in range(100): 
    rand_index = np.random.choice(len(x_vals), size=batch_size) 
    rand_x = np.transpose([x_vals[rand_index]]) 
    rand_y = np.transpose([y_vals[rand_index]]) 
    sess.run(train_step, feed_dict={x_data: rand_x, y_target: rand_y}) 
    temp_loss = sess.run(loss, feed_dict={x_data: rand_x, y_target: rand_y}) 
    loss_vec.append(temp_loss) 
    if (i+1)%25==0: 
        print('Step #' + str(i+1) + ' A = ' + str(sess.run(A)) + ' b = ' + str(sess.run(b))) 
        print('Loss = ' + str(temp_loss)) 
Step #25 A = [[ 2.17270374]] b = [[ 2.85338426]] 
Loss = 1.08116 
Step #50 A = [[ 1.70683455]] b = [[ 3.59916329]] 
Loss = 0.796941 
Step #75 A = [[ 1.32762754]] b = [[ 4.08189011]] 
Loss = 0.466912 
Step #100 A = [[ 1.15968263]] b = [[ 4.38497639]] 
Loss = 0.281003

接下来，我们将提取我们找到的系数并创建一个最合适的线以放入图中：

[slope] = sess.run(A) 
[y_intercept] = sess.run(b) 
best_fit = [] 
for i in x_vals: 
    best_fit.append(slope*i+y_intercept)

在这里，我们将创建两个图。第一个是覆盖拟合线的数据。第二个是 100 次迭代中的 L2 损失函数。这是生成两个图的代码。

plt.plot(x_vals, y_vals, 'o', label='Data Points') 
plt.plot(x_vals, best_fit, 'r-', label='Best fit line', linewidth=3) 
plt.legend(loc='upper left') 
plt.title('Sepal Length vs Petal Width') 
plt.xlabel('Petal Width') 
plt.ylabel('Sepal Length') 
plt.show() 
plt.plot(loss_vec, 'k-') 
plt.title('L2 Loss per Generation') 
plt.xlabel('Generation') 
plt.ylabel('L2 Loss') 
plt.show()

此代码生成以下拟合数据和损失图。

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-18aoJlI7-1681566813067)(https://gitcode.net/apachecn/apachecn-dl-zh/-/raw/master/docs/tf-ml-cookbook-2e-zh/img/e3f807db-f6b3-4808-8b2e-990ed0c444b8.png)]

图 3：来自鸢尾数据集的数据点（萼片长度与花瓣宽度）重叠在 TensorFlow 中找到的最佳线条拟合。

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-2oU3jpIn-1681566813067)(https://gitcode.net/apachecn/apachecn-dl-zh/-/raw/master/docs/tf-ml-cookbook-2e-zh/img/3d3d7762-7528-4101-bd3b-8175f11e0c83.png)]

图 4：用我们的算法拟合数据的 L2 损失；注意损失函数中的抖动，可以通过较大的批量大小减小抖动，或者通过较小的批量大小来增加。

Here is a good place to note how to see whether the model is overfitting or underfitting the data. If our data is broken into test and training sets, and the accuracy is greater on the training set and lower on the test set, then we are overfitting the data. If the accuracy is still increasing on both test and training sets, then the model is underfitting and we should continue training.

工作原理

找到的最佳线不保证是最合适的线。最佳拟合线的收敛取决于迭代次数，批量大小，学习率和损失函数。随着时间的推移观察损失函数总是很好的做法，因为它可以帮助您解决问题或超参数变化。

理解线性回归中的损失函数

了解损失函数在算法收敛中的作用非常重要。在这里，我们将说明 L1 和 L2 损失函数如何影响线性回归中的收敛。

准备

我们将使用与先前秘籍中相同的鸢尾数据集，但我们将更改损失函数和学习率以查看收敛如何变化。

操作步骤

我们按如下方式处理秘籍：

程序的开始与上一个秘籍相同，直到我们达到我们的损失函数。我们加载必要的库，启动会话，加载数据，创建占位符，并定义我们的变量和模型。需要注意的一点是，我们正在提取学习率和模型迭代。我们这样做是因为我们希望显示快速更改这些参数的效果。使用以下代码：

import matplotlib.pyplot as plt 
import numpy as np 
import tensorflow as tf 
from sklearn import datasets 
sess = tf.Session() 
iris = datasets.load_iris() 
x_vals = np.array([x[3] for x in iris.data]) 
y_vals = np.array([y[0] for y in iris.data]) 
batch_size = 25 
learning_rate = 0.1 # Will not converge with learning rate at 0.4 
iterations = 50 
x_data = tf.placeholder(shape=[None, 1], dtype=tf.float32) 
y_target = tf.placeholder(shape=[None, 1], dtype=tf.float32) 
A = tf.Variable(tf.random_normal(shape=[1,1])) 
b = tf.Variable(tf.random_normal(shape=[1,1])) 
model_output = tf.add(tf.matmul(x_data, A), b)

我们的损失函数将变为 L1 损失（loss_l1），如下所示：

loss_l1 = tf.reduce_mean(tf.abs(y_target - model_output))

现在，我们通过初始化变量，声明我们的优化器以及通过训练循环迭代数据来恢复。请注意，我们也在节省每一代的损失来衡量收敛。使用以下代码：

init = tf.global_variables_initializer() 
sess.run(init) 
my_opt_l1 = tf.train.GradientDescentOptimizer(learning_rate) 
train_step_l1 = my_opt_l1.minimize(loss_l1) 
loss_vec_l1 = [] 
for i in range(iterations): 
    rand_index = np.random.choice(len(x_vals), size=batch_size) 
    rand_x = np.transpose([x_vals[rand_index]]) 
    rand_y = np.transpose([y_vals[rand_index]]) 
    sess.run(train_step_l1, feed_dict={x_data: rand_x, y_target: rand_y}) 
    temp_loss_l1 = sess.run(loss_l1, feed_dict={x_data: rand_x, y_target: rand_y}) 
    loss_vec_l1.append(temp_loss_l1) 
    if (i+1)%25==0: 
        print('Step #' + str(i+1) + ' A = ' + str(sess.run(A)) + ' b = ' + str(sess.run(b))) 
plt.plot(loss_vec_l1, 'k-', label='L1 Loss') 
plt.plot(loss_vec_l2, 'r--', label='L2 Loss') 
plt.title('L1 and L2 Loss per Generation') 
plt.xlabel('Generation') 
plt.ylabel('L1 Loss') 
plt.legend(loc='upper right') 
plt.show()

工作原理

在选择损失函数时，我们还必须选择适合我们问题的相应学习率。在这里，我们将说明两种情况，一种是首选 L2，另一种是首选 L1。

如果我们的学习率很小，我们的收敛会花费更多时间。但是如果我们的学习速度太大，我们的算法就会遇到问题从不收敛。下面是当学习率为 0.05 时，鸢尾线性回归问题的 L1 和 L2 损失的损失函数图：

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-P0bdRUn6-1681566813068)(https://gitcode.net/apachecn/apachecn-dl-zh/-/raw/master/docs/tf-ml-cookbook-2e-zh/img/bf44e506-bc88-41c2-b566-ddf2fe68bbd6.png)]

图 5：鸢尾线性回归问题的学习率为 0.05 的 L1 和 L2 损失

学习率为 0.05 时，似乎 L2 损失是首选，因为它会收敛到较低的损失。下面是我们将学习率提高到 0.4 时的损失函数图：

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-LbgISEgU-1681566813068)(https://gitcode.net/apachecn/apachecn-dl-zh/-/raw/master/docs/tf-ml-cookbook-2e-zh/img/729d53ef-d7e2-4068-8ae7-2fc6e91c572b.png)]

图 6：鸢尾线性回归问题的 L1 和 L2 损失，学习率为 0.4;请注意，由于 y 轴的高比例，L1 损失不可见

在这里，我们可以看到高学习率可以在 L2 范数中超调，而 L1 范数收敛。

为了理解正在发生的事情，我们应该看看大学习率和小学习率如何影响 L1 范数和 L2 范数。为了使这个可视化，我们查看两个规范的学习步骤的一维表示，如下所示：

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-TxezJnvf-1681566813068)(https://gitcode.net/apachecn/apachecn-dl-zh/-/raw/master/docs/tf-ml-cookbook-2e-zh/img/f15420de-a851-4e7f-8b87-434e1eba5e2d.png)]

图 7：学习率越来越高的 L1 和 L2 范数会发生什么

实现戴明回归

在这个秘籍中，我们将实现戴明回归，这意味着我们需要一种不同的方法来测量模型线和数据点之间的距离。

戴明回归有几个名字。它也称为总回归，正交距离回归（ODR）和最短距离回归。

准备

如果最小二乘线性回归最小化到线的垂直距离，则戴明回归最小化到线的总距离。这种类型的回归可以最小化y和x值的误差。

请参阅下图进行比较：

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-s8G0k7Zm-1681566813068)(https://gitcode.net/apachecn/apachecn-dl-zh/-/raw/master/docs/tf-ml-cookbook-2e-zh/img/7b0c20a1-43c3-4a07-9785-2f7a5706ad6b.png)]

图 8：常规线性回归和戴明回归之间的差异；左边的线性回归最小化了到线的垂直距离，右边的变形回归最小化了到线的总距离

要实现戴明回归，我们必须修改损失函数。常规线性回归中的损失函数使垂直距离最小化。在这里，我们希望最小化总距离。给定线的斜率和截距，到点的垂直距离是已知的几何公式。我们只需要替换此公式并告诉 TensorFlow 将其最小化。

操作步骤

我们按如下方式处理秘籍：

代码与之前的秘籍非常相似，除非我们进入损失函数。我们首先加载库；开始一个会议；加载数据；声明批量大小；并创建占位符，变量和模型输出，如下所示：

import matplotlib.pyplot as plt 
import numpy as np 
import tensorflow as tf 
from sklearn import datasets 
sess = tf.Session() 
iris = datasets.load_iris() 
x_vals = np.array([x[3] for x in iris.data]) 
y_vals = np.array([y[0] for y in iris.data]) 
batch_size = 50 
x_data = tf.placeholder(shape=[None, 1], dtype=tf.float32) 
y_target = tf.placeholder(shape=[None, 1], dtype=tf.float32) 
A = tf.Variable(tf.random_normal(shape=[1,1])) 
b = tf.Variable(tf.random_normal(shape=[1,1])) 
model_output = tf.add(tf.matmul(x_data, A), b)

损失函数是由分子和分母组成的几何公式。为清楚起见，我们将分别编写这些内容。给定一条线y = mx + b和一个点(x0, y0)，两者之间的垂直距离可以写成如下：

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-lGXMMZqu-1681566813069)(https://gitcode.net/apachecn/apachecn-dl-zh/-/raw/master/docs/tf-ml-cookbook-2e-zh/img/9537f564-ab2b-4b6a-936e-2432e21e2859.png)]

deming_numerator = tf.abs(tf.sub(y_target, tf.add(tf.matmul(x_data, A), b))) 
deming_denominator = tf.sqrt(tf.add(tf.square(A),1)) 
loss = tf.reduce_mean(tf.truediv(deming_numerator, deming_denominator))

我们现在初始化变量，声明我们的优化器，并循环遍历训练集以获得我们的参数，如下所示：

init = tf.global_variables_initializer() 
sess.run(init) 
my_opt = tf.train.GradientDescentOptimizer(0.1) 
train_step = my_opt.minimize(loss) 
loss_vec = [] 
for i in range(250): 
    rand_index = np.random.choice(len(x_vals), size=batch_size) 
    rand_x = np.transpose([x_vals[rand_index]]) 
    rand_y = np.transpose([y_vals[rand_index]]) 
    sess.run(train_step, feed_dict={x_data: rand_x, y_target: rand_y}) 
    temp_loss = sess.run(loss, feed_dict={x_data: rand_x, y_target: rand_y}) 
    loss_vec.append(temp_loss) 
    if (i+1)%50==0: 
        print('Step #' + str(i+1) + ' A = ' + str(sess.run(A)) + ' b = ' + str(sess.run(b))) 
        print('Loss = ' + str(temp_loss))

我们可以使用以下代码绘制输出：

[slope] = sess.run(A) 
[y_intercept] = sess.run(b) 
best_fit = [] 
for i in x_vals: 
    best_fit.append(slope*i+y_intercept)
plt.plot(x_vals, y_vals, 'o', label='Data Points') 
plt.plot(x_vals, best_fit, 'r-', label='Best fit line', linewidth=3) 
plt.legend(loc='upper left') 
plt.title('Sepal Length vs petal Width') 
plt.xlabel('petal Width') 
plt.ylabel('Sepal Length') 
plt.show()

我们得到上面代码的以下图：

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-XGPoGCFc-1681566813069)(https://gitcode.net/apachecn/apachecn-dl-zh/-/raw/master/docs/tf-ml-cookbook-2e-zh/img/e53cec07-65ac-4af7-8a26-f9d5a1c27642.png)]

图 9：对鸢尾数据集进行戴明回归的解决方案

工作原理

戴明回归的方法几乎与常规线性回归相同。关键的区别在于我们如何衡量预测和数据点之间的损失。而不是垂直损失，我们对y和x值有垂直损失（或总损失）。

当我们假设x和y值中的误差相似时，使用这种类型的回归。根据我们的假设，我们还可以根据误差的差异在距离计算中缩放x和y轴。

实现套索和岭回归

还有一些方法可以限制系数对回归输出的影响。这些方法称为正则化方法，两种最常见的正则化方法是套索和岭回归。我们将介绍如何在本文中实现这两个方面。

准备

套索和岭回归与常规线性回归非常相似，除了我们添加正则化项以限制公式中的斜率（或部分斜率）。这可能有多种原因，但一个常见的原因是我们希望限制对因变量产生影响的特征。这可以通过在损失函数中添加一个取决于我们的斜率值A的项来实现。

对于套索回归，如果斜率A超过某个值，我们必须添加一个能大大增加损失函数的项。我们可以使用 TensorFlow 的逻辑运算，但它们没有与之关联的梯度。相反，我们将使用称为连续重阶函数的阶梯函数的连续近似，该函数按比例放大到我们选择的正则化截止值。我们将展示如何在此秘籍中进行套索回归。

对于岭回归，我们只是在 L2 范数中添加一个项，这是斜率系数的缩放 L2 范数。这种修改很简单，并在本秘籍末尾的“更多”部分中显示。

操作步骤

我们按如下方式处理秘籍：

我们将再次使用鸢尾花数据集并以与以前相同的方式设置我们的脚本。我们先加载库；开始一个会议；加载数据；声明批量大小；并创建占位符，变量和模型输出，如下所示：

import matplotlib.pyplot as plt 
import numpy as np 
import tensorflow as tf 
from sklearn import datasets 
from tensorflow.python.framework import ops 
ops.reset_default_graph() 
sess = tf.Session() 
iris = datasets.load_iris() 
x_vals = np.array([x[3] for x in iris.data]) 
y_vals = np.array([y[0] for y in iris.data]) 
batch_size = 50 
learning_rate = 0.001 
x_data = tf.placeholder(shape=[None, 1], dtype=tf.float32) 
y_target = tf.placeholder(shape=[None, 1], dtype=tf.float32) 
A = tf.Variable(tf.random_normal(shape=[1,1])) 
b = tf.Variable(tf.random_normal(shape=[1,1])) 
model_output = tf.add(tf.matmul(x_data, A), b)

我们添加了损失函数，它是一个改进的连续 Heaviside 阶梯函数。我们还为0.9设定了套索回归的截止值。这意味着我们希望将斜率系数限制为小于0.9。使用以下代码：

lasso_param = tf.constant(0.9) 
heavyside_step = tf.truediv(1., tf.add(1., tf.exp(tf.multiply(-100., tf.subtract(A, lasso_param))))) 
regularization_param = tf.mul(heavyside_step, 99.) 
loss = tf.add(tf.reduce_mean(tf.square(y_target - model_output)), regularization_param)

我们现在初始化变量并声明我们的优化器，如下所示：

init = tf.global_variables_initializer() 
sess.run(init) 
my_opt = tf.train.GradientDescentOptimizer(learning_rate) 
train_step = my_opt.minimize(loss)

我们将训练循环延长了一段时间，因为它可能需要一段时间才能收敛。我们可以看到斜率系数小于0.9。使用以下代码：

loss_vec = [] 
for i in range(1500): 
    rand_index = np.random.choice(len(x_vals), size=batch_size) 
    rand_x = np.transpose([x_vals[rand_index]]) 
    rand_y = np.transpose([y_vals[rand_index]]) 
    sess.run(train_step, feed_dict={x_data: rand_x, y_target: rand_y}) 
    temp_loss = sess.run(loss, feed_dict={x_data: rand_x, y_target: rand_y}) 
    loss_vec.append(temp_loss[0]) 
    if (i+1)%300==0: 
        print('Step #' + str(i+1) + ' A = ' + str(sess.run(A)) + ' b = ' + str(sess.run(b))) 
        print('Loss = ' + str(temp_loss)) 
Step #300 A = [[ 0.82512331]] b = [[ 2.30319238]] 
Loss = [[ 6.84168959]] 
Step #600 A = [[ 0.8200165]] b = [[ 3.45292258]] 
Loss = [[ 2.02759886]] 
Step #900 A = [[ 0.81428504]] b = [[ 4.08901262]] 
Loss = [[ 0.49081498]] 
Step #1200 A = [[ 0.80919558]] b = [[ 4.43668795]] 
Loss = [[ 0.40478843]] 
Step #1500 A = [[ 0.80433637]] b = [[ 4.6360755]] 
Loss = [[ 0.23839757]]

工作原理

我们通过在线性回归的损失函数中添加连续的 Heaviside 阶跃函数来实现套索回归。由于阶梯函数的陡峭性，我们必须小心步长。步长太大而且不会收敛。对于岭回归，请参阅下一节中所需的更改。

对于岭回归，我们将损失ss函数更改为如下：

ridge_param = tf.constant(1.) 
ridge_loss = tf.reduce_mean(tf.square(A)) 
loss = tf.expand_dims(tf.add(tf.reduce_mean(tf.square(y_target - model_output)), tf.multiply(ridge_param, ridge_loss)), 0)

实现弹性网络回归

弹性网络回归是一种回归类型，通过将 L1 和 L2 正则化项添加到损失函数，将套索回归与岭回归相结合。

准备

在前两个秘籍之后实现弹性网络回归应该是直截了当的，因此我们将在鸢尾数据集上的多元线性回归中实现这一点，而不是像以前那样坚持二维数据。我们将使用花瓣长度，花瓣宽度和萼片宽度来预测萼片长度。

操作步骤

我们按如下方式处理秘籍：

首先，我们加载必要的库并初始化图，如下所示：

import matplotlib.pyplot as plt 
import numpy as np 
import tensorflow as tf 
from sklearn import datasets 
sess = tf.Session()

现在，我们加载数据。这次，x数据的每个元素将是三个值的列表而不是一个。使用以下代码：

iris = datasets.load_iris() 
x_vals = np.array([[x[1], x[2], x[3]] for x in iris.data]) 
y_vals = np.array([y[0] for y in iris.data])

接下来，我们声明批量大小，占位符，变量和模型输出。这里唯一的区别是我们更改x数据占位符的大小规范，取三个值而不是一个，如下所示：

batch_size = 50 
learning_rate = 0.001 
x_data = tf.placeholder(shape=[None, 3], dtype=tf.float32) 
y_target = tf.placeholder(shape=[None, 1], dtype=tf.float32) 
A = tf.Variable(tf.random_normal(shape=[3,1])) 
b = tf.Variable(tf.random_normal(shape=[1,1])) 
model_output = tf.add(tf.matmul(x_data, A), b)

对于弹性网络，损失函数具有部分斜率的 L1 和 L2 范数。我们创建这些项，然后将它们添加到损失函数中，如下所示：

elastic_param1 = tf.constant(1.) 
elastic_param2 = tf.constant(1.) 
l1_a_loss = tf.reduce_mean(tf.abs(A)) 
l2_a_loss = tf.reduce_mean(tf.square(A)) 
e1_term = tf.multiply(elastic_param1, l1_a_loss) 
e2_term = tf.multiply(elastic_param2, l2_a_loss) 
loss = tf.expand_dims(tf.add(tf.add(tf.reduce_mean(tf.square(y_target - model_output)), e1_term), e2_term), 0)

现在，我们可以初始化变量，声明我们的优化函数，运行训练循环，并拟合我们的系数，如下所示：

init = tf.global_variables_initializer() 
sess.run(init) 
my_opt = tf.train.GradientDescentOptimizer(learning_rate) 
train_step = my_opt.minimize(loss) 
loss_vec = [] 
for i in range(1000): 
    rand_index = np.random.choice(len(x_vals), size=batch_size) 
    rand_x = x_vals[rand_index] 
    rand_y = np.transpose([y_vals[rand_index]]) 
    sess.run(train_step, feed_dict={x_data: rand_x, y_target: rand_y}) 
    temp_loss = sess.run(loss, feed_dict={x_data: rand_x, y_target: rand_y}) 
    loss_vec.append(temp_loss[0]) 
    if (i+1)%250==0: 
        print('Step #' + str(i+1) + ' A = ' + str(sess.run(A)) + ' b = ' + str(sess.run(b))) 
        print('Loss = ' + str(temp_loss))

这是代码的输出：

Step #250 A = [[ 0.42095602] 
 [ 0.1055888 ] 
 [ 1.77064979]] b = [[ 1.76164341]] 
Loss = [ 2.87764359] 
Step #500 A = [[ 0.62762028] 
 [ 0.06065864] 
 [ 1.36294949]] b = [[ 1.87629771]] 
Loss = [ 1.8032167] 
Step #750 A = [[ 0.67953539] 
 [ 0.102514 ] 
 [ 1.06914485]] b = [[ 1.95604002]] 
Loss = [ 1.33256555] 
Step #1000 A = [[ 0.6777274 ] 
 [ 0.16535147] 
 [ 0.8403284 ]] b = [[ 2.02246833]] 
Loss = [ 1.21458709]

现在，我们可以观察训练迭代的损失，以确保算法收敛，如下所示：

plt.plot(loss_vec, 'k-') 
plt.title('Loss per Generation') 
plt.xlabel('Generation') 
plt.ylabel('Loss') 
plt.show()

我们得到上面代码的以下图：

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-W5AMZU9L-1681566813069)(https://gitcode.net/apachecn/apachecn-dl-zh/-/raw/master/docs/tf-ml-cookbook-2e-zh/img/12a7c632-d523-401b-b9bd-769b8b765f67.png)]

图 10：在 1,000 次训练迭代中绘制的弹性净回归损失

工作原理

这里实现弹性网络回归以及多元线性回归。我们可以看到，利用损失函数中的这些正则化项，收敛速度比先前的秘籍慢。正则化就像在损失函数中添加适当的项一样简单。

实现逻辑回归

对于这个秘籍，我们将实现逻辑回归来预测样本人群中低出生体重的概率。

准备

逻辑回归是将线性回归转换为二元分类的一种方法。这是通过将线性输出转换为 Sigmoid 函数来实现的，该函数将输出在 0 和 1 之间进行缩放。目标是零或一，表示数据点是在一个类还是另一个类中。由于我们预测 0 和 1 之间的数字，如果预测高于指定的截止值，则预测被分类为类值 1，否则分类为 0。出于此示例的目的，我们将指定截断为 0.5，这将使分类像舍入输出一样简单。

我们将用于此示例的数据将是从作者的 GitHub 仓库获得的低出生体重数据。我们将从其他几个因素预测低出生体重。

操作步骤

我们按如下方式处理秘籍：

我们首先加载库，包括request库，因为我们将通过超链接访问低出生体重数据。我们还发起了一个会议：

import matplotlib.pyplot as plt 
import numpy as np 
import tensorflow as tf 
import requests 
from sklearn import datasets 
from sklearn.preprocessing import normalize 
from tensorflow.python.framework import ops 
ops.reset_default_graph() 
sess = tf.Session()

接下来，我们通过请求模块加载数据并指定我们要使用的特征。我们必须具体，因为一个特征是实际出生体重，我们不想用它来预测出生体重是大于还是小于特定量。我们也不想将 ID 列用作预测器：

birth_weight_file = 'birth_weight.csv'
# Download data and create data file if file does not exist in current directory
if not os.path.exists(birth_weight_file):
    birthdata_url = 'https://github.com/nfmcclure/tensorflow_cookbook/raw/master/01_Introduction/07_Working_with_Data_Sources/birthweight_data/birthweight.dat'
    birth_file = requests.get(birthdata_url)
    birth_data = birth_file.text.split('\r\n')
    birth_header = birth_data[0].split('\t')
    birth_data = [[float(x) for x in y.split('\t') if len(x)>=1] for y in birth_data[1:] if len(y)>=1]
    with open(birth_weight_file, 'w', newline='') as f:
        writer = csv.writer(f)
        writer.writerow(birth_header)
        writer.writerows(birth_data)
# Read birth weight data into memory
birth_data = []
with open(birth_weight_file, newline='') as csvfile:
    csv_reader = csv.reader(csvfile)
    birth_header = next(csv_reader)
    for row in csv_reader:
        birth_data.append(row)
    birth_data = [[float(x) for x in row] for row in birth_data]
# Pull out target variable
y_vals = np.array([x[0] for x in birth_data])
# Pull out predictor variables (not id, not target, and not birthweight)
x_vals = np.array([x[1:8] for x in birth_data])

首先，我们将数据集拆分为测试和训练集：

train_indices = np.random.choice(len(x_vals), round(len(x_vals)*0.8), replace=False) 
test_indices = np.array(list(set(range(len(x_vals))) - set(train_indices))) 
x_vals_train = x_vals[train_indices] 
x_vals_test = x_vals[test_indices] 
y_vals_train = y_vals[train_indices] 
y_vals_test = y_vals[test_indices]

当特征在 0 和 1 之间缩放（最小 - 最大缩放）时，逻辑回归收敛效果更好。那么，接下来我们将扩展每个特征：

def normalize_cols(m, col_min=np.array([None]), col_max=np.array([None])):
    if not col_min[0]:
        col_min = m.min(axis=0)
    if not col_max[0]:
        col_max = m.max(axis=0)
    return (m-col_min) / (col_max - col_min), col_min, col_max
x_vals_train, train_min, train_max = np.nan_to_num(normalize_cols(x_vals_train))
x_vals_test = np.nan_to_num(normalize_cols(x_vals_test, train_min, train_max))

请注意，在缩放数据集之前，我们将数据集拆分为训练和测试。这是一个重要的区别。我们希望确保测试集完全不影响训练集。如果我们在分裂之前缩放整个集合，那么我们不能保证它们不会相互影响。我们确保从训练组中保存缩放以缩放测试集。

接下来，我们声明批量大小，占位符，变量和逻辑模型。我们不将输出包装在 sigmoid 中，因为该操作内置于损失函数中。另请注意，每次观察都有七个输入特征，因此x_data占位符的大小为[None, 7]。

batch_size = 25 
x_data = tf.placeholder(shape=[None, 7], dtype=tf.float32) 
y_target = tf.placeholder(shape=[None, 1], dtype=tf.float32) 
A = tf.Variable(tf.random_normal(shape=[7,1])) 
b = tf.Variable(tf.random_normal(shape=[1,1])) 
model_output = tf.add(tf.matmul(x_data, A), b)

现在，我们声明我们的损失函数，它具有 sigmoid 函数，初始化我们的变量，并声明我们的优化函数：

loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(model_output, y_target)) 
init = tf.global_variables_initializer() 
sess.run(init) 
my_opt = tf.train.GradientDescentOptimizer(0.01) 
train_step = my_opt.minimize(loss)

在记录损失函数的同时，我们还希望在训练和测试集上记录分类准确率。因此，我们将创建一个预测函数，返回任何大小的批量的准确率：

prediction = tf.round(tf.sigmoid(model_output)) 
predictions_correct = tf.cast(tf.equal(prediction, y_target), tf.float32) 
accuracy = tf.reduce_mean(predictions_correct)

现在，我们可以开始我们的训练循环并记录损失和准确率：

loss_vec = [] 
train_acc = [] 
test_acc = [] 
for i in range(1500): 
    rand_index = np.random.choice(len(x_vals_train), size=batch_size) 
    rand_x = x_vals_train[rand_index] 
    rand_y = np.transpose([y_vals_train[rand_index]]) 
    sess.run(train_step, feed_dict={x_data: rand_x, y_target: rand_y}) 
    temp_loss = sess.run(loss, feed_dict={x_data: rand_x, y_target: rand_y}) 
    loss_vec.append(temp_loss) 
    temp_acc_train = sess.run(accuracy, feed_dict={x_data: x_vals_train, y_target: np.transpose([y_vals_train])}) 
    train_acc.append(temp_acc_train) 
    temp_acc_test = sess.run(accuracy, feed_dict={x_data: x_vals_test, y_target: np.transpose([y_vals_test])}) 
    test_acc.append(temp_acc_test)

以下是查看损失和准确率图的代码：

plt.plot(loss_vec, 'k-') 
plt.title('Cross' Entropy Loss per Generation') 
plt.xlabel('Generation') 
plt.ylabel('Cross' Entropy Loss') 
plt.show() 
plt.plot(train_acc, 'k-', label='Train Set Accuracy') 
plt.plot(test_acc, 'r--', label='Test Set Accuracy') 
plt.title('Train and Test Accuracy') 
plt.xlabel('Generation') 
plt.ylabel('Accuracy') 
plt.legend(loc='lower right') 
plt.show()

工作原理

这是迭代和训练和测试精度的损失。由于数据集仅为 189 次观测，因此随着数据集的随机分裂，训练和测试精度图将发生变化。第一个数字是交叉熵损失：

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-RvTYHVDR-1681566813069)(https://gitcode.net/apachecn/apachecn-dl-zh/-/raw/master/docs/tf-ml-cookbook-2e-zh/img/0efa3298-7046-432d-8393-ca1765b44175.png)]

图 11：在 1,500 次迭代过程中绘制的交叉熵损失

第二个图显示了训练和测试装置的准确率：

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-KO0IYaLB-1681566813070)(https://gitcode.net/apachecn/apachecn-dl-zh/-/raw/master/docs/tf-ml-cookbook-2e-zh/img/a6f50907-4950-402e-b327-053a02deb6b3.png)]Figure 12: Test and train set accuracy plotted over 1,500 generations

四、支持向量机

本章将介绍有关如何在 TensorFlow 中使用，实现和评估支持向量机（SVM）的一些重要秘籍。将涵盖以下领域：

使用线性 SVM
回退到线性回归
在 TensorFlow 中使用核
实现非线性 SVM
实现多类 SVM

本章中先前涵盖的逻辑回归和大多数 SVM 都是二元预测变量。虽然逻辑回归试图找到最大化距离的任何分离线（概率地），但 SVM 还尝试最小化误差，同时最大化类之间的余量。通常，如果问题与训练示例相比具有大量特征，请尝试逻辑回归或线性 SVM。如果训练样本的数量较大，或者数据不是线性可分的，则可以使用具有高斯核的 SVM。

另外，请记住本章的所有代码都可以在 Github 和 Packt 仓库中找到。

介绍

SVM 是二分类的方法。基本思想是在两个类之间找到二维的线性分离线（或更多维度的超平面）。我们首先假设二元类目标是 -1 或 1，而不是先前的 0 或 1 目标。由于可能有许多行分隔两个类，我们定义最佳线性分隔符，以最大化两个类之间的距离：

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-VwoJbYof-1681566813070)(https://gitcode.net/apachecn/apachecn-dl-zh/-/raw/master/docs/tf-ml-cookbook-2e-zh/img/bc479121-6689-4836-9221-2f694919ce64.png)]

图 1

给定两个可分类o和x，我们希望找到两者之间的线性分离器的等式。左侧绘图显示有许多行将两个类分开。右侧绘图显示了唯一的最大边际线。边距宽度由2 / ||A||给出。通过最小化A的 L2 范数找到该线。

我们可以编写如下超平面：

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-4FLfWZZT-1681566813070)(https://gitcode.net/apachecn/apachecn-dl-zh/-/raw/master/docs/tf-ml-cookbook-2e-zh/img/d459df75-35ba-46f7-b3a0-2e6d934d5636.png)]

这里，A是我们部分斜率的向量，x是输入向量。最大边距的宽度可以显示为 2 除以A的 L2 范数。这个事实有许多证明，但是对于几何思想，求解从 2D 点到直线的垂直距离可以提供前进的动力。

对于线性可分的二元类数据，为了最大化余量，我们最小化A，[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-injRxuHI-1681566813070)(https://gitcode.net/apachecn/apachecn-dl-zh/-/raw/master/docs/tf-ml-cookbook-2e-zh/img/c8bd7a37-6085-4843-8d01-34b9877ffe39.png)]的 L2 范数。我们还必须将此最小值置于以下约束条件下：

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-3DNdvXCF-1681566813070)(https://gitcode.net/apachecn/apachecn-dl-zh/-/raw/master/docs/tf-ml-cookbook-2e-zh/img/35825b20-6e93-4136-acb4-f51128ed4c7f.png)]

前面的约束确保我们来自相应类的所有点都在分离线的同一侧。

由于并非所有数据集都是线性可分的，因此我们可以为跨越边界线的点引入损失函数。对于n数据点，我们引入了所谓的软边际损失函数，如下所示：

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-MPSfKPMp-1681566813071)(https://gitcode.net/apachecn/apachecn-dl-zh/-/raw/master/docs/tf-ml-cookbook-2e-zh/img/068234be-29b7-4890-96ef-afd771e361f6.png)]

请注意，如果该点位于边距的正确一侧，则乘积y[i](Ax[i] - b)始终大于 1。这使得损失函数的左手项等于 0，并且对损失函数的唯一影响是余量的大小。

前面的损失函数将寻找线性可分的线，但允许穿过边缘线的点。根据α的值，这可以是硬度或软度量。α的较大值导致更加强调边距的扩大，而α的较小值导致模型更像是一个硬边缘，同时允许数据点跨越边距，如果需要的话。

在本章中，我们将建立一个软边界 SVM，并展示如何将其扩展到非线性情况和多个类。

使用线性 SVM

对于此示例，我们将从鸢尾花数据集创建线性分隔符。我们从前面的章节中知道，萼片长度和花瓣宽度创建了一个线性可分的二分类数据集，用于预测花是否是山鸢尾（I）。

准备

要在 TensorFlow 中实现软可分 SVM，我们将实现特定的损失函数，如下所示：

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-zOGiOLH9-1681566813071)(https://gitcode.net/apachecn/apachecn-dl-zh/-/raw/master/docs/tf-ml-cookbook-2e-zh/img/11e1a288-7294-4c7c-bbee-37e854bbf309.png)]

这里，A是部分斜率的向量，b是截距，x[i]是输入向量，y[i]是实际类，（-1 或 1），α是软可分性正则化参数。

操作步骤

我们按如下方式处理秘籍：

我们首先加载必要的库。这将包括用于访问鸢尾数据集的scikit-learn数据集库。使用以下代码：

import matplotlib.pyplot as plt 
import numpy as np 
import tensorflow as tf 
from sklearn import datasets

要为此练习设置 scikit-learn，我们只需要输入$pip install -U scikit-learn。请注意，它也安装了 Anaconda。

接下来，我们启动图会话并根据需要加载数据。请记住，我们正在加载鸢尾数据集中的第一个和第四个变量，因为它们是萼片长度和萼片宽度。我们正在加载目标变量，对于山鸢尾将取值 1，否则为 -1。使用以下代码：

sess = tf.Session() 
iris = datasets.load_iris() 
x_vals = np.array([[x[0], x[3]] for x in iris.data]) 
y_vals = np.array([1 if y==0 else -1 for y in iris.target])

我们现在应该将数据集拆分为训练集和测试集。我们将评估训练和测试集的准确率。由于我们知道这个数据集是线性可分的，因此我们应该期望在两个集合上获得 100% 的准确率。要拆分数据，请使用以下代码：

train_indices = np.random.choice(len(x_vals), round(len(x_vals)*0.8), replace=False) 
test_indices = np.array(list(set(range(len(x_vals))) - set(train_indices))) 
x_vals_train = x_vals[train_indices] 
x_vals_test = x_vals[test_indices] 
y_vals_train = y_vals[train_indices] 
y_vals_test = y_vals[test_indices]

接下来，我们设置批量大小，占位符和模型变量。值得一提的是，使用这种 SVM 算法，我们需要非常大的批量大小来帮助收敛。我们可以想象，对于非常小的批量大小，最大边际线会略微跳跃。理想情况下，我们也会慢慢降低学习率，但现在这已经足够了。此外，A变量将采用2x1形状，因为我们有两个预测变量：萼片长度和花瓣宽度。要进行此设置，我们使用以下代码：

batch_size = 100 
x_data = tf.placeholder(shape=[None, 2], dtype=tf.float32) 
y_target = tf.placeholder(shape=[None, 1], dtype=tf.float32) 
A = tf.Variable(tf.random_normal(shape=[2,1])) 
b = tf.Variable(tf.random_normal(shape=[1,1]))

我们现在声明我们的模型输出。对于正确分类的点，如果目标是山鸢尾，则返回大于或等于 1 的数字，否则返回小于或等于 -1。模型输出使用以下代码：

model_output = tf.subtract(tf.matmul(x_data, A), b)

接下来，我们将汇总并声明必要的组件以获得最大的保证金损失。首先，我们将声明一个计算向量的 L2 范数的函数。然后，我们添加边距参数 [外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-5zigb5h1-1681566813071)(https://gitcode.net/apachecn/apachecn-dl-zh/-/raw/master/docs/tf-ml-cookbook-2e-zh/img/1f304159-a7af-498c-b746-72d49ecadecb.png)]。然后我们宣布我们的分类损失并将这两项加在一起。使用以下代码：

l2_norm = tf.reduce_sum(tf.square(A)) 
alpha = tf.constant([0.1]) 
classification_term = tf.reduce_mean(tf.maximum(0., tf.subtract(1., tf.multiply(model_output, y_target)))) 
loss = tf.add(classification _term, tf.multiply(alpha, l2_norm))

现在，我们声明我们的预测和准确率函数，以便我们可以评估训练集和测试集的准确率，如下所示：

prediction = tf.sign(model_output) 
accuracy = tf.reduce_mean(tf.cast(tf.equal(prediction, y_target), tf.float32))

在这里，我们将声明我们的优化函数并初始化我们的模型变量；我们在以下代码中执行此操作：

my_opt = tf.train.GradientDescentOptimizer(0.01) 
train_step = my_opt.minimize(loss) 
init = tf.global_variables_initializer() 
sess.run(init)

我们现在可以开始我们的训练循环，记住我们想要在训练和测试集上记录我们的损失和训练准确率，如下所示：

loss_vec = [] 
train_accuracy = [] 
test_accuracy = [] 
for i in range(500): 
   rand_index = np.random.choice(len(x_vals_train), size=batch_size) 
   rand_x = x_vals_train[rand_index] 
    rand_y = np.transpose([y_vals_train[rand_index]]) 
    sess.run(train_step, feed_dict={x_data: rand_x, y_target: rand_y}) 
    temp_loss = sess.run(loss, feed_dict={x_data: rand_x, y_target: rand_y}) 
    loss_vec.append(temp_loss) 
    train_acc_temp = sess.run(accuracy, feed_dict={x_data: x_vals_train, y_target: np.transpose([y_vals_train])}) 
    train_accuracy.append(train_acc_temp) 
    test_acc_temp = sess.run(accuracy, feed_dict={x_data: x_vals_test, y_target: np.transpose([y_vals_test])}) 
    test_accuracy.append(test_acc_temp) 
    if (i+1)%100==0: 
        print('Step #' + str(i+1) + ' A = ' + str(sess.run(A)) + ' b = ' + str(sess.run(b))) 
        print('Loss = ' + str(temp_loss))

训练期间脚本的输出应如下所示：

Step #100 A = [[-0.10763293] 
 [-0.65735245]] b = [[-0.68752676]] 
Loss = [ 0.48756418] 
Step #200 A = [[-0.0650763 ] 
 [-0.89443302]] b = [[-0.73912662]] 
Loss = [ 0.38910741] 
Step #300 A = [[-0.02090022] 
 [-1.12334013]] b = [[-0.79332656]] 
Loss = [ 0.28621092] 
Step #400 A = [[ 0.03189624] 
 [-1.34912157]] b = [[-0.8507266]] 
Loss = [ 0.22397576] 
Step #500 A = [[ 0.05958777] 
 [-1.55989814]] b = [[-0.9000265]] 
Loss = [ 0.20492229]

为了绘制输出（拟合，损失和精度），我们必须提取系数并将x值分成山鸢尾和其它鸢尾，如下所示：

[[a1], [a2]] = sess.run(A) 
[[b]] = sess.run(b) 
slope = -a2/a1 
y_intercept = b/a1 
x1_vals = [d[1] for d in x_vals] 
best_fit = [] 
for i in x1_vals: 
    best_fit.append(slope*i+y_intercept) 
setosa_x = [d[1] for i,d in enumerate(x_vals) if y_vals[i]==1] 
setosa_y = [d[0] for i,d in enumerate(x_vals) if y_vals[i]==1] 
not_setosa_x = [d[1] for i,d in enumerate(x_vals) if y_vals[i]==-1] 
not_setosa_y = [d[0] for i,d in enumerate(x_vals) if y_vals[i]==-1]

以下是使用线性分离器拟合，精度和损耗绘制数据的代码：

plt.plot(setosa_x, setosa_y, 'o', label='I. setosa') 
plt.plot(not_setosa_x, not_setosa_y, 'x', label='Non-setosa') 
plt.plot(x1_vals, best_fit, 'r-', label='Linear Separator', linewidth=3) 
plt.ylim([0, 10]) 
plt.legend(loc='lower right') 
plt.title('Sepal Length vs Petal Width') 
plt.xlabel('Petal Width') 
plt.ylabel('Sepal Length') 
plt.show() 
plt.plot(train_accuracy, 'k-', label='Training Accuracy') 
plt.plot(test_accuracy, 'r--', label='Test Accuracy') 
plt.title('Train and Test Set Accuracies') 
plt.xlabel('Generation') 
plt.ylabel('Accuracy') 
plt.legend(loc='lower right') 
plt.show() 
plt.plot(loss_vec, 'k-') 
plt.title('Loss per Generation') 
plt.xlabel('Generation') 
plt.ylabel('Loss') 
plt.show()

以这种方式使用 TensorFlow 来实现 SVD 算法可能导致每次运行的结果略有不同。其原因包括随机训练/测试集拆分以及每个训练批次中不同批次点的选择。此外，在每一代之后慢慢降低学习率是理想的。

得到的图如下：

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-NlwfUGcR-1681566813071)(https://gitcode.net/apachecn/apachecn-dl-zh/-/raw/master/docs/tf-ml-cookbook-2e-zh/img/9c53587e-f2d7-4e1f-9af9-1bde936d24da.png)]

图 2：最终线性 SVM 与绘制的两个类别拟合

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-jW6NO9wQ-1681566813071)(https://gitcode.net/apachecn/apachecn-dl-zh/-/raw/master/docs/tf-ml-cookbook-2e-zh/img/fb331bce-9d8e-4d8e-928e-8446022dac2c.png)]

图 3：迭代测试和训练集精度；我们确实获得 100% 的准确率，因为这两个类是线性可分的

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-Vmpx8h0a-1681566813072)(https://gitcode.net/apachecn/apachecn-dl-zh/-/raw/master/docs/tf-ml-cookbook-2e-zh/img/adb5aaf5-d575-4e93-a84c-7f709afcdcb6.png)]

图 4：超过 500 次迭代的最大边际损失图

工作原理

在本文中，我们已经证明使用最大边际损失函数可以实现线性 SVD 模型。

简化为线性回归

SVM 可用于拟合线性回归。在本节中，我们将探讨如何使用 TensorFlow 执行此操作。

TensorFlow 机器学习秘籍第二版：1~5（4）https://developer.aliyun.com/article/1426834

热门

活动广场

任务中心

开发者评测

高校计划

乘风者计划

训练营

阿里云MVP

话题

直播

下载

镜像站

技术资料

插件

TensorFlow 机器学习秘籍第二版：1~5（3）

工作原理

实现分解方法

准备

操作步骤

工作原理

学习 TensorFlow 线性回归方法

准备

操作步骤

工作原理

理解线性回归中的损失函数

准备

操作步骤

工作原理

更多

实现戴明回归

准备

操作步骤

工作原理

实现套索和岭回归

准备

操作步骤

工作原理

更多

实现弹性网络回归

准备

操作步骤

工作原理

实现逻辑回归

准备

操作步骤

工作原理

四、支持向量机

介绍

使用线性 SVM

准备

操作步骤

工作原理

简化为线性回归

热门文章

最新文章

相关课程

相关电子书

相关实验场景