盘一盘 Python 系列 2 - NumPy (下)（四）-阿里云开发者社区

盘一盘 Python 系列 2 - NumPy (下)（四）

2023-06-20 102

版权

本文内容由阿里云实名注册用户自发贡献，版权归原作者所有，阿里云开发者社区不拥有其著作权，亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容，填写侵权投诉表单进行举报，一经查实，本社区将立刻删除涉嫌侵权内容。

简介： 盘一盘 Python 系列 2 - NumPy (下)（四）

本文首发于“生信补给站”公众号 https://mp.weixin.qq.com/s/0-9zUTxZC6qefyjLyw1SSA

一维数组

分析结果：

1, 2, 3 的总和是 6
在轴 0(只有一个轴) 上的元素求和是 6

用代码验证一下：

arr = np.array([1,2,3])print( 'The total sum is', arr.sum() )print( 'The sum on axis0 is', arr.sum(axis=0) )

The total sum is 6
The sum on axis0 is 6

求和一维数组没什么难度，而且也看不出如果「按轴求和」的规律。下面看看二维数组。

二维数组

分析结果：

1 到 6 的总和是 6
轴 0 上的元素 (被一个红方括号[]包住的) 是[1, 2, 3]和[4, 5, 6]，求和得到[[5, 6, 7]]
轴 1 上的元素 (被两个蓝方括号[] 包住的) 分别是 1, 2, 3 和 4, 5, 6，求和得到 [[1+2+3, 4+5+6]]=[[6, 15]]

用代码验证一下：

arr = np.arange(1,7).reshape((2,3))print( arr )

[[1 2 3]
 [4 5 6]]

print( 'The total sum is', arr.sum() )print( 'The sum on axis0 is', arr.sum(axis=0) )print( 'The sum on axis1 is', arr.sum(axis=1) )

The total sum is 21
The sum on axis0 is [5 7 9]
The sum on axis1 is [ 6 15]

结果是对的，但是好像括号比上图推导出来的少一个。原因np.sum()里面有个参数是 keepdims，意思是「保留维度」，默认值时 False，因此会去除多余的括号，比如 [[5, 7, 9]] 会变成 [5, 7, 9]。

如果把 keepdims 设置为 True，那么打印出来的结果和上图推导的一模一样。

print( arr.sum(axis=0, keepdims=True) )print( arr.sum(axis=1, keepdims=True) )

[[5 7 9]]
[[ 6]
[15]]

三维数组

分析结果：

1 到 12 的总和是 78
轴 0 上的元素是一个红方括号[] 包住的两个[[ ]]，对其求和得到一个[ [[ ]] ]
轴 1 上的元素是两个蓝方括号[] 包住的两个[ ]，对其求和得到两个[[ ]]，即 [ [[ ]],[[ ]]]
轴 2 上的元素是四个绿方括号[] 包住的三个标量，对其求和得到四个[]，即 [ [[ ],[ ]], [[ ],[ ]] ]

用代码验证一下：

arr = np.arange(1,13).reshape((2,2,3))print(arr)

[[[ 1 2 3]
  [ 4 5 6]]
 [[ 7 8 9]
  [10 11 12]]]

print( 'The total sum is', arr.sum() )print( 'The sum on axis0 is', arr.sum(axis=0) )print( 'The sum on axis1 is', arr.sum(axis=1) )print( 'The sum on axis2 is', arr.sum(axis=2) )

The total sum is 78
The sum on axis0 is [[ 8 10 12] [14 16 18]]
The sum on axis1 is [[ 5 7 9] [17 19 21]]
The sum on axis2 is [[ 6 15] [24 33]]

打印出来的结果比上图推导结果少一个括号，也是因为 keepdims 默认为 False。

四维数组

不解释了，彩色括号画的人要抓狂了。通用规律：当在某根轴上求和，明晰该轴的元素，再求和。具体说来：

在轴 0上求和，它包含是两个[]，对其求和
在轴 1 上求和，它包含是两个 []，对其求和
在轴 2 上求和，它包含是两个 []，对其求和
在轴 3 上求和，它包含是三个标量，对其求和

用代码验证一下：

arr = np.arange(1,25).reshape((2,2,2,3))print(arr)

[[[[ 1 2 3]
   [ 4 5 6]]
  [[ 7 8 9]
   [10 11 12]]]
 [[[13 14 15]
   [16 17 18]]
  [[19 20 21]
   [22 23 24]]]]

print( 'The total sum is', arr.sum() )print( 'The sum on axis0 is', arr.sum(axis=0) )print( 'The sum on axis1 is', arr.sum(axis=1) )print( 'The sum on axis2 is', arr.sum(axis=2) )print( 'The sum on axis3 is', arr.sum(axis=3) )

The total sum is 300
The sum on axis0 is [[[14 16 18] [20 22 24]]
                     [[26 28 30] [32 34 36]]]
The sum on axis1 is [[[ 8 10 12] [14 16 18]]
                     [[32 34 36] [38 40 42]]]
The sum on axis2 is [[[ 5 7 9] [17 19 21]]
                     [[29 31 33] [41 43 45]]]
The sum on axis3 is [[[ 6 15] [24 33]]
                     [[42 51] [60 69]]]

打印出来的结果比上图推导结果少一个括号，也是因为 keepdims 默认为 False。

小节

除了 sum 函数，整合函数还包括 min, max, mean, std 和 cumsum，分别是求最小值、最大值、均值、标准差和累加，这些函数对数组里的元素整合方式和 sum 函数相同，就不多讲了。总结来说我们可以对数组

所有的元素整合
在某个轴 (axis) 上的元素整合

整合函数= {sum, min, max, mean, std,cumsum}

5.4 广播机制计算

当对两个形状不同的数组按元素操作时，可能会触发「广播机制」。具体做法，先适当复制元素使得这两个数组形状相同后再按元素操作，两个步骤：

广播轴 (broadcast axis)：比对两个数组的维度，将形状小的数组的维度 (轴) 补齐
复制元素：顺着补齐的轴，将形状小的数组里的元素复制，使得最终形状和另一个数组吻合

在给出「广播机制」需要的严谨规则之前，我们先来看看几个简单例子。

例一：标量和一维数组

arr = np.arange(5)print( arr )print( arr + 2 )

[0 1 2 3 4]
[2 3 4 5 6]

元素 2 被广播到数组 arr 的所有元素上。

例二：一维数组和二维数组

arr = np.arange(12).reshape((4,3))print( arr )print( arr.mean(axis=0) )print( arr - arr.mean(axis=0) )

[[ 0 1 2]
 [ 3 4 5]
 [ 6 7 8]
 [ 9 10 11]]
[4.5 5.5 6.5]
[[-4.5 -4.5 -4.5]
 [-1.5 -1.5 -1.5]
 [ 1.5 1.5 1.5]
 [ 4.5 4.5 4.5]]

沿轴 0 的均值的一维数组被广播到数组 arr 的所有的行上。

现在我们来看看「广播机制」的规则：

广播机制的规则

知识点

当我们对两个数组操作时，如果它们的形状

不相容 (incompatible)，广播机制不能进行
相容 (compatible)，广播机制可以进行

因此，进行广播机制分两步

检查两个数组形状是否兼容，即从两个形状元组最后一个元素，来检查
1. 它们是否相等
2. 是否有一个等于 1
一旦它们形状兼容，确定两个数组的最终形状。

例三：维度一样，形状不一样

用个例子来应用以上广播机制规则

a = np.array([[1,2,3]])b = np.array([[4],[5],[6]])print( 'The shape of a is', a.shape )print( 'The shape of b is', b.shape )

The shape of a is (1, 3)
The shape of b is (3, 1)

回顾进行广播机制的两步

检查数组 a 和 b 形状是否兼容，从两个形状元组 (1, 3) 和 (3, 1)最后一个元素开始检查，发现它们都满足『有一个等于 1』的条件。
因此它们形状兼容，两个数组的最终形状为 (max(1,3), max(3,1)) = (3, 3)

到此，a 和 b 被扩展成 (3, 3) 的数组，让我们看看 a + b 等于多少

c = a + bprint( 'The shape of c is', c.shape )print( 'a is', a )print( 'b is', b )print( 'c = a + b =', c )

The shape of c is (3, 3)
a is [[1 2 3]]
b is [[4]
      [5]
      [6]]
c = a + b = [[5 6 7]
             [6 7 8]
             [7 8 9]]

例四：维度不一样

a = np.arange(5)b = np.array(2)print( 'The dimension of a is', a.ndim, 'and the shape of a is', a.shape )print( 'The dimension of b is', b.ndim, 'and the shape of b is', b.shape )

The dimension of a is 1 and the shape of a is (5,)
The dimension of b is 0 and the shape of b is ()

数组 a 和 b 形状分别为 (5,) 和 ()，首先我们把缺失的维度用 1 补齐得到 (5,) 和 (1,)，再根据广播机制那套流程得到这两个形状是兼容的，而且最终形状为 (5,)。

用代码来看看 a + b 等于多少

c = a + bprint( 'The dimension of c is', c.ndim, 'and the shape of c is', c.shape, '\n' )print( 'a is', a )print( 'b is', b )print( 'c = a + b =', c )

The dimension of c is 1 and the shape of c is (5,)
a is [0 1 2 3 4]
b is 2
c = a + b = [2 3 4 5 6]

现在对广播机制有概念了吧，来趁热打铁搞清楚下面这五个例子，你就完全弄懂它了。

a = np.array( [[[1,2,3], [4,5,6]]] )b1 = np.array( [[1,1,1], [2,2,2], [3,3,3]] )b2 = np.arange(3).reshape((1,3))b3 = np.arange(6).reshape((2,3))b4 = np.arange(12).reshape((2,2,3))b5 = np.arange(6).reshape((2,1,3))print( 'The dimension of a is', a.ndim, 'and the shape of a is', a.shape )print( 'The dimension of b1 is', b.ndim, 'and the shape of b1 is', b1.shape, '\n')print( 'The dimension of a is', a.ndim, 'and the shape of a is', a.shape )print( 'The dimension of b2 is', b.ndim, 'and the shape of b2 is', b2.shape, '\n' )print( 'The dimension of a is', a.ndim, 'and the shape of a is', a.shape )print( 'The dimension of b3 is', b.ndim, 'and the shape of b3 is', b3.shape, '\n' )print( 'The dimension of a is', a.ndim, 'and the shape of a is', a.shape )print( 'The dimension of b4 is', b.ndim, 'and the shape of b4 is', b4.shape, '\n' )print( 'The dimension of a is', a.ndim, 'and the shape of a is', a.shape )print( 'The dimension of b5 is', b.ndim, 'and the shape of b5 is', b5.shape )

The dimension of a is 3 and the shape of a is (1, 2, 3)
The dimension of b1 is 0 and the shape of b1 is (3, 3)
The dimension of a is 3 and the shape of a is (1, 2, 3)
The dimension of b2 is 0 and the shape of b2 is (1, 3)
The dimension of a is 3 and the shape of a is (1, 2, 3)
The dimension of b3 is 0 and the shape of b3 is (2, 3)
The dimension of a is 3 and the shape of a is (1, 2, 3)
The dimension of b4 is 0 and the shape of b4 is (2, 2, 3)
The dimension of a is 3 and the shape of a is (1, 2, 3)
The dimension of b5 is 0 and the shape of b5 is (2, 1, 3)

对于数组 a 和 b1，它们形状是 (1, 2, 3) 和 (3, 3)。元组最后一个都是 3，兼容；倒数第二个是 3 和 2，即不相等，也没有一个是 1，不兼容！a 和 b1 不能进行广播机制。不行就看看下面代码：

c1 = a + b1print( c1 )print( c1.shape )

ValueError: operands could not be broadcast
together with shapes (1,2,3) (3,3)

a 和其他 b2, b3, b4, b5 都可以进行广播机制，自己分析吧。

c2 = a + b2print( c2 )print( c2.shape )

[[[1 3 5]
  [4 6 8]]]
(1, 2, 3)

c3 = a + b3print( c3 )print( c3.shape )

[[[ 1 3 5]
  [ 7 9 11]]]
(1, 2, 3)

c4 = a + b4print( c4 )print( c4.shape )

[[[ 1 3 5]
  [ 7 9 11]]
 [[ 7 9 11]
  [13 15 17]]]
(2, 2, 3)

c5 = a + b5print( c5 )print( c5.shape )

[[[ 1 3 5]
  [ 4 6 8]]
 [[ 4 6 8]
  [ 7 9 11]]]
(2, 2, 3)

6总结

NumPy 篇终于完结！即上贴讨论过的数组创建、数组存载和数组获取，本贴讨论了数组变形、数组计算。

数组变形有以下重要操作：

改变维度的重塑和打平
改变分合的合并和分裂
复制本质的重复和拼接
其他排序插入删除复制

数组计算有以下重要操作：

元素层面：四则运算、函数，比较
线性代数：务必弄懂点乘函数 dot()
元素整合：务必弄懂轴这个概念！
广播机制：太重要了，神经网络无处不在！

下篇讨论用于科学计算的 SciPy。Stay Tuned!

盘一盘 Python 系列 2 - NumPy (下)（四）

5.4 广播机制计算

6总结

热门文章

最新文章

相关课程

相关电子书

相关实验场景

推荐镜像

热门

活动广场

任务中心

开发者评测

高校计划

乘风者计划

训练营

阿里云MVP

话题

直播

下载

镜像站

技术资料

插件

盘一盘 Python 系列 2 - NumPy (下)（四）

5.4 广播机制计算

6总结

热门文章

最新文章

相关课程

相关电子书

相关实验场景

推荐镜像