pandas库的数据类型运算
算数运算法则
- 根据行列索引,补齐运算(不同索引不运算,行列索引相同才运算),默认产生浮点数
- 补齐时默认填充NaN空值
- 二维和一维,一维和0维之间采用广播运算(低维元素与每一个高维元素运算)
- 采用 +-*/符号的二元运算会产生新的对象
import pandas as pd
import numpy as np
a = pd.DataFrame(np.arange(12).reshape(3, 4))
a
|
0 |
1 |
2 |
3 |
0 |
0 |
1 |
2 |
3 |
1 |
4 |
5 |
6 |
7 |
2 |
8 |
9 |
10 |
11 |
b = pd.DataFrame(np.arange(20).reshape(4, 5))
b
|
0 |
1 |
2 |
3 |
4 |
0 |
0 |
1 |
2 |
3 |
4 |
1 |
5 |
6 |
7 |
8 |
9 |
2 |
10 |
11 |
12 |
13 |
14 |
3 |
15 |
16 |
17 |
18 |
19 |
a + b
|
0 |
1 |
2 |
3 |
4 |
0 |
0.0 |
2.0 |
4.0 |
6.0 |
NaN |
1 |
9.0 |
11.0 |
13.0 |
15.0 |
NaN |
2 |
18.0 |
20.0 |
22.0 |
24.0 |
NaN |
3 |
NaN |
NaN |
NaN |
NaN |
NaN |
a * b
|
0 |
1 |
2 |
3 |
4 |
0 |
0.0 |
1.0 |
4.0 |
9.0 |
NaN |
1 |
20.0 |
30.0 |
42.0 |
56.0 |
NaN |
2 |
80.0 |
99.0 |
120.0 |
143.0 |
NaN |
3 |
NaN |
NaN |
NaN |
NaN |
NaN |
除了使用+-*/,也可使用方法形式,好处是可以增加可选参数
-
.add(d,**argws)
类型间加法运算,可选参数
-
.sub(d,**argws)
类型间减法运算,可选参数
-
.mul(d,**argws)
类型间乘法运算,可选参数
-
.div(d,**argws)
类型间除法运算,可选参数
b.add(a,fill_value = 100)
|
0 |
1 |
2 |
3 |
4 |
0 |
0.0 |
2.0 |
4.0 |
6.0 |
104.0 |
1 |
9.0 |
11.0 |
13.0 |
15.0 |
109.0 |
2 |
18.0 |
20.0 |
22.0 |
24.0 |
114.0 |
3 |
115.0 |
116.0 |
117.0 |
118.0 |
119.0 |
a.mul(b,fill_value = 0)
|
0 |
1 |
2 |
3 |
4 |
0 |
0.0 |
1.0 |
4.0 |
9.0 |
0.0 |
1 |
20.0 |
30.0 |
42.0 |
56.0 |
0.0 |
2 |
80.0 |
99.0 |
120.0 |
143.0 |
0.0 |
3 |
0.0 |
0.0 |
0.0 |
0.0 |
0.0 |
不同维度运算
b = pd.DataFrame(np.arange(20).reshape(4, 5))
b
|
0 |
1 |
2 |
3 |
4 |
0 |
0 |
1 |
2 |
3 |
4 |
1 |
5 |
6 |
7 |
8 |
9 |
2 |
10 |
11 |
12 |
13 |
14 |
3 |
15 |
16 |
17 |
18 |
19 |
c = pd.Series(np.arange(4))
c
0 0
1 1
2 2
3 3
dtype: int32
c - 10
0 -10
1 -9
2 -8
3 -7
dtype: int32
b - c
|
0 |
1 |
2 |
3 |
4 |
0 |
0.0 |
0.0 |
0.0 |
0.0 |
NaN |
1 |
5.0 |
5.0 |
5.0 |
5.0 |
NaN |
2 |
10.0 |
10.0 |
10.0 |
10.0 |
NaN |
3 |
15.0 |
15.0 |
15.0 |
15.0 |
NaN |
b.sub(c,axis=0)
|
0 |
1 |
2 |
3 |
4 |
0 |
0 |
1 |
2 |
3 |
4 |
1 |
4 |
5 |
6 |
7 |
8 |
2 |
8 |
9 |
10 |
11 |
12 |
3 |
12 |
13 |
14 |
15 |
16 |
比较运算法则
- 比较运算只能比较相同索引的元素,不进行补齐(尺寸不同会报错)
- 二维和一维/一维和零维间为广播运算
- 采用>< >= <= – !=等符号进行的二元运算产生布尔对象
a = pd.DataFrame(np.arange(12).reshape(3, 4))
a
|
0 |
1 |
2 |
3 |
0 |
0 |
1 |
2 |
3 |
1 |
4 |
5 |
6 |
7 |
2 |
8 |
9 |
10 |
11 |
d = pd.DataFrame(np.arange(12, 0, -1).reshape(3, 4))
d
|
0 |
1 |
2 |
3 |
0 |
12 |
11 |
10 |
9 |
1 |
8 |
7 |
6 |
5 |
2 |
4 |
3 |
2 |
1 |
a > d
|
0 |
1 |
2 |
3 |
0 |
False |
False |
False |
False |
1 |
False |
False |
False |
True |
2 |
True |
True |
True |
True |
a == d
|
0 |
1 |
2 |
3 |
0 |
False |
False |
False |
False |
1 |
False |
False |
True |
False |
2 |
False |
False |
False |
False |
b = pd.DataFrame(np.arange(12).reshape(3, 4))
c = pd.Series(np.arange(4))
a > c
|
0 |
1 |
2 |
3 |
0 |
False |
False |
False |
False |
1 |
True |
True |
True |
True |
2 |
True |
True |
True |
True |
c > 0
0 False
1 True
2 True
3 True
dtype: bool