1.merge数据合并
import pandas as pd import numpy as np
1.1merge默认合并数据
merge合并时默认是内连接(inner)
price = pd.DataFrame({'fruit':['apple','grape','orange','orange'],'price':[8,7,9,11]}) amount = pd.DataFrame({'fruit':['apple','grape','orange'],'amout':[5,11,8]}) display(price,amount,pd.merge(price,amount)) #------------------------------------------------------------------------ fruit price 0 apple 8 1 grape 7 2 orange 9 3 orange 11 fruit amout 0 apple 5 1 grape 11 2 orange 8 fruit price amout 0 apple 8 5 1 grape 7 11 2 orange 9 8 3 orange 11 8
1.2左连接和右连接
pd.merge(price,amount,how = 'left')#左连接 pd.merge(price,amount,how = 'right')#右连接
1.3 merge通过多个键合并
left = pd.DataFrame({'key1':['one','one','two'],'key2':['a','b','a'],'value1':range(3)}) right = pd.DataFrame({'key1':['one','one','two','two'],'key2':['a','a','a','b'],'value2':range(4)}) display(left,right,pd.merge(left,right,on = ['key1','key2'],how = 'left')) #-------------------------------------------------------------------------------- key1 key2 value1 0 one a 0 1 one b 1 2 two a 2 key1 key2 value2 0 one a 0 1 one a 1 2 two a 2 3 two b 3 key1 key2 value1 value2 0 one a 0 0.0 1 one a 0 1.0 2 one b 1 NaN 3 two a 2 2.0
2.concat数据连接
concat方法默认情况下会按行的方向堆叠数据,如果在列向上连接,设置axis=1。
2.1两个Series的数据连接
s1 = pd.Series([0,1],index = ['a','b']) s2 = pd.Series([2,3,4],index = ['a','d','e']) s3 = pd.Series([5,6],index = ['f','g']) print(pd.concat([s1,s2,s3])) #Series行合并 #---------------------------------------------- a 0 b 1 a 2 d 3 e 4 f 5 g 6
2.2两个DataFrame的数据连接
data1 = pd.DataFrame(np.arange(6).reshape(2,3),columns = list('abc')) data2 = pd.DataFrame(np.arange(20,26).reshape(2,3),columns = list('ayz')) data = pd.concat([data1,data2],axis = 0,sort=False) display(data1,data2,data) #-------------------------------------------------------------------------- a b c 0 0 1 2 1 3 4 5 a y z 0 20 21 22 1 23 24 25 a b c y z 0 0 1.0 2.0 NaN NaN 1 3 4.0 5.0 NaN NaN 0 20 NaN NaN 21.0 22.0 1 23 NaN NaN 24.0 25.0
3.combine_first合并数据
如果需要合并的两个DataFrame存在重复索引,可以使用combine_first方法。
s6.combine_first(s5) #----------------------- 0 1 a 0.0 0.0 b 1.0 5.0 f NaN 5.0 g NaN 6.0