&:与
|:或
不同条件需要用()括起来
import pandas as pd # 构造字典数据 dic = { "name":["shanjialan","shanyanhong","luckyapple"], "age":[21,23,12], "hobby":["sports","music","programming"] } # dataframe读取字典 df = pd.DataFrame(dic)
image.png
# 简单判断 print(df[df["age"]>18]) print(df[df["name"].str.len()>10]) # 复杂判断 print(df[(df["age"]>18)&(df["age"]<22)])
image.png
pandas字符串的方法
pandas字符串的方法.png
pandas 缺失数据的处理
pd.isnull(df):df每个数据是否为空的bool矩阵
pd.notnull(df):df每个数据是否不为空的bool矩阵
缺失数据包括np.nan/None
image.png
import pandas as pd import numpy as np # 构造字典数据 dic = { "name":["shanjialan","shanyanhong","luckyapple","hunvibe","chenwenhao"], "age":[21,23,0,np.nan,21], "hobby":["sports","music","programming","eating","basketball"] } # dataframe读取字典 df = pd.DataFrame(dic) print(pd.isnull(df)) print(pd.notnull(df))
image.png
缺失值的处理方法:删除或者填充
df.dropna(how='all/any',inplace='True/False',axis=n):
how——以何种方式删除,all:所有数据都为nan,any表示只要有一个就可;inplace:是否原地修改,TRUE为原地修改,FALSE为默认选择axis:指定轴
df.fillna(value):填充为value值
print(df.dropna(how='any',axis=0,inplace=False)) print(df["age"].fillna(value=df['age'].mean()))
image.png
注意:在pandas中出现nan进行求均值等操作会默认为0,和在numpy中不同
处理0值
t[t==0]=np.nan
![3G9H6({DZ[PNT`8HT~0]17E.png 3G9H6({DZ[PNT`8HT~0]17E.png](https://ucc.alicdn.com/pic/developer-ecology/swqrm2evajpu4_4ff6fd92e39348aea8eb9519f357cd93.png?x-oss-process=image/resize,w_1400/format,webp)



![D}5`L]Y8%_BKV62K84~ARTN.png D}5`L]Y8%_BKV62K84~ARTN.png](https://ucc.alicdn.com/pic/developer-ecology/swqrm2evajpu4_75bacf7ba8f04117a4aa05ce3b87026f.png?x-oss-process=image/resize,w_1400/format,webp)
