表头和内容错位
用df.columns获取表头
dtype='object')
用df.to_csv 保存为一个新的文件,在新文件上操作
df1.columns = ['cmte_id', 'cand_id', 'cand_nm', 'contbr_nm', 'contbr_city', 'contbr_st', 'contbr_zip', 'contbr_employer', 'contbr_occupation', 'contb_receipt_amt', 'contb_receipt_dt', 'receipt_desc', 'memo_cd', 'memo_text', 'form_tp', 'file_num', 'tran_id', 'election_tp','None']
不能用{},否则顺序会乱
########################
经过不断实践,发现这个问题是我没搞懂read_csv
一个参数搞定
df = pd.read_csv('D:\\project_codes\\USA_Presidential_Contributor\\2012-P00000001-ALL.csv',index_col=False)
看看官方解释
index_col : int or sequence or False, default None Column to use as the row labels of the DataFrame. If a sequence is given, a MultiIndex is used. If you have a malformed file with delimiters at the end of each line, you might consider index_col=False to force pandas to _not_ use the first column as the index (row names)
index_col=False to force pandas to _not_use the first column as the index (row names)