制作重叠直方图
统计每月上市电脑数
新建方法
def line_chart():
读取表格并修改索引
df = pd.read_csv("笔记本信息.csv", encoding='gb18030') df.reset_index(inplace=True) df.set_index('上市时间', inplace=True)
查看修改过的索引,确定索引是否修改成功
print(df.index) print(df.loc[2019.1])
创建字典,月份作键,值用来储存当月上市的电脑数
times_nums = {"2019.1": 0, "2019.2": 0, "2019.3": 0, "2019.4": 0, "2019.5": 0, "2019.6": 0, "2019.7": 0, "2019.8": 0, "2019.9": 0, "2019.10": 0, "2019.11": 0, "2019.12": 0, "2020.1": 0, "2020.2": 0, "2020.3": 0, "2020.4": 0, "2020.5": 0, "2020.6": 0, "2020.7": 0, "2020.8": 0, "2020.9": 0, "2020.10": 0, "2020.11": 0, "2020.12": 0, "2021.1": 0, "2021.2": 0, "2021.3": 0, "2021.4": 0, "2021.5": 0, "2021.6": 0}
一个问题
本来打算这么写的
for time_num in times_nums: print(time_num) times_nums[time_num] = df.loc[[time_num]] print(times_nums)
但是出现报错,于是进行了如下测试
# 测试 # for i in times_nums: # a = i # break # print(a) # # print(df.loc[a]) 报错 # b = str(a) # # print(df.loc[b]) 报错 # c = int(a) # # print(df.loc[c]) 报错 # d = 2019 # # print(df.loc[d]) 正常
没有找到问题所在,只能换个方法。
用二维数组解决
df_list = df.values.tolist() # print(all) months = [] for hang in df_list: months.append(hang[2]) for month in months: month = str(month) for time_num in times_nums: if time_num == month: times_nums[time_num] = times_nums.get(time_num) + 1 time = [] num = [] items = times_nums.items() for item in items: time.append(item[0][5:]), num.append(item[1]) # print(time) # print(num)
切割列表,分别获得两年的数据
nineteen_time = time[0: 12] nineteen_num = num[0: 12] twenty_time = time[12: 23] twenty_num = num[12: 23]
作图
plt.figure(figsize=(18, 12), dpi=300) plt.style.use('ggplot') plt.bar(twenty_time, twenty_num, width=1, color='pink', label=u'2020年', alpha=0.6) plt.bar(nineteen_time, nineteen_num, width=1, color='lightblue', label=u'2019年', alpha=0.6) plt.title("各月份电脑上市数量直方图", fontdict={'fontsize': 40}) plt.xlabel("月份", fontdict={'fontsize': 30}) plt.ylabel("数量", fontdict={'fontsize': 30}) plt.xticks(fontsize=25) plt.yticks(fontsize=25) plt.legend(fontsize='xx-large') plt.savefig('各月份电脑上市数量直方图.png', dpi=300) plt.show()
成品图
成品如下图所示