benchmark程序
如果发现提交后,loss和排名都是无,请按一下几条详细检查提交格式:
1. shop_id:shop_id应该为1至2000的整数值。缺少或异常的shop_id都会导致提交错误。可参考prediction_example.csv第一列。
2. 预测结果:预测结果应是非负整数,而且不能为空
3. 存储格式:最后请确认文件使用utf-8 without BOM格式存储。
目前,大部分问题是第二条,请详细检查预测结果是否都是非负整数。
如果依然无法解决,请将文件以附件方式发送至
tianchi_ijcai2017@service.alibaba.com,我们会尽快回复。
-------------------------
#coding=utf-8 import numpy as np import pandas as pd # your path to table user_pay user_pay = 'user_pay.txt' # load data print('loading data...') user_pay_df = pd.read_table(user_pay, sep=',', header=None, \ names=['user_id', 'shop_id', 'time_stamp'], \ dtype={'user_id':'str', 'shop_id':'str', 'time_stamp':'str'}) # generate customer flow print('generating customer flow...') user_pay_df['time_stamp'] = user_pay_df['time_stamp'].str[:10] customer_flow = user_pay_df.groupby(['shop_id', 'time_stamp']).size() # predict fid = open('prediction_example.csv', 'w') for shop_id in xrange(1, 2001): print('predicting: %4d/2000'%shop_id) weekly_flow = pd.Series(np.zeros(7, dtype=int), [d.strftime('%Y-%m-%d') for d in pd.date_range('10/25/2016', periods=7)]) flow = customer_flow.loc[str(shop_id), '2016-10-25':'2016-10-31'] weekly_flow[flow.index.get_level_values(1)] = flow # use latest week's customer flow to predict following 2 weeks' customer flow predictons = ','.join([str(x) for x in list(weekly_flow)*2]) fid.write('%d,%s\n'%(shop_id, predictons)) fid.close() print('Finish')
赞0
踩0