这里的pyrouge安装的是这个:pyrouge · PyPI,也就是这个项目:bheinzerling/pyrouge: A Python wrapper for the ROUGE summarization evaluation package
细节稍后再补,先把主要内容写上。
总之非常麻烦,安装和运行都很麻烦。不如用rouge包(pltrdy/rouge: A full Python Implementation of the ROUGE Metric (not a wrapper))。
rouge包两种安装方法都很简单,用源的话:
git clone git://github.com/pltrdy/rouge cd rouge python setup.py install
也可以直接用pip:pip install rouge
以下介绍pyrouge包的安装方法,首先安装ROUGE-1.5.5,然后安装pyrouge包:
(以下路径全部建议使用绝对路径)
ROUGE-1.5.5.tgz文件来自:https://pan.baidu.com/s/1qXQpBp6(来自ROUG安装配置,终于在两台linux和一台Mac上成功安装ROUGE,完美解决各种问题_qingjuanzhao的博客-CSDN博客),因为andersjo/pyrouge: An interface to and, in time, a Python reimplementation of the ROUGE package for evaluating summarization里面的文件不够。此外文件也可以用fastSum: 包含多个模型和数据集的文本摘要项目里面的resources/ROUGE/RELEASE-1.5.5文件夹。
必须要管理员权限,没有想办法吧。
cpan -v sudo cpan install XML::DOM runROUGE-test.pl(文件来自上面) pip install pyrouge pyrouge_set_rouge_path RELEASE-1.5.5文件夹路径
python -m pyrouge.test会报错,参考网上的解决方案改了之后还是会。但是代码能跑。
以下用一个简单的例子来比较两个包的运行结果:
随便给个示例(注意中文字符会报错,是正常的):
trys/pyrouge_models路径下:(真实摘要)
001_candidate.txt: 0 1 2 3 4
002_candidate.txt: 0 1 2 3 4 5 6 7
trys/pyrouge_systems路径下:(预测摘要)
001_reference.txt: 0 1 2 3 4 5
002_reference.txt: 0 1 2 3 4 5
然后运行代码。这里同时拿rouge(pltrdy/rouge: A full Python Implementation of the ROUGE Metric (not a wrapper))包的结果比了一下:
from pyrouge import Rouge155 r = Rouge155() r.system_dir = 'trys/pyrouge_systems' r.model_dir = 'trys/pyrouge_models' r.model_filename_pattern = '(\d+)_candidate.txt' r.system_filename_pattern = '(\d+)_reference.txt' output = r.convert_and_evaluate() print(output) output_dict = r.output_to_dict(output) from rouge import Rouge rouge = Rouge() refs=['0 1 2 3 4','0 1 2 3 4 5 6 7'] #真实值。rouge包支持中文,这里用refs=['我 不 是 黄 蓉','红 橙 黄 绿 青 蓝 紫 。']的代码一样 hyps=['0 1 2 3 4 5','0 1 2 3 4 5'] #预测值。同上,hyps=['我 不 是 黄 蓉 啊','红 橙 黄 绿 青 蓝'] scores=rouge.get_scores(hyps,refs,avg=True) print(scores)
输出:
2022-05-14 09:41:13,118 [MainThread ] [INFO ] Writing summaries. 2022-05-14 09:41:13,118 [MainThread ] [INFO ] Processing summaries. Saving system files to /tmp/tmpq7ugz254/system and model files to /tmp/tmpq7ugz254/model. 2022-05-14 09:41:13,118 [MainThread ] [INFO ] Processing files in trys/pyrouge_systems. 2022-05-14 09:41:13,118 [MainThread ] [INFO ] Processing 001_reference.txt. 2022-05-14 09:41:13,118 [MainThread ] [INFO ] Processing 002_reference.txt. 2022-05-14 09:41:13,119 [MainThread ] [INFO ] Saved processed files to /tmp/tmpq7ugz254/system. 2022-05-14 09:41:13,119 [MainThread ] [INFO ] Processing files in trys/pyrouge_models. 2022-05-14 09:41:13,119 [MainThread ] [INFO ] Processing 001_candidate.txt. 2022-05-14 09:41:13,119 [MainThread ] [INFO ] Processing 002_candidate.txt. 2022-05-14 09:41:13,119 [MainThread ] [INFO ] Saved processed files to /tmp/tmpq7ugz254/model. 2022-05-14 09:41:13,119 [MainThread ] [INFO ] Written ROUGE configuration to /tmp/tmp49jc1wnw/rouge_conf.xml 2022-05-14 09:41:13,119 [MainThread ] [INFO ] Running ROUGE with command fastSum/fastSum/resources/ROUGE/RELEASE-1.5.5/ROUGE-1.5.5.pl -e astSum/fastSum/resources/ROUGE/RELEASE-1.5.5/data -c 95 -2 -1 -U -r 1000 -n 4 -w 1.2 -a -m /tmp/tmp49jc1wnw/rouge_conf.xml --------------------------------------------- 1 ROUGE-1 Average_R: 0.84615 (95%-conf.int. 0.84615 - 0.84615) 1 ROUGE-1 Average_P: 0.91667 (95%-conf.int. 0.91667 - 0.91667) 1 ROUGE-1 Average_F: 0.88000 (95%-conf.int. 0.88000 - 0.88000) --------------------------------------------- 1 ROUGE-2 Average_R: 0.81818 (95%-conf.int. 0.81818 - 0.81818) 1 ROUGE-2 Average_P: 0.90000 (95%-conf.int. 0.90000 - 0.90000) 1 ROUGE-2 Average_F: 0.85714 (95%-conf.int. 0.85714 - 0.85714) --------------------------------------------- 1 ROUGE-3 Average_R: 0.77778 (95%-conf.int. 0.77778 - 0.77778) 1 ROUGE-3 Average_P: 0.87500 (95%-conf.int. 0.87500 - 0.87500) 1 ROUGE-3 Average_F: 0.82353 (95%-conf.int. 0.82353 - 0.82353) --------------------------------------------- 1 ROUGE-4 Average_R: 0.71429 (95%-conf.int. 0.71429 - 0.71429) 1 ROUGE-4 Average_P: 0.83333 (95%-conf.int. 0.83333 - 0.83333) 1 ROUGE-4 Average_F: 0.76923 (95%-conf.int. 0.76923 - 0.76923) --------------------------------------------- 1 ROUGE-L Average_R: 0.84615 (95%-conf.int. 0.84615 - 0.84615) 1 ROUGE-L Average_P: 0.91667 (95%-conf.int. 0.91667 - 0.91667) 1 ROUGE-L Average_F: 0.88000 (95%-conf.int. 0.88000 - 0.88000) --------------------------------------------- 1 ROUGE-W-1.2 Average_R: 0.57431 (95%-conf.int. 0.57431 - 0.57431) 1 ROUGE-W-1.2 Average_P: 0.91742 (95%-conf.int. 0.91742 - 0.91742) 1 ROUGE-W-1.2 Average_F: 0.70641 (95%-conf.int. 0.70641 - 0.70641) --------------------------------------------- 1 ROUGE-S* Average_R: 0.65789 (95%-conf.int. 0.65789 - 0.65789) 1 ROUGE-S* Average_P: 0.83333 (95%-conf.int. 0.83333 - 0.83333) 1 ROUGE-S* Average_F: 0.73529 (95%-conf.int. 0.73529 - 0.73529) --------------------------------------------- 1 ROUGE-SU* Average_R: 0.69388 (95%-conf.int. 0.69388 - 0.69388) 1 ROUGE-SU* Average_P: 0.85000 (95%-conf.int. 0.85000 - 0.85000) 1 ROUGE-SU* Average_F: 0.76405 (95%-conf.int. 0.76405 - 0.76405) {'rouge-1': {'r': 0.875, 'p': 0.9166666666666667, 'f': 0.8831168781885648}, 'rouge-2': {'r': 0.8571428571428572, 'p': 0.9, 'f': 0.8611111062114198}, 'rouge-l': {'r': 0.875, 'p': 0.9166666666666667, 'f': 0.8831168781885648}}
差别还是挺大的,整的我很困惑。按算法来说rouge-1-r应该是(5/5+6/8)/2=(1+0.75)/2=0.875,rouge-1-p应该是(5/6+6/6)/2=(0.833+1)/2=0.917,rouge-1-f就自然是(2*1*0.833/(1+0.833)+2*0.75*1/(0.75+1))/2=0.883,所以rouge包算的应该是对的。
其他:rouge-2-r=(1+5/7)/2=0.857, rouge-2-p=(4/5+1)/2=(0.8+1)/20.9, rouge-2-f=(2*1*0.8/(1+0.8)+(2*5/7*1)/(5/7+1))/2=0.861
rouge-L-r=(1+6/8)/2=(1+0.75)/2=0.875, rouge-L-p=(5/6+1)/2=0.917, rouge-L-f=(2*1*5/6/(1+5/6)+2*0.75*1/(0.75+1))/2=0.883
这样的话rouge包就是对的,pyrouge包的算法就是错的。我也不知道ROUGE-1.5.5到底是拿啥算的!我不会PERL语言啊!都2022年了谁还用PERL语言啊!反正建议用rouge包!