Re19:读论文 Paragraph-level Rationale Extraction through Regularization: A case study on European Court

简介: Re19:读论文 Paragraph-level Rationale Extraction through Regularization: A case study on European Court

1. Background


  1. rationalization by construction方法论:直接用constraint来正则化模型,对模型决策基于正确rationales的情况给与reward,而非事后根据模型决策结果推理可解释性

the model is regularized to satisfy additional constraints that reward the model, if its decisions are based on concise rationales it selects, as opposed to inferring explanations from the model’s decisions in a post-hoc manner

  1. 可解释性的意义:right to explanation
  2. 执法过程:

image.png


2. 模型


2.1 Novelty

  1. previous work on word-level rationales for binary classification→paragraph-level rationales
  2. 第一个在端到端微调预训练Transformer模型中应用rationale extraction的工作
  3. 不需要人工标注的rationales


2.2 模型

constraint:以前就有的sparsity, continuity(实验证明无效), and comprehensiveness(需要根据multi-label范式进行修正),本文新提出的singularity(能提升效果,而且鲁棒)

baseline HIERBERT-HA:text encoder→rationale extraction→prediction

image.png


在视频中放的图是:

image.png


词级别的正则器

①分别编码每个段落:context-unaware paragraph representations

②用2层transformer编码contextualized paragraph embeddings

③全连接层(激活函数selu)

K→用于分类

Q→用于rationale extraction→每个段落分别过全连接层+sigmoid,得到soft attention

scores→binarize,得到hard attention scores

④得到hardmasked document representation(hard mask+max pooling)(不可微,所以有一个训练trick)

⑤全连接层+sigmoid

baseline HIERBERT-ALL:不mask事实

constraint:

①Sparsity:限制选择出的事实的数目

②Continuity:于本文模型无用,但还是实验了

③Comprehensiveness:留下的段落生成的结果比扔掉的要好多少,或者比较两种段落的余弦相似度

④Singularity:选出的mask比随机的要好

Rationales supervision:noisy rationale supervision

image.png

image.png


3. 实验


3.1 数据集

提出ECtHR数据集,英语案例文本,silver/gold rationales,事件有时间顺序,决策包括违背的法条和援引的先例


3.2 实验设置

超参数:

image.png

网格搜索,Adam,学习率2e-5

贪心调参

LEGAL-BERT-SMALL:

50 paragraphs of 256 words


3.3 实验结果

指标:

micro-F1

Faithfulness: sufficiency and comprehensiveness

Rationale quality: Objective / subjective (mean R-Precision (mRP) Precision@k)

image.png

image.png

image.png

image.png


4. 代码复现


等我服务器好了再说。

相关文章
|
自然语言处理 算法 数据可视化
Re21:读论文 MSJudge Legal Judgment Prediction with Multi-Stage Case Representation Learning in the Real
Re21:读论文 MSJudge Legal Judgment Prediction with Multi-Stage Case Representation Learning in the Real
Re21:读论文 MSJudge Legal Judgment Prediction with Multi-Stage Case Representation Learning in the Real
《Towards A Fault-Tolerant Speaker Verification System A Regularization Approach To Reduce The Condition Number》电子版地址
Towards A Fault-Tolerant Speaker Verification System: A Regularization Approach To Reduce The Condition Number
96 0
《Towards A Fault-Tolerant Speaker Verification System A Regularization Approach To Reduce The Condition Number》电子版地址
|
语音技术 机器学习/深度学习 开发者
语音顶会Interspeech 论文解读|Towards A Fault-tolerant Speaker Verification System: A Regularization Approach To Reduce The Condition Number
Interspeech是世界上规模最大,最全面的顶级语音领域会议,本文为Siqi Zheng, Gang Liu, Hongbin Suo, Yun Lei的入选论文
语音顶会Interspeech 论文解读|Towards A Fault-tolerant Speaker Verification System: A Regularization Approach To Reduce The Condition Number
|
异构计算
Re12:读论文 Se3 Semantic Self-segmentation for Abstractive Summarization of Long Legal Documents in Low
Re12:读论文 Se3 Semantic Self-segmentation for Abstractive Summarization of Long Legal Documents in Low
Re12:读论文 Se3 Semantic Self-segmentation for Abstractive Summarization of Long Legal Documents in Low
|
自然语言处理 Oracle 关系型数据库
Re32:读论文 Summarizing Legal Regulatory Documents using Transformers
这篇文章提出了一个英文法律规范文件摘要数据集。模型就是很简单地把抽取式摘要建模成每一句的二分类任务,还测试了在此之前用TextRank先抽取一遍的效果。(这个指标甚至没有做人工的) 看起来非常简单,这样就能发SIGIR吗,那我怎么不行…… 所以可能本文的贡献重点在数据集上吧!
Re32:读论文 Summarizing Legal Regulatory Documents using Transformers
|
存储
PAT (Advanced Level) Practice - 1051 Pop Sequence(25 分)
PAT (Advanced Level) Practice - 1051 Pop Sequence(25 分)
111 0
|
9月前
|
机器学习/深度学习 自然语言处理 算法
[UIM]论文解读:subword Regularization: Multiple Subword Candidates
[UIM]论文解读:subword Regularization: Multiple Subword Candidates
73 0
|
机器学习/深度学习
论文笔记之: Person Re-Identification by Multi-Channel Parts-Based CNN with Improved Triplet Loss Function
 Person Re-Identification by Multi-Channel Parts-Based CNN with Improved Triplet Loss Function  CVPR 2016      摘要:跨摄像机的行人再识别仍然是一个具有挑战的问题,特别是摄像机之间没有重叠的观测区域。
|
人工智能
2015 Multi-University Training Contest 1 - 1001 OO’s Sequence
 OO’s Sequence Problem's Link: http://acm.hdu.edu.cn/showproblem.php?pid=5288   Mean:  给定一个数列,让你求所有区间上满足Ai%Aj!=0(Ai!=Aj)的Ai的个数之和。
922 0
|
机器学习/深度学习 自然语言处理 数据挖掘
UnifiedEAE: A Multi-Format Transfer Learning Model for Event Argument Extraction via Variational论文解读
事件论元抽取(Event argument extraction, EAE)旨在从文本中抽取具有特定角色的论元,在自然语言处理中已被广泛研究。
113 0

热门文章

最新文章