Re14:读论文 ILLSI Interpretable Low-Resource Legal Decision Making

简介: Re14:读论文 ILLSI Interpretable Low-Resource Legal Decision Making

1. Background


likelihood of confusion:新商标与旧已有商标太像,会引起混淆,所以不允许。

低资源:深度学习模型会对小样本标注数据表现好(我觉得怪怪的……):(1) 迁移学习+finetune(对超参敏感)(2) 弱监督或远程监督

image.png

可解释性


2. 数据集


(说要公布但是还没有公布)由两部分组成:

  1. 525个样本:有从5个角度来衡量相似性的中间标签。分成训练集/验证集/测试集。

f2e56c49a1f447d5afeaaa81d4763bfc.png


  1. 2852个样本:全都作为训练集。

image.png

  1. augment:

image.png

我没有搞懂这个augment数据集的标签数据是怎么得来的,意思是跟clean数据中相似的句子有一样的中间标签?然后最后标签就直接求最大值?阈值是什么?我看跟人工筛选的规则也不一样啊,没有各feature之间的关联?


3. 模型


3.1 主模型

0372e620ccc54246bdd54268a3f288ca.png

image.png

image.png

3.2 curriculum learning

实现中间标签生成时使用的curriculum learning:

image.png


3.3 不做多任务范式的原因

image.png

(实验部分也拿多任务作为baseline了)


4. 实验


4.1 baseline:RoBERTa

image.png

  1. End-to-End

image.png


  1. 多任务

4.2 实验设置

image.png

4.3 主实验结果

image.png

image.png

中间标签的预测结果:

image.png


4.4 Calibration(这一部分还没看懂)

Expected Calibration Error (ECE)

240869ffa33d4c6aa23c8748100e7aa1.png

image.png


5. 文献阅读思考


不管怎么想我还是觉得这个任务应该用多模态范式来做,比如对比图片和发音。(我看到Analytics and EU Courts: The Case of Trademark Disputes这一篇真的用了CV,我心满离)


6. 代码复现


这数据集都没放出来,代码也没放出来,我issue提了也不回,无法复现。

相关文章
|
2月前
|
算法 数据挖掘 数据处理
文献解读-Sentieon DNAscope LongRead – A highly Accurate, Fast, and Efficient Pipeline for Germline Variant Calling from PacBio HiFi reads
PacBio® HiFi 测序是第一种提供经济、高精度长读数测序的技术,其平均读数长度超过 10kb,平均碱基准确率达到 99.8% 。在该研究中,研究者介绍了一种准确、高效的 DNAscope LongRead 管道,用于从 PacBio® HiFi 读数中调用胚系变异。DNAscope LongRead 是对 Sentieon 的 DNAscope 工具的修改和扩展,该工具曾获美国食品药品管理局(FDA)精密变异调用奖。
26 2
文献解读-Sentieon DNAscope LongRead – A highly Accurate, Fast, and Efficient Pipeline for Germline Variant Calling from PacBio HiFi reads
|
3月前
|
算法 数据挖掘
文献解读-Consistency and reproducibility of large panel next-generation sequencing: Multi-laboratory assessment of somatic mutation detection on reference materials with mismatch repair and proofreading deficiency
Consistency and reproducibility of large panel next-generation sequencing: Multi-laboratory assessment of somatic mutation detection on reference materials with mismatch repair and proofreading deficiency,大panel二代测序的一致性和重复性:对具有错配修复和校对缺陷的参考物质进行体细胞突变检测的多实验室评估
31 6
文献解读-Consistency and reproducibility of large panel next-generation sequencing: Multi-laboratory assessment of somatic mutation detection on reference materials with mismatch repair and proofreading deficiency
|
算法 计算机视觉 知识图谱
ACL2022:A Simple yet Effective Relation Information Guided Approach for Few-Shot Relation Extraction
少样本关系提取旨在通过在每个关系中使用几个标记的例子进行训练来预测句子中一对实体的关系。最近的一些工作引入了关系信息
127 0
|
机器学习/深度学习 自然语言处理 算法
Joint Information Extraction with Cross-Task and Cross-Instance High-Order Modeling 论文解读
先前的信息抽取(IE)工作通常独立地预测不同的任务和实例(例如,事件触发词、实体、角色、关系),而忽略了它们的相互作用,导致模型效率低下。
96 0
Re8:读论文 Hier-SPCNet: A Legal Statute Hierarchy-based Heterogeneous Network for Computing Legal Case
Re8:读论文 Hier-SPCNet: A Legal Statute Hierarchy-based Heterogeneous Network for Computing Legal Case
Re8:读论文 Hier-SPCNet: A Legal Statute Hierarchy-based Heterogeneous Network for Computing Legal Case
|
异构计算
Re12:读论文 Se3 Semantic Self-segmentation for Abstractive Summarization of Long Legal Documents in Low
Re12:读论文 Se3 Semantic Self-segmentation for Abstractive Summarization of Long Legal Documents in Low
Re12:读论文 Se3 Semantic Self-segmentation for Abstractive Summarization of Long Legal Documents in Low
|
机器学习/深度学习 自然语言处理 PyTorch
Re6:读论文 LeSICiN: A Heterogeneous Graph-based Approach for Automatic Legal Statute Identification fro
Re6:读论文 LeSICiN: A Heterogeneous Graph-based Approach for Automatic Legal Statute Identification fro
Re6:读论文 LeSICiN: A Heterogeneous Graph-based Approach for Automatic Legal Statute Identification fro
|
自然语言处理 算法 数据可视化
Re21:读论文 MSJudge Legal Judgment Prediction with Multi-Stage Case Representation Learning in the Real
Re21:读论文 MSJudge Legal Judgment Prediction with Multi-Stage Case Representation Learning in the Real
Re21:读论文 MSJudge Legal Judgment Prediction with Multi-Stage Case Representation Learning in the Real
|
搜索推荐 PyTorch 算法框架/工具
Re30:读论文 LegalGNN: Legal Information Enhanced Graph Neural Network for Recommendation
Re30:读论文 LegalGNN: Legal Information Enhanced Graph Neural Network for Recommendation
Re30:读论文 LegalGNN: Legal Information Enhanced Graph Neural Network for Recommendation
|
机器学习/深度学习 算法 数据挖掘
Re18:读论文 GCI Everything Has a Cause: Leveraging Causal Inference in Legal Text Analysis
Re18:读论文 GCI Everything Has a Cause: Leveraging Causal Inference in Legal Text Analysis
Re18:读论文 GCI Everything Has a Cause: Leveraging Causal Inference in Legal Text Analysis