总结 | ACL2022主会论文分类整理(二)

简介: 总结 | ACL2022主会论文分类整理(二)

表示学习



  • A Contrastive Framework for Learning Sentence Representations from Pairwise and Triple-wise Perspective in Angular Space


  • Auto-Debias: Debiasing Masked Language Models with Automated Biased Prompts


  • Compact Token Representations with Contextual Quantization for Efficient Document Re-ranking


  • Contextual Representation Learning beyond Masked Language Modeling


  • Contrastive Visual Semantic Pretraining Magnifies the Semantics of Natural Language Representations


  • Cross-Lingual Contrastive Learning for Fine-Grained Entity Typing for LowResource Languages


  • Cross-Modal Discrete Representation Learning


  • Debiased Contrastive Learning of Unsupervised Sentence Representations


  • Enhancing Chinese Pre-trained Language Model via Heterogeneous Linguistics Graph


  • GL-CLeF: A Global–Local Contrastive Learning Framework for Crosslingual Spoken Language Understanding


  • Improving Event Representation via Simultaneous Weakly Supervised Contrastive Learning and Clustering


  • Just Rank: Rethinking Evaluation with Word and Sentence Similarities


  • Language-agnostic BERT Sentence Embedding


  • Learning Disentangled Representations of Negation and Uncertainty


  • Learning Disentangled Textual Representations via Statistical Measures of Similarity


  • Multilingual Molecular Representation Learning via Contrastive Pre-training


  • Nibbling at the Hard Core of Word Sense Disambiguation


  • Noisy Channel Language Model Prompting for Few-Shot Text Classification


  • Rare and Zero-shot Word Sense Disambiguation using Z-Reweighting


  • Sentence-level Privacy for Document Embeddings


  • Softmax Bottleneck Makes Language Models Unable to Represent Multimode Word Distributions


  • SPoT: Better Frozen Model Adaptation through Soft Prompt Transfer


  • Tackling Fake News Detection by Continually Improving Social Context Representations using Graph Neural Networks


  • The Grammar-Learning Trajectories of Neural Language Models


  • Using Context-to-Vector with Graph Retrofitting to Improve Word Embeddings


Machine Learning for NLP【NLP中的机器学习】



  • A Rationale-Centric Framework for Human-in-the-loop Machine Learning


  • Bias Mitigation in Machine Translation Quality Estimation


  • Disentangled Sequence to Sequence Learning for Compositional Generalization


  • DoCoGen: Domain Counterfactual Generation for Low Resource Domain Adaptation


  • Domain Adaptation in Multilingual and Multi-Domain Monolingual Settings for Complex Word Identification


  • Domain Knowledge Transferring for Pre-trained Language Model via Calibrated Activation Boundary Distillation


  • Learning Functional Distributional Semantics with Visual Data


  • Leveaging Relaxed Equilibrium by Lazy Transition for Sequence Modeling


  • Local Languages, Third Spaces, and other High-Resource Scenarios


  • Meta-learning via Language Model In-context Tuning


  • MPII: Multi-Level Mutual Promotion for Inference and Interpretation


  • On the Calibration of Pre-trained Language Models using Mixup Guided by Area Under the Margin and Saliency


  • Overcoming a Theoretical Limitation of Self-Attention


  • Rethinking Negative Sampling for Handling Missing Entity Annotations


  • Rethinking Self-Supervision Objectives for Generalizable Coherence Modeling


  • Robust Lottery Tickets for Pre-trained Language Models


  • Sharpness-Aware Minimization Improves Language Model Generalization


  • Skill Induction and Planning with Latent Language


  • The Trade-offs of Domain Adaptation for Neural Language Models


  • Distributionally Robust Finetuning BERT for Covariate Drift in Spoken Language Understanding


  • Learning to Imagine: Integrating Counterfactual Thinking in Neural Discrete Reasoning


Machine Translation and Multilinguality【机器翻译与多语】


翻译


  • Alignment-Augmented Consistent Translation for Multilingual Open Information Extraction


  • Alternative Input Signals Ease Transfer in Multilingual Machine Translation


  • BiTIIMT: A Bilingual Text-infilling Method for Interactive Machine Translation


  • Bridging the Data Gap between Training and Inference for Unsupervised Neural Machine Translation


  • Can Transformer be Too Compositional? Analysing Idiom Processing in Neural Machine Translation


  • CipherDAug: Ciphertext based Data Augmentation for Neural Machine Translation


  • Conditional Bilingual Mutual Information Based Adaptive Training for Neural Machine Translation


  • Confidence Based Bidirectional Global Context Aware Training Framework for Neural Machine Translation


  • DEEP: DEnoising Entity Pre-training for Neural Machine Translation


  • DiBiMT: A Novel Benchmark for Measuring Word Sense Disambiguation Biases in Machine Translation


  • Divide and Rule: Effective Pre-Training for Context-Aware Multi-Encoder Translation Models


  • EAG: Extract and Generate Multi-way Aligned Corpus for Complete Multilingual Neural Machine Translation


  • Efficient Cluster-Based k-Nearest-Neighbor Machine Translation


  • Flow-Adapter Architecture for Unsupervised Machine Translation


  • From Simultaneous to Streaming Machine Translation by Leveraging Streaming History


  • Improving Word Translation via Two-Stage Contrastive Learning


  • Integrating Vectorized Lexical Constraints for Neural Machine Translation


  • Investigating Failures of Automatic Translation in the Case of Unambiguous Gender


  • Learning Adaptive Segmentation Policy for End-to-End Simultaneous Translation


  • Learning Confidence for Transformer-based Neural Machine Translation


  • Learning to Generalize to More: Continuous Semantic Augmentation for Neural Machine Translation


  • Learning When to Translate for Streaming Speech


  • Measuring and Mitigating Name Biases in Neural Machine Translation


  • Modeling Dual Read/Write Paths for Simultaneous Machine Translation


  • MSP: Multi-Stage Prompting for Making Pre-trained Language Models Better Translators


  • Multilingual Document-Level Translation Enables Zero-Shot Transfer From Sentences to Documents


  • Multilingual Mix: Example Interpolation Improves Multilingual Neural Machine Translation


  • Neural Machine Translation with Phrase-Level Universal Visual Representations


  • On Vision Features in Multimodal Machine Translation


  • Overcoming Catastrophic Forgetting beyond Continual Learning: Balanced Training for Neural Machine Translation


  • Prediction Difference Regularization against Perturbation for Neural Machine Translation


  • Redistributing Low-Frequency Words: Making the Most of Monolingual Data in Non-Autoregressive Translation


  • Reducing Position Bias in Simultaneous Machine Translation with Length Aware Framework


  • Scheduled Multi-task Learning for Neural Chat Translation


  • The Paradox of the Compositionality of Natural Language: A Neural Machine Translation Case Study


  • Towards Making the Most of Cross-Lingual Transfer for Zero-Shot Neural Machine Translation


  • Understanding and Improving Sequence-to-Sequence Pretraining for Neural Machine Translation


  • Unified Speech-Text Pre-training for Speech Translation and Recognition


  • UniTE: Unified Translation Evaluation


  • Universal Conditional Masked Language Pre-training for Neural Machine Translation


多语



  • AmericasNLI: Evaluating Zero-shot Natural Language Understanding of Pretrained Multilingual Models in Truly Low-resource Languages


  • Cross-Lingual Ability of Multilingual Masked Language Models: A Study of Language Structure


  • Domain Adaptation in Multilingual and Multi-Domain Monolingual Settings for Complex Word Identification


  • Expanding Pretrained Models to Thousands More Languages via Lexiconbased Adaptation


  • Match the Script, Adapt if Multilingual: Analyzing the Effect of Multilingual Pretraining on Cross-lingual Transferability


  • mLUKE: The Power of Entity Representations in Multilingual Pretrained Language Models


  • Multi Task Learning For Zero Shot Performance Prediction of Multilingual Models


  • Multilingual Generative Language Models for Zero-Shot Cross-Lingual Event Argument Extraction


  • Multilingual Knowledge Graph Completion with Self-Supervised Adaptive Graph Alignment


  • Multilingual Molecular Representation Learning via Contrastive Pre-training


  • Multilingual unsupervised sequence segmentation transfers to extremely low-resource languages


  • One Country, 700+ Languages: NLP Challenges for Underrepresented Languages and Dialects in Indonesia


  • Prix-LM: Pretraining for Multilingual Knowledge Base Construction


  • Probing Structured Pruning on Multilingual Pre-trained Models: Settings, Algorithms, and Efficiency


Question Answering【问答与理解】



阅读理解


  • AdaLoGN: Adaptive Logic Graph Network for Reasoning-Based Machine Reading Comprehension


  • Deep Inductive Logic Reasoning for Multi-Hop Reading Comprehension


  • Improving Machine Reading Comprehension with Contextualized Commonsense Knowledge


  • Learning Disentangled Semantic Representations for Zero-Shot CrossLingual Transfer in Multilingual Machine Reading Comprehension


  • Lite Unified Modeling for Discriminative Reading Comprehension


  • Modeling Temporal-Modal Entity Graph for Procedural Multimodal Machine Comprehension


  • What Makes Reading Comprehension Questions Difficult?


  • MultiHiertt: Numerical Reasoning over Multi Hierarchical Tabular and Textual Data


问答


  • Answer-level Calibration for Free-form Multiple Choice Question Answering


  • Answering Open-Domain Multi-Answer Questions via a Recall-then-Verify Framework


  • CQG: A Simple and Effective Controlled Generation Framework for Multihop Question Generation


  • Ditch the Gold Standard: Re-evaluating Conversational Question Answering


  • Generated Knowledge Prompting for Commonsense Reasoning


  • How Do We Answer Complex Questions: Discourse Structure of Long-form Answers


  • Hypergraph Transformer: Weakly-Supervised Multi-hop Reasoning for Knowledge-based Visual Question Answering


  • Hyperlink-induced Pre-training for Passage Retrieval in Open-domain Question Answering


  • Improving Time Sensitivity for Question Answering over Temporal Knowledge Graphs


  • It is AI’s Turn to Ask Humans a Question: Question-Answer Pair Generation for Children’s Story Books


  • KaFSP: Knowledge-Aware Fuzzy Semantic Parsing for Conversational Question Answering over a Large-Scale Knowledge Base


  • KG-FiD: Infusing Knowledge Graph in Fusion-in-Decoder for Open-Domain Question Answering


  • MMCoQA: Conversational Question Answering over Text, Tables, and Images


  • Modeling Multi-hop Question Answering as Single Sequence Prediction


  • On the Robustness of Question Rewriting Systems to Questions of Varying Hardness


  • Open Domain Question Answering with A Unified Knowledge Interface


  • Program Transfer for Answering Complex Questions over Knowledge Bases


  • Retrieval-guided Counterfactual Generation for QA


  • RNG-KBQA: Generation Augmented Iterative Ranking for Knowledge Base Question Answering


  • Sequence-to-Sequence Knowledge Graph Completion and Question Answering


  • Simulating Bandit Learning from User Feedback for Extractive Question Answering


  • Subgraph Retrieval Enhanced Model for Multi-hop Knowledge Base Question Answering


  • Synthetic Question Value Estimation for Domain Adaptation of Question Answering


  • Your Answer is Incorrect… Would you like to know why? Introducing a Bilingual Short Answer Feedback Dataset


Resources and Evaluation【数据集与评估方法】



数据集


  • A Statutory Article Retrieval Dataset in French


  • CBLUE: A Chinese Biomedical Language Understanding Evaluation Benchmark


  • Chart-to-Text: A Large-Scale Benchmark for Chart Summarization


  • CICERO: A Dataset for Contextualized Commonsense Inference in Dialogues


  • CLUES: A Benchmark for Learning Classifiers using Natural Language Explanations


  • ConditionalQA: A Complex Reading Comprehension Dataset with Conditional Answers


  • Cree Corpus: A Collection of nêhiyawêwin Resources


  • Detecting Unassimilated Borrowings in Spanish: An Annotated Corpus and Approaches to Modeling


  • DialFact: A Benchmark for Fact-Checking in Dialogue


  • DiBiMT: A Novel Benchmark for Measuring Word Sense Disambiguation Biases in Machine Translation


  • Down and Across: Introducing Crossword-Solving as a New NLP Benchmark


  • e-CARE: a New Dataset for Exploring Explainable Causal Reasoning


  • EntSUM: A Data Set for Entity-Centric Extractive Summarization


  • ePiC: Employing Proverbs in Context as a Benchmark for Abstract Language Understanding


  • FairLex: A Multilingual Benchmark for Evaluating Fairness in Legal Text Processing


  • Fantastic Questions and Where to Find Them: FairytaleQA – An Authentic Dataset for Narrative Comprehension


  • Few-Shot Tabular Data Enrichment Using Fine-Tuned Transformer Architectures


  • French CrowS-Pairs: Extending a challenge dataset for measuring social bias in masked language models to a language other than English


  • From text to talk: Harnessing conversational corpora for humane and diversity-aware language technology


  • HiTab: A Hierarchical Table Dataset for Question Answering and Natural Language Generation


  • IAM: A Comprehensive and Large-Scale Dataset for Integrated Argument Mining Tasks


  • Image Retrieval from Contextual Descriptions


  • KQA Pro: A Dataset with Explicit Compositional Programs for Complex Question Answering over Knowledge Base


  • LexGLUE: A Benchmark Dataset for Legal Language Understanding in English


  • M3ED: Multi-modal Multi-scene Multi-label Emotional Dialogue Database


  • MSCTD: A Multimodal Sentiment Chat Translation Dataset


  • NumGLUE: A Suite of Fundamental yet Challenging Mathematical Reasoning Tasks


  • QuoteR: A Benchmark of Quote Recommendation for Writing


  • Reports of personal experiences and stories in argumentation: datasets and analysis


  • RNSum: A Large-Scale Dataset for Automatic Release Note Generation via Commit Logs Summarization


  • SciNLI: A Corpus for Natural Language Inference on Scientific Text


  • SummScreen: A Dataset for Abstractive Screenplay Summarization


  • SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark for Semantic and Generative Capabilities


  • Textomics: A Dataset for Genomics Data Summary Generation


  • The Moral Integrity Corpus: A Benchmark for Ethical Dialogue Systems


  • ToxiGen: A Large-Scale Machine-Generated Dataset for Adversarial and Implicit Hate Speech Detection


  • VALSE: A Task-Independent Benchmark for Vision and Language Models Centered on Linguistic Phenomena


  • WatClaimCheck: A new Dataset for Claim Entailment and Inference


  • Your Answer is Incorrect… Would you like to know why? Introducing a Bilingual Short Answer Feedback Dataset


相关文章
|
9月前
|
人工智能 运维 Serverless
DeepSeek模型部署全过程实践,轻松上手就在阿里云
随着人工智能技术的不断发展,越来越多的企业和个人开始探索如何利用深度学习模型来提升业务效率和用户体验。阿里云推出的【零门槛、轻松部署您的专属 DeepSeek 模型】解决方案为用户提供了多种便捷的部署方式,包括**基于百炼 API 调用满血版、基于人工智能平台 PAl 部署、基于函数计算部署以及基于 GPU 云服务器部署**。本文将从多个维度对这些部署方式进行详细评测,并分享个人的实际体验和观点。
2040 26
|
SQL 缓存 关系型数据库
MySQL 深潜 - Semijoin 丛林小道全览
作者深入内核讲述了 MySQL semijoin 从识别到优化器根据代价决定最优执行策略,以及执行方式的全过程,掌握 MySQL semijoin 这一篇就够了!
|
机器学习/深度学习 算法 数据处理
深度学习之多模态信息检索
基于深度学习的多模态信息检索(Multimodal Information Retrieval, MMIR)是指利用深度学习技术,从包含多种模态(如文本、图像、视频、音频等)的数据集中检索出满足用户查询意图的相关信息。
468 5
Nest.js 实战 (十三):实现 SSE 服务端主动向客户端推送消息
这篇文章介绍了在Nest.js应用中使用Server-Sent Events (SSE)的技术。文章首先讨论了在特定业务场景下,为何选择SSE而不是WebSocket作为实时通信系统的实现方式。接着解释了SSE的概念,并展示了如何在Nest.js中实现SSE。文章包含客户端实现的代码示例,并以一个效果演示结束,总结SSE在Nest.js中的应用。
566 0
Nest.js 实战 (十三):实现 SSE 服务端主动向客户端推送消息
|
机器学习/深度学习 文字识别 算法
百度飞桨(PaddlePaddle) - PaddleHub OCR 文字识别简单使用
百度飞桨(PaddlePaddle) - PaddleHub OCR 文字识别简单使用
1013 0
|
机器学习/深度学习 数据采集 人工智能
【机器学习】怎样检测到线性回归模型中的过拟合?
【5月更文挑战第17天】【机器学习】怎样检测到线性回归模型中的过拟合?
|
自然语言处理 安全 数据挖掘
PaddleNLP基于ERNIR3.0文本分类以CAIL2018-SMALL数据集罪名预测任务为例【多标签】
文本分类任务是自然语言处理中最常见的任务,文本分类任务简单来说就是对给定的一个句子或一段文本使用文本分类器进行分类。文本分类任务广泛应用于长短文本分类、情感分析、新闻分类、事件类别分类、政务数据分类、商品信息分类、商品类目预测、文章分类、论文类别分类、专利分类、案件描述分类、罪名分类、意图分类、论文专利分类、邮件自动标签、评论正负识别、药物反应分类、对话分类、税种识别、来电信息自动分类、投诉分类、广告检测、敏感违法内容检测、内容安全检测、舆情分析、话题标记等各类日常或专业领域中。 文本分类任务可以根据标签类型分为**多分类(multi class)、多标签(multi label)、层次分类
PaddleNLP基于ERNIR3.0文本分类以CAIL2018-SMALL数据集罪名预测任务为例【多标签】
|
Ubuntu
ubuntu 安装 virt-manager 虚拟机
ubuntu 安装 virt-manager 虚拟机
851 1
|
自然语言处理 网络协议 网络安全
【Python】已解决:nltk.download(‘stopwords‘) 报错问题
【Python】已解决:nltk.download(‘stopwords‘) 报错问题
1708 0