总结 | ACL2022主会论文分类整理(二)
简介:
总结 | ACL2022主会论文分类整理(二)
表示学习
- A Contrastive Framework for Learning Sentence Representations from Pairwise and Triple-wise Perspective in Angular Space
- Auto-Debias: Debiasing Masked Language Models with Automated Biased Prompts
- Compact Token Representations with Contextual Quantization for Efficient Document Re-ranking
- Contextual Representation Learning beyond Masked Language Modeling
- Contrastive Visual Semantic Pretraining Magnifies the Semantics of Natural Language Representations
- Cross-Lingual Contrastive Learning for Fine-Grained Entity Typing for LowResource Languages
- Cross-Modal Discrete Representation Learning
- Debiased Contrastive Learning of Unsupervised Sentence Representations
- Enhancing Chinese Pre-trained Language Model via Heterogeneous Linguistics Graph
- GL-CLeF: A Global–Local Contrastive Learning Framework for Crosslingual Spoken Language Understanding
- Improving Event Representation via Simultaneous Weakly Supervised Contrastive Learning and Clustering
- Just Rank: Rethinking Evaluation with Word and Sentence Similarities
- Language-agnostic BERT Sentence Embedding
- Learning Disentangled Representations of Negation and Uncertainty
- Learning Disentangled Textual Representations via Statistical Measures of Similarity
- Multilingual Molecular Representation Learning via Contrastive Pre-training
- Nibbling at the Hard Core of Word Sense Disambiguation
- Noisy Channel Language Model Prompting for Few-Shot Text Classification
- Rare and Zero-shot Word Sense Disambiguation using Z-Reweighting
- Sentence-level Privacy for Document Embeddings
- Softmax Bottleneck Makes Language Models Unable to Represent Multimode Word Distributions
- SPoT: Better Frozen Model Adaptation through Soft Prompt Transfer
- Tackling Fake News Detection by Continually Improving Social Context Representations using Graph Neural Networks
- The Grammar-Learning Trajectories of Neural Language Models
- Using Context-to-Vector with Graph Retrofitting to Improve Word Embeddings
Machine Learning for NLP【NLP中的机器学习】
- A Rationale-Centric Framework for Human-in-the-loop Machine Learning
- Bias Mitigation in Machine Translation Quality Estimation
- Disentangled Sequence to Sequence Learning for Compositional Generalization
- DoCoGen: Domain Counterfactual Generation for Low Resource Domain Adaptation
- Domain Adaptation in Multilingual and Multi-Domain Monolingual Settings for Complex Word Identification
- Domain Knowledge Transferring for Pre-trained Language Model via Calibrated Activation Boundary Distillation
- Learning Functional Distributional Semantics with Visual Data
- Leveaging Relaxed Equilibrium by Lazy Transition for Sequence Modeling
- Local Languages, Third Spaces, and other High-Resource Scenarios
- Meta-learning via Language Model In-context Tuning
- MPII: Multi-Level Mutual Promotion for Inference and Interpretation
- On the Calibration of Pre-trained Language Models using Mixup Guided by Area Under the Margin and Saliency
- Overcoming a Theoretical Limitation of Self-Attention
- Rethinking Negative Sampling for Handling Missing Entity Annotations
- Rethinking Self-Supervision Objectives for Generalizable Coherence Modeling
- Robust Lottery Tickets for Pre-trained Language Models
- Sharpness-Aware Minimization Improves Language Model Generalization
- Skill Induction and Planning with Latent Language
- The Trade-offs of Domain Adaptation for Neural Language Models
- Distributionally Robust Finetuning BERT for Covariate Drift in Spoken Language Understanding
- Learning to Imagine: Integrating Counterfactual Thinking in Neural Discrete Reasoning
Machine Translation and Multilinguality【机器翻译与多语】
翻译
- Alignment-Augmented Consistent Translation for Multilingual Open Information Extraction
- Alternative Input Signals Ease Transfer in Multilingual Machine Translation
- BiTIIMT: A Bilingual Text-infilling Method for Interactive Machine Translation
- Bridging the Data Gap between Training and Inference for Unsupervised Neural Machine Translation
- Can Transformer be Too Compositional? Analysing Idiom Processing in Neural Machine Translation
- CipherDAug: Ciphertext based Data Augmentation for Neural Machine Translation
- Conditional Bilingual Mutual Information Based Adaptive Training for Neural Machine Translation
- Confidence Based Bidirectional Global Context Aware Training Framework for Neural Machine Translation
- DEEP: DEnoising Entity Pre-training for Neural Machine Translation
- DiBiMT: A Novel Benchmark for Measuring Word Sense Disambiguation Biases in Machine Translation
- Divide and Rule: Effective Pre-Training for Context-Aware Multi-Encoder Translation Models
- EAG: Extract and Generate Multi-way Aligned Corpus for Complete Multilingual Neural Machine Translation
- Efficient Cluster-Based k-Nearest-Neighbor Machine Translation
- Flow-Adapter Architecture for Unsupervised Machine Translation
- From Simultaneous to Streaming Machine Translation by Leveraging Streaming History
- Improving Word Translation via Two-Stage Contrastive Learning
- Integrating Vectorized Lexical Constraints for Neural Machine Translation
- Investigating Failures of Automatic Translation in the Case of Unambiguous Gender
- Learning Adaptive Segmentation Policy for End-to-End Simultaneous Translation
- Learning Confidence for Transformer-based Neural Machine Translation
- Learning to Generalize to More: Continuous Semantic Augmentation for Neural Machine Translation
- Learning When to Translate for Streaming Speech
- Measuring and Mitigating Name Biases in Neural Machine Translation
- Modeling Dual Read/Write Paths for Simultaneous Machine Translation
- MSP: Multi-Stage Prompting for Making Pre-trained Language Models Better Translators
- Multilingual Document-Level Translation Enables Zero-Shot Transfer From Sentences to Documents
- Multilingual Mix: Example Interpolation Improves Multilingual Neural Machine Translation
- Neural Machine Translation with Phrase-Level Universal Visual Representations
- On Vision Features in Multimodal Machine Translation
- Overcoming Catastrophic Forgetting beyond Continual Learning: Balanced Training for Neural Machine Translation
- Prediction Difference Regularization against Perturbation for Neural Machine Translation
- Redistributing Low-Frequency Words: Making the Most of Monolingual Data in Non-Autoregressive Translation
- Reducing Position Bias in Simultaneous Machine Translation with Length Aware Framework
- Scheduled Multi-task Learning for Neural Chat Translation
- The Paradox of the Compositionality of Natural Language: A Neural Machine Translation Case Study
- Towards Making the Most of Cross-Lingual Transfer for Zero-Shot Neural Machine Translation
- Understanding and Improving Sequence-to-Sequence Pretraining for Neural Machine Translation
- Unified Speech-Text Pre-training for Speech Translation and Recognition
- UniTE: Unified Translation Evaluation
- Universal Conditional Masked Language Pre-training for Neural Machine Translation
多语
- AmericasNLI: Evaluating Zero-shot Natural Language Understanding of Pretrained Multilingual Models in Truly Low-resource Languages
- Cross-Lingual Ability of Multilingual Masked Language Models: A Study of Language Structure
- Domain Adaptation in Multilingual and Multi-Domain Monolingual Settings for Complex Word Identification
- Expanding Pretrained Models to Thousands More Languages via Lexiconbased Adaptation
- Match the Script, Adapt if Multilingual: Analyzing the Effect of Multilingual Pretraining on Cross-lingual Transferability
- mLUKE: The Power of Entity Representations in Multilingual Pretrained Language Models
- Multi Task Learning For Zero Shot Performance Prediction of Multilingual Models
- Multilingual Generative Language Models for Zero-Shot Cross-Lingual Event Argument Extraction
- Multilingual Knowledge Graph Completion with Self-Supervised Adaptive Graph Alignment
- Multilingual Molecular Representation Learning via Contrastive Pre-training
- Multilingual unsupervised sequence segmentation transfers to extremely low-resource languages
- One Country, 700+ Languages: NLP Challenges for Underrepresented Languages and Dialects in Indonesia
- Prix-LM: Pretraining for Multilingual Knowledge Base Construction
- Probing Structured Pruning on Multilingual Pre-trained Models: Settings, Algorithms, and Efficiency
Question Answering【问答与理解】
阅读理解
- AdaLoGN: Adaptive Logic Graph Network for Reasoning-Based Machine Reading Comprehension
- Deep Inductive Logic Reasoning for Multi-Hop Reading Comprehension
- Improving Machine Reading Comprehension with Contextualized Commonsense Knowledge
- Learning Disentangled Semantic Representations for Zero-Shot CrossLingual Transfer in Multilingual Machine Reading Comprehension
- Lite Unified Modeling for Discriminative Reading Comprehension
- Modeling Temporal-Modal Entity Graph for Procedural Multimodal Machine Comprehension
- What Makes Reading Comprehension Questions Difficult?
- MultiHiertt: Numerical Reasoning over Multi Hierarchical Tabular and Textual Data
问答
- Answer-level Calibration for Free-form Multiple Choice Question Answering
- Answering Open-Domain Multi-Answer Questions via a Recall-then-Verify Framework
- CQG: A Simple and Effective Controlled Generation Framework for Multihop Question Generation
- Ditch the Gold Standard: Re-evaluating Conversational Question Answering
- Generated Knowledge Prompting for Commonsense Reasoning
- How Do We Answer Complex Questions: Discourse Structure of Long-form Answers
- Hypergraph Transformer: Weakly-Supervised Multi-hop Reasoning for Knowledge-based Visual Question Answering
- Hyperlink-induced Pre-training for Passage Retrieval in Open-domain Question Answering
- Improving Time Sensitivity for Question Answering over Temporal Knowledge Graphs
- It is AI’s Turn to Ask Humans a Question: Question-Answer Pair Generation for Children’s Story Books
- KaFSP: Knowledge-Aware Fuzzy Semantic Parsing for Conversational Question Answering over a Large-Scale Knowledge Base
- KG-FiD: Infusing Knowledge Graph in Fusion-in-Decoder for Open-Domain Question Answering
- MMCoQA: Conversational Question Answering over Text, Tables, and Images
- Modeling Multi-hop Question Answering as Single Sequence Prediction
- On the Robustness of Question Rewriting Systems to Questions of Varying Hardness
- Open Domain Question Answering with A Unified Knowledge Interface
- Program Transfer for Answering Complex Questions over Knowledge Bases
- Retrieval-guided Counterfactual Generation for QA
- RNG-KBQA: Generation Augmented Iterative Ranking for Knowledge Base Question Answering
- Sequence-to-Sequence Knowledge Graph Completion and Question Answering
- Simulating Bandit Learning from User Feedback for Extractive Question Answering
- Subgraph Retrieval Enhanced Model for Multi-hop Knowledge Base Question Answering
- Synthetic Question Value Estimation for Domain Adaptation of Question Answering
- Your Answer is Incorrect… Would you like to know why? Introducing a Bilingual Short Answer Feedback Dataset
Resources and Evaluation【数据集与评估方法】
数据集
- A Statutory Article Retrieval Dataset in French
- CBLUE: A Chinese Biomedical Language Understanding Evaluation Benchmark
- Chart-to-Text: A Large-Scale Benchmark for Chart Summarization
- CICERO: A Dataset for Contextualized Commonsense Inference in Dialogues
- CLUES: A Benchmark for Learning Classifiers using Natural Language Explanations
- ConditionalQA: A Complex Reading Comprehension Dataset with Conditional Answers
- Cree Corpus: A Collection of nêhiyawêwin Resources
- Detecting Unassimilated Borrowings in Spanish: An Annotated Corpus and Approaches to Modeling
- DialFact: A Benchmark for Fact-Checking in Dialogue
- DiBiMT: A Novel Benchmark for Measuring Word Sense Disambiguation Biases in Machine Translation
- Down and Across: Introducing Crossword-Solving as a New NLP Benchmark
- e-CARE: a New Dataset for Exploring Explainable Causal Reasoning
- EntSUM: A Data Set for Entity-Centric Extractive Summarization
- ePiC: Employing Proverbs in Context as a Benchmark for Abstract Language Understanding
- FairLex: A Multilingual Benchmark for Evaluating Fairness in Legal Text Processing
- Fantastic Questions and Where to Find Them: FairytaleQA – An Authentic Dataset for Narrative Comprehension
- Few-Shot Tabular Data Enrichment Using Fine-Tuned Transformer Architectures
- French CrowS-Pairs: Extending a challenge dataset for measuring social bias in masked language models to a language other than English
- From text to talk: Harnessing conversational corpora for humane and diversity-aware language technology
- HiTab: A Hierarchical Table Dataset for Question Answering and Natural Language Generation
- IAM: A Comprehensive and Large-Scale Dataset for Integrated Argument Mining Tasks
- Image Retrieval from Contextual Descriptions
- KQA Pro: A Dataset with Explicit Compositional Programs for Complex Question Answering over Knowledge Base
- LexGLUE: A Benchmark Dataset for Legal Language Understanding in English
- M3ED: Multi-modal Multi-scene Multi-label Emotional Dialogue Database
- MSCTD: A Multimodal Sentiment Chat Translation Dataset
- NumGLUE: A Suite of Fundamental yet Challenging Mathematical Reasoning Tasks
- QuoteR: A Benchmark of Quote Recommendation for Writing
- Reports of personal experiences and stories in argumentation: datasets and analysis
- RNSum: A Large-Scale Dataset for Automatic Release Note Generation via Commit Logs Summarization
- SciNLI: A Corpus for Natural Language Inference on Scientific Text
- SummScreen: A Dataset for Abstractive Screenplay Summarization
- SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark for Semantic and Generative Capabilities
- Textomics: A Dataset for Genomics Data Summary Generation
- The Moral Integrity Corpus: A Benchmark for Ethical Dialogue Systems
- ToxiGen: A Large-Scale Machine-Generated Dataset for Adversarial and Implicit Hate Speech Detection
- VALSE: A Task-Independent Benchmark for Vision and Language Models Centered on Linguistic Phenomena
- WatClaimCheck: A new Dataset for Claim Entailment and Inference
- Your Answer is Incorrect… Would you like to know why? Introducing a Bilingual Short Answer Feedback Dataset