CV - 计算机视觉 | ML - 机器学习 | RL - 强化学习 | NLP 自然语言处理
Subjects: cs.Cv、cs.CL、cs.LG
1.SemSup: Semantic Supervision for Simple and Scalable Zero-shot Generalization
标题:SemSup:语义监督用于简单和可扩展的零点泛化
作者:Austin W. Hanjie, Ameet Deshpande, Karthik Narasimhan
文章链接:https://arxiv.org/abs/2202.13100
摘要:
自从引入零点学习是指对训练期间未见的类的实例进行预测的问题。零点学习的一个方法是为模型提供辅助的类信息。此前的工作在很大程度上使用了昂贵的每实例注释或单一的类级描述,但每实例描述很难扩展,单一的类描述可能不够丰富。此外,这些工作完全使用自然语言描述、简单的双编码器模型、以及模式或特定任务的方法。这些方法有几个局限性:文本监督可能并不总是可用或最佳的,双编码器可能只学习输入和类描述之间的粗略关系。在这项工作中,我们提出了SemSup,这是一种新颖的方法,它使用了(1)可扩展的多重描述抽样方法,该方法比单一描述提高了性能;(2)替代描述格式,如JSON,易于生成,在某些设置上优于文本;以及(3)混合词汇-语义相似性,以利用类描述中的细粒度信息。我们证明了SemSup在四个数据集、两种模式和三种概括设置中的有效性。例如,在文本和图像数据集中,SemSup比最接近的基线平均增加了15分的未见过的类概括准确率。
Zero-shot learning is the problem of predicting instances over classes not seen during training. One approach to zero-shot learning is providing auxiliary class information to the model. Prior work along this vein have largely used expensive per-instance annotation or singular class-level descriptions, but per-instance descriptions are hard to scale and single class descriptions may not be rich enough. Furthermore, these works have used natural-language descriptions exclusively, simple bi-encoders models, and modality or task-specific methods. These approaches have several limitations: text supervision may not always be available or optimal and bi-encoders may only learn coarse relations between inputs and class descriptions. In this work, we present SemSup, a novel approach that uses (1) a scalable multiple description sampling method which improves performance over single descriptions, (2) alternative description formats such as JSON that are easy to generate and outperform text on certain settings, and (3) hybrid lexical-semantic similarity to leverage fine-grained information in class descriptions. We demonstrate the effectiveness of SemSup across four datasets, two modalities, and three generalization settings. For example, across text and image datasets, SemSup increases unseen class generalization accuracy by 15 points on average compared to the closest baseline.
2.Continual Few-Shot Learning Using HyperTransformers
标题:使用超级变形器的连续小样本学习
作者:Max Vladymyrov, Andrey Zhmoginov, Mark Sandler
文章链接:https://arxiv.org/abs/2301.04584
摘要:
我们专注于从连续到达的多个任务中学习而不被遗忘的问题,其中每个任务都是用新的或已经看到的类的几张照片来定义的。我们使用最近发表的HyperTransformer(HT)来处理这个问题,这是一个基于Transformer的超网络,它直接从支持集中生成专门的特定任务CNN权重。为了从连续的任务序列中学习,我们建议递归地重新使用生成的权重作为下一个任务的HT的输入。这样一来,生成的CNN权重本身就可以作为以前学习的任务的代表,而HT被训练来更新这些权重,这样就可以在不忘记过去任务的情况下学习新任务。这种方法与大多数持续学习算法不同,后者通常依赖于使用重放缓冲区、权重正则化或任务依赖的架构变化。我们证明了我们提出的配备原型损失的连续超变换器方法能够学习和保留关于过去任务的知识,适用于各种场景,包括从小型批次中学习,以及任务递增和类递增的学习场景。
We focus on the problem of learning without forgetting from multiple tasks arriving sequentially, where each task is defined using a few-shot episode of novel or already seen classes. We approach this problem using the recently published HyperTransformer (HT), a Transformer-based hypernetwork that generates specialized task-specific CNN weights directly from the support set. In order to learn from a continual sequence of tasks, we propose to recursively re-use the generated weights as input to the HT for the next task. This way, the generated CNN weights themselves act as a representation of previously learned tasks, and the HT is trained to update these weights so that the new task can be learned without forgetting past tasks. This approach is different from most continual learning algorithms that typically rely on using replay buffers, weight regularization or task-dependent architectural changes. We demonstrate that our proposed Continual HyperTransformer method equipped with a prototypical loss is capable of learning and retaining knowledge about past tasks for a variety of scenarios, including learning from mini-batches, and task-incremental and class-incremental learning scenarios.
3.Universal Domain Adaptation for Remote Sensing Image Scene Classification
标题:遥感图像场景分类的通用域适应性
作者:Qingsong Xu, Yilei Shi, Xin Yuan, Xiao Xiang Zhu
文章链接:https://arxiv.org/abs/2301.11387
项目代码:https://github.com/zhu-xlab/UniDA
摘要:
迄今为止,现有的领域适应(DA)方法通常不太适合遥感图像分类的实际DA场景,因为这些方法(如无监督DA)依赖于关于源域和目标域的标签集之间关系的丰富的先验知识,而由于隐私或保密问题,源数据往往无法获得。为此,我们提出了一个实用的通用域适应设置,用于遥感图像场景分类,不需要关于标签集的先验知识。此外,针对源数据不可用的情况,我们提出了一种没有源数据的新型通用域适应方法。该模型的结构分为两部分:源数据生成阶段和模型适应阶段。第一阶段利用源域中的类分离性知识,从预训练的模型中估计出源数据的条件分布,然后合成源数据。有了这个合成的源数据,如果目标样本属于源标签集中的任何类别,就可以对其进行正确的分类,否则就将其标记为 "未知",这就成为一项通用的DA任务。在第二阶段,一个新的可转移权重区分了每个领域的共享和私有标签集,促进了自动发现的共享标签集的适应性,并成功识别了 "未知 "的样本。实证结果表明,无论源数据是否可用,所提出的模型对遥感图像场景分类是有效和实用的。
The domain adaptation (DA) approaches available to date are usually not well suited for practical DA scenarios of remote sensing image classification, since these methods (such as unsupervised DA) rely on rich prior knowledge about the relationship between label sets of source and target domains, and source data are often not accessible due to privacy or confidentiality issues. To this end, we propose a practical universal domain adaptation setting for remote sensing image scene classification that requires no prior knowledge on the label sets. Furthermore, a novel universal domain adaptation method without source data is proposed for cases when the source data is unavailable. The architecture of the model is divided into two parts: the source data generation stage and the model adaptation stage. The first stage estimates the conditional distribution of source data from the pre-trained model using the knowledge of class-separability in the source domain and then synthesizes the source data. With this synthetic source data in hand, it becomes a universal DA task to classify a target sample correctly if it belongs to any category in the source label set, or mark it as "unknown" otherwise. In the second stage, a novel transferable weight that distinguishes the shared and private label sets in each domain promotes the adaptation in the automatically discovered shared label set and recognizes the ``unknown'' samples successfully. Empirical results show that the proposed model is effective and practical for remote sensing image scene classification, regardless of whether the source data is available or not.