论文5:VGSE: Visually-Grounded Semantic Embeddings for Zero-Shot Learning
- 作者:Wenjia Xu等
- 论文地址:https://arxiv.org/abs/2203.10444
摘要:北京邮电大学、马普所等机构的研究者提出了类别嵌入发掘网络(Visually-Grounded Semantic Embedding Network, VGSE),本文主要回答了两个问题:如何从可见类图像中自动发掘具有语义和视觉特征的类别嵌入;如何在没有训练样本的情况下,为不可见类别预测类别嵌入。
为了充分挖掘不同类别之间共享的视觉特征,VGSE 模型将大量局部图像切片按其视觉相似度聚类形成属性簇,从图像底层特征中归纳不同类别实例所共享的视觉特征。此外 VGSE 模型提出类别关系模块,在少量外部知识源的辅助下学习类别关系,能够将知识从源类别转移到目标类别,为没有训练图像的目标类别预测其类别嵌入。相较于其他基于语料自动挖掘而获得的属性,VGSE 模型在 CUB、SUN、AWA2 等零样本分类数据集上取得非常有竞争力的结果。
本论文已被 CVPR 2022 录用。
VGSE模型结构。
挖掘属性簇可视化结果。
结果比较。
推荐:大幅减少零样本学习所需的人工标注,马普所和北邮提出富含视觉信息的类别语义嵌入。
论文6:Label Relation Graphs Enhanced Hierarchical Residual Network for Hierarchical Multi-Granularity Classification
- 作者:Jingzhou Chen等
- 论文地址:https://arxiv.org/pdf/2201.03194.pdf
摘要:传统的图像识别数据集类别设定中,针对某个特定任务例如通用图像分类任务或者细粒度分类任务,类别标签往往只位于同一层级中,无法鲁棒地利用标注到不同层级上的图片,对标注的要求较高。
为了降低图像质量以及背景知识等带来的对标注数据的高要求、充分利用具有不同层级粒度标签的样本,设计建模目标层级语义结构的层级多粒度识别算法对于提升深度神经网络的鲁棒性具有十分重要的作用。
为此,浙江大学联合蚂蚁集团提出了一种基于标签关系树的层级残差多粒度分类网络,收录到 CVPR2022 中。
层级残差网络结构。
CUB-200-2011上与SOTA方法的比较。
各个数据集、不同重标记比例下对比方法的平均OA/结果。
推荐:基于标签关系树的层级残差多粒度分类网络,建模多粒度标签间的层级知识。
论文7:Zero-Shot Logit Adjustment
- 作者:Dubing Chen 等
- 论文地址:https://arxiv.org/abs/2204.11822
摘要:南京理工大学和牛津大学的研究者提出了一个即插即用的分类器模块,只需修改一行代码就能大幅提升生成型零样本学习方法的效果,减少了分类器对于生成伪样本质量的依赖。
本文以一致化训练与测试目标为指引,推导出广义零样本学习评测指标的变分下界。以此建模的分类器避免使用重采用策略,防止分类器在生成的伪样本上过拟合对真实样本的识别造成不利影响。所提方法能够使基于嵌入的分类器在生成型方法框架上有效,减少了分类器对于生成伪样本质量的依赖。
本文已被IJCAI 2022会议接收。
GZSL与SOTA方法的比较。
纯原型学习器与基于生成的ZLA原型学习器之间的比较。推荐:用一行代码大幅提升零样本学习方法效果,南京理工&牛津提出即插即用分类器模块。
ArXiv Weekly Radiostation
机器之心联合由楚航、罗若天发起的ArXiv Weekly Radiostation,在 7 Papers 的基础上,精选本周更多重要论文,包括NLP、CV、ML领域各10篇精选,并提供音频形式的论文摘要简介,详情如下:
10 NLP Papers音频:00:0020:02
本周 10 篇 NLP 精选论文是:
1. Few-Shot Fine-Grained Entity Typing with Automatic Label Interpretation and Instance Generation. (from Jiawei Han)2. Improving the Training Recipe for a Robust Conformer-based Hybrid Model. (from Hermann Ney)3. Improving Deliberation by Text-Only and Semi-Supervised Training. (from Tara N. Sainath)4. OPERA: Harmonizing Task-Oriented Dialogs and Information Seeking Experience. (from Jianfeng Gao)5. Analysis of Individual Conversational Volatility in Tandem Telecollaboration for Second Language Learning. (from Alan F. Smeaton)6. ConcreteGraph: A Data Augmentation Method Leveraging the Properties of Concept Relatedness Estimation. (from Irwin King)7. Creation and Analysis of an International Corpus of Privacy Laws. (from Norman Sadeh)8. Annotated Speech Corpus for Low Resource Indian Languages: Awadhi, Bhojpuri, Braj and Magahi. (from Siddharth Singh)9. MVP: Multi-task Supervised Pre-training for Natural Language Generation. (from Ji-Rong Wen)10. Trial2Vec: Zero-Shot Clinical Trial Document Similarity Search using Self-Supervision. (from Jimeng Sun)
10 CV Papers音频:00:0024:03
本周 10 篇 CV 精选论文是:
1. Uncertainty-aware Panoptic Segmentation. (from Wolfram Burgard)2. Asymmetric Transfer Hashing with Adaptive Bipartite Graph Learning. (from Witold Pedrycz)3. Text-Driven Stylization of Video Objects. (from Serge Belongie)4. Neural Annotation Refinement: Development of a New 3D Dataset for Adrenal Gland Analysis. (from Pascal Fua)5. BIMS-PU: Bi-Directional and Multi-Scale Point Cloud Upsampling. (from Xiaogang Wang, Daniela Rus)6. RandStainNA: Learning Stain-Agnostic Features from Histology Slides by Bridging Stain Augmentation and Normalization. (from Dinggang Shen)7. Video Activity Localisation with Uncertainties in Temporal Boundary. (from Shaogang Gong, Yang Liu)8. PolarFormer: Multi-camera 3D Object Detection with Polar Transformer. (from Weiming Hu)9. FedRare: Federated Learning with Intra- and Inter-Client Contrast for Effective Rare Disease Classification. (from Kwang-Ting Cheng)10. The Lighter The Better: Rethinking Transformers in Medical Image Segmentation Through Adaptive Pruning. (from Kwang-Ting Cheng)
10 ML Papers音频:00:0021:22
本周 10 篇 ML 精选论文是:
1. Computer-aided diagnosis and prediction in brain disorders. (from Frederik Barkhof, Wiro J. Niessen)2. Denoised MDPs: Learning World Models Better Than the World Itself. (from Antonio Torralba, Phillip Isola)3. Value-Consistent Representation Learning for Data-Efficient Reinforcement Learning. (from Shuicheng Yan)4. p-Meta: Towards On-device Deep Model Adaptation. (from Lothar Thiele)5. ZeroC: A Neuro-Symbolic Model for Zero-shot Concept Recognition and Acquisition at Inference Time. (from Jure Leskovec)6. Learning Iterative Reasoning through Energy Minimization. (from Joshua B. Tenenbaum)7. From Kernel Methods to Neural Networks: A Unifying Variational Formulation. (from Michael Unser)8. Topology-aware Generalization of Decentralized SGD. (from Dacheng Tao)9. RegMixup: Mixup as a Regularizer Can Surprisingly Improve Accuracy and Out Distribution Robustness. (from Philip H.S. Torr)10. Joint Representation Training in Sequential Tasks with Shared Structure. (from Peter Bartlett)