摘要:图像级弱监督语义分割(WSSS)是一项基本但极具挑战性的计算机视觉任务,该任务有助于促进场景理解和自动驾驶领域的发展。现有的技术大多采用基于分类的类激活图(CAM)作为初始的伪标签,这些伪标签往往集中在有判别性的图像区域,缺乏针对于分割任务的定制化特征。
为了解决上述问题,字节跳动 - 智能创作团队提出了一种即插即用的激活值调制和重校准(AMR)模块来生成面向分割任务的 CAM,大量的实验表明,AMR 不仅在 PASCAL VOC 2012 数据集上获得最先进的性能。实验表明,AMR 是即插即用的,可以作为其他先进方法的子模块来提高性能。论文已入选机器学习顶级论文 AAAI2022,相关代码即将开源。
方法概览。
结果比较。
推荐:在图像级弱监督语义分割这项 CV 难题上,字节跳动做到了性能显著提升。论文被 AAAI 2022 接收。
论文 7:A Survey of Generalisation in Deep Reinforcement Learning
- 作者:Robert Kirk、Amy Zhang、Edward Grefenstette 等
- 论文链接:https://arxiv.org/pdf/2111.09794v1.pdf
摘要:强化学习 (RL) 可用于自动驾驶汽车、机器人等一系列应用,其在现实世界中表现如何呢?现实世界是动态、开放并且总是在变化的,强化学习算法需要对环境的变化保持稳健性,并在部署期间能够进行迁移和适应没见过的(但相似的)环境。然而,当前许多强化学习研究都是在 Atari 和 MuJoCo 等基准上进行的,其具有以下缺点:它们的评估策略环境和训练环境完全相同;这种环境相同的评估策略不适合真实环境。
目前,许多研究者已经意识到这个问题,开始专注于改进 RL 中的泛化。来自伦敦大学学院、UC 伯克利机构的研究者撰文《 A SURVEY OF GENERALISATION IN DEEP REINFORCEMENT LEARNING 》,对深度强化学习中的泛化进行了研究。
强化学习泛化。
在 RL 中可以进行测试泛化的可用环境,共 47 个。
推荐:伦敦大学学院、UC 伯克利联手,撰文综述深度强化学习泛化研究。
ArXiv Weekly Radiostation
机器之心联合由楚航、罗若天发起的ArXiv Weekly Radiostation,在 7 Papers 的基础上,精选本周更多重要论文,包括NLP、CV、ML领域各10篇精选,并提供音频形式的论文摘要简介,详情如下:
10 NLP Papers音频:00:0019:30
本周 10 篇 NLP 精选论文是:
1. OpenQA: Hybrid QA System Relying on Structured Knowledge Base as well as Non-structured Data. (from Yang Liu)2. Phrase-level Adversarial Example Generation for Neural Machine Translation. (from Jian Yang)3. TextRGNN: Residual Graph Neural Networks for Text Classification. (from Meng Wang)4. Utilizing Wordnets for Cognate Detection among Indian Languages. (from Gholamreza Haffari)5. A Survey on Using Gaze Behaviour for Natural Language Processing. (from Pushpak Bhattacharyya)6. Fortunately, Discourse Markers Can Enhance Language Models for Sentiment Analysis. (from Ranit Aharonov)7. BERN2: an advanced neural biomedical named entity recognition and normalization tool. (from Jaewoo Kang)8. Improving Mandarin End-to-End Speech Recognition with Word N-gram Language Model. (from Yuexian Zou)9. Zero-shot Commonsense Question Answering with Cloze Translation and Consistency Optimization. (from Nanyun Peng)10. Which Student is Best? A Comprehensive Knowledge Distillation Exam for Task-Specific BERT Models. (from Alham Fikri Aji)
10 CV Papers音频:00:0022:11
本周 10 篇 CV 精选论文是:
1. iCaps: Iterative Category-level Object Pose and Shape Estimation. (from Dieter Fox)2. SASA: Semantics-Augmented Set Abstraction for Point-based 3D Object Detection. (from Dacheng Tao)3. Quality-aware Part Models for Occluded Person Re-identification. (from Dacheng Tao)4. PCACE: A Statistical Approach to Ranking Neurons for CNN Interpretability. (from Seth Flaxman)5. CaFT: Clustering and Filter on Tokens of Transformer for Weakly Supervised Object Localization. (from Ming Li)6. Vision Transformer Slimming: Multi-Dimension Searching in Continuous Optimization Space. (from Kwang-Ting Cheng, Eric Xing)7. Multi-Dimensional Model Compression of Vision Transformer. (from Sun-Yuan Kung)8. Aerial Scene Parsing: From Tile-level Scene Classification to Pixel-wise Semantic Labeling. (from Liangpei Zhang)9. D-Former: A U-shaped Dilated Transformer for 3D Medical Image Segmentation. (from Jian Wu)10. Scene-Adaptive Attention Network for Crowd Counting. (from Yihong Gong)
10 ML Papers音频:00:0024:19
本周 10 篇 ML 精选论文是:
1. Learning Agent State Online with Recurrent Generate-and-Test. (from Richard S. Sutton)2. Randomized Signature Layers for Signal Extraction in Time Series Data. (from Thomas Hofmann)3. Deconfounded Training for Graph Neural Networks. (from Tat-Seng Chua)4. Federated Optimization of Smooth Loss Functions. (from Ali Jadbabaie, Devavrat Shah)5. NumHTML: Numeric-Oriented Hierarchical Transformer Model for Multi-task Financial Forecasting. (from Barry Smyth)6. Stochastic convex optimization for provably efficient apprenticeship learning. (from John Lygeros)7. Multi-Agent Reinforcement Learning via Adaptive Kalman Temporal Difference and Successor Representation. (from Konstantinos N. Plataniotis)8. TransLog: A Unified Transformer-based Framework for Log Anomaly Detection. (from Jian Yang)9. CausalSim: Toward a Causal Data-Driven Simulator for Network Protocols. (from Devavrat Shah)10. Transformer Embeddings of Irregularly Spaced Events and Their Participants. (from Jason Eisner)