pix2pix-论文阅读笔记
论文地址:Image-to-Image Translation with Conditional Adversarial Networks-ReadPaper论文阅读平台
论文结构
1.Introduction
2.Related work
3.Method
3.1 Objective
3.2 Network architectures
3.2.1 Generator with skips
3.2.2 Markovian discriminator
(PatchGAN)
3.3 Optimization and inference
4.Experiments
4.1 Evaluation metrics
4.2 Analysis of the objective function
4.3 Analysis of the generator architecture
4.4 From PixelGANs to PatchGANs to ImageGANs
4.5 Perceptual validation
4.6 Semantic segmentation
4.7 Community-driven Research
Conclusion
摘要
原文
We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping. This makes it possible to apply the same generic approach to problems that traditionally would require very different loss formulations. We demonstrate that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks. Indeed, since the release of the pix2pix software associated with this paper, a large number of internet users (many of them artists) have posted their own experiments with our system, further demonstrating its wide applicability and ease of adoption without the need for parameter tweaking. As a community, we no longer hand-engineer our mapping functions, and this work suggests we can achieve reasonable results without hand-engineering our loss functions either.
核心
研究条件生成式对抗网络在图像翻译任务中的通用解决方案
网络不仅学习从输入图像到输出图像的映射,还学习了用于训练该映射的损失函数
证明了这种方法可以有效应用在图像合成、图像上色等多种图像翻译任务中
使用作者发布的pix2pix软件,大量用户已经成功进行了自己的实验,进一步证明了此方法的泛化性
这项工作表明可以在不手工设计损失函数的情况下,也能获得理想的结果
背景
数字图像任务
计算机视觉(Computer Vision)
模仿人眼和大脑对视觉信息的处理和理解图像分类,目标检测,人脸识别
计算机图形学(Computer Graphics )
在数字空间中模拟物理世界的视觉感知动画制作,3D建模,虚拟现实
数字图像处理(Digital Image Processing)
依据先验知识,对图像的展现形式进行转换图像增强,图像修复,相机ISP
图像翻译(Image Translation)
图像与图像之间以不同形式的转换。根据 source domain 的图像生成 target domain 中的对应图像,约束生成的图像和 source 图像的分布在某个维度上尽量一致
图像修复
视频插帧
图像编辑
风格迁移
超分辨率