位置: IT常识 - 正文

Diffusion-GAN: Training GANs with Diffusion 解读

编辑:rootadmin
Diffusion-GAN: Training GANs with Diffusion 解读

推荐整理分享Diffusion-GAN: Training GANs with Diffusion 解读,希望有所帮助,仅作参考,欢迎阅读内容。

文章相关热门搜索词:,内容如对您有帮助,希望把文章链接给更多的朋友!

 Diffusion-GAN: 将GAN与diffusion一起训练 

paper:https://arxiv.org/abs/2206.02262

code:GitHub - Zhendong-Wang/Diffusion-GAN: Official PyTorch implementation for paper: Diffusion-GAN: Training GANs with Diffusion

  第一行从左向右看是diffusion forward的过程,不断由 real image 进行 diffusion,第三行从右向左看是由noise逐步恢复成fake image的过程,第二行是鉴别器D,D对每一个timestep都进行鉴别。 

 Figure 1: Flowchart for Diffusion-GAN. The top-row images represent the forward diffusion process of a real image, while the bottom-row images represent the forward diffusion process of a generated fake image. The discriminator learns to distinguish a diffused real image from a diffused fake image at all diffusion steps.

in Figure 1. In Diffusion-GAN, the input to the diffusion process is either a real or a generated image, and the diffusion process consists of a series of steps that gradually add noise to the  image. The number of diffusion steps is not fixed, but depends on the data and the generator. We also design the diffusion process to be differentiable, which means that we can compute the derivative of the output with respect to the input. This allows us to propagate the gradient from the discriminator to the generator through the diffusion process, and update the generator accordingly. Unlike vanilla GANs, which compare the real and generated images directly, Diffusion-GAN compares the noisy versions of them, which are obtained by sampling from the Gaussian mixture distribution over the diffusion steps, with the help of our timestep-dependent discriminator. This distribution has the property that its components have different noise-to-data ratios, which means that some components add more noise than others. By sampling from this distribution, we can achieve two benefits: first, we can stabilize the training by easing the problem of vanishing gradient, which occurs when the data and generator distributions are too different; second, we can augment the data by creating different noisy versions of the same image, which can improve the data efficiency and the diversity of the generator. We provide a theoretical analysis to support our method, and show that the min-max objective function of Diffusion-GAN, which measures the difference between the data and generator distributions, is continuous and differentiable everywhere. This means that the generator in theory can always receive a useful gradient from the discriminator, and improve its performance.【G可以从D收到有用的梯度,从而提升G的性能】

主要贡献:

1) We show both theoretically and empirically how the diffusion process can be utilized to provide a model- and domain-agnostic differentiable augmentation, enabling data-efficient and leaking-free stable GAN training.【稳定了GAN的训练】 2) Extensive experiments show that Diffusion-GAN boosts the stability and generation performance of strong baselines, including StyleGAN2 , Projected GAN , and InsGen , achieving state-of-the-art results in synthesizing photo-realistic images, as measured by both the Fréchet Inception Distance (FID)  and Recall score.【diffusion提升了原始只有GAN组成的框架的性能,例如styleGAN2,Projected GAN】

Diffusion-GAN: Training GANs with Diffusion 解读

Figure 2: The toy example inherited from Arjovsky et al. [2017]. The first row plots the distributions of data with diffusion noise injected for t. The second row shows the JS divergence and the optimal discriminator value with and without our noise injection. 

Figure 4: Plot of adaptively adjusted maximum diffusion steps T and discriminator outputs of Diffusion-GANs. 

To investigate how the adaptive diffusion process works during training, we illustrate in Figure 4 the convergence of the maximum timestep T in our adaptive diffusion and discriminator outputs. We see that T is adaptively adjusted: The T for Diffusion StyleGAN2 increases as the training goes while the T for Diffusion ProjectedGAN first goes up and then goes down. Note that the T is adjusted according to the overfitting status of the discriminator. The second panel shows that trained with the diffusion-based mixture distribution, the discriminator is always well-behaved and provides useful learning signals for the generator, which validates our analysis in Section 3.4 and Theorem 1.

如图4左所示,随着训练过程的变化,扩散的timestep T也会自适应的改变(T通过鉴别器D过拟合的状态而改变); 如图4右所示,用基于扩散的混合分布训练的鉴别器总是表现良好,并为生成器G提供有用的学习信号。

Effectiveness of Diffusion-GAN for domain-agnostic augmentation(未知域增强的有效性)

25-Gaussians Example.

We conduct experiments on the popular 25-Gaussians generation task. The 25-Gaussians dataset is a 2-D toy data, generated by a mixture of 25 two-dimensional Gaussian distributions. Each data point is a 2-dimensional feature vector. We train a small GAN model, whose generator and discriminator are both parameterized by multilayer perceptrons (MLPs), with two 128-unit hidden layers and LeakyReLu nonlinearities.

Figure 5: The 25-Gaussians example. We show the true data samples, the generated samples from vanilla GANs, the discriminator outputs of the vanilla GANs, the generated samples from our Diffusion-GAN, and the discriminator outputs of Diffusion-GAN. 

(1)groundtruth数据集的数据分布,在25个Gaussians example均匀分布; (2)vanilla GANs的输出结果产生了mode collapsing,只在几个model上生成数据; (3)vanilla GANs鉴别器输出很快就会彼此分离。这意味着发生了鉴别器的强烈过拟合,使得鉴别器停止为发生器提供有用的学习信号。 (4)Diffusion-GAN在25个example上均匀分布,意味着它在所有的model上学到了采样分布; (5)Diffusion-GAN的鉴别器输出,D在持续的为G提供有用的学习信号

我们从两个角度来解释这种改进: 首先,non-leaking augmentation(无泄漏增强)有助于提供关于数据空间的更多信息;第二,自适应调整的基于扩散的噪声注入,鉴别器表现良好。

关于 Difffferentiable augmentation. (可微分增强)

As Diffusion-GAN transforms both the data and generated samples before sending them to the discriminator, we can also relate it to differentiable augmentation proposed for data-efficient GAN training. Karras et al introduce a stochastic augmentation pipeline with 18 transformationsand develop an adaptive mechanism for controlling the augmentation probability. Zhao et al. [2020] propose to use Color + Translation + Cutout as differentiable augmentations for both generated and real images.

While providing good empirical results on some datasets, these augmentation methods are developed with domain-specific knowledge and have the risk of leaking augmentation  into generation [Karras et al., 2020a]. As observed in our experiments, they sometime worsen the results when applied to a new dataset, likely because the risk of augmentation leakage overpowers the benefits of enlarging the training set, which could happen especially if the training set size is already sufficiently large.(在数据量足够大的情况下,数据增强带来的负面效果可能大于正面效果)

By contrast, Diffusion-GAN uses a differentiable forward diffusion process to stochastically transform the data and can be considered as both a domain-agnostic and a model-agnostic augmentation method. In other words, Diffusion-GAN can be applied to non-image data or even latent features, for which appropriate data augmentation is difficult to be defined, and easily plugged into an existing GAN to improve its generation performance. Moreover, we prove in theory and show in experiments that augmentation leakage is not a concern for Diffusion-GAN. Tran et al. [2021] provide a theoretical analysis for deterministic non-leaking transformation with differentiable and invertible mapping functions. Bora et al. [2018] show similar theorems to us for specific stochastic transformations, such as Gaussian Projection, Convolve+Noise, and stochastic Block-Pixels, while our Theorem 2 includes more satisfying possibilities as discussed in Appendix B.

本文链接地址:https://www.jiuchutong.com/zhishi/294494.html 转载请保留说明!

上一篇:Vue|非单文件组件(vuecli非根目录打包)

下一篇:【HTML】原生js实现的图书馆管理系统(javascript原生)

  • B站连续包月舰长怎么取消(b站连续包月舰长取消后还能连续包月吗)

    B站连续包月舰长怎么取消(b站连续包月舰长取消后还能连续包月吗)

  • 京东同城购物在哪里(京东同城购物在哪个平台)

    京东同城购物在哪里(京东同城购物在哪个平台)

  • 华为查找我的手机总是离线(华为查找我的手机定位不了是什么原因)

    华为查找我的手机总是离线(华为查找我的手机定位不了是什么原因)

  • 秘乐短视频一个人可以注册几个号(秘乐短视频还有救吗)

    秘乐短视频一个人可以注册几个号(秘乐短视频还有救吗)

  • 微信通话时长是谁挂断(微信电话的时长)

    微信通话时长是谁挂断(微信电话的时长)

  • 拼多多榜单标签是什么意思(拼多多榜单标签在哪里)

    拼多多榜单标签是什么意思(拼多多榜单标签在哪里)

  • wps翻译功能没有了怎么办(wps2016翻译功能找不到)

    wps翻译功能没有了怎么办(wps2016翻译功能找不到)

  • 带问号的文件能删除吗(带问号的文件能复印吗)

    带问号的文件能删除吗(带问号的文件能复印吗)

  • 笔记本发烫正常吗(笔记本发热发烫)

    笔记本发烫正常吗(笔记本发热发烫)

  • 手机相机黑屏拍不了照(手机相机黑屏拍不了照怎么办)

    手机相机黑屏拍不了照(手机相机黑屏拍不了照怎么办)

  • 千兆网下载速度是多少(千兆网下载速度只有50兆)

    千兆网下载速度是多少(千兆网下载速度只有50兆)

  • 华为荣耀9x怎么打开开发者模式(华为荣耀9x怎么开空调)

    华为荣耀9x怎么打开开发者模式(华为荣耀9x怎么开空调)

  • 企业微信多久显示离开状态(企业微信多久显示离开电脑)

    企业微信多久显示离开状态(企业微信多久显示离开电脑)

  • 美版官换机什么意思(官换机和美版哪个值得买)

    美版官换机什么意思(官换机和美版哪个值得买)

  • 苹果手机开机不了了是什么原因(苹果手机开机不了一直显示苹果标志)

    苹果手机开机不了了是什么原因(苹果手机开机不了一直显示苹果标志)

  • 华为手机怎么解除密码(华为手机怎么解开密码)

    华为手机怎么解除密码(华为手机怎么解开密码)

  • 老年手机不出声音怎么办(老年手机无声怎么办)

    老年手机不出声音怎么办(老年手机无声怎么办)

  • 怎么补手机卡(电信网上怎么补手机卡)

    怎么补手机卡(电信网上怎么补手机卡)

  • 微信收藏的语音怎么发朋友圈(微信收藏的语音删除了还能恢复吗)

    微信收藏的语音怎么发朋友圈(微信收藏的语音删除了还能恢复吗)

  • gpu缩放有什么用(gpu缩放有什么用吗)

    gpu缩放有什么用(gpu缩放有什么用吗)

  • 6g和8g运行有什么差别(6g和8g运行区别)

    6g和8g运行有什么差别(6g和8g运行区别)

  • 手机网络共享怎么设置(手机网络共享怎么连接电脑)

    手机网络共享怎么设置(手机网络共享怎么连接电脑)

  • 手机qq怎么看退群记录(手机qq怎么看退群人员)

    手机qq怎么看退群记录(手机qq怎么看退群人员)

  • za是什么版本(za是什么版本的苹果支持电信吗)

    za是什么版本(za是什么版本的苹果支持电信吗)

  • ios怎么下netflix

    ios怎么下netflix

  • 幂等性是什么?(java幂等性是什么)

    幂等性是什么?(java幂等性是什么)

  • 用Python绘制几个动画(如何用python绘制)

    用Python绘制几个动画(如何用python绘制)

  • 北京增值税认证平台电话
  • 应交税费负数调整到其他非流动资产
  • 280元抵减税控设备
  • 一般纳税人的税种有哪些
  • 融资租赁业务如何高效拓展
  • 农产品是零税率还是免税税率
  • 利润总额与毛利的区别
  • 商家为什么要做广告?
  • 出口企业类别在哪里查
  • 灾区捐款会计分录
  • 房地产开发企业土地增值税清算
  • 出纳备用金管理制度
  • 收到福利费会计分录
  • 存货的计税价格是什么
  • 一般纳税人为其他公司制作标书怎么缴税?
  • 企业有哪些o
  • 生产企业办理出口退税时要提供发票吗
  • 开票信息银行账号有一个横杠
  • 个税所得期和所得税区别
  • 结算银行贷款利息用什么凭证
  • 出口企业取得失业保险
  • 支付银行托管费怎么入账
  • 企业债券和公司债券
  • 现金比率计算公式含义
  • 监控 固定资产
  • 销售预付款会计分录
  • linux查看du
  • 简易征收的分录
  • uefi和legacy的区别对显卡兼容
  • 公司账户上的钱怎么转到老板账户上
  • 出差回来报销会务费
  • 可供出售金融资产现在叫什么
  • 农产品进项税额转出怎么算
  • 股票发行费用怎么算
  • 库存现金盘亏会计分录无法查明原因
  • php共享内存用法有哪些
  • 前端实现微信联系人二维码
  • LIO-SAM学习与运行测试数据集
  • 会计核算方法体系构成
  • 自产应税消费品用于在建工程要交增值税吗
  • 库存股属于什么
  • 其他应付款转入管理费用
  • 利用php生成静态数据
  • 供应商退回货款怎么入账
  • 未达起征点免税额是什么意思
  • 冲减多计提的工资怎么做账
  • linux服务器环境搭建
  • phpcms下载
  • 营业收入的构成分析主要包括
  • 企业增资相关知识点
  • 怎么做掉公司账面库存100万
  • 成本结账是什么意思
  • 公司向个人租赁房屋办公要交税吗?
  • 个税速算扣除数2023
  • 微信公众号注册时间在哪里看
  • 企业利润分配的内容
  • 没有外币账户可以收外币吗
  • 其他应付款的借贷方分别表示什么
  • 两家公司
  • 未分配利润借方是什么意思
  • mysql索引数据结构有哪些
  • Windows下MySQL 5.7无法启动的解决方法
  • u盘装win8系统教程图解
  • Win10系统开机出现问题pin码不可用
  • 电脑预读文件
  • ubuntu 12.04安装
  • 升级win10系统后鼠标键盘无法用什么原因
  • win7系统如何关闭开机自动启动软件
  • Linux系统中矢量图ai格式怎么打开?
  • win7微软账户
  • win8开机界面
  • win8.1出现了一些问题
  • win10电脑补丁 kb5000802
  • react-native-navigation
  • unity 3d资源
  • web标准网页布局的背景
  • 解决口苦的最佳方法
  • unity3d相机设置视角
  • EasyUI Pagination 分页的两种做法小结
  • 公积金个人网上开户流程
  • 免责声明:网站部分图片文字素材来源于网络,如有侵权,请及时告知,我们会第一时间删除,谢谢! 邮箱:opceo@qq.com

    鄂ICP备2023003026号

    网站地图: 企业信息 工商信息 财税知识 网络常识 编程技术

    友情链接: 武汉网站建设