位置: IT常识 - 正文

Diffusion-GAN: Training GANs with Diffusion 解读

编辑:rootadmin
Diffusion-GAN: Training GANs with Diffusion 解读

推荐整理分享Diffusion-GAN: Training GANs with Diffusion 解读,希望有所帮助,仅作参考,欢迎阅读内容。

文章相关热门搜索词:,内容如对您有帮助,希望把文章链接给更多的朋友!

 Diffusion-GAN: 将GAN与diffusion一起训练 

paper:https://arxiv.org/abs/2206.02262

code:GitHub - Zhendong-Wang/Diffusion-GAN: Official PyTorch implementation for paper: Diffusion-GAN: Training GANs with Diffusion

  第一行从左向右看是diffusion forward的过程,不断由 real image 进行 diffusion,第三行从右向左看是由noise逐步恢复成fake image的过程,第二行是鉴别器D,D对每一个timestep都进行鉴别。 

 Figure 1: Flowchart for Diffusion-GAN. The top-row images represent the forward diffusion process of a real image, while the bottom-row images represent the forward diffusion process of a generated fake image. The discriminator learns to distinguish a diffused real image from a diffused fake image at all diffusion steps.

in Figure 1. In Diffusion-GAN, the input to the diffusion process is either a real or a generated image, and the diffusion process consists of a series of steps that gradually add noise to the  image. The number of diffusion steps is not fixed, but depends on the data and the generator. We also design the diffusion process to be differentiable, which means that we can compute the derivative of the output with respect to the input. This allows us to propagate the gradient from the discriminator to the generator through the diffusion process, and update the generator accordingly. Unlike vanilla GANs, which compare the real and generated images directly, Diffusion-GAN compares the noisy versions of them, which are obtained by sampling from the Gaussian mixture distribution over the diffusion steps, with the help of our timestep-dependent discriminator. This distribution has the property that its components have different noise-to-data ratios, which means that some components add more noise than others. By sampling from this distribution, we can achieve two benefits: first, we can stabilize the training by easing the problem of vanishing gradient, which occurs when the data and generator distributions are too different; second, we can augment the data by creating different noisy versions of the same image, which can improve the data efficiency and the diversity of the generator. We provide a theoretical analysis to support our method, and show that the min-max objective function of Diffusion-GAN, which measures the difference between the data and generator distributions, is continuous and differentiable everywhere. This means that the generator in theory can always receive a useful gradient from the discriminator, and improve its performance.【G可以从D收到有用的梯度,从而提升G的性能】

主要贡献:

1) We show both theoretically and empirically how the diffusion process can be utilized to provide a model- and domain-agnostic differentiable augmentation, enabling data-efficient and leaking-free stable GAN training.【稳定了GAN的训练】 2) Extensive experiments show that Diffusion-GAN boosts the stability and generation performance of strong baselines, including StyleGAN2 , Projected GAN , and InsGen , achieving state-of-the-art results in synthesizing photo-realistic images, as measured by both the Fréchet Inception Distance (FID)  and Recall score.【diffusion提升了原始只有GAN组成的框架的性能,例如styleGAN2,Projected GAN】

Diffusion-GAN: Training GANs with Diffusion 解读

Figure 2: The toy example inherited from Arjovsky et al. [2017]. The first row plots the distributions of data with diffusion noise injected for t. The second row shows the JS divergence and the optimal discriminator value with and without our noise injection. 

Figure 4: Plot of adaptively adjusted maximum diffusion steps T and discriminator outputs of Diffusion-GANs. 

To investigate how the adaptive diffusion process works during training, we illustrate in Figure 4 the convergence of the maximum timestep T in our adaptive diffusion and discriminator outputs. We see that T is adaptively adjusted: The T for Diffusion StyleGAN2 increases as the training goes while the T for Diffusion ProjectedGAN first goes up and then goes down. Note that the T is adjusted according to the overfitting status of the discriminator. The second panel shows that trained with the diffusion-based mixture distribution, the discriminator is always well-behaved and provides useful learning signals for the generator, which validates our analysis in Section 3.4 and Theorem 1.

如图4左所示,随着训练过程的变化,扩散的timestep T也会自适应的改变(T通过鉴别器D过拟合的状态而改变); 如图4右所示,用基于扩散的混合分布训练的鉴别器总是表现良好,并为生成器G提供有用的学习信号。

Effectiveness of Diffusion-GAN for domain-agnostic augmentation(未知域增强的有效性)

25-Gaussians Example.

We conduct experiments on the popular 25-Gaussians generation task. The 25-Gaussians dataset is a 2-D toy data, generated by a mixture of 25 two-dimensional Gaussian distributions. Each data point is a 2-dimensional feature vector. We train a small GAN model, whose generator and discriminator are both parameterized by multilayer perceptrons (MLPs), with two 128-unit hidden layers and LeakyReLu nonlinearities.

Figure 5: The 25-Gaussians example. We show the true data samples, the generated samples from vanilla GANs, the discriminator outputs of the vanilla GANs, the generated samples from our Diffusion-GAN, and the discriminator outputs of Diffusion-GAN. 

(1)groundtruth数据集的数据分布,在25个Gaussians example均匀分布; (2)vanilla GANs的输出结果产生了mode collapsing,只在几个model上生成数据; (3)vanilla GANs鉴别器输出很快就会彼此分离。这意味着发生了鉴别器的强烈过拟合,使得鉴别器停止为发生器提供有用的学习信号。 (4)Diffusion-GAN在25个example上均匀分布,意味着它在所有的model上学到了采样分布; (5)Diffusion-GAN的鉴别器输出,D在持续的为G提供有用的学习信号

我们从两个角度来解释这种改进: 首先,non-leaking augmentation(无泄漏增强)有助于提供关于数据空间的更多信息;第二,自适应调整的基于扩散的噪声注入,鉴别器表现良好。

关于 Difffferentiable augmentation. (可微分增强)

As Diffusion-GAN transforms both the data and generated samples before sending them to the discriminator, we can also relate it to differentiable augmentation proposed for data-efficient GAN training. Karras et al introduce a stochastic augmentation pipeline with 18 transformationsand develop an adaptive mechanism for controlling the augmentation probability. Zhao et al. [2020] propose to use Color + Translation + Cutout as differentiable augmentations for both generated and real images.

While providing good empirical results on some datasets, these augmentation methods are developed with domain-specific knowledge and have the risk of leaking augmentation  into generation [Karras et al., 2020a]. As observed in our experiments, they sometime worsen the results when applied to a new dataset, likely because the risk of augmentation leakage overpowers the benefits of enlarging the training set, which could happen especially if the training set size is already sufficiently large.(在数据量足够大的情况下,数据增强带来的负面效果可能大于正面效果)

By contrast, Diffusion-GAN uses a differentiable forward diffusion process to stochastically transform the data and can be considered as both a domain-agnostic and a model-agnostic augmentation method. In other words, Diffusion-GAN can be applied to non-image data or even latent features, for which appropriate data augmentation is difficult to be defined, and easily plugged into an existing GAN to improve its generation performance. Moreover, we prove in theory and show in experiments that augmentation leakage is not a concern for Diffusion-GAN. Tran et al. [2021] provide a theoretical analysis for deterministic non-leaking transformation with differentiable and invertible mapping functions. Bora et al. [2018] show similar theorems to us for specific stochastic transformations, such as Gaussian Projection, Convolve+Noise, and stochastic Block-Pixels, while our Theorem 2 includes more satisfying possibilities as discussed in Appendix B.

本文链接地址:https://www.jiuchutong.com/zhishi/294494.html 转载请保留说明!

上一篇:Vue|非单文件组件(vuecli非根目录打包)

下一篇:【HTML】原生js实现的图书馆管理系统(javascript原生)

  • 苹果12能升级ios15吗(苹果12能升级ios16.1.1吗)

    苹果12能升级ios15吗(苹果12能升级ios16.1.1吗)

  • ios13可以隐藏运营商吗(ios13应用隐藏)

    ios13可以隐藏运营商吗(ios13应用隐藏)

  • 小米手环4复制不了门禁卡(小米手环复制的门禁卡不能用)

    小米手环4复制不了门禁卡(小米手环复制的门禁卡不能用)

  • 路由器显示光信号红灯是什么意思(路由器显示光信号红灯怎么回事)

    路由器显示光信号红灯是什么意思(路由器显示光信号红灯怎么回事)

  • mate20pro无线充电多少W(mate20Pro无线充电功率)

    mate20pro无线充电多少W(mate20Pro无线充电功率)

  • 荣耀20通话声音小什么问题(荣耀20通话声音小)

    荣耀20通话声音小什么问题(荣耀20通话声音小)

  • 在word标准模板中,正文的默认中文字体为(在word标准模板中正文的默认中文字体)

    在word标准模板中,正文的默认中文字体为(在word标准模板中正文的默认中文字体)

  • 腾讯客厅极光tv无法投屏(腾讯客厅极光tv在tv端怎么打开)

    腾讯客厅极光tv无法投屏(腾讯客厅极光tv在tv端怎么打开)

  • 钉钉视频会议主持人录制是什么意思(钉钉视频会议主持人可以强制开麦吗)

    钉钉视频会议主持人录制是什么意思(钉钉视频会议主持人可以强制开麦吗)

  • ipad只换不修政策(ipad只换不修政策2022)

    ipad只换不修政策(ipad只换不修政策2022)

  • 表格标签在哪里找(表格标签怎么写)

    表格标签在哪里找(表格标签怎么写)

  • reno移动版和全网通区别(reno5移动版和全网通区别)

    reno移动版和全网通区别(reno5移动版和全网通区别)

  • 苹果手机可以换内存卡吗(苹果手机可以换铃声吗怎么换)

    苹果手机可以换内存卡吗(苹果手机可以换铃声吗怎么换)

  • 苹果手机屏幕变黑是什么原因(苹果手机屏幕变模糊了怎么恢复)

    苹果手机屏幕变黑是什么原因(苹果手机屏幕变模糊了怎么恢复)

  • ipadwlan版是什么意思(ipad2018wlan版是什么意思)

    ipadwlan版是什么意思(ipad2018wlan版是什么意思)

  • 超广角是什么意思(超广角是啥)

    超广角是什么意思(超广角是啥)

  • 苹果11pro max怎么隐藏照片(苹果11pro max怎么录屏)

    苹果11pro max怎么隐藏照片(苹果11pro max怎么录屏)

  • ipad苹果平板怎么截长图(ipad苹果平板怎么快速截图)

    ipad苹果平板怎么截长图(ipad苹果平板怎么快速截图)

  • 红米note3电池和什么通用(红米note3全网通电池型号)

    红米note3电池和什么通用(红米note3全网通电池型号)

  • 共享单车app有哪些功能(共享单车app哪个好)

    共享单车app有哪些功能(共享单车app哪个好)

  • 小米荣耀手环怎么使用(小米手机荣耀手环怎么用)

    小米荣耀手环怎么使用(小米手机荣耀手环怎么用)

  • vivoy81s按键怎么调(vivoy按键怎么调)

    vivoy81s按键怎么调(vivoy按键怎么调)

  • xsmax有什么新功能(xsmax新功能展示图)

    xsmax有什么新功能(xsmax新功能展示图)

  • win10纯净版exe应用程序打不开如何解决的图文步骤(纯净版 win10)

    win10纯净版exe应用程序打不开如何解决的图文步骤(纯净版 win10)

  • 加班费要计入个人账户吗
  • 包工包料工程要交哪些税
  • 其他债权投资减值准备影响账面价值吗
  • 现金盘亏无法查明原因计入什么科目
  • 兼职业务拿提成合法吗
  • 小规模季度超过45万了怎么缴纳
  • 非正常损失免税吗
  • 业务招待产生的快递费
  • 购入库存商品未付款怎么做账
  • 期初固定资产净值在资产负债表中的哪里
  • 开具红字发票抵扣后如何退税?
  • 会议收入包括哪些
  • 公司租用房产税如何征收
  • 银行承兑汇票收费
  • 停车场增加收入
  • 挂账的费用怎么填制单据凭证
  • 员工休产假期间公司发工资吗
  • 工会职工活动支出标准
  • 在建工程转入固定资产的条件
  • 商品退回顺丰代收怎么办
  • 代扣代缴返还的手续费科目
  • 债务重组账务处理会计分录
  • switcher.exe - switcher是什么进程
  • 本月发生的费用,下月取得发票,怎么做账
  • 销货退回与折让是什么
  • Skype.exe - Skype是什么进程 有什么用
  • xwizard.exe是什么
  • element ui+vue
  • php毫秒转换时分秒
  • centos安装php环境
  • PHP实现中国公民身份证号码有效性验证示例代码
  • 购买仓库计入什么科目
  • 小程序和公众号可以同名吗
  • msg文件怎么创建
  • 进项税发票可以跨年认证吗
  • 如何正确的开具增值税专用发票
  • 租金发票的税率多少
  • 报销职工福利
  • 关于英语的25个单词
  • python的元组有什么用
  • 安装mysql时出现错误
  • 门诊收费票据能重新打印吗
  • 从在建工程调整到费用
  • 视同销售的会计处理是指?
  • 企业开办费如何在税前扣除
  • 新租赁准则承租人租金用什么科目
  • 税金及附加主要包括什么
  • 生产自己的产品
  • 材料的采购成本包括
  • 长期待摊费用待摊费用
  • 因质量问题对方直接扣款也不开票
  • 企业实收资本的用途
  • 分红做什么会计分录
  • 进货没开发票,销货却开发票应怎么做帐?
  • 主营业务成本和其他业务成本区别
  • 建设工程毛利率如何计算
  • 结转销售成本怎么操作
  • mysql源码编译
  • win10系统崩溃后可以通过什么来恢复
  • WINDOWS系统无法正常启动
  • win7系统怎么修复安装系统
  • p2p是什么文件
  • win7重装系统之后怎么还原系统
  • windows8怎么隐藏任务栏
  • centos 安装方法
  • linux vi中查找内容
  • win8.1系统补丁
  • unity shaderlab
  • js unload
  • vue miniui
  • javascript闭包详解
  • unity读取文本文件
  • 关于全局变量和局部变量说法不正确的是
  • jquery基本语法
  • jquery input
  • jquery怎么获取
  • 1992年2月20号是什么
  • 教育费附加申报表
  • 买新房子需要交契税吗
  • 厦门税务机关办事大厅
  • 免责声明:网站部分图片文字素材来源于网络,如有侵权,请及时告知,我们会第一时间删除,谢谢! 邮箱:opceo@qq.com

    鄂ICP备2023003026号

    网站地图: 企业信息 工商信息 财税知识 网络常识 编程技术

    友情链接: 武汉网站建设