位置: IT常识 - 正文

Diffusion-GAN: Training GANs with Diffusion 解读

编辑:rootadmin
Diffusion-GAN: Training GANs with Diffusion 解读

推荐整理分享Diffusion-GAN: Training GANs with Diffusion 解读,希望有所帮助,仅作参考,欢迎阅读内容。

文章相关热门搜索词:,内容如对您有帮助,希望把文章链接给更多的朋友!

 Diffusion-GAN: 将GAN与diffusion一起训练 

paper:https://arxiv.org/abs/2206.02262

code:GitHub - Zhendong-Wang/Diffusion-GAN: Official PyTorch implementation for paper: Diffusion-GAN: Training GANs with Diffusion

  第一行从左向右看是diffusion forward的过程,不断由 real image 进行 diffusion,第三行从右向左看是由noise逐步恢复成fake image的过程,第二行是鉴别器D,D对每一个timestep都进行鉴别。 

 Figure 1: Flowchart for Diffusion-GAN. The top-row images represent the forward diffusion process of a real image, while the bottom-row images represent the forward diffusion process of a generated fake image. The discriminator learns to distinguish a diffused real image from a diffused fake image at all diffusion steps.

in Figure 1. In Diffusion-GAN, the input to the diffusion process is either a real or a generated image, and the diffusion process consists of a series of steps that gradually add noise to the  image. The number of diffusion steps is not fixed, but depends on the data and the generator. We also design the diffusion process to be differentiable, which means that we can compute the derivative of the output with respect to the input. This allows us to propagate the gradient from the discriminator to the generator through the diffusion process, and update the generator accordingly. Unlike vanilla GANs, which compare the real and generated images directly, Diffusion-GAN compares the noisy versions of them, which are obtained by sampling from the Gaussian mixture distribution over the diffusion steps, with the help of our timestep-dependent discriminator. This distribution has the property that its components have different noise-to-data ratios, which means that some components add more noise than others. By sampling from this distribution, we can achieve two benefits: first, we can stabilize the training by easing the problem of vanishing gradient, which occurs when the data and generator distributions are too different; second, we can augment the data by creating different noisy versions of the same image, which can improve the data efficiency and the diversity of the generator. We provide a theoretical analysis to support our method, and show that the min-max objective function of Diffusion-GAN, which measures the difference between the data and generator distributions, is continuous and differentiable everywhere. This means that the generator in theory can always receive a useful gradient from the discriminator, and improve its performance.【G可以从D收到有用的梯度,从而提升G的性能】

主要贡献:

1) We show both theoretically and empirically how the diffusion process can be utilized to provide a model- and domain-agnostic differentiable augmentation, enabling data-efficient and leaking-free stable GAN training.【稳定了GAN的训练】 2) Extensive experiments show that Diffusion-GAN boosts the stability and generation performance of strong baselines, including StyleGAN2 , Projected GAN , and InsGen , achieving state-of-the-art results in synthesizing photo-realistic images, as measured by both the Fréchet Inception Distance (FID)  and Recall score.【diffusion提升了原始只有GAN组成的框架的性能,例如styleGAN2,Projected GAN】

Diffusion-GAN: Training GANs with Diffusion 解读

Figure 2: The toy example inherited from Arjovsky et al. [2017]. The first row plots the distributions of data with diffusion noise injected for t. The second row shows the JS divergence and the optimal discriminator value with and without our noise injection. 

Figure 4: Plot of adaptively adjusted maximum diffusion steps T and discriminator outputs of Diffusion-GANs. 

To investigate how the adaptive diffusion process works during training, we illustrate in Figure 4 the convergence of the maximum timestep T in our adaptive diffusion and discriminator outputs. We see that T is adaptively adjusted: The T for Diffusion StyleGAN2 increases as the training goes while the T for Diffusion ProjectedGAN first goes up and then goes down. Note that the T is adjusted according to the overfitting status of the discriminator. The second panel shows that trained with the diffusion-based mixture distribution, the discriminator is always well-behaved and provides useful learning signals for the generator, which validates our analysis in Section 3.4 and Theorem 1.

如图4左所示,随着训练过程的变化,扩散的timestep T也会自适应的改变(T通过鉴别器D过拟合的状态而改变); 如图4右所示,用基于扩散的混合分布训练的鉴别器总是表现良好,并为生成器G提供有用的学习信号。

Effectiveness of Diffusion-GAN for domain-agnostic augmentation(未知域增强的有效性)

25-Gaussians Example.

We conduct experiments on the popular 25-Gaussians generation task. The 25-Gaussians dataset is a 2-D toy data, generated by a mixture of 25 two-dimensional Gaussian distributions. Each data point is a 2-dimensional feature vector. We train a small GAN model, whose generator and discriminator are both parameterized by multilayer perceptrons (MLPs), with two 128-unit hidden layers and LeakyReLu nonlinearities.

Figure 5: The 25-Gaussians example. We show the true data samples, the generated samples from vanilla GANs, the discriminator outputs of the vanilla GANs, the generated samples from our Diffusion-GAN, and the discriminator outputs of Diffusion-GAN. 

(1)groundtruth数据集的数据分布,在25个Gaussians example均匀分布; (2)vanilla GANs的输出结果产生了mode collapsing,只在几个model上生成数据; (3)vanilla GANs鉴别器输出很快就会彼此分离。这意味着发生了鉴别器的强烈过拟合,使得鉴别器停止为发生器提供有用的学习信号。 (4)Diffusion-GAN在25个example上均匀分布,意味着它在所有的model上学到了采样分布; (5)Diffusion-GAN的鉴别器输出,D在持续的为G提供有用的学习信号

我们从两个角度来解释这种改进: 首先,non-leaking augmentation(无泄漏增强)有助于提供关于数据空间的更多信息;第二,自适应调整的基于扩散的噪声注入,鉴别器表现良好。

关于 Difffferentiable augmentation. (可微分增强)

As Diffusion-GAN transforms both the data and generated samples before sending them to the discriminator, we can also relate it to differentiable augmentation proposed for data-efficient GAN training. Karras et al introduce a stochastic augmentation pipeline with 18 transformationsand develop an adaptive mechanism for controlling the augmentation probability. Zhao et al. [2020] propose to use Color + Translation + Cutout as differentiable augmentations for both generated and real images.

While providing good empirical results on some datasets, these augmentation methods are developed with domain-specific knowledge and have the risk of leaking augmentation  into generation [Karras et al., 2020a]. As observed in our experiments, they sometime worsen the results when applied to a new dataset, likely because the risk of augmentation leakage overpowers the benefits of enlarging the training set, which could happen especially if the training set size is already sufficiently large.(在数据量足够大的情况下,数据增强带来的负面效果可能大于正面效果)

By contrast, Diffusion-GAN uses a differentiable forward diffusion process to stochastically transform the data and can be considered as both a domain-agnostic and a model-agnostic augmentation method. In other words, Diffusion-GAN can be applied to non-image data or even latent features, for which appropriate data augmentation is difficult to be defined, and easily plugged into an existing GAN to improve its generation performance. Moreover, we prove in theory and show in experiments that augmentation leakage is not a concern for Diffusion-GAN. Tran et al. [2021] provide a theoretical analysis for deterministic non-leaking transformation with differentiable and invertible mapping functions. Bora et al. [2018] show similar theorems to us for specific stochastic transformations, such as Gaussian Projection, Convolve+Noise, and stochastic Block-Pixels, while our Theorem 2 includes more satisfying possibilities as discussed in Appendix B.

本文链接地址:https://www.jiuchutong.com/zhishi/294494.html 转载请保留说明!

上一篇:Vue|非单文件组件(vuecli非根目录打包)

下一篇:【HTML】原生js实现的图书馆管理系统(javascript原生)

  • oppok9s充电多少w(oppok9多少w快充)

    oppok9s充电多少w(oppok9多少w快充)

  • 电脑一闪一闪的怎么办(电脑一闪一闪的黑屏怎么回事)

    电脑一闪一闪的怎么办(电脑一闪一闪的黑屏怎么回事)

  • 打qq电话放歌对方能听见吗(打qq电话的时候放歌对方听得到怎么办)

    打qq电话放歌对方能听见吗(打qq电话的时候放歌对方听得到怎么办)

  • blackspace是哪个键(blackspace是哪个键位)

    blackspace是哪个键(blackspace是哪个键位)

  • 快手下架的商品还能上架吗(快手下架的商品差评会有影响吗)

    快手下架的商品还能上架吗(快手下架的商品差评会有影响吗)

  • cpu核显有什么用(cpu带核显和不带核显哪个好)

    cpu核显有什么用(cpu带核显和不带核显哪个好)

  • 抖音隐藏作品会降低权重吗(抖音隐藏作品会不会有影响)

    抖音隐藏作品会降低权重吗(抖音隐藏作品会不会有影响)

  • applecare保修要发票吗(购买applecare后保修需要发票吗)

    applecare保修要发票吗(购买applecare后保修需要发票吗)

  • 图片不超过300k是多大(图片不超过300kb什么时候最清晰)

    图片不超过300k是多大(图片不超过300kb什么时候最清晰)

  • 手机淘宝币怎么用(淘宝币怎么提现)

    手机淘宝币怎么用(淘宝币怎么提现)

  • 快手直播黄钻怎么兑换(快手直播黄钻怎么算)

    快手直播黄钻怎么兑换(快手直播黄钻怎么算)

  • 手机怎么突然变3g网了(手机怎么突然变5G网络了)

    手机怎么突然变3g网了(手机怎么突然变5G网络了)

  • 微信出现matrix怎么回事

    微信出现matrix怎么回事

  • a1722是第几代airpods(a1523和1722是一代还是二代)

    a1722是第几代airpods(a1523和1722是一代还是二代)

  • 手机qq咋取消文件默认(手机qq咋取消文字提示)

    手机qq咋取消文件默认(手机qq咋取消文字提示)

  • 手机为什么充不进去电(手机为什么充不上电?怎么解决?华为)

    手机为什么充不进去电(手机为什么充不上电?怎么解决?华为)

  • 拼多怎么购买东西(拼多多怎么购买物品)

    拼多怎么购买东西(拼多多怎么购买物品)

  • 淘宝和闲鱼是一个账号吗(淘宝和闲鱼是一家公司吗)

    淘宝和闲鱼是一个账号吗(淘宝和闲鱼是一家公司吗)

  • tenda怎么改wifi密码(tenda怎么改wifi密码手机)

    tenda怎么改wifi密码(tenda怎么改wifi密码手机)

  • 屏幕上的时间没了怎么弄出来(屏幕上的时间没有了怎么办)

    屏幕上的时间没了怎么弄出来(屏幕上的时间没有了怎么办)

  • 360wifi初始密码是多少(360无线wifi初始密码是多少)

    360wifi初始密码是多少(360无线wifi初始密码是多少)

  • 怎么看淘金币多久过期(怎么看淘金币多还是少)

    怎么看淘金币多久过期(怎么看淘金币多还是少)

  • word怎么删除某一页的页眉(word怎么删除某一列)

    word怎么删除某一页的页眉(word怎么删除某一列)

  • 能上网但微信连不上网(能上网但微信连不上)

    能上网但微信连不上网(能上网但微信连不上)

  • Python中类的继承是什么(python类的继承与多态)

    Python中类的继承是什么(python类的继承与多态)

  • 公司没有收入怎么报销
  • 企业购进材料入什么账户
  • 公司工资0申报
  • 生产能量计入固定成本
  • 预授权撤销怎么撤销
  • 2019年所得税汇算清缴政策
  • 事业单位合并财务交接
  • 其他应收款资产负债表是负数怎么办
  • 房租 收入
  • 土地增值税申报流程
  • 不动产分割的法律规定
  • 会计凭证的概念及分类
  • 注册资本金印花税税率是多少
  • 个人怎么开增值税
  • 税务大厅补报个税怎样申报
  • 营改增之后账务怎么处理
  • 保安服务税目
  • 人工费开专票最多能开几个点的
  • 企业所得税资产总额怎么算出来的呢
  • 公司注销过程中如果有纠纷怎么办
  • 所得税费用可以抵减利润吗
  • 高新加计扣除怎么做账
  • 给股东分配股利
  • 建账时的库存怎么做账
  • 基建项目类型填什么
  • 稿酬计入工资所得吗
  • 如何在excel中计算两列数值的差
  • 现金流量套期的例子
  • 承租厂房需要缴纳什么税
  • 年末结转利润分配账户的借方余额表示
  • 部署文档
  • 通讯补贴计入工资薪金吗
  • 公司费用报销包括哪些
  • Vue:element-ui中表格过长内容隐藏显示
  • 使用php进行mysql数据库编程的基本步骤
  • nbtstat命令的用法
  • php、java、android、ios通用的3des方法(推荐)
  • 风险资本生存概率
  • golang eventbus
  • 二手车交易账务处理
  • 超过认证期限的发票未抵扣能红冲吗
  • 内部交易费用外部交易费用
  • 一般纳税人跨月发票怎么作废
  • 纳税减免申报
  • 职工福利费的开支范围的规定扣除
  • sqlserver存储过程在哪里
  • 交易性金融资产公允价值变动怎么算
  • 国税申报流程怎么操作的
  • 其他应付款是什么类账户
  • 商品盘点短缺
  • 免税收入啥意思
  • 高新技术企业的税收优惠政策
  • 装修公司做账涉及科目
  • 汽车保险费能不能不交
  • 超市会计怎么做会计分录
  • 税前扣除是什么时候开始的
  • 超市购物卡怎么办理
  • 微软出win9了吗
  • linux常用帮助命令
  • linux的apache
  • windows7 ie
  • SSSvr.exe - SSSvr是什么进程
  • 64位windows8系统安装驱动时出现签名错误的解决方法
  • 为啥没有win8 win9
  • windows7笔记本无线网络连接
  • win7 蓝屏1e
  • win8更改电脑设置在哪
  • win7系统自动重启日志
  • 电脑导航阻止怎么办
  • perl怎么把字符串变为数字
  • window.open失效
  • unicode类型 python
  • unity游戏开发常用技术
  • u3d unity3d
  • js实现右键菜单
  • python模拟ajax
  • android 4.2
  • 国家税务总局安徽省税务局公告
  • 为什么10月份社保交不了
  • 免责声明:网站部分图片文字素材来源于网络,如有侵权,请及时告知,我们会第一时间删除,谢谢! 邮箱:opceo@qq.com

    鄂ICP备2023003026号

    网站地图: 企业信息 工商信息 财税知识 网络常识 编程技术

    友情链接: 武汉网站建设