Jump to content

Diffusion model

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by PopoDameron (talk | contribs) at 03:07, 4 October 2022 (started article on diffusion models). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.
(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff)

In machine learning, diffusion models, also known as diffusion probabilistic models, are a class of latent variable models. These models are Markov chains trained using variational inference.[1] The goal of diffusion models is to learn the latent structure of a dataset by modeling the way in which data points diffuse through the latent space. In computer vision, this means that a neural network is trained to denoise images blurred with Gaussian noise by learning to reverse the diffusion process.[2]

Diffusion models can be applied to a variety of tasks, including image denoising, inpainting, super-resolution, and image generation. For example, an image generation model would start with a random noise image and then, after having been trained reversing the diffusion process on natural images, the model would be able to generate new natural images. A recent example of this is OpenAI's text-to-image model DALL-E 2, which uses diffusion models for both the model's prior (which produces an image embedding given a text caption) and the decoder that generates the final image.[3]

See also


References

  1. ^ Ho, Jonathan; Jain, Ajay; Abbeel, Pieter (19 June 2020). "Denoising Diffusion Probabilistic Models". doi:10.48550/arXiv.2006.11239. {{cite journal}}: Cite journal requires |journal= (help)
  2. ^ Song, Yang; Ermon, Stefano (2020). "Improved Techniques for Training Score-Based Generative Models". doi:10.48550/arXiv.2006.09011. {{cite journal}}: Cite journal requires |journal= (help)
  3. ^ Ramesh, Aditya; Dhariwal, Prafulla; Nichol, Alex; Chu, Casey; Chen, Mark (2022). "Hierarchical Text-Conditional Image Generation with CLIP Latents". doi:10.48550/arXiv.2204.06125. {{cite journal}}: Cite journal requires |journal= (help)