What is Diffusion Model?
A diffusion model is a generative AI technique that creates new data like images by learning to reverse a gradual noising process applied to training examples.
In the forward process, random noise is slowly added to data over many steps until it becomes pure noise. The model is trained to predict and remove this noise at each step.
During generation, the model starts from random noise and iteratively denoises it, guided by learned patterns, to produce coherent new samples.
Key ideas include using a Markov chain for the diffusion steps and optimizing a simple noise-prediction objective that enables high-quality, stable training.
Example
Stable Diffusion uses a diffusion model to turn a text prompt like 'a cat astronaut' into a detailed image by starting from noise and gradually refining it into a recognizable picture.
Why it matters
Diffusion models currently power the highest-quality image and video generators used in creative tools, research, and applications like design and entertainment.
Frequently asked questions
Diffusion models train by reversing noise addition rather than using an adversarial game between generator and discriminator, often yielding more stable training and higher quality.
Related terms
A Generative Adversarial Network (GAN) is a machine learning model made of two neural networks that compete against each other to generate realistic new data, such as images or text.
A Variational Autoencoder (VAE) is a neural network that learns a compressed probabilistic representation of data and can generate new similar examples by sampling from that space. It combines autoencoders with variational inference to enable both reconstruction and generation.
A multimodal model is a generative AI system that can process and create content across multiple data types, such as text, images, audio, or video, within a single model.