Skip to content
Sign in

What is Stable Diffusion?

Stable Diffusion is a generative AI model that creates images from text prompts by reversing a gradual noising process.

It is based on diffusion models, which learn to remove noise from data step by step. Starting from random noise, the model iteratively denoises the image while being guided by a text embedding.

To make it efficient, Stable Diffusion operates in a compressed latent space rather than pixel space, using a U-Net architecture conditioned on text encodings from models like CLIP.

The approach allows high-quality image synthesis with relatively modest compute, and the open release of its weights enabled widespread community use and fine-tuning.

Example

A user types the prompt "a watercolor painting of a fox in a snowy forest at dusk" and receives a detailed, original image matching the description within seconds.

Why it matters

Stable Diffusion made high-quality text-to-image generation freely accessible and locally runnable, accelerating creative tools, research, and the broader adoption of generative AI.

Frequently asked questions

Yes, the core model weights are publicly available under an open license, though some web services charge for hosted access.