Blog Post··6 min read
DiT: Replacing the U-Net with a Transformer
DDPM, DDIM, and latent diffusion all use a U-Net backbone. DiT replaces it with a transformer — and finds that diffusion scales with model size the same way language models do.
DDPM, DDIM, and latent diffusion all use a U-Net backbone. DiT replaces it with a transformer — and finds that diffusion scales with model size the same way language models do.