Diffusion Inference with Dynamic Classifier-free Guidance

Published in International Conference on Inventive Computing and Informatics, 2024

The introduction of the diffusion architecture has led to rapid progress in image generation. These models learn to iteratively generate images by predicting noise from an initial Gaussian distribution, which is learnt through forward and reverse diffusion. The process of iterative sampling occurs over a number of inference steps. Classifier guidance is a popular technique that guides the generation process of latent diffusion models (LDMs). It does this by adding an external classifier into the training process. This however requires additional training and severely limits the diversity of the synthesized data along with several other limitations. To improve on this, a novel approach, known as Classifier-Free Guidance (CFG), was introduced that removed the need for a separate classifier. It controls the sample generation process using a parameter known as the CFG scale. Previously, this scale has been set to a constant value during inference. This research work proposes varying the CFG scale values across the inference steps by making use of various scheduling functions, which not only results in better images but also unlocks the full potential of the rich latent space representation of diffusion models by allowing for the sampling of various different images from the same initial conditions by only varying the CFG scale and keeping all other parameters constant.

Keywords: classifier-free guidance, diffusion, image generation, inference, text-to-image

Recommended citation: V. S., A., Kulkarni, A., Chawla, D., Rawther, A., & Rangareddy, J. (2024). Diffusion Inference with Dynamic Classifier-free Guidance. In International Conference on Inventive Computing and Informatics (2nd ed., pp. 53–59). IEEE. https://doi.org/10.1109/icici62254.2024.00018
Download Paper