This is based on official stable diffusion repository CompVis/stable-diffusion. We have kept the model structure same so that open sourced weights could be directly loaded. Our implementation does not contain training code.
We have deployed a stable diffusion based image generation service at promptart.labml.ai
The core is the Latent Diffusion Model. It consists of:
We have also (optionally) integrated Flash Attention into our U-Net attention which lets you speed up the performance by close to 50% on an RTX A6000 GPU.
The diffusion is conditioned based on CLIP embeddings.
We have implemented the following sampling algorithms:
Here are the image generation scripts:
util.py
defines the utility functions.