Skip to content

Commit

Permalink
update README.md;
Browse files Browse the repository at this point in the history
Signed-off-by: lawrence-cj <[email protected]>
  • Loading branch information
lawrence-cj committed Jan 12, 2025
1 parent 45c1d1b commit 173eb91
Show file tree
Hide file tree
Showing 3 changed files with 6 additions and 5 deletions.
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ As a result, Sana-0.6B is very competitive with modern giant diffusion model (e.

## 🔥🔥 News

- (🔥 New) \[2025/1/12\] DC-AE tiling makes Sana-4K inferences 4096x4096px images within 22GB GPU memory.[\[Guidance\]](asset/docs/model_zoo.md#-3-4k-models)
- (🔥 New) \[2025/1/11\] Sana code-base license changed to Apache 2.0.
- (🔥 New) \[2025/1/10\] Inference Sana with 8bit quantization.[\[Guidance\]](asset/docs/8bit_sana.md#quantization)
- (🔥 New) \[2025/1/8\] 4K resolution [Sana models](asset/docs/model_zoo.md) is supported in [Sana-ComfyUI](https://github.com/Efficient-Large-Model/ComfyUI_ExtraModels) and [work flow](asset/docs/ComfyUI/Sana_FlowEuler_4K.json) is also prepared. [\[4K guidance\]](asset/docs/ComfyUI/comfyui.md)
Expand Down
9 changes: 4 additions & 5 deletions asset/docs/model_zoo.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,11 +77,9 @@ image = pipe(
image[0].save('sana.png')
```

#### 2). For 4K models
## ❗ 3. 4K models

4K models need [patch_conv](https://github.com/mit-han-lab/patch_conv) to avoid OOM issue.(80GB GPU is recommended)

run `pip install patch_conv` first, then
4K models need VAE tiling to avoid OOM issue.(24 GPU is recommended)

```python
# run `pip install git+https://github.com/huggingface/diffusers` before use Sana in diffusers
Expand All @@ -99,7 +97,8 @@ pipe.vae.to(torch.bfloat16)
pipe.text_encoder.to(torch.bfloat16)

# for 4096x4096 image generation OOM issue, feel free adjust the tile size
pipe.vae.enable_tiling(tile_sample_min_height=1024, tile_sample_min_width=1024)
if pipe.transformer.config.sample_size == 128:
pipe.vae.enable_tiling(tile_sample_min_height=1024, tile_sample_min_width=1024)

prompt = 'a cyberpunk cat with a neon sign that says "Sana"'
image = pipe(
Expand Down
1 change: 1 addition & 0 deletions diffusion/model/builder.py
Original file line number Diff line number Diff line change
Expand Up @@ -135,6 +135,7 @@ def vae_decode(name, vae, latent):
scaling_factor = ae.cfg.scaling_factor if ae.cfg.scaling_factor else 0.41407
if latent.shape[-1] * vae_scale_factor > 4000 or latent.shape[-2] * vae_scale_factor > 4000:
from patch_conv import convert_model

ae = convert_model(ae, splits=4)
samples = ae.decode(latent.detach() / scaling_factor)
elif "AutoencoderDC" in name:
Expand Down

0 comments on commit 173eb91

Please sign in to comment.