From d4cd93226e12e84d6cb205d58a2dcab3ec637926 Mon Sep 17 00:00:00 2001 From: CharlesCNorton <135471798+CharlesCNorton@users.noreply.github.com> Date: Thu, 16 Jan 2025 07:18:38 -0500 Subject: [PATCH] fix: correct CSS property & missing articles - Correct "border-raduis" to "border-radius" - Insert missing article "a" in "modern giant diffusion model" - Insert missing article "a" in "decoder-only small LLM" --- README.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index 3a14731..b779785 100644 --- a/README.md +++ b/README.md @@ -13,7 +13,7 @@   -

+

teaser_page1

@@ -25,10 +25,10 @@ Core designs include: (1) [**DC-AE**](https://hanlab.mit.edu/projects/dc-ae): unlike traditional AEs, which compress images only 8×, we trained an AE that can compress images 32×, effectively reducing the number of latent tokens. \ (2) **Linear DiT**: we replace all vanilla attention in DiT with linear attention, which is more efficient at high resolutions without sacrificing quality. \ -(3) **Decoder-only text encoder**: we replaced T5 with modern decoder-only small LLM as the text encoder and designed complex human instruction with in-context learning to enhance the image-text alignment. \ +(3) **Decoder-only text encoder**: we replaced T5 with a modern decoder-only small LLM as the text encoder and designed complex human instruction with in-context learning to enhance the image-text alignment. \ (4) **Efficient training and sampling**: we propose **Flow-DPM-Solver** to reduce sampling steps, with efficient caption labeling and selection to accelerate convergence. -As a result, Sana-0.6B is very competitive with modern giant diffusion model (e.g. Flux-12B), being 20 times smaller and 100+ times faster in measured throughput. Moreover, Sana-0.6B can be deployed on a 16GB laptop GPU, taking less than 1 second to generate a 1024 × 1024 resolution image. Sana enables content creation at low cost. +As a result, Sana-0.6B is very competitive with modern giant diffusion models (e.g. Flux-12B), being 20 times smaller and 100+ times faster in measured throughput. Moreover, Sana-0.6B can be deployed on a 16GB laptop GPU, taking less than 1 second to generate a 1024 × 1024 resolution image. Sana enables content creation at low cost.

teaser_page2