Skip to content

Latest commit

 

History

History
157 lines (113 loc) · 6.91 KB

File metadata and controls

157 lines (113 loc) · 6.91 KB

stable_diffusion

Module Name stable_diffusion
Category text to image
Network CLIP Text Encoder+UNet+VAD
Dataset -
Fine-tuning supported or not No
Module Size 4.0GB
Latest update date 2022-08-26
Data indicators -

I.Basic Information

Application Effect Display

  • Prompt "in the morning light,Overlooking TOKYO city by greg rutkowski and thomas kinkade,Trending on artstation."

  • Output image


  • Generating process


Module Introduction

Stable Diffusion is a latent diffusion model (Latent Diffusion), which belongs to the generative model. This kind of model obtains the images by iteratively denoising noise and sampling step by step, and currently has achieved amazing results. Compared with Disco Diffusion, Stable Diffusion iterates in a lower dimensional latent space instead of the original pixel space, which greatly reduces the memory and computational requirements. You can render the desired image within a minute on the V100, welcome to enjoy it in aistudio.

For more details, please refer to High-Resolution Image Synthesis with Latent Diffusion Models

II.Installation

III.Module API Prediction

  • 1.Command line Prediction

    • $ hub run stable_diffusion --text_prompts "in the morning light,Overlooking TOKYO city by greg rutkowski and thomas kinkade,Trending on artstation." --output_dir stable_diffusion_out
  • 2.Prediction Code Example

    • import paddlehub as hub
      
      module = hub.Module(name="stable_diffusion")
      text_prompts = ["in the morning light,Overlooking TOKYO city by greg rutkowski and thomas kinkade,Trending on artstation."]
      # Output images will be saved in stable_diffusion_out directory.
      # The returned da is a DocumentArray object, which contains all immediate and final results
      # You can manipulate the DocumentArray object to do post-processing and save images
      # you can set batch_size parameter to generate number of batch_size images at one inference step.
      da = module.generate_image(text_prompts=text_prompts, batch_size=3, output_dir='./stable_diffusion_out/')  
      # Show all immediate results
      da[0].chunks[-1].chunks.plot_image_sprites(skip_empty=True, show_index=True, keep_aspect_ratio=True)
      # Save the generating process as a gif
      da[0].chunks[-1].chunks.save_gif('stable_diffusion_out-merged-result.gif')
      da[0].chunks[0].chunks.plot_image_sprites(skip_empty=True, show_index=True, keep_aspect_ratio=True)
      da[0].chunks[0].chunks.save_gif('stable_diffusion_out-image-0-result.gif')
  • 3.API

    • def generate_image(
              text_prompts,
              style: Optional[str] = None,
              artist: Optional[str] = None,
              width_height: Optional[List[int]] = [512, 512],
              seed: Optional[int] = None,
              batch_size: Optional[int] = 1,
              output_dir: Optional[str] = 'stable_diffusion_out'):
      • Image generating api, which generates an image corresponding to your prompt.

      • Parameters

        • text_prompts(str): Prompt, used to describe your image content. You can construct a prompt conforms to the format "content" + "artist/style", such as "in the morning light,Overlooking TOKYO city by greg rutkowski and thomas kinkade,Trending on artstation.". For more details, you can refer to website.
        • style(Optional[str]): Image style, such as "watercolor" and "Chinese painting". If not provided, style is totally up to your prompt.
        • artist(Optional[str]): Artist name, such as Greg Rutkowsk,krenz, image style is as whose works you choose. If not provided, style is totally up to your prompt.(https://weirdwonderfulai.art/resources/disco-diffusion-70-plus-artist-studies/).
        • width_height(Optional[List[int]]): The width and height of output images, should be better multiples of 64. The larger size is, the longger computation time is.
        • seed(Optional[int]): Random seed, different seeds result in different output images.
        • batch_size(Optional[int]): Number of images generated for one inference step.
        • output_dir(Optional[str]): Output directory, default is "stable_diffusion_out".
      • Return

        • ra(DocumentArray): DocumentArray object, including batch_size Documents,each document keeps all immediate results during generation, please refer to DocumentArray tutorial for more details.

IV.Server Deployment

  • PaddleHub Serving can deploy an online service of text-to-image.

  • Step 1: Start PaddleHub Serving

    • Run the startup command:

    • $ hub serving start -m stable_diffusion
    • The servitization API is now deployed and the default port number is 8866.

    • NOTE: If GPU is used for prediction, set CUDA_VISIBLE_DEVICES environment variable before the service, otherwise it need not be set.

  • Step 2: Send a predictive request

    • With a configured server, use the following lines of code to send the prediction request and obtain the result.

    • import requests
      import json
      import cv2
      import base64
      from docarray import DocumentArray
      
      # Send an HTTP request
      data = {'text_prompts': 'in the morning light,Overlooking TOKYO city by greg rutkowski and thomas kinkade,Trending on artstation.'}
      headers = {"Content-type": "application/json"}
      url = "http://127.0.0.1:8866/predict/stable_diffusion"
      r = requests.post(url=url, headers=headers, data=json.dumps(data))
      
      # Get results
      r.json()["results"]
      da = DocumentArray.from_base64(r.json()["results"])
      # Save final result image to a file
      da[0].save_uri_to_file('stable_diffusion_out.png')
      # Save the generating process as a gif
      da[0].chunks[0].chunks.save_gif('stable_diffusion_out.gif')

V.Release Note

  • 1.0.0

    First release

    $ hub install stable_diffusion == 1.0.0