Stable Diffusion XL 1.0

DataCrunch Inference Stable Diffusion XL 1.0 documentation

The inference service provides Stable Diffusion XL 1.0 endpoint.

Endpoint features

Three generation modes:
Safety filter toggle
Support for style templates
Support for loading LoRA (limited)

Examples

Simple base SDXL (no refiner)

To disable refiner set is_ensemble=false and refiner=false.

curl -X POST https://<ENDPOINT_URL>/generate \
  -H "Content-Type: application/json" \
  -H <AUTH_HEADERS> \
  -d \
'{
    "prompt": "A cat with a hat",
    "height": 512,
    "width": 512,
    "num_inference_steps": 50,
    "num_images_per_prompt": 1,
    "seed": 42,
    "refiner": false,
    "is_ensemble": false
}'

Ensemble of Expert Denoisers

First denoise for num_inference_steps * (1 - refiner_ratio) steps using the base model, and then continue for num_inference_steps * refiner_ratio steps using the refiner, more details here.

To run in the Ensemble of Experts mode set is_ensemble=true and refiner=false.

curl -X POST https://<ENDPOINT_URL>/generate \
  -H "Content-Type: application/json" \
  -H <AUTH_HEADERS> \
  -d \
'{
    "prompt": "A cat with a hat",
    "height": 512,
    "width": 512,
    "num_inference_steps": 50,
    "num_images_per_prompt": 1,
    "seed": 42,
    "refiner": false,
    "is_ensemble": true,
    "refiner_ratio": 0.2
}'

Refine the denoised base image

To run a two-step pipeline, where first the image is fully denoised using the base model, and then refiner is applied as an image-to-image pipeline to the output of the base model set is_ensemble=false and refiner=true. Number of steps for the base model, and the refiner are controlled separately using the num_inference_steps and num_inference_steps_refiner parameters. Also separate guidance_scale and guidance_scale_refiner values are used.

curl -X POST https://<ENDPOINT_URL>/generate \
  -H "Content-Type: application/json" \
  -H <AUTH_HEADERS> \
  -d \
'{
    "prompt": "A cat with a hat",
    "height": 512,
    "width": 512,
    "num_inference_steps": 40,
    "num_inference_steps_refiner": 10,
    "guidance_scale": 4,
    "guidance_scale_refiner": 4,
    "num_images_per_prompt": 1,
    "seed": 42,
    "refiner": true,
    "is_ensemble": false
}'

Parameters

prompt (str, required): Prompt text.
height (int, optional): Height of the output image. Setting aspect_ratio overrides this value. Defaults to 1024.
width (int, optional): Width of the output image. Setting aspect_ratio overrides this value. Defaults to 1024.
num_inference_steps (int, optional): Number of inference (denoising) steps. Defaults to 50.
guidance_scale (float, optional): Scaling factor for guidance. Specifies how much to follow the text prompt. Defaults to 4.0.
num_images_per_prompt (int, optional): Number of images to generate per prompt. Defaults to 1.
seed (int, optional): Seed for random number generator. Defaults to 42.
negative_prompt (str, optional): Negative prompt text.
seed_image (str, optional): Base64-encoded seed image string.
strength (float, optional, Range: [0.05, 1.0]): How much noise is added to the seed_image before generation. Defaults to 0.2.
scheduler (str, optional): Scheduler to use. Supported schedulers: DDIM, K_EULER, EulerA, DPMSolverMultistep, KarrasDPM, PNDM, HeunDiscrete. Defaults to DDIM.
timestep_spacing: (str, optional): specifies the timestep spacing for the scheduler. Supported values: linspace, trailing, leading. Defaults to linspace.
guidance_scale_refiner (float, optional): Scaling factor for refiner guidance (corresponds to guidance_scale). Defaults to 1.0.
refiner (bool, optional): Whether to use the refiner model. Defaults to false.
num_inference_steps_refiner (int, optional): Number of inference steps for refiner, applied when is_ensemble=false. Defaults to 50.
style_selected (str, optional): Apply the specified to the provided prompt, see supported styles.
is_ensemble (bool, optional): Whether to use the Ensemble of Expert Denoisers pipeline. Defaults to false.
refiner_ratio (float, optional): Requires is_ensemble=true. The fraction of the num_inference_steps steps to run the refiner for. For example, if num_inference_steps=40, and refiner_ratio=0.1 then the base model will run for 40 * (1-0.1) = 36 steps, and the refiner for 40 * 0.1 = 4 steps. Values over 0.2 start to produce unnatural-looking images. Defaults to 0.2.
aspect_ratio (str, optional): Aspect ratio of the output image. Setting this value overrides the width and height values.
lora_id (str, optional): Finetuned LoRA ID to load (LoRA file must exist on DataCrunch platform).
lora_name (str, optional): Public LoRA to be loaded. Currently only supported lora_name="offset" (corresponding to: "sd_xl_offset_example-lora_1.0.safetensors").
safety_filter (bool, optional): Whether to use NSFW filter. Defaults to true.

PreviousLanguage Models

Last updated 1 year ago