Best Practice Workflow for Automatic 1111 โ€“ Stable Diffusion

AIKnowledge2Go
26 Jun 202307:59

TLDRIn this video, the presenter shares their preferred workflow for using the Stable Diffusion model in Automatic 1111, specifically with the ref animated model known for its semi-realistic and high-quality renderings. They provide tips on setting up the UI for clip skip, crafting prompts, and adjusting settings for optimal image quality. The process involves selecting the right sampler, resizing images for detail enhancement, and using denoising strengths to introduce desired levels of change. The presenter also addresses common issues like leg composition and offers solutions using tools like Over Paint and intent. Finally, they upscale the image using the RS Rugen 4X anime 6B for a crisp and detailed finish. The video concludes with a teaser for upcoming tutorials on related topics.

Takeaways

  • ๐ŸŽจ Use the ref animated model for semi-realistic renderings, which is great for beautiful and detailed images.
  • โš™๏ธ Set the Clip Skip to 2 in the Quick Settings to control the generation process.
  • ๐Ÿ–ฅ๏ธ Choose a resolution of 768x432 for a 16:9 aspect ratio to avoid deformations and maintain image quality.
  • ๐Ÿš€ Increase the batch size to 8 to get a selection of images to choose from for the best results.
  • ๐Ÿ” Use Euler a for prompt engineering and when experimenting to speed up the process.
  • ๐Ÿ“ˆ Change the sampler to DPM++ 2m for additional details and set the denoising strength between 0.4 and 0.7.
  • ๐Ÿงฉ Use the 'send to image to image' option to retain the composition of the original image.
  • ๐Ÿฆฟ Fix any compositional issues, such as adding a missing leg, using the Over Paint feature.
  • ๐Ÿ” Adjust the denoising strength and scale to refine the image details without losing quality.
  • ๐Ÿ“š Learn from common mistakes, like forgetting to delete the mask or starting to draw without it.
  • ๐Ÿ“ˆ Use the RS Rugen 4X anime 6B upscaler for a semi-realistic look, which is often preferred for ref animated images.

Q & A

  • What is the best workflow for stable diffusion in automatic 1111 according to the video?

    -The best workflow for stable diffusion in automatic 1111 involves using the ref animated model, setting the clip skip to 2, using Euler a for prompt engineering, and adjusting the width and height to 768 and 432 respectively for a 16:9 resolution. The batch size is increased to 8 to select from multiple images, and the sampler is changed to DPM plus plus 2m arrows with denoising strength between 0.4 and 0.7.

  • How does the video suggest to enable the clip skip feature in the settings?

    -To enable the clip skip feature, go to settings, navigate to user interface, click on Quick Settings, list, and then type in 'clip stop at last layers'. After that, you need to restart your UI.

  • What is the recommended resolution for the stable diffusion model to avoid deformations?

    -The recommended resolution is 768x432, as this is the maximum most models can handle without causing deformations.

  • Why is the batch size increased to 8 in the rendering process?

    -The batch size is increased to 8 to allow the selection of multiple images, providing more options to choose from for the final output.

  • What is the role of the DPM plus plus 2m arrows sampler in the workflow?

    -The DPM plus plus 2m arrows sampler is used to introduce changes to the image while keeping the composition intact. It is chosen to create a finished image based on the original composition.

  • What is the recommended denoising strength value for introducing some changes to the image?

    -The recommended denoising strength value is between 0.4 and 0.7, with 0.7 for more changes and 0.4 for minimal changes.

  • How does the video suggest fixing issues with the generated image, such as a missing leg?

    -The video suggests using the Over Paint feature to fix issues like a missing leg. It involves downscaling the image to avoid losing detail, making the necessary adjustments, and then scaling it back up for a crisp image.

  • What is the purpose of downscaling and upscaling the image in the process of fixing errors?

    -Downscaling the image allows for less detail loss when making adjustments. After fixing the errors, upscaling the image provides a crisp and detailed final output.

  • How does the video recommend selecting the final image from the generated options?

    -The video suggests choosing the image that best represents the desired outcome, such as the one with the most appealing explosion or the one that has been corrected for errors like a missing leg.

  • What upscaler is recommended for images rendered with the ref animated model?

    -The video recommends using the RS Rugen 4X anime 6B upscaler for images rendered with the ref animated model due to its semi-realistic look.

  • Why is it important to ensure the mask is deleted before starting to draw in the Over Paint feature?

    -Ensuring the mask is deleted before starting to draw prevents the previous mask from interfering with the new drawing, allowing for accurate and clean corrections.

  • What does the video suggest for those who want to discuss different upscaling options for various models?

    -The video suggests that those interested in discussing different upscaling options and their suitability for various models can engage in online discussions, as there are many debates on this topic.

Outlines

00:00

๐ŸŽจ Optimal Workflow for Stable Diffusion in Auto1111

The speaker shares their personal best workflow for stable diffusion using the semi-realistic model 'ref animated' in Auto1111. They provide tips on setting up the user interface for clip skip, crafting prompts, and adjusting settings like Euler for prompt engineering and image dimensions to avoid deformations. The speaker also discusses the importance of choosing the right sampler and denoising strength to introduce desired changes to the image. They mention sharing prompts on their Patreon page and showcase the rendering process, ultimately selecting an image with a preferred explosion effect for further refinement.

05:02

๐Ÿ–Œ๏ธ Refining the Image with Overpainting and Upscaling

The second paragraph focuses on refining the generated image by addressing issues such as a missing leg on the astronaut. The speaker guides on using the Over Paint feature, adjusting settings like denoising strength, and resizing the image to maintain detail. They emphasize the importance of checking for the presence of a mask before drawing and provide a method for correcting the leg. The paragraph concludes with an upscaler recommendation, the RSRugen 4X Anime 6B, for enhancing the semi-realistic look of the rendered image. The speaker also teases upcoming content, including a tutorial on common mistakes in painting hands with Auto1111.

Mindmap

Keywords

๐Ÿ’กStable Diffusion

Stable Diffusion is a term referring to a type of artificial intelligence model used for generating images from textual descriptions. In the context of the video, it is the core technology that the workflow is built around, allowing for the creation of AI art. The video discusses how to optimize the use of Stable Diffusion for generating images with specific characteristics and settings.

๐Ÿ’กClip Skip

Clip Skip is a feature within the AI art generation process that allows users to skip certain layers in the image generation, which can affect the outcome of the final image. In the video, the presenter shows how to enable Clip Skip in the settings to give more control over the generation process.

๐Ÿ’กRef Animated

Ref Animated is mentioned as a semi-realistic model within the Stable Diffusion framework that is capable of producing high-quality renderings. It is highlighted as a preferred choice for creating images with a more realistic aesthetic, as demonstrated in the video with the rendering of an astronaut and a space station.

๐Ÿ’กEuler a

Euler a is an algorithm used within the AI art generation process, known for its efficiency in prompt engineering. The video suggests using Euler a during the experimental phase due to its speed, which allows for quicker iterations and adjustments to the generated images.

๐Ÿ’กResolution

Resolution refers to the dimensions of the generated image, with the video specifying a width and height of 768 pixels as the maximum that most models can handle without causing deformations. The presenter also discusses adjusting the resolution to fit a 16:9 aspect ratio for screen compatibility.

๐Ÿ’กBatch Size

Batch Size is the number of images generated in a single rendering process. The video mentions increasing the batch size to 8 to allow for the selection of multiple images from which to choose the best outcome.

๐Ÿ’กDenoising Strength

Denoising Strength is a parameter that determines the level of detail preservation during the image generation process. A higher value results in more changes to the image, while a lower value preserves more of the original details. The video provides a range of 0.4 to 0.7 for this setting, depending on the desired level of change.

๐Ÿ’กImage-to-Image

Image-to-Image is a mode within the AI art generation process that allows users to base the final image on a provided image, rather than starting from scratch. This mode is chosen in the video to maintain the composition of the initial rendering while introducing changes.

๐Ÿ’กSampler

Sampler refers to the algorithm used to select samples during the image generation process. The video discusses changing the sampler to DPM++ 2m, which affects the level of detail and the overall look of the generated images.

๐Ÿ’กOver Painting

Over Painting is a technique used to manually edit parts of the generated image. In the video, the presenter uses Over Painting to correct an issue with the leg of the astronaut, demonstrating how to fix errors and make specific adjustments to the generated art.

๐Ÿ’กUpscaler

An Upscaler is a tool used to increase the resolution of an image without losing quality. The video mentions using the RS Rugen 4X Anime 6B upscaler, which is often used for images generated with the Ref Animated model to enhance the details and achieve a higher quality final image.

Highlights

Personal opinion on the best workflow for stable diffusion in automatic 1111 is shared.

Use of the ref animated model for semi-realistic and beautiful renderings is recommended.

Setting clip skip to 2 is suggested for optimization.

The process to enable the clip skip slider in settings is explained.

A prepared prompt featuring a female astronaut and an exploding space station is mentioned.

Use of Euler a for prompt engineering and its fast processing is highlighted.

The importance of not exceeding width and height of 768 to avoid deformations is noted.

A 16:9 resolution is targeted for screen compatibility.

Batch size is increased to 8 for image selection.

The selection process of the best rendered image is discussed.

Use of Cyrus fix is not recommended in this workflow.

Switching the sampler to DPM plus, plus 2m arrows is suggested for detail enhancement.

Denoising strength values between 0.4 and 0.7 are recommended based on desired changes.

Batch count set to three for multiple image options.

The next step involves addressing the leg situation in the image.

Over Painting and adjusting the mask for fixing image errors is covered.

Resizing the image by 0.5 for better detail retention is explained.

The use of DPM plus sampler with a denoising strength of 0.6 is suggested for further image changes.

Choosing the best image from the three rendered options is discussed.

Upscaling the final image using the RS Rugen 4X Anime 6B is recommended for semi-realistic looks.

Details and explosion in the final rendered image are praised for their quality.

Upcoming tutorials on in painting hands and common mistakes with automatic 1111 are teased.