I Spent 1000 Hours Researching This - You Won't Believe What I Discovered About Stable Diffusion!

PromptGeek
28 Jul 202318:31

TLDRIn this video, the speaker introduces a comprehensive guide to creating photorealistic images using stable diffusion technology. The guide includes a 182-page prompt look book with over 350 images and 200 prompt tags, all tested by the speaker. The resource is available for free on Gumroad, with an optional $2 donation towards the creator's coffee fund. The video covers the best settings for stable diffusion, the models used, and provides examples from the book. The speaker also discusses the use of LORAs for realistic skin and eyes, negative prompts to avoid common issues, and the importance of specific settings like sampling method and upscaler. The guide helps users understand how to structure prompts for AI image generation, including style, subject, pose, framing, background, lighting, camera angle, and camera properties. The speaker encourages viewers to share their creations and subscribe to the channel for updates.

Takeaways

  • ๐Ÿ“ท Stable Diffusion can create photorealistic images without the need for expensive camera equipment.
  • ๐ŸŽจ The speaker has created a 182-page prompt look book with over 350 images and 200 prompt tags, available for free on Gumroad.
  • โ˜•๏ธ The audience is encouraged to like the video, subscribe, and optionally donate to the speaker's coffee fund to support further content creation.
  • ๐Ÿ–ผ๏ธ The look book includes settings and models used for stable diffusion, such as Universe Stable, Absolute Reality, and Photon.
  • ๐Ÿ” The use of LORAs (like detailed eyes and polyhedron New Skin) can enhance the realism of skin textures and eyes in the generated images.
  • โŒ Negative prompts, such as 'bad hands' and 'unrealistic dream', are important to refine the image generation process.
  • ๐Ÿ“ˆ The sampling method DPM++ SDE CARAS with 30 sampling steps and high res fix is recommended for image generation.
  • ๐Ÿ” Using a 4x ultra sharp upscaler can yield fast and high-quality results compared to the 8x NMKD super scaler suggested by Absolute Reality's creator.
  • ๐Ÿ–ผ๏ธ The speaker emphasizes the importance of the prompt structure, which includes elements like style of photo, subject details, pose/action, framing, background, and lighting.
  • ๐ŸŒŸ Specific styles like 'documentary photography' can lead to more realistic skin tones and textures in the generated images.
  • ๐Ÿ“š The speaker's book provides a comprehensive guide on how to build the perfect prompt for stable diffusion, including camera properties and styles of famous photographers.

Q & A

  • What is the main topic of the video?

    -The main topic of the video is about using stable diffusion to create photorealistic images without the need for expensive camera equipment.

  • What is the resource the speaker has built for the audience?

    -The speaker has built an 182-page prompt look book with over 350 images and over 200 prompt tags, which is available for free on Gumroad.

  • What are the three models the speaker has been using for stable diffusion?

    -The three models the speaker has been using are Universe Stable, Absolute Reality, and Photon.

  • What are LORAs and how are they used in the process?

    -LORAs are specific prompt tags used to enhance certain features in the generated images, such as 'detailed eyes' and 'polyhedron New Skin' for realistic skin textures and eyes.

  • What is the recommended sampling method and steps to use in stable diffusion?

    -The recommended sampling method is DPM ++ SDE CARAS, with sampling steps set to 30.

  • How does the speaker suggest improving the resolution of the generated images?

    -The speaker suggests using a four x ultra sharp upscaler and setting the high res steps to 20 for better and faster results.

  • What is the role of negative prompts in the image generation process?

    -Negative prompts are used to avoid unwanted elements in the generated images, such as 'bad hands' or 'unrealistic dream'.

  • Why is the speaker against using the detailer for creating many images?

    -The speaker found that the detailer was giving the same faces too often when creating a large number of images, which reduced the diversity of the results.

  • What is the structure of the perfect prompt as described in the speaker's guide?

    -The perfect prompt structure includes the style of photo, subject with important features, pose or action, framing, background, lighting, camera angle, camera properties, and the style of the photographer's name.

  • How does the speaker suggest using the prompt tags for different photography styles?

    -The speaker suggests using specific photography style tags like 'candid photography', 'documentary photography', or 'surrealist' with appropriate weights to achieve the desired look in the generated images.

  • What is the importance of specifying the camera angle and properties in the prompt?

    -Specifying the camera angle and properties helps to influence the style and quality of the generated image, giving it a more authentic and realistic look based on the chosen camera or lens.

  • How can the audience get access to the speaker's prompt look book?

    -The audience can access the prompt look book for free on Gumroad, with the option to donate $2 towards the speaker's coffee fund if they wish.

Outlines

00:00

๐Ÿ“ท Introduction to Photorealistic Image Creation with Stable Diffusion

The video begins by addressing engineers and photography enthusiasts, suggesting that despite owning expensive cameras and lenses, one can create photorealistic images using stable diffusion without needing to leave their workspace. The speaker humorously encourages getting out of the basement and introduces a free 182-page prompt look book with over 350 images and 200 prompt tags, tested over hundreds of hours. The resource is available on Gumroad, and the speaker asks viewers to like the video, subscribe to the channel, and optionally donate to a coffee fund. The video promises to showcase the best settings for stable diffusion, models used, and examples from the book.

05:03

๐Ÿ–ผ๏ธ Optimal Settings and Models for Realistic Image Generation

The speaker discusses the models they've found most successful for creating images with a sci-fi or fantasy twist, as well as for backgrounds. These include Universe Stable, Absolute Reality, and Photon. The importance of using the right prompt and settings is emphasized, and the video outlines specific settings within stable diffusion, such as using two LORAs (detailed eyes and polyhedron New Skin) for realistic skin and eyes, negative prompts like 'bad hands' and 'bad dream', and the sampling method DPM ++ SDE CARAS with 30 steps. The speaker also covers the use of high res fix, upscalers, high res steps, denoising strength, and aspect ratio adjustments. The paragraph concludes with a teaser for the next book and a reminder of the coffee fund.

10:04

๐ŸŽจ In-Depth Prompt Construction and Image Generation Process

The video continues with the process of generating the first image using the discussed settings. The speaker highlights the need for a good prompt structure, which includes the style of photo, subject details, pose or action, framing, background, lighting, camera angle, and camera properties. The paragraph provides examples of different photography styles like abstract, candid, documentary, and large format, explaining how they affect the final image. The speaker also shares their personal experiences with the prompt guide, emphasizing the effort put into perfecting the prompts and the selection process for the images included in the book.

15:07

๐Ÿ“ธ Advanced Prompt Techniques and Photographer Styles

The final paragraph delves into more advanced aspects of prompt construction, including the use of adjectives to describe the character of the subject and the avoidance of focusing on hands and feet. The paragraph discusses the importance of specifying poses and actions, framing, and background settings, and how they contribute to the overall image. Lighting choices such as candlelight, chiaroscuro, cinematic, golden hour, high key, neon, and overcast are explored. The speaker also touches on camera angles and properties, mentioning different cameras and lenses that can be referenced in prompts to achieve specific visual effects. The video concludes with a call to action for viewers to like, subscribe, and share their creations, and an invitation to download the free prompt book for further guidance.

Mindmap

Keywords

๐Ÿ’กStable Diffusion

Stable Diffusion is a term referring to a type of artificial intelligence model used for generating images from textual descriptions. In the context of the video, it is the core technology that allows the creation of photorealistic images without the need for traditional photography equipment. The video discusses how to use Stable Diffusion effectively to produce high-quality results.

๐Ÿ’กPrompt Look Book

A Prompt Look Book is a resource that provides examples and guidelines on how to construct prompts for AI image generation. The video mentions a 182-page look book that the speaker has created, which includes over 350 images and 200 prompt tags tested by the speaker. It serves as a guide for users to understand how to generate images using specific prompts and settings in Stable Diffusion.

๐Ÿ’กLORAs

LORAs, or Latent Optimizations, are specific types of adjustments used within the Stable Diffusion model to influence the generation process. In the video, 'detailed eyes' and 'polyhedron New Skin' are mentioned as LORAs that help in creating more realistic skin textures and eyes in the generated images.

๐Ÿ’กNegative Prompts

Negative prompts are terms or phrases included in the prompt to guide the AI away from generating certain undesired elements in the image. An example from the video is 'bad hands', which is used to prevent the AI from generating images with poorly rendered hands.

๐Ÿ’กSampling Method

The Sampling Method refers to the algorithmic technique used by the AI to generate the image. DPM ++ SDE CARAS is mentioned in the video as a specific sampling method that the speaker uses in Stable Diffusion to achieve better image results.

๐Ÿ’กHigh Res Fix

High Res Fix is a setting used to enhance the resolution of the generated images. The video suggests using this feature to improve the quality of the output, making the images more detailed and closer to photorealism.

๐Ÿ’กUpscale

Upscaling is the process of increasing the resolution of an image. In the context of the video, the speaker mentions using a four x ultra sharp upscaler to improve the resolution of the generated images, making them clearer and more detailed.

๐Ÿ’กPhotorealistic

Photorealistic refers to the quality of an image appearing very similar to a photograph taken by a camera. The entire theme of the video revolves around achieving photorealistic results using Stable Diffusion, with the speaker providing various techniques and settings to accomplish this.

๐Ÿ’กPrompt Structure

Prompt Structure is the arrangement and composition of the textual description (prompt) given to the AI to guide the image generation process. The video emphasizes the importance of a well-structured prompt, which includes elements like the style of photography, subject details, pose, background, and lighting.

๐Ÿ’กCamera Properties

Camera Properties in the context of the video refer to the specific characteristics of cameras and lenses that can be referenced in the prompt to influence the style and quality of the generated image. The speaker discusses how mentioning certain cameras or lenses can add a distinctive look to the AI-generated images.

๐Ÿ’กStyle of Photographer

The Style of Photographer refers to the unique aesthetic qualities associated with specific photographers, which can be invoked in the prompt to guide the AI towards generating images in a similar style. The video provides examples of photographers like Tim Walker and Alfred Stieglitz, whose styles can be referenced to achieve particular visual effects.

Highlights

You can create photorealistic images using stable diffusion without expensive camera equipment.

The speaker has developed a 182-page prompt look book with over 350 images and 200 prompt tags for stable diffusion.

The look book is available for free on Gumroad, with an option to donate towards the creator's coffee fund.

The video showcases the best settings for stable diffusion, including models and prompt examples from the book.

Three models discussed are Universe Stable, Absolute Reality, and Photon, suitable for sci-fi, fantasy, and film grain effects.

Popular photorealistic models can yield good results with the right prompt and settings.

LORAs such as detailed eyes and polyhedron New Skin are used for realistic skin and eye textures.

Negative prompts like 'bad hands' and 'unrealistic dream' are crucial for refining image generation.

The sampling method DPM++ SDE CARAS with 30 sampling steps is recommended for high-quality outputs.

High res fix and four x ultra sharp upscaler are used for faster and great results.

Denoising strength can be adjusted between 0.2 to 0.4 for optimal image quality.

The aspect ratio and CFG scale can be modified based on the desired image orientation and style.

The use of adetailer can sometimes result in repetitive faces, suggesting manual touch-ups may be necessary.

The structure of an effective prompt includes the style of photo, subject details, pose, framing, background, lighting, camera angle, and photographer's style.

The prompt guide offers a variety of styles like abstract, candid, documentary, and glamour photography for different visual effects.

Adjectives and verbs in prompts can add character and expressiveness to the generated images.

Care should be taken when describing poses to match the desired scene's formality or casualness.

Background descriptions should provide context without being overly specific to allow the AI model to interpret effectively.

Lighting tags like candlelight, chiaroscuro, and overcast can dramatically affect the mood and realism of the image.

Camera angle choices such as Dutch angle, high angle, and eye level can change the perspective and feel of the image.

Camera properties including specific camera models and lenses can influence the style and quality of the generated image.

The inclusion of a photographer's style in the prompt can add a distinct artistic touch to the image.

The speaker encourages the community to use the provided resources to create and share their images, fostering a collaborative learning environment.