Midjourney v5 - Style Prompt Tips and Reference Tricks

Theoretically Media
21 Mar 202311:57

TLDRIn this video, the host discusses the challenges of generating specific images using Midjourney V5 and provides tips to improve the process. They address a common issue where the AI ignores certain instructions, such as posing and style preferences, using a comment from Eric Schlitzbeyer as a starting point. The host then demonstrates various techniques to achieve a desired image output, including adjusting the prompt's structure, emphasizing certain words, and using image prompts. They also experiment with different styles, like those of Frank Miller and Akira Kurosawa, and suggest workarounds for aspect ratio issues. The video concludes with a Photoshop example to illustrate character emotions and a mention of Leonardo's canvas feature for adjusting image dimensions. The host encourages viewer engagement and provides a link to a related video on cinematic prompting in Midjourney.

Takeaways

  • 🎨 **Style Prompting in Midjourney V5**: The video discusses how to use style prompts effectively in Midjourney V5 to achieve desired images.
  • πŸ—£οΈ **Natural Language vs. Programming**: While the new prompting is intended to be more natural language, it's still somewhat programming-oriented.
  • πŸ–ΌοΈ **Full Body Shots**: Achieving full body images can be challenging due to the 16:9 aspect ratio's tendency to favor cinematic compositions.
  • πŸ“š **Comic Book Styles**: The video explores applying comic book styles, like Frank Miller's, to prompts, but notes that Midjourney may not always apply them as expected.
  • πŸ‘Ÿ **Specific Details**: Midjourney sometimes ignores specific details like the absence of shoes, likely due to the training data it was exposed to.
  • πŸ”„ **Prompt Formula**: A prompt formula is introduced to help control the output, which includes elements like 'cinematic still' and 'style by'.
  • πŸ” **Image Prompts**: Using an image as a prompt can be effective, but it may lock the output into the reference image's aspect ratio.
  • 🌐 **Combining Tools**: The video suggests using multiple tools, like Leonardo, to achieve better results by combining their strengths.
  • 🎭 **Emoting Characters**: Eliciting specific emotions in generated characters can be tricky, but using a 'photo bash' technique can help.
  • πŸ“ **Aspect Ratio Adjustments**: Leonardo's canvas feature can be used to adjust the aspect ratio of an image without distorting it.
  • πŸš€ **Creative Experimentation**: The process encourages experimentation with different styles and artists to find unique combinations that work.

Q & A

  • What is the main issue that Eric Schlitzbeyer has with Midjourney V5?

    -Eric Schlitzbeyer is bothered by the fact that many instructions are ignored by Midjourney V5, particularly when he wants a full body picture and not just a portrait from the hip, and the addition of 'in the style of' is also often disregarded.

  • What is the example prompt provided by Eric in the transcript?

    -Eric's example prompt describes a gloomy, mystical, and somewhat threatening forest with a 10-year-old Viking girl standing in a clearing where fog is wafting. She is seen in full body, wearing torn, dirty clothes, no shoes, threatening with a sword, and screaming a loud war cry. Her long blonde hair is braided in a style reminiscent of Tarantino, Kurosawa, or Frank Miller.

  • What is the initial problem with the generated images from Eric's prompt?

    -The initial problem is that despite the prompt asking for a full body shot, the generated images are waist-up shots. This is likely due to the 16:9 aspect ratio, which tends to result in more cinematic compositions that favor waist-up or close-up shots over full body shots.

  • How does the speaker attempt to address the issue of not getting a Frank Miller style in the generated images?

    -The speaker tries to address this by rearranging the prompt and applying a prompt formula that emphasizes 'Style by Frank Miller'. They also experiment with adding 'Sin City' to the prompt to get closer to the desired Frank Miller style.

  • What is the prompt formula used by the speaker to improve the results?

    -The prompt formula used is: 'A slash imagine cinematic still film by scene subject action set a link or shot, and then dash dash'. The speaker adapts this formula by replacing 'cinematic still' with 'illustration' and 'film by' with 'Style by Frank Miller'.

  • Why does the speaker think Midjourney V5 might be ignoring the instruction for a shoeless Viking?

    -The speaker suspects that Midjourney V5 is ignoring the instruction for a shoeless Viking because the AI has been trained on a dataset where there might not be many images of shoeless Vikings, leading to the AI's insistence on generating images with shoes.

  • What alternative artist style does the speaker try after struggling with the Frank Miller style?

    -After struggling with the Frank Miller style, the speaker tries using Mike Mignola's style, who is known for his work on Hellboy, which results in images that look quite impressive and closer to the desired outcome.

  • How does the speaker attempt to get a 16:9 aspect ratio image when the generated images are in a different ratio?

    -The speaker uses Leonardo's canvas feature to take the generated image and manually paint out sections to expand it into a 16:9 format, thus achieving the desired aspect ratio.

  • What trick does the speaker use to get Midjourney V5 to generate images with specific emotions?

    -The speaker uses a photo bashing technique where they manually edit an image in Photoshop to depict the desired emotion and then use that as an image reference in the prompt to generate images with characters expressing specific emotions.

  • What is the final suggestion the speaker makes to viewers regarding Midjourney V5?

    -The speaker suggests that viewers should experiment with different styles and artists when prompting Midjourney V5, as the AI may respond better to some styles over others. They also encourage viewers to use any tools available, like Leonardo, to achieve their desired results.

  • Why does the speaker believe Midjourney V5 has difficulty applying certain styles to specific subjects?

    -The speaker believes that Midjourney V5 has difficulty applying certain styles to specific subjects because the AI has been trained on data where certain styles are strongly associated with certain subjects. When a style is requested that has no historical association with the subject, like Frank Miller's style on a Viking theme, the AI struggles to generate accurate results.

Outlines

00:00

🎨 Enhancing Image Control in Mid-Journey V5

The speaker discusses the challenges of obtaining specific images using Mid-Journey V5 and shares tips on improving the output. Inspired by a comment from Eric Schlitzbeyer, the video addresses common issues such as instructions being ignored and the difficulty of achieving full-body images. The speaker also talks about the importance of prompt structure and provides a formula for creating more effective prompts. The initial attempt using Eric's prompt results in waist-up shots instead of the desired full-body image. The video then explores adjusting the prompt to emphasize certain aspects and achieve a closer match to the desired Frank Miller style, although not entirely successful. The process involves experimenting with different styles and references to guide the AI towards the desired outcome.

05:01

πŸ–ΌοΈ Image Prompting and Style Application

The speaker continues the discussion on refining image prompts in Mid-Journey V5, focusing on style application and the use of image references. They mention the limitations of image prompts, such as being locked into the aspect ratio of the reference image. The speaker shares a trick to overcome this by using Leonardo's canvas feature to expand images into desired formats. They also discuss the difficulty of conveying specific emotions in generated images and demonstrate a method to add character emotion using a photobash technique. The video highlights the importance of combining various tools and techniques to achieve the desired results, emphasizing the need for experimentation and adaptation when working with AI image generation.

10:03

πŸ“ Reshaping Images with Leonardo's Canvas Feature

The speaker concludes the video by addressing the issue of aspect ratio in generated images and how to reshape them using Leonardo's canvas feature. They demonstrate how to take an image and adjust its aspect ratio to fit formats like 16:9, which is not natively supported by Mid-Journey V5 without distorting the image. The process involves using the canvas feature to paint out sections of the image and expand it to the desired format. The speaker also shares a personal anecdote about discovering a glitch in Leonardo when using reference images with low resolution. The video ends with a call to action for viewers to like, subscribe, and engage with the content, and a teaser for an upcoming video on cinematic prompting in Mid-Journey.

Mindmap

Keywords

Midjourney V5

Midjourney V5 refers to a specific version of a generative AI model or tool that is used for creating images based on textual prompts. In the context of the video, it is central to the discussion as the host explores how to improve the output of images by refining the prompts given to this AI system.

Prompting

Prompting, in the context of AI image generation, involves providing detailed textual instructions to the AI in order to guide the creation of a specific image. The video focuses on strategies for effective prompting to achieve desired results from the Midjourney V5 AI.

Image References

Image references are existing images or visual styles that a user may want the AI to emulate or draw inspiration from when generating a new image. The script discusses the challenges of getting the AI to incorporate specific image references into its output.

Frank Miller

Frank Miller is a renowned comic book artist and writer known for his distinctive style, particularly in works like 'Sin City' and 'The Dark Knight Returns.' In the video, the host attempts to get the AI to mimic Miller's style in generating images, highlighting the difficulties in achieving this due to the AI's training data.

Aspect Ratio

The aspect ratio refers to the proportional relationship between the width and the height of an image. The script mentions the challenges of getting the AI to produce images in a specific aspect ratio, such as 16:9, which is common for cinematic compositions.

Cinematic

Cinematic, in this context, refers to a style of image that resembles a still from a movie, often characterized by dramatic lighting and composition. The host discusses how the AI tends to produce cinematic-style images and how to adjust prompts to achieve different styles.

Viking Girl

The Viking girl is a character concept central to the example prompt provided by a viewer named Eric. The video is focused on generating an image of a Viking girl with specific attributes and in a particular style, which serves as a case study for discussing AI image generation techniques.

Emoting

Emoting refers to the portrayal of emotions in a character or image. The script discusses the difficulty of getting the AI to generate images that depict a character with a specific emotional expression, such as screaming a war cry.

Photobash

A photobash is a technique where a user manually edits or composites elements into an image to guide the AI towards a desired output. The host demonstrates this by pasting a stock image of a screaming character onto an AI-generated image to prompt a more emotive result.

Leonardo

Leonardo is an AI tool mentioned in the script that allows users to upload reference images and use them to guide the generation of new images. The host uses Leonardo to adjust the aspect ratio of an image and to experiment with different styles.

Style Prompt Formula

The style prompt formula is a structured way of constructing prompts for the AI to increase the likelihood of achieving a desired style in the generated image. The host discusses a specific formula and how it can be adapted to emphasize certain aspects of the desired output.

Highlights

Midjourney V5's new prompting system is more linguistic and natural language-focused, although it still requires a programming-like approach.

Eric Schlitzbeyer's comment inspired the video, highlighting the challenge of getting specific images from Midjourney.

The initial output from Eric's prompt did not fully meet the request for a full-body image, instead producing waist-up shots.

The prompt logic in Midjourney V5 prioritizes elements placed earlier in the prompt sequence.

The use of a prompt formula can help achieve closer results to the desired image.

Frank Miller's unique style was not initially captured, suggesting the need to adjust the prompt strategy.

Increasing the weight of 'Frank Miller' in the prompt improved the style match in subsequent images.

Using an image prompt can lock the output to the aspect ratio of the reference image.

Leonardo's canvas feature can be used to adjust the aspect ratio and expand images to desired formats.

Midjourney struggles with applying styles that are not part of its training data, such as Frank Miller's style on a Viking theme.

Experimenting with different styles and artists, such as Mike Mignola or Ridley Scott, can yield unexpected but interesting results.

Photo bashing can be used as a trick to introduce specific emotions or actions that are challenging for Midjourney to generate.

Combining tools like Midjourney and Leonardo can lead to more refined and stylized outputs.

The video demonstrates the iterative process of refining prompts and using different strategies to achieve desired image outcomes.

The importance of understanding the limitations of AI training data and how it affects the generation of specific styles or themes.

The video concludes with a call to action for viewers to like, subscribe, and engage with the content for channel growth.