真摄影级出图!Midjourney V6 alpha测试视频 MJV6对比DALLE 3 谁优谁劣?使用Style raw参数在MJ中画出照片级图片 MJV6文本生成测试 Style raw用法解析

氪學家
1 Jan 202409:16

TLDR本期视频深入探讨了Midjourney(MJ)V6 alpha版本与DALL-E 3在AI绘画领域的竞争。MJ V6是经过半年技术沉淀后的跨版本更新,相较于V5.2,它在提示词跟随、模型知识、图像提示和混合、小幅文本绘制能力等方面都有所提升。特别地,V6在写实风格图片的生成上进行了优化,通过添加style raw参数,能够生成细节丰富、接近实拍照片的图像。此外,V6在文本生成方面也展现了进步,能够生成具有手写感觉的文本图像。尽管DALL-E 3在语义识别和拼写准确度上略胜一筹,但MJ V6在图像的真实感和元素丰富性上更胜一筹。视频最后鼓励观众亲自体验MJ V6,感受其带来的震撼。

Takeaways

  • 🎉 MJ系列教程更新,介绍了MJ V6 alpha版本,这是自V5.2以来的跨版本更新。
  • 🚀 AI绘画领域半年内技术突破和新产品层出不穷,包括DALL-E3和SD XL Turbo模型。
  • 📈 MJ V6相较于V5.2,在提示词跟随、一致性、图像提示和混合、文本绘制能力等方面有所提升。
  • ❌ V6版本目前不支持pan方向性拓图、zoom缩放拓图、局部重绘等功能。
  • 🔍 V6对提示词更敏感,官方建议避免使用photorealistic, 4K8K等提示词,推荐使用style raw参数。
  • 🖼️ 使用style raw参数后,V6版本的图片真实感显著优于V5.2版本。
  • 📝 V6版本在文本生成能力上有所进步,可以更准确地生成包含文本的图片。
  • 🆚 在写实风格图片生成方面,MJ V6版本与DALL-E3相比,V6在细节处理上更胜一筹。
  • 📈 V6版本对写实风格图片进行了优化,添加style raw参数后,生成的图片具有更强的写实感。
  • 📚 官方文档提供了关于V6风格和提示的详细信息,有助于用户更好地理解和使用新功能。
  • 🌟 视频鼓励观众亲自体验MJ V6版本,以感受其带来的新功能和改进。

Q & A

  • Midjourney V6版本相较于V5.2版本有哪些显著的更新和改进?

    -Midjourney V6版本相较于V5.2版本,主要的更新和改进包括更精确的提示词跟随、更长的提示词支持、改善了一致性和模型的知识、提升了图像的提示和混合、小幅文本的绘制能力,以及改进了放大器,包括subtle和creative模式。

  • 为什么说在Midjourney V6版本中,使用photorealistic这样的提示词会得到不理想的结果?

    -Midjourney V6版本对提示词更加敏感,官方明确表示photorealistic这样的提示词在V6中被视为垃圾提示词,因为它们不再能够带来预期的效果,反而可能产生负面效果,如生成密集恐惧症式的图像。

  • 在Midjourney V6版本中,如果要生成更写实风格的图片,应该如何设置参数?

    -在Midjourney V6版本中,为了生成更写实风格的图片,可以使用style raw参数或者设置较低的stylize值。

  • Midjourney V6版本在文本生成方面有哪些进步?

    -Midjourney V6版本在文本生成方面,支持将文本内容置于双引号内进行生成,且在测试中显示出较V5.2版本在文本绘制上的进步,生成的文本具有手写感觉,且元素更为丰富。

  • DALL-E 3与Midjourney V6在写实风格图片生成方面有哪些不同的表现?

    -在写实风格图片生成方面,Midjourney V6在服装、皮肤纹理、毛发细节以及景深关系的表现上更胜一筹,而DALL-E 3虽然在语义识别能力上更强,但在真实感的表现上不如Midjourney V6。

  • Midjourney V6版本在文本生成时,提示词中包含文本应该如何放置?

    -在Midjourney V6版本中,如果要生成包含文本的图片,需要将文本内容放在英文双引号中。

  • Midjourney V6版本在测试版阶段不支持哪些功能?

    -Midjourney V6在测试版阶段不支持pan方向性拓图、zoom缩放拓图、局部重绘、样式协调器和提示词反求功能。

  • Midjourney V6版本默认生成的图片分辨率是多少?

    -Midjourney V6版本默认生成的图片分辨率是1024x1024,经过放大后,图片的分辨率可以达到2048x2048。

  • 在Midjourney V6版本中,如何切换到使用V6模型进行图像生成?

    -在Midjourney V6版本中,可以通过在输入框中输入斜杠(/)并选择settings,然后在弹出的对话框中选择MJ model V6 alpha选项来切换到使用V6模型。

  • Midjourney V6版本在官方社区中有哪些更新说明?

    -Midjourney V6版本在官方社区中的更新说明包括更精确的提示词跟随、更长的提示词支持、改善一致性和模型知识、提升图像提示和混合、小幅文本的绘制能力,以及改进放大器等。

  • Midjourney V6版本在文本生成准确性上与DALL-E 3相比如何?

    -在文本生成准确性上,Midjourney V6版本生成的图片在拼写上没有问题,但DALL-E 3在拼写准确度上略胜一筹,尽管DALL-E 3的出图相对单调,而Midjourney V6的出图元素更为丰富。

  • Midjourney V6版本在图像生成时,添加style raw参数有什么效果?

    -在Midjourney V6版本中,添加style raw参数有助于生成更加写实的图像,减少MJ自身的风格添加,使得生成的图像更加贴近提示词的描述。

Outlines

00:00

📈 Introduction to MJ V6 and AI Art Development

The video begins with a greeting and an acknowledgment of the long gap since the last update of the MJ tutorial series. The presenter mentions the release of MJ's new V6 version, an alpha version that marks a significant update from the previous V5.2. The video provides a context of AI art advancements in the past six months, including the release of DALL-E3 by OpenAI, SD XL and SD XL Turbo models by SD, and Adobe's Firefly update. It also discusses the potential loss of paid users for MJ due to the long update cycle. The presenter then guides viewers on how to switch to the V6 model in the Discord platform and suggests visiting the official MJ community for updates. The video outlines the new features in V6, such as improved prompt following, consistency and knowledge of the model, enhanced image prompts and blending, and the ability to handle small text. It also addresses the change in prompt sensitivity and the deprecation of certain prompt words like 'photorealistic'.

05:02

🎨 Testing MJ V6's Realism and Text Generation Capabilities

The presenter conducts a comparison between MJ V6 and its predecessor V5.2 using the same prompt to demonstrate the improved realism in V6, especially when using the 'style raw' parameter. The video shows side-by-side comparisons of images generated by both versions, highlighting the superior detail and realism of V6. The 'style raw' parameter is explained as a way to reduce MJ's artistic flair and achieve more control over the output, which is particularly useful for creating realistic images. The presenter also compares MJ V6 with DALL-E3 using similar prompts, noting that while DALL-E3 has stronger semantic understanding, MJ V6's images are more realistic in terms of clothing, skin texture, hair detail, and depth of field. Additionally, the video tests MJ V6's ability to generate text within images, showing an improvement over V5.2 with more accurate and diverse outputs. The presenter concludes by encouraging viewers to experience the new features of MJ V6 and thanks the audience for their attention.

Mindmap

Keywords

Midjourney V6

Midjourney V6 refers to the latest version of the AI image generation software by Midjourney, which is a significant update from the previous V5.2 model. The V6 alpha version showcases advancements in AI technology, with improved features such as more accurate prompt following, longer prompt support, and enhanced image quality. It is a central focus of the video as it represents the subject of the tutorial and comparison.

DALL-E 3

DALL-E 3 is an AI model developed by OpenAI, known for its powerful semantic recognition capabilities. It is mentioned in the video as a competitor to Midjourney V6, and the video aims to compare the two in terms of their ability to generate realistic images. DALL-E 3 is used as a benchmark to evaluate the improvements in Midjourney's V6 version.

Style raw

The term 'Style raw' is a parameter used in Midjourney V6 to produce more photorealistic images. It is a key concept in the video as it is demonstrated to significantly enhance the realism of the generated images when compared to the default settings. The script provides examples of how using 'Style raw' can lead to images that closely resemble real photographs.

Photorealistic

Photorealistic refers to the quality of an image that closely resembles a real photograph. The video discusses how the term 'photorealistic' was considered a 'bad word' in the context of V6's prompt sensitivity, but its usage with the 'Style raw' parameter still yields highly realistic results. It is a crucial aspect of the video's comparison between different AI models.

Semantic recognition

Semantic recognition is the ability of an AI to understand the meaning of words and phrases in context. It is highlighted as a strong point of DALL-E 3, which allows it to generate images that closely follow the prompts given to it. The video compares the semantic recognition capabilities of DALL-E 3 with those of Midjourney V6.

Consistency and knowledge

Consistency and knowledge in the context of AI models refer to the ability of the AI to maintain a coherent theme and draw upon a broad base of knowledge when generating images. The video mentions that Midjourney V6 has improved in these areas, which contributes to the quality and believability of the generated images.

Text generation

Text generation within the scope of the video pertains to the AI's ability to include text within the generated images. The video demonstrates how Midjourney V6 can generate images with text, such as 'hello world' on a sticky note, showcasing the AI's capability to understand and render text accurately.

WebUI ComfyUI and FOOOCUS

WebUI ComfyUI and FOOOCUS are user interface programs that enhance the functionality of AI image generation models like Stable Diffusion (SD). They are mentioned in the video as part of the advancements in the AI art industry, highlighting the variety of tools available to improve image quality and efficiency.

Adobe Firefly

Adobe Firefly is Adobe's AI technology that has been updated to the second generation. It is noted in the video as one of the competitors in the AI art space, leveraging the extensive user base of Photoshop to establish a presence in the market. It represents the broader industry's engagement with AI image generation.

AI painting circle

The AI painting circle refers to the community and industry surrounding AI-driven image creation. The video discusses the rapid advancements and product launches within this circle over the past six months, emphasizing the fast-paced evolution of AI art technology.

Discord

Discord, in the context of the video, is a platform where users can access and interact with the Midjourney AI service. It is used as a demonstration space to show how to switch to the latest V6 model and to discuss community updates, indicating its role as a hub for user engagement and software access.

Highlights

Midjourney V6 alpha版本发布,相较于V5.2版本,是跨版本的更新,距离上一次更新已过去半年。

AI绘画领域半年来出现了多项技术突破和新产品,如DALL-E3和SD XL Turbo模型。

Midjourney V6版本在功能上进行了更新,包括更精确的提示词跟随、更长的提示词支持等。

V6版本改善了一致性和模型知识,提升了图像的提示和混合能力。

V6版本小幅增强了文本的绘制能力,如在便签上写有'hello world'。

V6版本改进了放大器,包括subtle和creative模式,生成图片默认分辨率为1024x1024。

V6测试版目前不支持pan方向性拓图、zoom缩放拓图等功能。

V6版本的提示方式与V5不同,对提示词更加敏感,避免使用如photorealistic等旧提示词。

官方推荐使用style raw参数或较低的stylize值以获得更写实的风格。

通过案例测试,V6在添加style raw参数后,生成的图片真实感远超V5.2版本。

V6版本在写实风格图片上做了优化,细节处理接近实拍照片。

V6版本在文本生成方面有显著进步,能够更准确地绘制英文文本。

MJ V6与DALL-E3在写实风格图片生成上进行了对比,V6在细节上更胜一筹。

MJ V6在文本绘制上展现了更丰富的元素和手写感觉。

V6版本在图片质量上相较于5.2版本有明显进步,值得用户亲自体验。

视频作者建议观众亲自体验MJ V6版本,以感受其带来的震撼。

视频作者承诺未来会带来更多AI绘画相关的教程。