The Craziest Faceswap I've Seen Yet / Midjourney's Future & Two New AI Video Platforms!

Theoretically Media
25 Apr 202410:38

TLDRThe video discusses advancements in face-swapping technology, showcasing a video from AI Katana that demonstrates highly realistic face tracking. The host also explores the future of Midjourney, a 12-month roadmap hinting at 3D scene generation and interactive world simulation. Additionally, the video introduces two new AI video platforms, Synthesia's Expressive model that allows for more emotive AI avatars and Morph Studios, which offers a node-based UI for creating animated-style videos. The host also mentions a new feature from Midjourney called 'style random' that randomizes styles for creative and potentially useful outcomes. Lastly, Nim Video is highlighted for its beta features, including style and character options, lip sync, and motion control.

Takeaways

  • 🤯 The AI face-swapping technology showcased is highly advanced, with realistic tracking even during complex facial movements like eating or touching the face.
  • 🔍 The video suggests that the face-swapping might not be in real-time, as there are still some inconsistencies and morphing issues present.
  • 📈 AI avatars from Synthesia have taken a leap forward with the introduction of the Expressive model, which can convey a range of emotions.
  • 🎭 The new Synthesia avatars do not require personal recording; instead, pre-trained avatars are used, which aligns with the capture rig's capabilities.
  • 🌟 Midjourney's future roadmap includes a focus on 3D, real-time video, and non-interactive world simulation, with an aim to add an interactive layer.
  • 🔮 Speculation suggests that Midjourney's 3D feature might allow for full 360° control over generated scenes, moving beyond image generation.
  • 👩‍💼 Media Molecule co-founder Alex Evans has joined Midjourney, indicating a significant push into 3D capabilities, leveraging his experience with 3D creation engines.
  • 📈 Midjourney has released a new feature called 'Style Random' that randomizes the style of generated images, offering both fun and utility.
  • 🎨 The 'Style Random' feature is particularly useful for discovering new styles that can be applied to future prompts, providing creative flexibility.
  • 🚀 Morph Studios and Nim Video are two new AI video generators in beta, offering animated looks, lip-sync, and unique editing features.
  • 📚 Morph Studios' interface is based on a node structure, allowing for complex style rerolls and connections between different video segments.

Q & A

  • What is the main advancement in face swapping technology discussed in the script?

    -The main advancement discussed is the high level of realism and convincing tracking in the face swap technology provided by AI Katana. It is particularly impressive in how it tracks the subject's face while eating and tugging on cheeks.

  • What is the language used by the person in the face swap video?

    -The person in the face swap video is speaking either Mandarin or Cantonese, but the exact language is not specified in the script.

  • What is the speculation about the real-time capabilities of the face swapping technology?

    -There is speculation that the face swapping technology shown is not running in real-time. The video capture is likely processed through face swapping software after being recorded due to the high quality of the results.

  • What is the new feature in Midjourney called?

    -The new feature in Midjourney is called 'style random', which randomizes the style of the generated images.

  • What does the 'style random' feature in Midjourney allow users to do?

    -The 'style random' feature allows users to generate images with completely different styles, offering a wide range of creative possibilities and making it a useful tool for discovering new styles.

  • What is the future direction of Midjourney for the next 12 months?

    -The future direction of Midjourney for the next 12 months is focused on video, 3D, real-time, and bringing these elements together to create a non-interactive world simulator with an added interaction layer.

  • What is the role of Alex Evans in Midjourney?

    -Alex Evans, one of the co-founders of media molecule, has joined Midjourney as a principal research engineer, contributing to the development of 3D features in the platform.

  • What is the significance of the 'orb' in Midjourney?

    -The 'orb' is described as a device that could generate and manage thousands of 3D rooms, indicating that Midjourney is serious about integrating advanced 3D capabilities into their platform.

  • What are the two new AI video generators mentioned in the script?

    -The two new AI video generators mentioned are Morph Studios and Nim Video, both of which are currently in beta and offer features like lip sync, character consistency, and various style options.

  • What is the unique aspect of Morph Studios' user interface?

    -Morph Studios' user interface features a node-based structure that allows for a unique workflow where users can prompt different styles and connect aspects of those styles to subsequent shots or nodes.

  • What is the main appeal of Nim Video's platform?

    -Nim Video's platform is appealing due to its consistent character feature, camera motion options, sound and lip sync capabilities, as well as additional features like image to video conversion, video restyling, upscaling, and motion control.

  • How does the Express one model from Synthesia differ from previous AI avatars?

    -The Express one model from Synthesia introduces AI avatars with the ability to express emotions, making them more emotive and engaging compared to previous models.

Outlines

00:00

😲 Advanced Face Swapping and AI Avatars

The video introduces a highly advanced face swapping technology by AI Katana, which is shown to convincingly track facial movements, even while eating or tugging on cheeks. The presenter discusses the technology's potential and compares it with Snapchat filters, suggesting it's a significant leap from previous face swapping technology. The video also touches on the future of mid-Journey, a 12-month roadmap indicating a shift towards video, 3D, and real-time non-interactive world simulation, with plans to add an interactive layer. Synthesia's new Express model for AI avatars is highlighted, which can express emotions and aligns lips with speech more precisely. The presenter also mentions the speculation that the technology might not be running in real-time and shares a translation of the AI Katana video.

05:01

🚀 Mid-Journey's 3D and Style Random Feature

The video discusses the next steps for mid-Journey, focusing on 3D and real-time aspects. It's suggested that mid-Journey will move from generating images to creating scenes with full 360° camera control. The presenter also talks about the 'orb' device, which could manage thousands of 3D rooms, and the hiring of Ahmad, a key figure behind the Apple Pencil, indicating a serious approach to hardware development. The video also introduces a new feature from mid-Journey called 'style random', which randomizes the style of generated images, offering both fun and utility. The presenter demonstrates how the feature can be used to create diverse styles and how it can be applied to future image generation.

10:02

🎬 New AI Video Generators: Morph Studios and Nim Video

The video concludes with a look at two new AI video generators. Morph Studios is highlighted for its beta release, which offers an animated look and allows for character image uploads for consistency. The user interface is noted for its node-based structure, which enables different styles to be connected for varied outputs. Nim Video is also introduced, another beta tool that provides options for style, character, camera motion, sound, and lip sync. Additional features include image to video conversion, video restyling, upscaling, and layer-based editing. The presenter expresses excitement about trying these tools and shares a link for viewers to sign up for the Nim Video beta.

Mindmap

Keywords

💡Face Swapping

Face swapping is a technology that allows the digital replacement of a person's face in a video or image with another person's face. In the video, it is mentioned as having taken a significant leap with AI Katana, where the face swap appears convincingly as the person speaks and eats, showcasing the technology's advancement.

💡AI Avatars

AI avatars are digital representations of a person that can be controlled or directed by AI algorithms. The video discusses the next generation of AI avatars from Synthesia, which are capable of displaying emotions, making them more lifelike and engaging for various applications.

💡Midjourney

Midjourney is a term used in the video to refer to a 12-month roadmap of a company's future developments. It is mentioned in the context of a shift towards video, 3D, and real-time technologies, indicating a significant direction change for the company.

💡3D World Simulator

A 3D world simulator is a software that can generate and display three-dimensional environments. The video suggests that Midjourney is moving towards creating non-interactive 3D world simulators with full 360° camera control, which would allow for more immersive and interactive experiences.

💡Deepfake

Deepfake refers to synthetic media in which a person's likeness is replaced with another's using AI. The video discusses the quality of deepfakes, noting that while they appear realistic, there are still inconsistencies, especially when it comes to real-time applications.

💡Synthesia

Synthesia is a company that creates AI avatars capable of expressing emotions. In the video, it is highlighted for its new Express one model, which allows for more emotive and precise lip-syncing with pre-trained avatars for users to employ.

💡Style Random

Style Random is a feature released by Midjourney that randomizes the style of generated images. It is initially perceived as a fun tool, but the video demonstrates its utility in discovering new styles that can be applied to subsequent image generations, adding a layer of creativity and unpredictability.

💡Morph Studios

Morph Studios is an AI video generator in beta that focuses on creating animated looks with character images for consistent styles. The video mentions its unique node-based UI, which allows for a different workflow in creating and connecting different styles and shots.

💡Nim Video

Nim Video is another AI video generator in beta, offering features like style and character customization, camera motion, sound, and lip-sync. It is noted for its interesting workspace and capabilities, including image to video conversion and video restyling.

💡Media Molecule

Media Molecule is a developer known for creating the 3D creation engine 'Dreams' for PlayStation. The video mentions Alex Evans, a co-founder of Media Molecule, joining Midjourney as a principal research engineer, which signifies a move towards more advanced 3D capabilities.

💡Orb

The Orb is described as a device that could potentially generate and manage thousands of 3D rooms. It is mentioned in the context of Midjourney's serious commitment to 3D technology, with the hiring of Ahmad, a key figure behind the Apple iPhone Pro, to lead their hardware division.

Highlights

AI face swapping technology has made significant advancements, with a new example from AI Katana showcasing impressive tracking and realism.

The face swap technology is speculated to not be running in real-time but rather a post-processed video capture.

AI Katana's model is claimed to have advantages over current face swapping technology.

Synthesia introduces Express One, an AI video generator that can mimic human emotions.

Express One uses pre-trained avatars, eliminating the need for users to record themselves.

Midjourney's 12-month roadmap hints at a shift towards video, 3D, and real-time technology integration.

Midjourney's potential 3D feature may allow for 360° camera control over generated scenes.

Alex Evans, co-founder of media molecule, has joined Midjourney as a principal research engineer, bringing expertise in 3D creation engines.

The Orb, a device speculated to manage thousands of 3D rooms, is being taken seriously by Midjourney with the hiring of Ahmad, a key figure behind the Apple M1 Pro.

Midjourney has released a new feature called 'style random' which randomizes the style of generated images.

The 'style random' feature is not only fun but also useful for discovering new styles and applying them to future images.

Morph Studios, currently in beta, offers a node-based UI for creating animated-style AI videos with lip sync and sound features.

Nim Video, another AI video generator in beta, provides options for style, character, and camera motion, as well as features like image to video conversion and upscaler.

Nvidia's platform will utilize open source models, and interested users can sign up for the beta.

The host offers a free course on getting started with Midjourney as part of a beginner's course for Semrush.

The transcript discusses the potential and current state of AI in video generation, highlighting the rapid advancements in the field.

AI-generated content is becoming more realistic and expressive, with applications in various fields including entertainment and education.

The future of AI video platforms is expected to include more interactive and immersive 3D experiences.