AI-generated video is heating up. Earlier this month, Chinese company Kuaishou Technology launched Kling AI, a new text-to-video model that tackles the text-to-video market opened up by OpenAI’s Sora.
Kling AI, which accepts inputs in Mandarin, can create two-minute videos with a consistent 30 frames per second and a 1080p resolution. The use of 3D face and body reconstruction technology via 3D VAE also gives the model the capacity to convincingly model movement, as we explore below.
The model now joins a wave of models like Sora, Google Veo, Runway Gen-3 Alpha, and Haiper AI that enable users to create highly detailed videos in a matter of minutes.
Key Takeaways
- Kling AI, a new text-to-video model by Kuaishou Technology, creates two-minute, 1080p videos using inputs in Mandarin.
- It employs 3D face and body reconstruction for realistic animations, adding natural movements to static images.
- Kling AI is currently available as a demo on the Kwaiying app, requiring a Chinese phone number for access.
- It competes with models like OpenAI’s Sora, offering longer video durations but facing language and accessibility barriers.
- Kling AI can challenge Sora with its high-quality outputs and extensive user base from Kuaishou’s video platform.
What We Know About Kling AI
At the moment, Kling AI hasn’t been released to the public. Instead, it is available to users via the Kwaiying (KwaiCut) app as a demo, but you need a Chinese phone number to sign up.
When it comes to overall capabilities, Kling cannot only create videos of up to two minutes in length but also transform input images into five-second videos. A one-click feature can also extend the generated video by an extra 4.5 seconds.
This feature gives users a tool to add motion to static images and generates some interesting results. In one example, Kuaishou Technology shared how this feature could be used to animate Leonardo Da Vinci’s Mona Lisa to put a pair of sunglasses on her head.
Under the hood, the model uses a range of technologies to help bring high-quality videos to life.
For instance, the use of a diffusion transformer architecture, plus 3D face and body reconstruction, allows the model to simulate many highly convincing animations that conform to real-world physics.
In practice, this helps add natural movements like singing and dancing to static images.
Can Kling AI Unseat Sora?
The biggest question surrounding Kling AI is whether or not it has what it takes to unseat Sora as the king of the AI video generation hype.
When Sora was announced at the start of this year, its impact was unprecedented, not just because of ChatGPT‘s success but also because of its extremely high-quality outputs. This was in spite of the fact that the initial footage was just OpenAI’s promotional demos — as the model hadn’t and still has not been released to the public.
Currently, Kling AI is an impressive model, but the fact that prompts need to be translated into Mandarin and used via a Chinese mobile number is a barrier.
In any case, there is some reason to suggest that Kling AI is well-placed to challenge Sora in the market. For a start, it can generate videos of up to 2 minutes in length, compared to Sora’s 1 minute.
Thomas Randall, director of AI market research at Info-Tech Research Group, told Techopedia:
“A major advantage here is that Kuaishou hosts a very large video platform with millions of users uploading their content; this is a huge data pool for Kling AI to feed on.”
Randall also notes that Kuaishou has confirmed it uses publicly available data from the Internet for model training. In contrast, many commentators allege that OpenAI has trained Sora on copyrighted material from YouTube.
That being said, Randall doesn’t believe OpenAI should be too concerned just yet.
Randall added:
“There is reason to think that OpenAI will not be overly worried with Kling AI’s release, especially with audiences outside of China. First, Kling AI works best when prompts are made in Mandarin.
“Second, the results have only produced five-second videos so far. This is a long way from Kling AI’s claim that it can generate longer videos (up to two minutes) compared to Sora’s one-minute limit.”
While a two minute time limit is a plus, Randall says it’s “not stark enough” to market as a competitive differentiator, and there’s always the option that OpenAI could expand Sora’s time limit in the future. But what about image quality?
Kling AI vs Sora’s Videos: Head to Head
Both Kling AI and Sora can produce videos that look extremely realistic. We believe that Kling AI produces images that look more real, but OpenAI’s photorealistic style generates more vibrant content (but nonetheless has a more synthetic quality to it).
Arguably, some of the best realistic outputs produced by Kling so far have been its Noodle is the only food on earth, a girl running in a dark tunnel, and POV of a bicycle rider videos.
If we look at OpenAI, arguably its most impressive examples have been the lady in red, Tokyo in the snow, and puppies playing in the snow. While we thought that the lady in red is the best AI-generated video clip in terms of style, Kling AI’s examples appeared less synthetic.
We were particularly impressed with Kling’s ability to depict motion, particularly in the noodles example and the girl running in the dark tunnel examples.
While OpenAI has also produced some extremely high-quality work in this area, some unnatural movement was noticeable in the walking in the ‘lady in red’ clip and the physics in the puppies clip.
Other Examples of Kling AI in Action
Besides the examples listed above, there are plenty of other examples of Kling AI in action. Some of our favorite examples are as follows:
- Godzilla vs King Kong — This clip demonstrates how Kling AI can be used to create movie trailers, and features Godzilla and King Kong confronting each other in dense city environments.
- Madmax Beer Commercial — Another example that caught our attention was an AI-generated video of a Mad Max-inspired beer commercial. The clip, allegedly made in 1 hour, showcases Kling’s storytelling capabilities.?
Less than 48 hours ago, Sora competitor Kling dropped.
People are already getting access and creating wild AI videos. ??
1. MadMax Beer commercial made in 1 hourpic.twitter.com/CyKm2aI0It
— Min Choi (@minchoi) June 8, 2024
- Animated Distracted Boyfriend Meme — One user shared an interesting example of the model being used to great effect to animate the distracted boyfriend meme which, although with some unnatural motion, still appears very high quality.
She wished the two of them well. A perfect ending.????#KLING @Kling_ai pic.twitter.com/1CZnMwPf9S
— KellyV (@Kellyv_ai) June 21, 2024
- Squirrel Eating a Strawberry — This example features an AI-generated clip of a squirrel eating a strawberry, which appears very natural.
Now I see any picture that I want Kling AI to generate a video. pic.twitter.com/uI4DdSd9m1
— Zane (@ZaneGallery) June 22, 2024
- Elon Musk Head Turn — In this example, a user has animated a still image of Tesla CEO, Elon Musk, where the model has generated the side of his head to enable further movement.
Kling @Kling_ai now supports image to video pic.twitter.com/i2LpTOMx2T
— el.cine (@EHuanglu) June 21, 2024
- Otter Splashing into Water — Another example shows a short clip of an otter splashing into water, with realistic motion being showcased in both the water splash and the otter’s whiskers.
10. An adorable otter with a happy atmosphere surrounded by splashing water and floating twinkle twinkle little stars.pic.twitter.com/3kgVUNwr3J
— Min Choi (@minchoi) June 8, 2024
The AI Video Generation: Where Kling AI Fits In
As it stands, Grandview Research valued the global AI video generator market size at $554.9 million in 2023 and predicts it will grow at a compound annual growth rate of 19.9% from 2024 to 2030.
Several providers currently offer text-to-video models, including OpenAI Sora, Google Veo, Runway Gen-3 Alpha, Luma AI, Haiper AI, and Kling AI.
It’s worth noting that Sora, Veo, Gen-3 Alpha, and Kling AI have yet to be widely released to the public, but these are still the models attracting the most attention in the market.
The Bottom Line
At the moment, Kling AI has emerged from nowhere and presents a Chinese alternative to models that have largely been created in the U.S., such as Sora, Veep, and Gen-3 Alpha.
Given that Kuaishou Technology’s daily active users are expected to increase to 400 million by the end of 2024, the company has the potential to be a force to be reckoned with in the market, particularly given its excellent videos so far.
For a model that hasn’t been widely released yet, Kling AI has lots of potential to gain ground in the market. Its two-minute video length and high-quality motion have made it stand out as a solution in a space that is becoming more oversaturated by the minute.