Runway has shouldered apart Midjourney and Secure Diffusion, introducing the primary clips of text-to-video AI artwork that the corporate says is totally generated by a textual content immediate.
The corporate mentioned that it’s providing a waitlist to hitch what it calls “Gen 2” of text-to-video AI, after providing the same waitlist for its first, easier text-to-video instruments that use a real-world scene as a mannequin.
When AI artwork emerged final 12 months, it used a text-to-image mannequin. A consumer would enter a textual content immediate describing the scene, and the software would try to create a picture utilizing what it knew of real-world “seeds,” creative kinds and so forth. Companies like Midjourney carry out these duties on a cloud server, whereas Secure Diffusion and Secure Horde make the most of related AI fashions operating on dwelling PCs.
Textual content-to-video, nonetheless, is the following step. There are numerous methods of undertaking this: Pollinations.ai has gathered just a few fashions which you’ll be able to check out, one among which merely takes just a few associated scenes and constructs an animation stringing them collectively. One other merely creates a 3D mannequin of a picture and permits you to zoom round.
Runway takes a distinct method. The corporate already presents AI-powered video instruments: inpainting to take away objects from a video (versus a picture), AI-powered bokeh, transcripts and subtitles, and extra. The primary technology of its text-to-video instruments allowed you to assemble a real-world scene, then use it as a mannequin to overlay a text-generated video on prime of it. That is usually executed as a picture, the place you might take a photograph of a Golden Retriever and use AI to remodel the photograph into a photograph of a Doberman, for instance.
That was Gen 1. Runway’s Gen 2, as the corporate tweeted, can use present photographs or movies as a base. However the know-how can even utterly auto-generate a brief video clip from a textual content immediate and nothing extra.
As Runway’s tweet signifies, the clips are each brief (only a few seconds at most), awfully grainy, and suffers from a low body fee. It’s not clear when Runway will launch the mannequin for early entry or common entry, both. However the examples on the Runway Gen 2 web page do present all kinds of video prompts: pure text-to-video AI, textual content+picture to video, and so forth. It seems that the extra enter you give the mannequin, the higher your luck. Making use of a video “overlay” over an present object or scene appeared to supply the smoothest video and highest decision.
Runway already presents a $12/mo “Customary” plan that permits for limitless video tasks. However sure instruments, comparable to really coaching your individual portrait or animal generator, require an extra $10 price. It’s unclear what Runway will cost for its new mannequin.
What Runway does display, nonetheless, is that in just a few brief months, we’ve moved from text-to-image AI artwork into text-to-video AI artwork… and all we will do is shake our heads in amazement.