OpenAI Unveils Powerful, Creepy New Text-to-video Generator That It Calls 'a Foundation For Models That Can Understand And Simulate The Real World'

Republished By Plato

Followers: 0

The generative AI company behind ChatGPT and DALL-E has a new toy: Sora, a text-to-video model that can (sometimes) generate pretty convincing 60-second clips from prompts like “a stylish woman walks down a Tokyo street…” and “a movie trailer featuring the adventures of the 30 year old space man wearing a red wool knitted motorcycle helmet…”

A lot of the AI video generation we’ve seen so far fails to sustain a consistent reality, redesigning faces and clothing and objects from one frame to the next. Sora, however, “understands not only what the user has asked for in the prompt, but also how those things exist in the physical world,” says OpenAI in its announcement post (using the word “understands” loosely).

View post on imgur.com”

The Sora clips are impressive. If I weren’t looking closely—say, I was just scrolling past them on social media—I’d probably think many of them were real. The prompt “a Chinese Lunar New Year celebration video with Chinese Dragon” looks at first like typical documentary footage of a parade. But then you realize that the people are oddly proportioned, and seem to be stumbling—it’s like the moment in a dream when you suddenly notice that everything is a little bit wrong. Creepy.

“The current model has weaknesses,” writes OpenAI. “It may struggle with accurately simulating the physics of a complex scene, and may not understand specific instances of cause and effect. For example, a person might take a bite out of a cookie, but afterward, the cookie may not have a bite mark. The model may also confuse spatial details of a prompt, for example, mixing up left and right, and may struggle with precise descriptions of events that take place over time, like following a specific camera trajectory.”

My favorite demonstration of Sora’s weaknesses is a video in which a plastic chair begins morphing into a Cronenberg lifeform. Behold:

View post on imgur.com”

Sora is not widely available yet, and OpenAI says it’s assessing social risks of the model and working on mitigating them, for instance with “a detection classifier that can tell when a video was generated by Sora.”

It’s fascinating as a research project, but OpenAI isn’t just interested in doing cool computer science. If it can outmaneuver copyright critics and legislators, it’s here to make bank. The company says it’s currently “granting [Sora] access to a number of visual artists, designers, and filmmakers to gain feedback on how to advance the model to be most helpful for creative professionals.”

One commenter on X optimistically wondered if models like Sora will one day allow the public to wrest control of filmmaking away from Hollywood by making movies purely with prompts—but I wonder where the source material for all this generated video will come from if not, you know, filmmakers? Big Hollywood movies may already look pretty homogenous, but auto-reproducing Marvel Cinematic Universe-style CGI and car commercial drone shots isn’t exactly bringing creative expression to the masses. (The blog post notably doesn’t mention Sora’s training material.)

View post on imgur.com”

Despite the often clumsy results of current generative AI models and the legal, ethical quagmire it presents, we’re already seeing it used in professional creative media. That includes videogames, both in ways that are directly visible to us, like to generate art and voices and on-the-fly dialogue, and in ways that are less obvious, like generating code snippets or early concept art. A recent survey found that 31% of game development professionals use generative AI in some capacity. Combined with other software, I wonder what this kind of machine learning-driven video simulation could do besides generate slightly-off CG-like clips?

I don’t think anyone really knows how generative AI will be used in five or ten years or what the consequences of continued development will be, but it isn’t slowing down, so it appears we’ll find out. OpenAI and other companies are explicitly working not just toward better image and video and text generators, but toward “artificial general intelligence” or AGI—as in, the science fiction idea of what AI is.

“Sora serves as a foundation for models that can understand and simulate the real world, a capability we believe will be an important milestone for achieving AGI,” says OpenAI.

SEO Powered Content & PR Distribution. Get Amplified Today.
PlatoData.Network Vertical Generative Ai. Empower Yourself. Access Here.
PlatoAiStream. Web3 Intelligence. Knowledge Amplified. Access Here.
PlatoESG. Carbon, CleanTech, Energy, Environment, Solar, Waste Management. Access Here.
PlatoHealth. Biotech and Clinical Trials Intelligence. Access Here.
Source: https://www.pcgamer.com/openai-sora-text-to-video-announcement

Time Stamp: February 15, 2024

More from PC Gamer

Watch a Thief fan film directed by one of the original game’s creators

Source Cluster:

PC Gamer

Source Node: 1826558

Time Stamp: Jun 3, 2023

Today’s Wordle 547 answer and hint for Sunday, December 18

PC Gamer

Source Node: 1743791

Time Stamp: Jul 21, 2022

Republished By Plato

Watch a Thief fan film directed by one of the original game’s creators

Today’s Wordle 547 answer and hint for Sunday, December 18

First MultiVersus tier lists say Taz OP, Velma trash

Check out mindbending puzzles in The Talos Principle 2’s specially made demo

Skyrocketing Cyberpunk 2077 player counts prove the Netflix boost is real

Over a million people have played the XDefiant beta

Where to find all the energy drinks in Stray

More than 20 years after its debut, revered sci-fi adventure The Ur-Quan Masters is finally coming to Steam next week

You can only date your co-workers in Starfield, decrees Todd Howard

One of Embracer’s top executives resigns and announces his own gaming company at the same time

Indie Metal Gear replaces Solid Snake with a gritty gecko

About Us

Vertical Search & Ai

Platform

Stay Connected

Account