It’s hard to write about Sora without feeling like your mind is melting. But after OpenAI’s surprise artificial intelligence announcement yesterday afternoon, we have our best evidence yet of what a yet unregulated, consequence-free tech industry wants to sell you: a suite of energy-hungry black box AI products capable of producing photorealistic media that pushes the boundaries of legality, privacy, and objective reality.
Barring decisive, thoughtful, and comprehensive regulation, the online landscape could very well become virtually unrecognizable, and somehow even more untrustworthy, than ever before. Once the understandable “wow” factor of hyperreal woolly mammoths and paper art ocean scapes wears off, CEO Sam Altman’s newest distortion project remains concerning.
Introducing Sora, our text-to-video model.
Sora can create videos of up to 60 seconds featuring highly detailed scenes, complex camera motion, and multiple characters with vibrant emotions. https://t.co/7j2JN27M3W
Prompt: “Beautiful, snowy… pic.twitter.com/ruTEWn87vf
— OpenAI (@OpenAI) February 15, 2024
The concept behind Sora (Japanese for “sky”) is nothing particularly new: It apparently is an AI program capable of generating high-definition video based solely on a user’s descriptive text inputs. To put it simply: Sora reportedly combines the text-to-image diffusion model powering DALL-E with a neural network system known as a transformer. While generally used to parse massive data sequences such as text, OpenAI allegedly adapted the transformer tech to handle video frames in a similar fashion.
“Apparently,” “reportedly,” “allegedly.” All these caveats are required when describing Sora, because as MIT Technology Review explains, OpenAI only granted access to yesterday’s example clips after media outlets agreed to wait until after the company’s official announcement to “seek the opinion of outside experts.” And even when OpenAI did…
Read the full article here