Reminder that "This is the worst version of the AI video creation we'll have from here on out."
As we mock the new Will Smith Eating Spaghetti video, every update is an improvement on where we're at and a new, higher baseline of the worst functionality.
Doesn't mean it'll get any better though. Most technologies and algorithms have an end point. People thought flying cars were inevitable and imminent because after all, cars got so much better and feature-rich within a very short timespan. And then innovation practically stopped because there's a limit to possible energy storage and utilisation unless some magical new innovation comes along. But it hasn't.
These generated videos are exactly the same nonsense as the spaghet videos, they've just got more data crammed in there. But nothing about them is consistent.
I suspect that this is a compute tradeoff they decided to make. Uses much less compute than SORA, so feasible to scale up for the public, but that much less coherent. If you look at OpenAI's technical report for SORA they show examples with "base compute" and "4x compute". This output looks like an estimated "2x" to me.
I wonder if they are really talking about the model size.
That's the version they released as an "open" model. Their SD3 model behind an API works just fine, the released medium model has been basically been lobotomized to hell and back in the name of "safety", because apparently human anatomy is icky and nipples are the devil. Unfortunately, that sort of heavy handed model lobotomizing leads to nasty side effects that turn even innocent renders of people into Eldritch nightmares as the model has been forced to forget that a human doesn't have limbs growing out of random places.
As we mock the new Will Smith Eating Spaghetti video, every update is an improvement on where we're at and a new, higher baseline of the worst functionality.