Ah. I use RealESRGAN (or one of its descendants, rather) as a first pass upscaler before high-resolution diffusion. If you skip the diffusion step, of course it'll be faster.
yeah me too...I've been very negative about the edge, it got overhyped with the romanticization of local LLMs, but there's a bunch of stuff coming together at the same time...Raspberry Pi 5...Mistral 7B Orca is my 20th try of a local LLM...and the first time it handled simple conversation with RAG. And new diffusion, even every 2 hours, is a credible product, arguing about power consumption aside...
Apples to oranges, they're comparing 11 hours on a Raspberry Pi Zero to:
- 10 seconds on Intel i7-13700
- 3 seconds on Intel i9-9990XE
- 5 seconds on Ryzen 9-5900X
Additionally, the 2048 is accomplished by using RealESRGAN to 2x, which isn't close to what a native 2048 diffuser's quality would be.
It does look interesting and is an achievement, in terms of, it's hard to write this stuff from scratch, much less in pure C++ without relying on GPU.