I think a lot of DALL-E 2 outputs fall into the category of "extremely impressive that a neural network made this" and also "not quite up to the standards of a human expert". Like if you show me an output and told me a machine made it, I'm absolutely fascinated, but if you showed me the same image and told me a human drew it, I'd just scroll past without a second thought. Even so, there are some applications for which being able to generate a pretty okay image for a few cents is a great deal - I use it for things like D&D character portraits.
Of course, DALL-E 2 is not the end of of text-to-image research - it'll be interesting to see where we are a year from now.
It is creating better images than the huge majority of people would do. Cheaply.
As you say, an expert can do far better.
But having something artistic created that well exceeds the average ability is gobsmakingly astonishing. And for quick blast variety generation, it is world class.
Of course, DALL-E 2 is not the end of of text-to-image research - it'll be interesting to see where we are a year from now.