What are objective metrics for generated source code? "It compiles" is just the baseline. You could look at coupling and cyclometric complexity to start. But optimizing those doesn't necessarily produce great code (though I realize that was never the goal).
That's a detail irrelevant to your argument and my counterargument. The point is that there's data beyond human generated available for training, therefore you can't conclude it will forever be restricted to human-level.