I'm all for flashy in academic sense, because we can let engineers sort out the practical aspects, especially by combining flashy academic approach. The flaw from LLM architecture can be predicted from the original paper, no amount of engineering can compensate that.