I'll make a deal with you: I'll agree to turn down speculative execution in hardware if
1) you can solve the VLIW scheduling problem, and
2) you transition the entire computing ecosystem to a JIT model.
These are the things we would need to claw back the performance we would lose through disabling speculation. You can look at speculation (I'm handwaving a lot here, bear with me) as the processor hardware dynamically recompiling your program code depending on observed behavior of the program. That's well and good and it gets us a huge performance boost.
You can, in principle, do the same thing in pure software. But every single attempt to do has ended in total algorithmic failure.
The last serious attempt I'm aware of to "drive" a uOP scheduler explicitly in software was Itanium, and that failed, in part, because compilers couldn't take advantage of the processor's instruction level parallelism. There's nothing in math or mathematics or computer science that forbids a magical compiler of the sort the Itanium people wanted to create. But nobody's made one. Your first task in your project of eliminating speculative execution is to solve this algorithmic problem.
But solving problem #1, while necessary, is insufficient. No static ahead-of-time compiler can adjust the compiled code depending on the actual execution history of the program. To really get back to speculative execution par, you have to give your already-magical compiler the ability to recompile code at runtime. That means turning everything into a JIT. Your /bin/ls would actually be LLVM bytecode, not machine code, and some runtime system would be responsible for dynamically generating the machine code and adjusting it depending on execution history. After all, that's what current superscalar CPUs do internally, transparently, all the time. This is problem #2.
Honestly, I think the world we'd create by solving both these problems would be a better world. I really don't like how we can't program the processor's speculation engine and uOP scheduler. I'd love to be able to do that.
But I don't think we can get there from where we are, so we're going to be stuck with speculation and hardware mitigation forever. Please, prove me wrong.
> Honestly, I think the world we'd create by solving both these problems would be a better world. I really don't like how we can't program the processor's speculation engine and uOP scheduler. I'd love to be able to do that.
Have you started working on any solutions for 1 and 2?
1) you can solve the VLIW scheduling problem, and
2) you transition the entire computing ecosystem to a JIT model.
These are the things we would need to claw back the performance we would lose through disabling speculation. You can look at speculation (I'm handwaving a lot here, bear with me) as the processor hardware dynamically recompiling your program code depending on observed behavior of the program. That's well and good and it gets us a huge performance boost.
You can, in principle, do the same thing in pure software. But every single attempt to do has ended in total algorithmic failure.
The last serious attempt I'm aware of to "drive" a uOP scheduler explicitly in software was Itanium, and that failed, in part, because compilers couldn't take advantage of the processor's instruction level parallelism. There's nothing in math or mathematics or computer science that forbids a magical compiler of the sort the Itanium people wanted to create. But nobody's made one. Your first task in your project of eliminating speculative execution is to solve this algorithmic problem.
But solving problem #1, while necessary, is insufficient. No static ahead-of-time compiler can adjust the compiled code depending on the actual execution history of the program. To really get back to speculative execution par, you have to give your already-magical compiler the ability to recompile code at runtime. That means turning everything into a JIT. Your /bin/ls would actually be LLVM bytecode, not machine code, and some runtime system would be responsible for dynamically generating the machine code and adjusting it depending on execution history. After all, that's what current superscalar CPUs do internally, transparently, all the time. This is problem #2.
Honestly, I think the world we'd create by solving both these problems would be a better world. I really don't like how we can't program the processor's speculation engine and uOP scheduler. I'd love to be able to do that.
But I don't think we can get there from where we are, so we're going to be stuck with speculation and hardware mitigation forever. Please, prove me wrong.