Part of the benefit of compiling bytecode (or anything) is specializing code to the context (types, values, etc) in which it appears. While I don't doubt your analysis, it could be the case that compiled C code in question is full of branches that can be folded away when specialized to the context of the query, such as the structure of the rows, the type and values of columns, etc.
Basically all of what you are saying about high-level bytecodes applies to dynamic languages, too. But they benefit highly from specializing each bytecode given static and dynamic context, and shortcutting dataflow through local variables.
There's usually a cost to shuttling data between bytecodes too. When two are fused together the second can lift the data from wherever the first wanted to leave it, as opposed to routing through a fixed location. Might be what you mean by shortcutting dataflow?
Also doing control flow in bytecode is usually slower than doing it in the native code.
I wonder if the context in which the instructions occur is sufficiently finite in sqlite for ahead of specialisation of the bytecode to be better. That is, the program you're operating on isn't known until JIT time, but the bytecode implementations are. SQL should correspond to an unusually specific set of operations relative to a general purpose language implementation.
The compiler will notice some values are constant when working with the bytecode. It can know ahead of time which arguments correspond to folding branches within the bytecode instructions and specialise correspondingly. If that works, you've emitted a sequence of calls into opcodes which are less branchy than they would otherwise be, at which point the opcode implementations start to look like basic blocks and a template JIT to machine code beckons.
Basically all of what you are saying about high-level bytecodes applies to dynamic languages, too. But they benefit highly from specializing each bytecode given static and dynamic context, and shortcutting dataflow through local variables.