The upstream now has TCG plugins (https://qemu.readthedocs.io/en/latest/devel/tcg-plugins.html) which allow for a degree of instrumentation. The implementation is architecture agnostic and also tested within the code base. There are still features missing but it does provide a base for dynamic analysis of guest code.
The plugins have access to the instruction stream to make architecture specific decisions. What I meant by architecture independent is that it doesn't involve per-guest annotations in the frontends to handle - any guest using the common translator loop (which is all of them now) can be instrumented by plugins.
However I absolutely agree its not currently as full featured as we would like. The next step when I get time is re-factoring the handling of register values in the core QEMU code so we can expose them to the plugins in a clean API.