It’s an interesting idea, and I have a feeling I’ve read somewhere about Google doing something similar, but these sorts of heuristics are a rabbit hole I’m trying not to go down at the moment—currently I have far more serious reasons why my ranking isn’t great, so I’m trying to prioritize those rather than getting distracted by whatever interesting algorithm I happen to bump into. (Definitely on my list to investigate in the future, though!)
It sounds like this same data is not only compressible, but it also has zero useful signal in it and it can be filtered away.
Have you tried that? Are there articles/approaches that talk about it?