Yes, but a 90% accurate model that's 10x faster than a 99% can be run 3x to achi...

deskamess · 2024-05-31T16:01:09 1717171269

Will that 90% model give you a more accurate answer after 3 tries?

CuriouslyC · 2024-05-31T17:47:49 1717177669

So far experiments say yes, with an asterisk. Taking ensembles of weak models and combining them has been shown to be able to produce arbitrarily strong predictors/generators, but there are still a lot of challenges in learning how to scale the techniques to large language models. Current results have shown that an ensemble of GPT3.5 level models can reach near state of the art by combining ~6-10 shots of the prompt, but the ensemble technique used was very rudimentary and I expect that much better results could be had with tuning.

torginus · 2024-05-31T21:11:21 1717189881

The problem with your premise, is that you don't necessarily know when said 90% accurate model produces the right output.

CuriouslyC · 2024-06-01T14:40:29 1717252829

You can think of multiple runs with the 90% model as fuzzy parity bits in an error correcting code, so you kind of can.