Hacker News new | past | comments | ask | show | jobs | submit login

Yes, but a 90% accurate model that's 10x faster than a 99% can be run 3x to achieve higher accuracy while still outperforming the 99% model, for most things. In order for the math to be in the big model's favor there would need to be problems that it could solve >90% of the time where the smaller model was <50%.



Will that 90% model give you a more accurate answer after 3 tries?


So far experiments say yes, with an asterisk. Taking ensembles of weak models and combining them has been shown to be able to produce arbitrarily strong predictors/generators, but there are still a lot of challenges in learning how to scale the techniques to large language models. Current results have shown that an ensemble of GPT3.5 level models can reach near state of the art by combining ~6-10 shots of the prompt, but the ensemble technique used was very rudimentary and I expect that much better results could be had with tuning.


The problem with your premise, is that you don't necessarily know when said 90% accurate model produces the right output.


You can think of multiple runs with the 90% model as fuzzy parity bits in an error correcting code, so you kind of can.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: