I always ask myself the following pseudo-question: "for this geneneration/classification task, do I need to be more intelligent than an average highschool student?" Almost always in business tasks, the answer is a no. Therefore I go with GPT3.5. Its much quicker and good enough to accomplish the task usually.
And then I need to run this task thousands of times, so the API limits are the most limiting factor, which are much higher in GPT3.5 variants, whereas when using GPT4 I have to be more careful with limiting/queueing requests.
I patiently wait for a efficient enough model that only needs to be on a GPT3.5 level I can self-host alongside my applications with reasonably low server requirements. No need for GPT-5 for now, for business automations the lower end of "intelligence" is more than enough, but efficiency/scaling is the real deal.
Do you mind sharing some tasks that you are solving with GPT 3.5? Be very concrete, if you don't mind. I am struggling to make it work for my business use cases (i.e. the ones where I am looking for "reliably helpful") and am very much looking for inspiration to define the limits. The hypothetical is interesting but seems to not do too much for me on its own.
I second this. For some type of applications, the 4 model can quickly ramp up costs, especially with large contexts and 3.5 often does the job just fine.
So for many applications it's the real competitor.
Too bad GPT3.5 Turbo is dirt cheap. Open source models are substantially more expensive when you factor in operating costs. There is no mature ecosystem where you can just plug in a model and spin up a robust infrastructure to run a local LLM at scale, aka you need infrastructure/ML engineers to do it, aka extremely expensive unless you are using LLMs at extremely large scales.
Yet, if you want to go cheaper, you totally can by paying for the API access. Gpt4 is accessible there and you get to use your own app. $20 will last you way longer then a month if you're not a heavy user.
It really depends on usage. If you need to have long conversations, or are sending huge texts, the all-you-can-eat for $20 plan will almost certainly be cheaper than API access.
If you're doing lots of smaller one shot stuff without a lot of back and forth, the API will be cheaper.
Depends on if they know how to use it. A lot of people still think it's just a google and wikipedia replacement. It doesn't really do anything super useful in that case.
GPT-3.5 (unfinetuned) has been matched by many OW (open weights) models now, and with fine tuning to a specific task (coding, customer care, health advice etc) can exceed it.
It's still useful as a well known model to compare with, since it's the model the most people have experience with.