The big feature here is the function calls, as this is effectively a replacement for the "Tools" feature of Agents popularized by LangChain, except in theory much more efficient since it may not require an extra call to the API. In the case of LangChain which selects Tools and their functional outputs through JSON Markdown shennanigans (which often fails and causes ParsingErrors), this variant of ChatGPT appears to be finetuned for it so perhaps it'll be more reliable.
The slight price drop for ChatGPT inputs is of course welcome, since inputs are the bulk of the costs for longer conversations. A 4x context window at 2x the price is a good value too. The notes for the updated ChatGPT also say "more reliable steerability via the system message" which will also be huge if it works as advertised.
As they are accepting a JSON schema for the function calls, it is likely they are using token biasing based on the schema (using some kind of state machine that follows along with the tokens and only allows the next token to be a valid one given the grammar/schema). I have successfully implemented this for JSON Schema (limited subset) on llama.cpp. See also e.g. this implementation: https://github.com/1rgs/jsonformer
As someone also building constrained decoders against JSON [1], I was hopeful to see the same but I note the following from their documentation:
The model can choose to call a function; if so, the content will be a stringified JSON object adhering to your custom schema (note: the model may generate invalid JSON or hallucinate parameters).
So sadly, it is just fine tuning. There's no hard biasing applied :(. You were so close, but so far OpenAI!
Good point. Backtracking is certainly possible but it is probably tricky to parallelize at scale if you're trying to coalesce and slam through a bunch of concurrent (unrelated) requests with minimal pre-emption.
This is a really clever approach to tool use. I'll definitely be experimenting with this trick. Previously I had a grotesque cacophony of agents and JSON parsers. I think this will do a lot to help (both the process and my wallet)
Not sure how well it scales if you need to provide a function definition for every conceivable use case of 'external data': functions": [
{
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"]
}
},
"required": ["location"]
}
}
]
There is also an alternative approach for running code with ChatGPT, the way Nekton(https://nekton.ai) does it. It will use ChatGPT to generate typescript code code, and then just run it in the cloud.
In the end you get similar result - AI generated automation, but you have an option to review what the code will actually do before running it.
While using Auto-GPT I realized that for most usecases, a simple script would have suited my needs better (faster, cheaper, deterministic). Then I realized those scripts can (a) be written by GPT, and (b) call into GPT!
While developing a more-simple LangChain alternative (https://github.com/minimaxir/simpleaichat) I discovered a neat trick for allowing ChatGPT to select tools from a list reliably: put the list of tools into a numbered list, and force the model to return only a single number by using the logit_bias parameter: https://github.com/minimaxir/simpleaichat/blob/main/PROMPTS....
The slight price drop for ChatGPT inputs is of course welcome, since inputs are the bulk of the costs for longer conversations. A 4x context window at 2x the price is a good value too. The notes for the updated ChatGPT also say "more reliable steerability via the system message" which will also be huge if it works as advertised.