Does this functionality provide more than one can build with the GPT-4 API?
Could I get the same by just making my prompt "You are a computer and can run the following tools to help you answer the users question: run_python('program'), google_search('query')".
Other people have done this already, for example [1]
> Could I get the same by just making my prompt "You are a computer and can run the following functions to help you answer the users question: run_python('program'), google_search('query')".
GPT-4 does not have a way to search the internet without plugins. It can search its training dataset, which is large, but not as large as the internet and certainly doesn't include private resources that a plugin can access.
Currently they have a special model called "Plugins" which is presumably tuned for tool use. I guess they have extended ChatML to support plugins (e.g., `<|im_start|>use_plugin` or something to signal intent to use a plugin) and trained the model on interactions consisting of tool use.
I'm interested to see if this tuned model will become available via the API, as well as the specific tokenization ChatGPT is using for the plugin prompts. If they have tuned the model towards a specific way to use tools, there's no need to waste time with our own prompt engineering like "say %search followed by the keywords and nothing else."
I'm not seeing anything there that can't be done with the basic API with tool use added - ie. you call the API, sending the users query and information and examples of available tools. The API responds saying it wishes to use a tool, and which tool it wants to use. You then do whatever the tool does (eg. some math). You then call the API again, with the previous state, plus the result of the calculations, and GPT-4 then responds with the reply to the user.
Agreed this isn't materially different, sounds like an incremental ui/ux improvement for non technical users who wouldn't fiddle with the API, analogous to how app stores simplified software installation for laypeople
GPT and LLMs don't run code, even when you tell them to run something. They hallucinate an answer they think would be the result of running the code. Presumably these plugins will allow limited and controlled interaction with partner services.
See the link in my post. It asks you to run the tool. You run the tool and tell it the result... And then it uses the result of the tool to decide to reply to the user.
The link talks about tools that 'lie' - ie. a calculator which deliberately tries to trick GPT-4 into giving the wrong answer. It turns out that GPT-4 only trusts the tools to a certain extent - if the answer the tool gives is too unbelievable, then GPT-4 will either re-run the tool or give a hallucinated answer instead.
It's always giving a hallucinated answer. GPT doesn't 'run' anything. It sees an input string of text asking for the result of fibonacci(100) and finds from its immense training set a response that's closely related to training data that had the result of fibonacci(100) (an extremely common programming exercise with results all over the internet and presumably its training data).
Again, GPT is not running a tool or arbitrary python code. It's not applying trust to a tool response. It has no reasoning or even a concept of what a tool is--you're projecting that on it. It is only generating text from an input stream of text.
Could I get the same by just making my prompt "You are a computer and can run the following tools to help you answer the users question: run_python('program'), google_search('query')".
Other people have done this already, for example [1]
[1]: https://vgel.me/posts/tools-not-needed/