Exactly. You can even have several such entities in one chat. One issue with thi...

Exactly. You can even have several such entities in one chat.

One issue with this approach, especially in production, is latency. You’ve got to run the entire chat through one of the big models, curie or davinci, which is not only expensive at scale, but also slow.

Then again, if you just have one or two external tools, using those big models to make the decisions which (if any) tool to call is overkill anyway. So you just fine-tune a smaller model on the task. Reduces not only costs by a fact of 100 or more. But also speeds up your pipeline considerably.