Tool use with the Claude 3 model family

Claude 3 Models now feature "tool use," also known as function calling, allowing them to interact with external capabilities defined by JSON schemas.
This enables models like Haiku to perform actions such as fetching web pages or running code in a sandboxed Python environment.
Advanced models like Opus can orchestrate multiple smaller models (sub-agents) in parallel to tackle complex, large-scale tasks by dispatching prompt templates and aggregating results.

Claude 3 Models can utilize "tool use" (function calling) to extend their capabilities beyond simple text generation by interacting with external functions.
Tools are described to the model using a JSON schema, which details the tool's capabilities and the arguments it accepts.
Models can make calls to these tools during generation, and a client then dispatches the tool call and returns its results to the model.
The Haiku model can access specific tools like a fetch web page tool for retrieving information and a sandboxed Python REPL tool for running code.
Advanced models like Opus can orchestrate "sub-agents" (other models) using a dispatch sub-agents tool to parallelize work.
To parallelize, the orchestrating model writes a prompt template and provides a list of arguments; each sub-agent receives the template filled with a respective argument.
All answers from the sub-agents are returned to the orchestrating model, which then processes and synthesizes the final result.
This approach combines the intelligence of advanced models (e.g., Opus) with the speed and affordability of smaller models (e.g., Haiku) to efficiently process large amounts of information at scale.

tool use — A feature that allows large language models to interact with external functions or services beyond just generating text. function calling — An alternative term for tool use, specifically referring to the model's ability to call pre-defined functions. JSON schema — A standard format for describing the structure and validation rules for JSON data, used here to define the capabilities and parameters of a tool. model generation — The process where a large language model produces output, which can include text, code, or a call to an external tool. dispatch — In the context of tool use, to send off or execute a tool call and then return its results to the calling model. sandboxed Python REPL — A secure, isolated environment for interactively executing Python code, often used by models to run and test code. sub-agents — Secondary language models or processes that are managed and directed by a primary, more advanced model to perform specific, often parallelized, tasks. prompt template — A pre-defined structure or pattern for a language model input, which is filled with specific arguments or information to generate a complete prompt. parallelize — To divide a task into multiple independent sub-tasks that can be executed simultaneously, often to significantly improve processing speed and efficiency.

One of the newest exciting features of the Claude 3 Model family is tool use, also known as function calling. Tools that Claude can use are represented by a JSON schema that tells the model about the capabilities of the tool and the arguments it accepts. During generation, the model can make a call to any of its tools, which the client can then dispatch and return the results. For example, this Hikew model, which is our fastest and most affordable model, has access to a fetch web page tool and a sandboxed Python Ripple tool so it can retrieve information from the internet and run code. We're going to use it to retrieve an implementation of quicksort, one of the most popular sorting algorithms, and check how fast it runs on a sample input. Now, because Hikew is pretty fast, I've actually slowed down this demo by 5x so that we can see the tokens being generated. You can see that Hikew is able to link together several different tools to accomplish a task. Now, things get even more interesting when models can call other models as tools. For example, let's say I want to find the fastest implementation of quicksort online. Here, I'm asking Opus, our most advanced model, to find 100 permissively licensed quicksort implementations on GitHub. Then, 100 Hikew models write tests to determine how fast each implementation is, and then we'll be able to determine which is the quickest quicksort. While we let this run, here's how it works under the hood. We've given Opus a dispatch sub-agents tool to parallelize this work, where it can write a prompt template and provide a list of arguments. The Hikew sub-agents each get the template filled in with their respective argument. Then, all of the answers get returned to Opus, which returns the fastest implementation. And here we see that the fastest result is available here. And it has some additional optimizations that some of the other implementations don't have. Tool use with sub-agents is a great way to combine the intelligence of Opus and the speed and affordability of Hikew to take action on large amounts of information at scale. Hope you tried out soon.

TL;DR

Claude 3 Models now feature "tool use," also known as function calling, allowing them to interact with external capabilities defined by JSON schemas.
This enables models like Haiku to perform actions such as fetching web pages or running code in a sandboxed Python environment.
Advanced models like Opus can orchestrate multiple smaller models (sub-agents) in parallel to tackle complex, large-scale tasks by dispatching prompt templates and aggregating results.

Takeaways

Claude 3 Models can utilize "tool use" (function calling) to extend their capabilities beyond simple text generation by interacting with external functions.
Tools are described to the model using a JSON schema, which details the tool's capabilities and the arguments it accepts.
Models can make calls to these tools during generation, and a client then dispatches the tool call and returns its results to the model.
The Haiku model can access specific tools like a fetch web page tool for retrieving information and a sandboxed Python REPL tool for running code.
Advanced models like Opus can orchestrate "sub-agents" (other models) using a dispatch sub-agents tool to parallelize work.
To parallelize, the orchestrating model writes a prompt template and provides a list of arguments; each sub-agent receives the template filled with a respective argument.
All answers from the sub-agents are returned to the orchestrating model, which then processes and synthesizes the final result.
This approach combines the intelligence of advanced models (e.g., Opus) with the speed and affordability of smaller models (e.g., Haiku) to efficiently process large amounts of information at scale.

Vocabulary

Transcript

Feedback / ReportSpotted an issue or have an improvement idea?