What is Tool Use in AI? Function Calling, APIs and Real-World Action
Tool use (function calling) is what transforms LLMs from text generators into systems that take real-world actions — searching the web, querying databases, calling APIs, running code. This guide explains the complete mechanism with code examples.
The Limitation Tool Use Solves
A raw LLM lives entirely in text space. It can write, reason, summarise, and translate — but it cannot check real-time data, query your database, send an email, or run a computation. Without tool use, every LLM answer is based solely on patterns learned during training, with a knowledge cutoff date and no access to external systems.
Tool use — also called function calling — bridges this gap. It allows an LLM to decide, during generation, that it needs to call an external function to proceed, output a structured request for that function call, receive the result, and incorporate it into its response.
This single capability transforms LLMs from sophisticated autocomplete systems into the reasoning engines of autonomous agents and production AI applications.
How Function Calling Works Mechanically
OpenAI introduced function calling in the Chat Completions API in June 2023. The mechanism is now standardised across providers — Anthropic (as "tool use"), Google (as "function declarations"), and open-source models via the tool_call format.
Step 1: Define Your Tools
You provide the model with a JSON Schema description of each available function. This includes the function name, a natural-language description (critical — this is how the model decides when to call the function), and the schema of its parameters.
tools = [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather in a given location. Call this whenever the user asks about weather.",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City and country, e.g. London, UK"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "Temperature unit"
}
},
"required": ["location"]
}
}
}
]
Step 2: Send the Request with Tools
Include the tools array in your API request. The model receives both the user message and the tool definitions:
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "What's the weather in London right now?"}],
tools=tools,
tool_choice="auto" # let the model decide whether to call a tool
)
Step 3: The Model Outputs a Tool Call (Not Text)
When the model decides a tool call is appropriate, it does not generate a text response. Instead, it outputs a structured tool_call object:
{
"role": "assistant",
"content": null,
"tool_calls": [
{
"id": "call_abc123",
"type": "function",
"function": {
"name": "get_current_weather",
"arguments": "{\"location\": \"London, UK\", \"unit\": \"celsius\"}"
}
}
]
}
Note that content is null — the model paused text generation to request external data. The arguments field is a JSON string (not object) that you must parse.
Step 4: Execute the Function
Your application code (the executor) parses the tool call, looks up and calls the corresponding function with the provided arguments, and captures the return value:
tool_call = response.choices[0].message.tool_calls[0]
function_name = tool_call.function.name # "get_current_weather"
arguments = json.loads(tool_call.function.arguments)
# Your actual function
result = get_current_weather(
location=arguments["location"],
unit=arguments.get("unit", "celsius")
)
# result = {"temperature": 14, "condition": "Cloudy", "humidity": 78}
Step 5: Return the Result to the Model
Append the original assistant message (with tool_calls) and a new tool message (with the function result) to the conversation, then call the API again:
messages.append(response.choices[0].message) # assistant message with tool_call
messages.append({
"role": "tool",
"tool_call_id": tool_call.id,
"content": json.dumps(result)
})
# Second API call — model now generates text using the tool result
final_response = client.chat.completions.create(
model="gpt-4o",
messages=messages,
tools=tools
)
Step 6: Final Text Response
The model reads the tool result and generates a natural language response grounded in real data:
"The current weather in London is 14°C and cloudy, with 78% humidity. You might want to bring a jacket!"
This entire exchange — from user query to final response — typically takes 500ms–1.5s end-to-end.
Parallel Tool Calls
GPT-4o and Claude 3.5 support parallel tool calling — outputting multiple tool calls in a single response when they are independent and can be executed concurrently:
"tool_calls": [
{"id": "call_1", "function": {"name": "search_crm", "arguments": "{\"email\": \"john@acme.com\"}"}},
{"id": "call_2", "function": {"name": "search_linkedin", "arguments": "{\"company\": \"Acme Corp\"}"}},
{"id": "call_3", "function": {"name": "get_company_news", "arguments": "{\"company\": \"Acme Corp\"}"}}
]
Your executor runs all three functions concurrently (using asyncio.gather in Python or parallel Promise resolution in JavaScript), appends all three results, then re-prompts the model. This dramatically reduces latency for multi-tool agent tasks — three serial calls at 300ms each (900ms total) become three parallel calls (300ms total).
Tool Design Principles
The quality of tool design determines agent reliability more than the choice of LLM. Poor tool descriptions lead to wrong tool selections. Overly broad tools give the model too many wrong paths. Vague parameter schemas lead to malformed calls.
Write Precise, Informative Descriptions
The description is the primary signal the model uses to decide whether to call a tool. Bad description: "Gets customer info." Good description: "Retrieve a customer record from the CRM by email address or phone number. Returns customer ID, name, company, lifetime value, last contact date, and open deal count. Call this before taking any action involving a specific customer."
Use Constrained Parameter Types
Use enum to constrain parameter values. Instead of "date": "string", use "date": {"type": "string", "format": "date", "description": "ISO 8601 date, e.g. 2024-03-28"}. This prevents the model from outputting ambiguous values like "tomorrow" or "next week."
One Tool, One Responsibility
Tools should do exactly one thing. send_email_and_create_crm_record is a bad tool. send_email and create_crm_record are two good tools. Granular tools give the model precise control and make errors easier to diagnose.
Always Include an ask_human Tool for High-Stakes Actions
For agents that can send emails, charge payments, or modify important records, include an ask_human_for_confirmation tool. When the model is about to take an irreversible action, it should ask the user "I'm about to send this email to john@acme.com. Confirm?" before executing. This is the single most important safety measure in production agents.
Tool Use vs RAG: When to Use Each
Both tool use and RAG give the model access to external information, but they serve different purposes:
- RAG is for retrieving information from a static or slowly-changing knowledge base (documents, FAQs, policies). The information is embedded and searched semantically.
- Tool use is for retrieving real-time or dynamic data (current database state, live API responses, user-specific information) or for taking actions (writing, creating, sending).
In production systems, both are used together. RAG retrieves background knowledge; tool calls retrieve live data and execute actions.
Structured Output: A Related but Distinct Feature
Structured output (JSON mode) forces the LLM to generate a response that conforms to a specified JSON Schema. Unlike function calling (where the model calls an external function), structured output is used when you want the model's final answer in a machine-parseable format:
{
"lead_score": 87,
"priority": "high",
"recommended_action": "Schedule a discovery call within 48 hours",
"reasoning": "Enterprise company, recent funding round, active on LinkedIn discussing CRM pain points"
}
This is essential for AI systems that feed output to downstream processes — CRMs, dashboards, automation pipelines — where human-readable prose is unusable.
Real-World Tool Use Architecture
In a production AI CRM I built, the sales assistant agent has 12 tools:
search_crm_contacts— semantic search over contact recordsget_contact_history— full interaction timeline for a contactcreate_contact,update_contact— CRM write operationssearch_email_threads— semantic search over all email historydraft_email— generate a personalised email draft (requires human approval before sending)send_email— send email (requires priorask_human_for_confirmation)create_task,schedule_followup— task managementsearch_company_news— real-time news lookup via APIscore_lead— run ML scoring modelask_human_for_confirmation— safety gate for destructive actionsfinish— end the agent loop
This agent handles: "Research the Acme Corp opportunity, draft a personalised follow-up email based on their recent funding news, and schedule a reminder for 3 days from now." It takes 4–6 tool calls, runs in 8–12 seconds, and requires one human confirmation before sending the email. It eliminates 20–30 minutes of manual work per sales rep per day.
Senior Full Stack Developer — Laravel, Vue.js, Nuxt.js & AI. Available for freelance projects.
Hire Me for Your Project