Foundations 4 min

Tool Calling Fundamentals

How LLMs invoke external functions, what a tool definition looks like, and how the model decides when and how to call one.

The previous lesson established that frozen training knowledge is not enough for real-world applications. Tool calling is the most powerful runtime solution to that problem. Instead of trying to inject all possible information into the context, you give the model the ability to go get what it needs - to call a function, query a database, fetch a web page, or run a calculation. The result comes back into the context, and the model uses it to answer.

The brilliant analyst with a phone and a team of specialists

Imagine an analyst who is exceptional at interpreting data but cannot personally access every system in the company. Instead of memorising everything, they have a phone and a directory of specialists they can call. Need the latest sales figures? They call the data team: 'Run me the Q3 revenue by region.' Need a legal interpretation? They call the legal team: 'What does clause 7.2 actually mean in practice?' Need to check a competitor's pricing? They call the market research team. The analyst does not know the answers in advance. They know *who to call and when*, and they synthesise the results into a final recommendation. That is exactly what a model with tool calling does - it decides which tools to invoke, passes the right arguments, gets the results, and synthesises a final response. Without tools, the analyst would have to guess. With tools, they can produce verified, current, precise answers.

How it works mechanically. You define a set of tools (functions) with names, descriptions, and parameter schemas. You pass these definitions to the model alongside the user's message. Instead of generating a text response immediately, the model may decide to return a tool call - a structured JSON object saying 'call this function with these arguments.' Your application executes the function, gets the result, and sends it back to the model. The model then incorporates the result and either calls another tool or returns the final response.

python
import google.generativeai as genai
import json

genai.configure(api_key="YOUR_API_KEY")

# Step 1: Define the tools - name, description, and parameter schema
tools = [
    {
        "name": "get_stock_price",
        "description": "Fetches the current stock price for a given ticker symbol. Call this whenever the user asks about stock prices, market values, or share prices.",
        "parameters": {
            "type": "object",
            "properties": {
                "ticker": {
                    "type": "string",
                    "description": "Stock ticker symbol, e.g. 'AAPL' for Apple, 'GOOGL' for Google"
                }
            },
            "required": ["ticker"]
        }
    }
]

# Step 2: Simulate the actual function your code would run
def get_stock_price(ticker: str) -> dict:
    # In production this calls a real API like Yahoo Finance or Alpha Vantage
    prices = {"AAPL": 189.43, "GOOGL": 175.20, "MSFT": 412.78}
    price = prices.get(ticker.upper(), None)
    if price is None:
        return {"error": f"Ticker {ticker} not found"}
    return {"ticker": ticker.upper(), "price": price, "currency": "USD"}

# Step 3: Send the user message - model decides to use a tool
model = genai.GenerativeModel("gemini-1.5-flash", tools=tools)
response = model.generate_content("What is Apple's stock price right now?")

# Step 4: Model returns a function call, not plain text
part = response.candidates[0].content.parts[0]
if hasattr(part, "function_call"):
    fn = part.function_call
    print(f"Model wants to call: {fn.name}({dict(fn.args)})")
    # → Model wants to call: get_stock_price({'ticker': 'AAPL'})
    
    # Step 5: Execute the real function
    result = get_stock_price(**dict(fn.args))
    print(f"Function result: {result}")
    # → Function result: {'ticker': 'AAPL', 'price': 189.43, 'currency': 'USD'}
    
    # Step 6: Send result back - model synthesises the final answer
    # (conversation continuation with function response)
The most important thing about tool descriptions
The model decides whether and how to use a tool based entirely on reading its description. There is no other signal.

Vague description: `'get_data'` - 'Gets data from the system.' The model will not know when to use this or what arguments to pass.

Good description: `'get_stock_price'` - 'Fetches the current real-time stock price for a given ticker symbol from the live market. Call this when the user asks about current stock prices, share values, or market cap. Do NOT use this for historical prices - use get_historical_price for that.'

Write descriptions as if you are briefing a capable colleague who has never seen your codebase. Tell them what the function does, when to use it, and crucially - when NOT to use it.
Interactive: Tool Calling Loop Visualizer
API Flow

Tool calling is a 3-step loop between the **User Application**, the **AI Model**, and **External APIs**. Run the weather request flow to see how parameters are passed.

Tool Description Setting
User Query: "What is the weather in Tokyo?"
Step 1
Query & Tool Schema

Application sends the prompt and the `get_weather` tool definition.

Step 2
Model Decision

Model evaluates whether query matches tool descriptions.

Step 3
API Execution

Application handles execution of function request.

Final Text Output from Model:

Inference output returns here...

Parallel and sequential tool calling. Modern models can call multiple tools in the same turn (parallel) or chain them sequentially - using the result of one tool to decide what to call next. This sequential chaining is the foundation of agentic behaviour. When a model plans a multi-step task, retrieves information, acts on it, retrieves more, and eventually completes a goal - that is tool calling applied in a loop. The MCP module formalises this pattern into a protocol. The Agentic AI module shows how to build systems that run it reliably in production.

You have completed Foundations
You now have the full mental model:

→ LLMs predict the next token using learned probability distributions (Lesson 1)
→ Tokens are not words - they are variable-length chunks that affect cost and limits (Lesson 2)
→ The context window is the model's working memory, with a hard token budget (Lesson 3)
→ Embeddings encode meaning as vectors, enabling semantic search (Lesson 4)
→ Training bakes in knowledge permanently; inference applies it without updating it (Lesson 5)
→ Prompts are the only interface you have at inference time - craft them deliberately (Lesson 6)
→ Hallucination is structural - fix it with architecture, not better wording (Lesson 7)
→ External knowledge (RAG, tools, context injection) bridges the knowledge gap (Lesson 8)
→ Tool calling gives the model the ability to act on the world, not just describe it (Lesson 9)

Next up: RAG - the most widely deployed pattern for external knowledge, from indexing to retrieval to generation.