Course 3 · Module 08 · Tool Use & Function Calling

Part 01 · What tools fix

Five things an LLM
genuinely cannot do.

A frozen-weights language model has hard limits. Tool use lets it route around them — by deferring to systems that can do those things and treating the result as new context.

Know the time

The model's knowledge cuts off at training. Real-time info, current weather, news today — invisible to it.

tool: get_time, get_weather, search

Do precise math

LLMs are bad at long arithmetic. They hallucinate digits. A calculator tool fixes this for free.

tool: calculator, run_python

Query your data

Your company database, your spreadsheets, your private docs — outside the model's training set.

tool: sql_query, search_docs

Run code safely

Need to actually execute Python, parse JSON, fit a regression? A code-execution tool does it for real.

tool: execute_python, repl

Take actions

Send an email. Book a flight. Make an API call. Effects in the world, not just outputs.

tool: send_email, book_flight, ...

Part 02 · Hands on · The loop in action

Watch the model
reach out.

Pick a scenario. Press play. Watch the full agent loop unfold: model decides → tool call(s) → tool result(s) → model continues → final answer. This is exactly what Claude, GPT-4, and Gemini do under the hood when you give them tools.

The protocol.

The model is given a list of available tools with JSON schemas. It outputs either a regular text reply, OR a special tool_use block containing the function name and arguments. Your runtime executes the tool and sends back a tool_result block with the response. The model sees the result, reasons over it, then either calls more tools or gives the final answer.

// Pick a scenario

Press play to see the agent loop animate

Part 03 · Tools are JSON schemas

How the model knows
what's callable.

Before the model can use a tool, you give it the tool's JSON schema — name, description, parameters, types, required fields. The model reads this list (as part of its system prompt) and decides when each tool is appropriate. Below: three canonical examples.

What a schema actually contains.

The description field is the most important — it's how the model knows when to use the tool. Vague description → model uses the tool wrong (or doesn't use it). Precise description → reliable behavior. This is the most underrated skill in building with tool use.

—

—

Part 04 · Hands on · Parallel vs sequential

Some calls wait.
Most don't have to.

If you need the weather in Tokyo AND Paris, those calls are independent — they can run at the same time. If you need to find a flight first, then book it, those are dependent — second waits for first. Modern models (Claude, GPT-4) emit multiple tool_use blocks in a single response when they detect independence, letting your runtime fan them out.

The model handles dependency detection itself.

You don't tell the model "these are parallel." It looks at the user's query, decides what tools to call, and emits them as multiple separate tool_use blocks within one response if they're independent. Your runtime executes them concurrently. If they're dependent, the model emits one tool call, sees the result, then emits the next.

Sequential (naïve)

// each call waits for the previous

0s1s2s3s4s

Total time:—

Parallel

// independent calls run together

0s1s2s3s4s

Total time:—

Part 05 · Tool use in the wild

The patterns running in
every real system.

Once you have tool use, you have agents. Here's how teams actually deploy it.

Web search

Search-augmented chat

// the most common pattern

The model has a web_search tool. For any question that smells time-sensitive ("latest", "current", "today"), it searches first, then synthesizes the result.

Used by: Claude, ChatGPT, Perplexity, Gemini. The reason they can answer "what happened today" despite a stale knowledge cutoff.

Example flow User: "What's the latest on the AI Act?"
→ tool_use: web_search(query="EU AI Act 2024 latest")
→ tool_result: [3 article summaries]
→ assistant: "Based on recent news..."

Code execution

Agentic coding

// Claude Code, Cursor, Devin

The model gets read_file, write_file, run_bash, and search_codebase tools. It reads, edits, runs tests, iterates — all autonomously.

This is what makes Claude Code, Cursor, and Devin work. The loop runs dozens or hundreds of tool calls per task without user intervention.

Example loop → read_file("src/auth.py")
→ search_codebase("login flow")
→ write_file (with fix)
→ run_bash("pytest tests/")
→ ...continues until tests pass

RAG via tools

Retrieval as a function

// modern alternative to fixed RAG

Instead of stuffing retrieved docs into context up front, give the model a search_knowledge_base tool and let it decide when (and how often) to query.

Smarter than fixed RAG: the model can refine its query based on partial results, search again with better keywords, only retrieve what's needed. "Agentic RAG" is the standard for serious deployments now.

Example flow → search_knowledge_base("Q1 sales")
→ [results too generic, refine]
→ search_knowledge_base("Q1 2024 Acme sales by region")
→ [perfect results]
→ answer

Computer use

Vision + action loop

// the most powerful frontier

Anthropic's Computer Use gives Claude tools to take screenshots, move the mouse, type, click. The model sees the screen, decides what to do, the action runs, takes another screenshot, repeats.

Same protocol as text tool use — just with vision input and GUI-action tools. Lets the model operate any app it has access to. The bridge from chatbot to assistant that actually does things.

Example loop → screenshot()
→ [sees a Google form]
→ mouse_click(x=380, y=240)
→ type("John Doe")
→ screenshot() ...

Course 3 · Module 08 complete

The LLM stops being a
text generator. It becomes an agent.

You watched the full loop: query → decide → tool call → result → continue → answer. You read real JSON schemas. You understand why some calls go parallel and others can't. You know that "agent" mostly just means "LLM in a tool-use loop". The mechanics are simple — the implications are not.

Up next · Course 3 · Module 09

Agents & Multi-step Reasoning

Tool use is the mechanism. Agents are what you build on top. ReAct, chain-of-thought, planning, reflection, self-consistency — the prompting and architectural patterns that turn a single tool call into a 50-step autonomous task. Interactive: watch an agent solve a multi-step problem step by step.

Continue to Module 09

The model can't
tell time.
So it asked.

Five things an LLM
genuinely cannot do.

Know the time

Do precise math

Query your data

Run code safely

Take actions

Watch the model
reach out.

How the model knows
what's callable.

Some calls wait.
Most don't have to.

The patterns running in
every real system.

Search-augmented chat

Agentic coding

Retrieval as a function

Vision + action loop

Five questions on what
you just wired up.

The LLM stops being a
text generator. It becomes an agent.

Agents & Multi-step Reasoning

The model can'ttell time.So it asked.

Five things an LLMgenuinely cannot do.

Know the time

Do precise math

Query your data

Run code safely

Take actions

Watch the modelreach out.

How the model knowswhat's callable.

Some calls wait.Most don't have to.

The patterns running inevery real system.

Search-augmented chat

Agentic coding

Retrieval as a function

Vision + action loop

Five questions on whatyou just wired up.

The LLM stops being atext generator. It becomes an agent.

Agents & Multi-step Reasoning

The model can't
tell time.
So it asked.

Five things an LLM
genuinely cannot do.

Watch the model
reach out.

How the model knows
what's callable.

Some calls wait.
Most don't have to.

The patterns running in
every real system.

Five questions on what
you just wired up.

The LLM stops being a
text generator. It becomes an agent.