Search as Code Turns Agent Search Into an Execution Surface

A new pattern in agent design lets an agent write and run code — typically Python — to orchestrate search directly, instead of issuing a sequence of typed tool calls. It is faster and more expressive. It also quietly changes the governance problem: when agents compose executable workflows rather than calling predefined tools, tool governance becomes code-execution governance, and the audit question shifts from “which tool did the agent call?” to “what code did the agent run?”

From tool calls to executable workflows

The dominant model for agent actions has been tool calling: the agent picks from a set of predefined tools and fills in their arguments, and a runtime executes each call. It is constrained by design — the agent can only do what the available tools allow. A newer pattern inverts that: Perplexity’s Search as Code, introduced in 2026, has the agent write Python that calls the search stack directly instead of looping through tool calls one at a time — reporting an 85% drop in token usage in the process. Instead of choosing a tool, the agent composes a program.

For capability, this is a clear win. A program can branch, loop, combine results, and adapt in ways a fixed sequence of tool calls cannot. But the constraint that made tool calling governable — a closed set of typed operations — is exactly what gets removed.

Search as code is not just faster search. It moves agents from selecting operations to writing them — which moves the risk surface from typed tool calls to generated code with real execution power.

Tool governance becomes code-execution governance

When an agent calls a typed tool, you can govern it at the tool boundary: this tool, these arguments, these permissions. The surface is small and enumerable. When an agent writes and runs code, that boundary dissolves. The same governance question — is this action allowed? — now has to be asked about an arbitrary program with network access, the ability to call other services, and the ability to compose operations the tool designer never anticipated.

This is the same structural shift we have tracked across infrastructure: as agents move from constrained interfaces to general execution, governance has to move with them, from the interface down to the runtime where the code actually runs.

Property	Typed tool calls	Search as code
Action space	Closed, enumerable	Open, arbitrary
Governed at	The tool boundary	Code execution
Audit question	Which tool was called?	What code was run?
Predictable side effects	Mostly	Not by default
Governance surface	Small	The whole runtime

The audit question changes

Under tool calling, an audit trail is a list of tool invocations and arguments — legible and bounded. Under search as code, the audit trail is a sequence of programs the agent wrote and executed. Answering “what did the agent do?” now requires understanding code, not reading a call log. And answering “was it allowed?” requires evaluating that code against policy before it runs, because after it runs the side effects already happened.

Governance has to move closer to the runtime

The conclusion follows the pattern of every other expansion of agent capability we have written about: as agents gain more general power, governance has to move closer to the point of execution. For search as code specifically, that means policy and architectural decisions enforced at the boundary where the agent’s generated code is about to run — a checkpoint that can read the program, evaluate it against the rules, and refuse to execute what violates them.

The capability is genuinely useful, and it is not going away. The governance answer is not to forbid executable workflows; it is to treat the agent’s code as the thing being governed, with verification at the runtime boundary.

When agents stop calling tools and start writing code, the governable unit changes. The question is no longer which tool — it is what code, and whether it should be allowed to run.

Frequently asked questions

What is 'search as code'?+

It is an agent pattern where the agent writes and runs code — typically Python — to orchestrate search and related operations directly, instead of issuing a sequence of predefined, typed tool calls. It is more expressive than tool calling because a program can branch, loop, and combine results, but it removes the closed, enumerable action space that made tool calling straightforward to govern.

Why does this change agent governance?+

Because tool calling can be governed at the tool boundary — a small, enumerable set of typed operations. When an agent writes and runs code, that boundary dissolves: the governance question now applies to an arbitrary program with execution power. Tool governance becomes code-execution governance, and the surface to govern grows from a list of tools to the whole runtime.

How does the audit trail change?+

Under tool calling, the audit trail is a legible list of tool invocations and arguments. Under search as code, it is a sequence of programs the agent wrote and ran. Answering 'what did the agent do?' now requires understanding code rather than reading a call log, and answering 'was it allowed?' requires evaluating that code before it runs, since side effects happen at execution.

What is the governance answer?+

Not to forbid executable workflows — the capability is genuinely useful — but to treat the agent's generated code as the thing being governed. That means enforcing policy and architectural decisions at the runtime boundary where the code is about to execute: a checkpoint that reads the program, evaluates it against the rules, and refuses to run what violates them.