From tool calls to executable workflows
The dominant model for agent actions has been tool calling: the agent picks from a set of predefined tools and fills in their arguments, and a runtime executes each call. It is constrained by design — the agent can only do what the available tools allow. A newer pattern inverts that: Perplexity’s Search as Code, introduced in 2026, has the agent write Python that calls the search stack directly instead of looping through tool calls one at a time — reporting an 85% drop in token usage in the process. Instead of choosing a tool, the agent composes a program.
For capability, this is a clear win. A program can branch, loop, combine results, and adapt in ways a fixed sequence of tool calls cannot. But the constraint that made tool calling governable — a closed set of typed operations — is exactly what gets removed.
Search as code is not just faster search. It moves agents from selecting operations to writing them — which moves the risk surface from typed tool calls to generated code with real execution power.
Tool governance becomes code-execution governance
When an agent calls a typed tool, you can govern it at the tool boundary: this tool, these arguments, these permissions. The surface is small and enumerable. When an agent writes and runs code, that boundary dissolves. The same governance question — is this action allowed? — now has to be asked about an arbitrary program with network access, the ability to call other services, and the ability to compose operations the tool designer never anticipated.
This is the same structural shift we have tracked across infrastructure: as agents move from constrained interfaces to general execution, governance has to move with them, from the interface down to the runtime where the code actually runs.
| Property | Typed tool calls | Search as code |
|---|---|---|
| Action space | Closed, enumerable | Open, arbitrary |
| Governed at | The tool boundary | Code execution |
| Audit question | Which tool was called? | What code was run? |
| Predictable side effects | Mostly | Not by default |
| Governance surface | Small | The whole runtime |
The audit question changes
Under tool calling, an audit trail is a list of tool invocations and arguments — legible and bounded. Under search as code, the audit trail is a sequence of programs the agent wrote and executed. Answering “what did the agent do?” now requires understanding code, not reading a call log. And answering “was it allowed?” requires evaluating that code against policy before it runs, because after it runs the side effects already happened.
Governance has to move closer to the runtime
The conclusion follows the pattern of every other expansion of agent capability we have written about: as agents gain more general power, governance has to move closer to the point of execution. For search as code specifically, that means policy and architectural decisions enforced at the boundary where the agent’s generated code is about to run — a checkpoint that can read the program, evaluate it against the rules, and refuse to execute what violates them.
The capability is genuinely useful, and it is not going away. The governance answer is not to forbid executable workflows; it is to treat the agent’s code as the thing being governed, with verification at the runtime boundary.
When agents stop calling tools and start writing code, the governable unit changes. The question is no longer which tool — it is what code, and whether it should be allowed to run.