ep 02 field notes
Show Us Your Agent Skills / EP 02 / guest dossier
TOMASZ TUNGUZ THEORY VENTURES LOCAL-FIRST STRIPPED TO THE BONE SILICON MIND

TOMASZ TUNGUZ

Tom runs agents as parallel local workers. A model on his own laptop, a harness stripped to the bone, skills for the repeatable analysis. He reaches for the cloud only when he can name the reason: a hard multi-file refactor, or a model that's distinctly better at one thing.

EP 02 · TOMASZ TUNGUZ · local-first agents in Pi, live on stream

LOCAL FIRST

"I live in Pi and I run local models." The first guest on the show to run local for the majority of his workflow. On a Mac M5 with Qwen 35B he gets 120 to 140 tokens a second at a 256K context window, fast enough that cloud tools feel slow next to it. He switches to the cloud only when he can name why: a multi-file refactor, or a model with a distinct edge like Kimi K26 for creative writing.

The catch is the harness. A system prompt of 25 to 40 thousand tokens erases the local speed advantage, so Tom keeps everything thin: a tiny AGENTS.md, skills for the repeatable analysis, and not much else. "You can literally strip almost to the bone and it'll still work."

Inside the write-up: eight principles, the session shape, anti-patterns, and the exact hardware and model setup you need.

Pi running a local Qwen model at 120-140 tokens per second
Local Qwen 35B in Pi: 120 to 140 tokens a second on a Mac M5. Thomas called Tom the first guest to run local for most of his workflow. [02:07:42]

"The great power of agents is parallelization."

But it's an orchestration problem: "you have to think differently about how you equip parallel processes with enough context and planning that they actually save you time." 01:49:36

WHEN LOCAL, WHEN CLOUD

how Tom decides, and keeps it fast. every timestamp opens the segment
Default to local"I do it for convenience." Local keeps working on a plane, cuts latency, and lets him paste secrets without sending them anywhere. 02:08:43
Strip the harness to the boneA 40K-token system prompt kills the speed advantage. Tom's AGENTS.md is "really thin": skills, a Gmail rule, Theory's MCP, a QMD lookup. 02:11:12
Put the repeatable work in a skillHis public-company analysis skill pulls data, finds the transcript with Exa, charts it in R, and assembles an HTML briefing. 02:00:45
Name the reason to go cloud"Large-scale coding, multi-file rearchitectures, or bugs that can't be solved" locally, or a model with a specific strength. Otherwise, stay local. 02:09:17
Cap package age for safetyHe panicked over unpinned npm installs, so the rule is avoid very new packages. Where that rule should live is still an open question. 02:04:57
Resume with ripgrep"Run ripgrep over the Pi sessions for the last six hours and find where we were, then pick it up." Saved sessions become memory. 02:13:11
Tom's deliberately thin AGENTS.md file
The thin AGENTS.md: skills, a Gmail-API note, Theory's MCP, a QMD lookup, and little else. Context overhead is the enemy of local speed. [02:11:12]

EARNINGS ON AUTOPILOT

his public-company-analysis skill, running locally in Pi
fetch

pull the numbers

The skill pulls company CSVs from a data source into a folder, the raw material for the briefing. 02:00:45

search

find the transcript

It locates the earnings call transcript with Exa, so the analysis is grounded in what management actually said. 02:01:12

chart

chart it in R

A series of R libraries render the figures: revenue growth, net dollar retention, cash, guidance, against his own style sheets. 02:01:16

ship

brief him daily

It assembles an HTML deck with charts and quotes, and a launch daemon runs it every morning on whichever companies just reported. 02:03:30

A Figma earnings analysis HTML presentation generated by Tom's local Pi workflow
The output: the Figma earnings deck, built locally the day Figma reported. "Within two and a half minutes I can get a sense of exactly what happened in the business." [02:02:20]

"They demonstrate such a level of intelligence and then they forget."

Tom's open problem: memory. Skills, plugins, QMD, AGENTS.md, sessions, "they don't have clean demarcated lines of what goes in which." He doesn't have a settled answer. 01:52:20