SEASON01 SHIPPED04 UPCOMING01 VENUEYOUTUBE LIVE

SHOW US YOUR [ AGENT ] SKILLS

[ vanishing / gradients ] × PyMC LABS

HOSTED BY HUGO BOWNE-ANDERSON × THOMAS WIECKI

EXCEL WORLD CHAMPIONSHIPS × EUROVISION

Long-time Python, ML, AI and data builders show how they're actually using agents today. Not vibe coding. Not demos. The real workflows: agent skills, harnesses, voice-memo memory, background reviewers, from people whose software you've been using for years.

▶ PRESS START · EP 05 ▣ WATCH PAST EPISODES ▣ MEET THE GUESTS

SEASON

01

EPISODES

04

BUILDERS

17

SKILLS · WORKFLOWS

34 · 68

NEXT EPISODE JUN 18 · 2026 thursday · live on youtube

EP 05 · COPILOTS & CODING AGENTS

Three builders on coding agents, search, and context engineering

JOHN BERRYMANarcturus labs · ex-github copilot

ISAAC FLATHkentro tech · ex-answer.ai

MATT PALMERconductor · ex-replit

HOSTS HUGO BOWNE-ANDERSON · THOMAS WIECKI

EP 04

· MAY 2026 · 03 GUESTS · FULL EPISODE

HOW TO EVALUATE AGENTIC WORKFLOWS

Skill scepticism, plan review, implementation review, agentic search, and hidden holdout tests.

EP 04 · HOW TO EVALUATE AGENTIC WORKFLOWS - skill scepticism, review loops, and hidden holdout tests

Hamel Husain

HAMEL HUSAINparlance labs

Chris Fonnesbeck

CHRIS FONNESBECKpymc labs · mets / brewers / yankees

Doug Turnbull

DOUG TURNBULLsearch · shopify / reddit

▣ HIGHLIGHT REEL

01

skill-scepticism workflow

Read public skills like code: check provenance, maintenance, and constraints, then fork the idea instead of importing a shortcut. · Hamel Husain

00:22:32

02

plan-review-implementation-review workflow

Turn cheap experimentation into checkpoints: plan first, review red and yellow flags, implement, then review the code before trust. · Chris Fonnesbeck

01:05:53

03

auto-research-agentic-search workflow

Give the agent room to mutate a search ranker, but keep the final score hidden so improvements have to survive a real holdout. · Doug Turnbull

01:41:07

▶ WATCH ON YOUTUBE

EP 03

· MAY 2026 · 06 GUESTS · RUNTIME 3H 18M

FROM SKILLS TO AGENT HARNESSES

Research memory, local boxes, debug panes, live notebooks, video generation, and code repair.

EP 03 · FROM SKILLS TO AGENT HARNESSES - research memory, debug surfaces, notebooks, video, and code repair

Paul Iusztin

PAUL IUSZTINdecoding ai

Eleanor Berger

ELEANOR BERGERelite ai coding

Alan Nichol

ALAN NICHOLrasa

Vincent Warmerdam

VINCENT WARMERDAMmarimo

Nicolay Gerold

NICOLAY GEROLDamp

Matthew Honnibal

MATTHEW HONNIBALspacy / explosion

Ines Montani

INES MONTANIspacy / explosion

▣ HIGHLIGHT REEL

01

try-except skill

A narrow audit pass for the failure mode agents love: broad try blocks and exception handlers that make bad code look green. · Matthew Honnibal

00:12:09

02

pre-mortem skill

Instead of hunting today's bugs, write incident reports for the failures a future reasonable edit could cause. · Matthew Honnibal

00:14:10

03

mutation-testing skill

Deliberately break the code, one mutation at a time, to find the bugs your test suite would let through. · Matthew Honnibal

00:14:10

04

Collapse publishing into one instruction: the agent ships HTML pages and small sites to live URLs without a GitHub Pages detour. · Eleanor Berger

00:45:55

05

anki-connect skill

Drive Anki through its local API with confirmation checks, so an agent can maintain flashcards without silently mutating memory. · Eleanor Berger

00:49:46

06

impeccable skill

Give coding agents a design language: fewer generic panels, more interfaces that look like someone meant it. · Eleanor Berger

00:50:02

07

youtube-watch-later-gist-summaries skill

Asked once from a phone; the agent invented browser login, transcript fetching, caching, and secret gist summaries. · Eleanor Berger

00:52:57

08

thread-postmortem skill

Use failed threads as harness training data: trace missteps to instructions, then delete or sharpen the rule that caused them. · Nicolay Gerold

01:59:04

09

remotion-video skill

Record a few minutes of audio; let the skill carry video taste, timing rules, frame checks, and avatar compositing. · Alan Nichol

02:46:00

10

Turn trusted sources into a durable research wiki, so future agents query accumulated context instead of starting over. · Paul Iusztin

02:19:52

11

personal-agent-harness workflow

Run a personal agent on a separate Mac mini, with Discord as the front door and autonomy earned inside a hard perimeter. · Eleanor Berger

00:47:50

▶ WATCH ON YOUTUBE $ INSTALL SKILLS

EP 02

· MAY 2026 · 04 GUESTS · RUNTIME 2H 14M

BUILDING AGENTS THAT IMPROVE THE WORKFLOW

Prompt refinement, eval-driven charts, human-in-the-loop EDA, and local-first inference.

EP 02 · BUILDING AGENTS THAT IMPROVE THE WORKFLOW - prompt refinement, eval-driven charts, EDA, and local-first inference

Hilary Mason

HILARY MASONhidden door

Bryan Bischof

BRYAN BISCHOFtheory ventures

Eric Ma

Tomasz Tunguz

TOMASZ TUNGUZtheory ventures

▣ HIGHLIGHT REEL

01

prompt-refinement skill

Interview intent first, then generate risky variants and score them against a rubric written before the run. · Hilary Mason

01:01:00

02

marimo-pair skill

Drop the agent inside a live Marimo kernel, so plots, widgets, markdown, and corrections happen in one reactive notebook. · Eric Ma

00:11:57

03

agentic-eda workflow

The human chooses the next scientific question; the agent renders evidence fast enough to keep exploration one plot at a time. · Eric Ma

00:23:27

04

eval-driven-charts workflow

Turn every failed chart eval into a library feature, so the package cannot regress on a case it already learned. · Bryan Bischof

01:25:11

05

weekly-gremlins workflow

Schedule three bad-idea personas to pitch, critique, and write moonshot docs no product roadmap would allow. · Hilary Mason

01:14:20

06

local-first-agents workflow

Default local: fast Qwen on a laptop, private workflows, offline flights, and cloud calls only for named exceptions. · Tomasz Tunguz

02:07:42

▶ WATCH ON YOUTUBE $ INSTALL SKILLS

EP 01

· APR 2026 · 03 GUESTS · RUNTIME 1H 32M

THE AGENTIC SOFTWARE FACTORY

RoboRev, agent memory, personal commands, and LLM-as-judge chart checks.

EP 01 · THE AGENTIC SOFTWARE FACTORY - RoboRev, agent memory, personal commands, and LLM-as-judge checks

Wes McKinney

WES MCKINNEYpandas / posit

Jeremiah Lowin

JEREMIAH LOWINprefect / fastmcp

Randy Olson

RANDY OLSONgoodeye labs

▣ HIGHLIGHT REEL

01

When ten agents are running, each one explains the change like a colleague, not a diff bot. · Jeremiah Lowin

00:46:14

02

github-reply skill

A tiny etiquette layer for OSS maintenance: reject clearly, sound human, and stop wrapping "no" in fake praise. · Jeremiah Lowin

00:54:08

03

One phrase, one override: in Jeremiah's world "ship it" means open the PR, never merge it. · Jeremiah Lowin

00:54:52

04

high-signal-chart-workflow skill

Run search, chart variants, linting, and an LLM-as-judge Tufte test until the graphic actually carries the story. · Randy Olson

01:12:37

05

8-bit-video-gen skill

The show's own production skill: turn guest photos into pixel art, animate them, and feed the retro livestream system. · Show Us Your Agent Skills

ep 01

06

agentic-software-factory workflow

Scale agentic engineering with commits every turn, RoboRev reading every line, and a review queue agents must drain. · Wes McKinney

00:27:14

07

second-brain workflow

Feed daily voice memos into editable agent memory, turning personal context into a substrate future sessions can use. · Jeremiah Lowin

00:35:50

▶ WATCH ON YOUTUBE $ INSTALL SKILLS

GUEST DOSSIERS

one page per builder: segment video, field notes, artifacts, and the workflow they showed

Hamel Husain

HAMEL HUSAINEP 04skill scepticism, internal APIs, and constraints over prose

Chris Fonnesbeck

CHRIS FONNESBECKEP 04review loops for Bayesian modeling and agent-written code

Doug Turnbull

DOUG TURNBULLEP 04agentic search, hidden validation, and anti-overfit evals

Paul Iusztin

PAUL IUSZTINEP 03research skills, trusted sources, and a writing knowledge base

Eleanor Berger

ELEANOR BERGEREP 03Hermes, fnord, local boxes, and agent boundaries

Alan Nichol

ALAN NICHOLEP 03programmatic video, Remotion, and command-line production

Vincent Warmerdam

VINCENT WARMERDAMEP 03marimo pair, widgets, and notebooks as shared state

Nicolay Gerold

NICOLAY GEROLDEP 03debug surfaces, thread postmortems, and reusable context

Matthew Honnibal

MATTHEW HONNIBALEP 03try-except repair, mutation testing, and self-review loops

Hilary Mason

HILARY MASONEP 02prompt refinement, weekly gremlins, and taste loops

Bryan Bischof

BRYAN BISCHOFEP 02eval-driven charts, ratchets, and chart judgment

Eric Ma

ERIC MAEP 02agentic EDA, marimo, and human-in-the-loop plots

Tomasz Tunguz

TOMASZ TUNGUZEP 02local-first agents, earnings analysis, and fast local models

Wes McKinney

WES MCKINNEYEP 01RoboRev, long-running agents, and agentic engineering

Jeremiah Lowin

JEREMIAH LOWINEP 01personal software, explain, ship-it, and GitHub replies

Randy Olson

RANDY OLSONEP 01Tufte checks, chart workflows, and LLM-as-judge review

SELECTED SKILLS & WORKFLOWS

a selection from the companion repo · hugobowne/show-us-your-agent-skills ↗

A selection of skills and workflows from the streams, packaged into the companion repo.

$ npx skills add https://github.com/hugobowne/show-us-your-agent-skills

skill

EP 01

Agent narrates what it just did, like a teammate handing off.

Jeremiah Lowin · Prefect / FastMCP ↗ youtube · 00:46:14

skill

EP 01

Replies to GitHub contributors in your voice. No "great work, but rejected" sandwiches.

Jeremiah Lowin · Prefect / FastMCP ↗ youtube · 00:54:08

skill

EP 01

Re-trains "ship it" to mean open a PR, not merge.

Jeremiah Lowin · Prefect / FastMCP ↗ youtube · 00:54:52

skill

high-signal-chart-workflow

EP 01

One-line idea → Tufte-style chart with an LLM-as-judge verifier loop.

Randy Olson · Goodeye Labs · r/dataisbeautiful ↗ youtube · 01:12:37

skill

8-bit-video-gen

EP 01

Turns guest headshots into 8-bit pixel-art video clips for livestream intros and cutaways.

Show Us Your Agent Skills ↗ ep 01 on youtube

skill

prompt-refinement

EP 02

Interview intent, ask for three variations at different magnitudes, score against a rubric.

Hilary Mason · Hidden Door ↗ youtube · 01:01:00

skill

EP 02

Agent drives a reactive Marimo notebook through a bash bridge into the Python kernel.

Eric Ma · Moderna ↗ youtube · 00:11:57

workflow

EP 02

Human-in-the-loop EDA. Every claim backed by an artifact.

Eric Ma · Moderna ↗ youtube · 00:23:27

workflow

eval-driven-charts

EP 02

Build an agent-facing chart library by generalising eval failures into features; the package can never regress on an eval it once passed.

Bryan Bischof · Theory Ventures ↗ youtube · 01:25:11

workflow

weekly-gremlins

EP 02

Three agent personas pull from a bad-ideas backlog, pitch and critique each other, and write design docs for moonshots no roadmap would schedule.

Hilary Mason · Hidden Door ↗ youtube · 01:14:20

YOUR HOSTS

two builders · two hosts · on a mission to find out what people at the top of the game are actually doing

Hugo Bowne-Anderson

HUGO
BOWNE-ANDERSON

host · vanishing gradients

AI builder, consultant, educator of 6+ million students; ex-Yale, ex-Max Planck.

Thomas Wiecki

THOMAS
WIECKI

host · pymc labs

Co-creator of PyMC. Founder of PyMC Labs. Has built Bayesian models for hedge funds, Fortune 500s, and indie tinkerers for over a decade.