Home Blog

Small models + smart routing: the workflow pattern behind “mini/nano” AI (and how to use it)

Published 2026-03-18 • Tags: AI trends, operations, cost, governance, workflows

A quiet trend is becoming the loudest business lesson in AI: most of your work is not “frontier reasoning” work. It’s classification, rewriting, extraction, summarisation, policy checks, and glue.

Thesis: “mini/nano” models are a forcing function for a better architecture. Build tiered routing: send 80–95% of tasks to small, fast models; escalate only when risk/ambiguity is high. Your wins are latency, cost, and reliability — without sacrificing safety.

Fresh signals (why this matters this week)

The practical pattern: tiered routing

Tiered routing is simple: use the cheapest model that can reliably meet your task’s quality and safety constraints. If the job is uncertain, high-impact, or policy-sensitive, route it up.

Think in three tiers

Rule: T0 defaults to small/fast models. T1 escalates based on confidence. T2 is always gated (human approval, tight tool scopes, audit logs), regardless of model.

How routing decisions should actually be made (not vibes)

Don’t route based on “this feels hard”. Route based on observables your workflow can measure.

Routing signals you can implement this week

Workflow example: “Inbound leads → next-step email” (SMB-friendly)

The win: the expensive model is now exception handling, not the default. Your AI spend becomes predictable — and your response time gets faster.

The missing piece: evaluation gates for routing

Routing is only safe if you can test it. Create a tiny eval set (20–50 real examples) and track: task success, policy violations, hallucinations, and time-to-complete.

Minimum viable “routing eval”

A rollout checklist (1–2 weeks)

Sources used for freshness via RSS: OpenAI News RSS (e.g. “Introducing GPT-5.4 mini and nano”) and Google Research RSS (e.g. “Introducing Groundsource…”).

Where Workflow ADL fits

Workflow ADL is built around the idea that AI is a workflow system: queues, tools, approvals, and auditability. Tiered routing is how you make that system economical.

If you want to adopt this pattern quickly, start with one workflow and measure escalation rate. Your first goal is not “perfect AI” — it’s a predictable, safe default.