82W Digital

Shipping AI into
places it usually
doesn't go.

The models work. Most deployments don't. 82W runs a phase-gated delivery discipline, with evals (the tests that define what working means) before the build and transfer at the end, for regulated-industry AI and for turning committed AI spend into shipped, measured value. Senior hands on every engagement, end to end.

Start a conversation How we work →

What we do

The models work. The deployments don't. We close that gap in the places where the system has to hold up the first time.

The chasm between a promising pilot and a production system isn't a model problem. It's integration, quality at volume, observability, and ownership. A better model fixes none of those. A delivery method fixes all of them: define what working means before the build, gate every phase on a written exit condition, and hand over a system your team can run.

Senior hands from first call to handoff. No junior swap-outs, no staff augmentation. The work is architected to be handed off, not to create dependency.

Practice areas

Two practices. One way of working.

Everything we take on sits in one of two lanes. Both lean AI-first, both ship working code, and both run the same phase-gated discipline end to end, with the same senior hands.

Eval-defined. Phase-gated. Shipped.

Applied AI Engineering

We build AI systems for places where it has to work the first time: clinical settings, behavioral systems, regulated workflows. The eval suite is the spec. Passing behavior gets defined, version-controlled, and agreed before the build, so non-determinism becomes a managed property instead of a standing risk.

Regulated-industry AI
HIPAA-compliant builds, auditable pipelines, PHI-aware architectures. We've shipped into behavioral health and education environments where the rules are real.
Evals-as-engineering
Golden trajectories built from your hardest real cases, regression across model versions, LLM-as-judge calibrated against human review. No eval suite, no build.
AI agent development
Task-shaped agents, tool-use systems, and multi-step workflows, with constitutional guardrails, escalation paths, and review loops that keep humans in the decision. Built to be inspected, not just invoked.
Phase-gated delivery
Validation through Transfer, with a written exit condition between every phase. Working code in real hands in weeks, and a system your team runs without us at the end.

Committed spend, shipped value.

AI Value Realization

Most enterprise AI budgets are already committed to model contracts, platform agreements, and copilot seats. Most of that spend hasn't converted into shipped, measured value. We close the gap between what you bought and what your leadership was promised.

Committed-spend audit
A week-one inventory of the spend and the stalled initiatives against your actual data and systems. The output is a decision: the one use case with the shortest credible path to measurable production value.
Value-realization sprint
One use case through all five gates: eval-defined, built, hardened, shipped into the real workflow. Proof, not another pilot.
Production AI on your platform
Agents, RAG systems, and decision surfaces built on the data platform you already run. Tooled into your data, governed by your controls, deployed where your team already works.
The value memo
Instrumented workflows, lift measured against the eval baseline, cost per outcome. Two pages your leadership can take to the board, and the artifact that defends next year's line item.

Selected work

Things we've actually shipped.

Four projects that prove the pitch. Open-source, regulated, responsible, and architectural. One of each.

01 · Shipping product

NextDialog

A calm interface for AI coding agents.

Multi-session desktop terminal manager for Claude Code, Cursor Agent, Gemini CLI, and custom agents. Built, shipped, and live at nextdialog.io.

Version: v0.3.2
Releases: 17
License: MIT
Platforms: macOS · Linux · Windows

02 · Regulated-industry fluency

Behavioral Health Platform

HIPAA-compliant behavioral health SaaS.

A multi-tenant platform for behavioral health providers: clinical workflows, risk flagging, and audit trails that meet the rules without getting in the way of care.

Compliance: HIPAA
Domain: Behavioral Health
Scope: Clinical + Admin

03 · Responsible AI at real stakes

Constitutional AI Tutor

Constitutional AI with human oversight for an education partner.

An AI tutoring and guidance system built for an education partner, with rules, escalation paths, and human review loops that keep the humans in charge.

Model: Constitutional AI
Oversight: Human-in-loop
Sector: Education

04 · Graph knowledge architectures

Substrate

A personal cognitive data substrate.

A graph-based knowledge management system with spreading activation, designed to hold and connect personal and project context across time.

Architecture: Graph-native
Algorithm: Spreading activation
Scope: Cognitive infrastructure

How we work

Five phases. Hard gates.

Every engagement runs the same phase-gated discipline: evals before build, transfer at the end, and a written exit condition between every phase. No gate, no next phase. Fixed fees where it matters.

00
Validation
1–2 weeks
Before anything gets built, the stated need gets reconciled against reality: the actual data, the actual systems, the actual people who'll live with the answer. Most AI failures are decided here, before a line of code exists.
Gate: the brief survives contact with your data and systems
01
Eval Definition
1–2 weeks
Passing and failing behavior gets defined as a version-controlled eval suite before the build begins. The suite is the spec. It's how a non-deterministic system becomes something you can accept, regress, and ship.
Gate: no eval suite, no build
02
Build
3–6 weeks
The system gets built against the suite: shipped into real hands early, instrumented from day one, iterated until the evals pass at threshold. Guardrails are operational, not theoretical.
Gate: the eval suite passes at threshold
03
Production Hardening
2–3 weeks
Monitoring live, failure modes rehearsed, integration load-tested, ownership assigned by name. The difference between a pilot and a system is everything in this phase.
Gate: monitoring live, ownership assigned
04
Transfer
1–2 weeks
You keep the eval suite, the architecture, the runbooks, and the working knowledge. The engagement is architected from day one to end, with your team running it without us.
Gate: your team operates and extends it without us

Fit

Who this is (and isn't) for.

Good fit

Teams where AI has to be rigorous, auditable, and deployable the first time.
Organizations with committed AI spend that hasn't converted into shipped, measured value.
Healthcare, behavioral health, education, and other regulated or high-stakes domains.
Founders who need a working prototype in weeks, not a decision framework in months.

Not a fit

Staff augmentation or body-shop engagements.
AI readiness assessments, maturity models, or 20-page procurement roadmaps.
Projects that need a sales team, a marketing team, or a subcontracting layer.
Teams that haven't decided whether AI is worth doing. That's a different conversation.

82W takes on a small number of engagements at a time. Fit matters more than pipeline. If it's not a match, you'll hear that on the first call.

Start a conversation

Got a problem worth
shipping an answer to?

One email. Describe the problem, the stakes, and the timeline. We reply to everything that isn't a sales pitch.

Start a conversation brian@82wdigital.com

Shipping AI intoplaces it usuallydoesn't go.

Two practices. One way of working.

Applied AI Engineering

AI Value Realization

Things we've actually shipped.

NextDialog

Behavioral Health Platform

Constitutional AI Tutor

Substrate

Five phases. Hard gates.

Validation

Eval Definition

Build

Production Hardening

Transfer

Who this is (and isn't) for.

Got a problem worthshipping an answer to?

Shipping AI into
places it usually
doesn't go.

Got a problem worth
shipping an answer to?