AI systems / retrieval / evals / architecture
AI systems that hold up in production.
I build agent workflows, retrieval layers, and internal AI platforms that stay legible under real constraints through grounding, evals, observability, and disciplined integration work.
Best fit: teams where the model is no longer the interesting part.
What I audit first
The first pass is usually enough to tell whether the system is trustworthy, merely busy, or headed for a political meeting.
01
Retrieval quality and fallback behavior
02
Tool boundaries and approval paths
03
Evaluation coverage and failure criteria
04
Telemetry, rollback, and operator visibility
05
Ownership once the system is live
Writing
Notes on systems, delivery, and the failure modes in between.
Notes on grounded AI systems, evals, observability, and integration-heavy delivery once the demo is no longer the hard part.
Best Practices for Building Agentic AI Systems in 2026
A practical March 2026 playbook for agent systems: tool contracts, approvals, retrieval discipline, evals, and telemetry.
Preventing Hallucinations in LLM Systems
A March 2026 playbook for groundedness: retrieval discipline, abstention, claim checks, evals, and guardrails.
Spec-Driven Development for Agent Workflows
A practical 2026 view of SDD: project brief, research, architecture, ticket sync, implementation, janitor.
Assistant
A bounded assistant for public questions.
Ask about grounding, architecture, delivery style, or role fit. It uses the public profile, published writing, and site metadata.
Architecture, delivery, retrieval, fit
Ask about system design, delivery, or fit.
A few useful starting points are built into the chat.
Contact
Available for selected architecture roles, focused audits, and embedded build shaping.
Best for teams with a live system, a trust problem, or an integration mess that has started to become political.