AI Systems Researcher

Building AI agents
you can trust.

I research evaluation methodologies and build infrastructure that makes AI systems reliable enough for production. Currently focused on agent orchestration and self-improving pipelines.

Vanta / ThreatKey (acquired) / Snap

View research→Get in touch

Research Focus

Agent Evaluation

Metrics and methodologies for measuring agent reliability

DSPy Systems

Self-improving pipelines with programmatic optimization

Safety Infrastructure

Guardrails and circuit breakers for production AI

Open Source

View all →

Cognitive Dissonance Detection

267★

Multi-agent system for detecting inconsistencies in AI outputs

↗

DSPy Micro Agent

59★

Minimal agent runtime with eval harness

↗

EvalOps

Infrastructure for agent reliability

↗

Recent Writing

View all →

Jan 3

Working on agent reliability?

I consult on evaluation infrastructure and agent architecture.

Book a session→

AI Systems Researcher

Building AI agents
you can trust.

I research evaluation methodologies and build infrastructure that makes AI systems reliable enough for production. Currently focused on agent orchestration and self-improving pipelines.

Vanta / ThreatKey (acquired) / Snap

View research→Get in touch