A short account of the person on the other end of the email.

Engineer, manager, dad, builder. Spent the last stretch a Staff Engineer and Uber Tech Lead at Google. Chapter author of The Site Reliability Workbook and Building Secure and Reliable Systems. Now independent and exploring how to make autonomous agentic engineering reliable.

Based in: Los Angeles
Status: Independent · 2026
From: Google - 2017
Focus: AuthZ · AI · SRE

David Challoner smiling with his child in a black-and-white illustrated portrait

Family Portrait

Fig. 1 — The author, off the clock© DC

I spent nearly a decade at Google, including the last five years as an SRE “Uber” Tech Lead. My team ran the authorization and data-governance systems behind GCP’s Regulated Cloud, as well as the controls that governed privileged access inside Google. Running these large scale systems at this sort of unparalleled scale forces you think carefully about reliability even far out on the tail. This was a great opportunity but after nearly a decade it was time to try different things and I recently stepped out to build independently.

Along the way I co-wrote the Anthos Service Mesh adoption story for Google Corp Eng, was the primary author of “Eliminating Toil” in The Site Reliability Workbook (O’Reilly, 2018), and contributed to Building Secure and Reliable Systems (Google, 2020). I’ve also filed a handful of patent disclosures, including one on path-aware access control.

Now that I’ve left, I’m building a few things in the open: JustAuth, RepoGauge, and ClawFS. Each one pokes at a different edge of the question: what does infrastructure look like when the primary user is an agent, not a human?

Off the keyboard I’m a dad, archer, hiker, a long-time tinkerer, and the proud maintainer of several half-finished side projects.

Specialties

Authorization · Data Governance · Zero-Trust · Service Mesh · SRE · Agentic Systems.

Also

Rust · Python · Go · TypeScript. Writer, occasional speaker, patient debugger.

§II

Experience

A record of where the time went.

2026 —
Independent · Building & advising
Building JustAuth, RepoGauge, and ClawFS in the open. Taking on a small number of consulting engagements per quarter on authorization, reliability, service mesh, and agentic AI infrastructure.
AuthZAI Eng.SRE
— March 2026
Uber Tech lead · Google — Regulated Cloud SRE
Ran the authorization and data-governance systems that mediate how Googlers reach internal applications. Led the adoption of Anthos Service Mesh across Corp Eng to enforce consistent security policies over services deployed across cloud, corp data centers, and edge locations.
AuthZZero TrustAnthosIstio
Earlier at Google
Senior Site Reliability Engineer · Google
A long tour across SRE teams. Primary author of “Eliminating Toil” in The Site Reliability Workbook (O’Reilly, 2018) and contributing author to Building Secure and Reliable Systems (Google, 2020).
SREReliabilityPlatform
Before Google
Software Engineer · Startups & infrastructure
Early-career engineering in distributed systems and infrastructure. Learned what “good” operations looks like largely by not having them yet.
DistributedInfra

§II.ii

Publications

Written, co-written, and quoted elsewhere.

2018
The Site Reliability Workbook — Ch. 6 · O’Reilly · Primary author
“Eliminating Toil” — the chapter on identifying and reducing the repetitive, predictable work that erodes SRE team capacity. Co-authored with John Looney, Vivek Rau, and others.
BookSRE
2020
Building Secure and Reliable Systems · Google · Contributing author
Google’s follow-up to the SRE Book, bringing the disciplines of reliability and security into one volume.
BookSecurity
2022
Securing apps for Googlers using Anthos Service Mesh · Google Cloud Blog · Co-author
With Anthony Bushong. How Corp Eng and Access SRE adopted Anthos Service Mesh to mediate Googler access across trust boundaries with minimal operational overhead.
Service MeshZero Trust
2024
Access Control Based on Entire Request Path · Invention disclosure · TDCommons 6837
A path-aware access-control mechanism: validates every device a request traverses, not just endpoints — preventing regulated data from leaving permissible regions.
PatentAuthZ
2026
Climbing the Agentic Coding Ladder · LinkedIn · Essay
A working field guide to autonomous coding agents: which rungs are solid, which are aspirational, and how to tell in practice.
AI Eng.

§II.iii

Education

Somewhat formal training.

2004 — 2008
Computer Science + Political Science · University of California, Santa Cruz
Data structures and algorithims, storage systems (Ceph!), distributed systems. Also studied international trade, how democracies work, and which policies seem to promote the most prosperity.

§III

Principles

How I think about the work.

01
Boring is a feature.
The best infrastructure is the kind you can forget about. I reach for proven tools and earn the right to novelty.
02
Reliability is a lens.
SLOs before launches. Error budgets as real constraints. If you can’t measure it, you can’t protect it.
03
Small surface area.
Interfaces should be narrow, durable, and easy to reason about. Remove weight on the airframe.
04
Write things down.
Design docs, post-mortems, READMEs. The org’s memory outlives its people; the artifacts should be worth inheriting.
05
Technology agnostic.
I’ve seen pretty bad code in every language. Velocity and confidently shipping without regressions matters more than frameworks and languages.
06
Respect the on-call.
Be deliberate about what surfaces as a ticket vs a page. Waking people up kills them slowly.

A short account of the person on the other end of the email.

A record of where the time went.

Independent · Building & advising

Uber Tech lead · Google — Regulated Cloud SRE

Senior Site Reliability Engineer · Google

Software Engineer · Startups & infrastructure

Written, co-written, and quoted elsewhere.

The Site Reliability Workbook — Ch. 6 · O’Reilly · Primary author

Building Secure and Reliable Systems · Google · Contributing author

Securing apps for Googlers using Anthos Service Mesh · Google Cloud Blog · Co-author

Access Control Based on Entire Request Path · Invention disclosure · TDCommons 6837

Climbing the Agentic Coding Ladder · LinkedIn · Essay

Somewhat formal training.

Computer Science + Political Science · University of California, Santa Cruz

How I think about the work.

Boring is a feature.

Reliability is a lens.

Small surface area.

Write things down.

Technology agnostic.

Respect the on-call.