§ Chapter III

Consulting

Hard problems,
patiently solved.

Selected advisory work on infrastructure problems that matter — reliability, security, platform design, and the engineering behind serious AI systems.

Focus: Infrastructure · AI
Scope: Advisory · Delivery
Teams: Startup · Enterprise
Method: Hands-on

§I — Practice

Five pillars, one point of view.

05 / 05

Authorization & Data Governance

Who can do what, where, and why — answered well.

My home turf. I help teams design authorization that capture the whole story and hold up in regulated environments. Technical enforcemennt of personnel or device-attribute-based access control, resource-aware policies, business justification validation, tamper-proof audit, and human-and-agent approval flows. Drawn from a decade of running access systems at Google scale.

Typical outcomes

—AuthZ architecture that minimizes friction and survives audits
—Policy-as-code with strict SLOs and reliable continuous iteration of policy
—Immutable audit trails that actually answer questions
—A path for agents, not just humans, to use safely

Site Reliability Engineering

SLOs, error budgets, and grown-up on-call.

I wrote the chapter on this. Literally — “Eliminating Toil” in The Site Reliability Workbook. I help teams make reliability a first-class concern: SLOs that map to user journeys, error-budget culture that keeps launches honest, and on-call you can actually sustain.

Typical outcomes

—Service-level objectives tied to business value and real user journeys
—Observability that answers the 2am question
—On-call rotations engineers don’t dread
—Post-mortems that ask 5-whys, capture root-cause, and change behaviour

Service Mesh & Zero Trust

Connectivity without the chaos.

Istio, Envoy, Anthos — I co-wrote the reference account of how Google’s Corp Eng rolled ASM internally. I help teams pick the mesh shape that fits their traffic, their org, and their on-call load, and the migration path that doesn’t break them in the process.

Typical outcomes

—A mesh architecture matched to your actual needs
—Zero-trust mTLS done without breaking everything
—Progressive delivery engineers trust
—A migration plan with graceful rollbacks

Distributed Systems & Platform

Consensus, coordination, and correctness.

The problems that don’t appear in a single-node test suite. Consistency models, partition behaviour, leader election, back-pressure, idempotency. Plus the platform engineering around them: golden paths, cluster topology, and blast-radius control.

Typical outcomes

—Architecture reviews that find the real risks
—Correctness properties stated plainly and tested
—Replay, reconciliation, and recovery paths
—A platform team charter worth keeping

AI & Agentic Engineering

From demo to durable — when the user is an agent.

A working prototype is not a product — especially when the product talks back. I help teams build serious agentic systems: evals that correlate with user value, observability for non-deterministic systems, and the infrastructure question we’re all still answering: what do filesystems, identity, and access look like when the caller is an agent?

Typical outcomes

—Evals that correlate with user value
—Observability for non-deterministic systems
—Authorization and audit for agent callers
—Cost & latency budgets that hold under growth

§II

Enterprise roadmap

Selling into Big Enterprise?
Here’s the roadmap.

Most startups find the regulated-enterprise and federal markets opaque — a wall of acronyms sitting between them and ten-figure deal sizes. It isn’t opaque. It’s a staged climb with well-known rungs, and each rung is tractable if you plan for it. Here’s the ladder I help teams map against — and the insider-risk controls that live across every stage.

00
Foundations
Stage 00
Unlocks — Mid-market & large commercial — most procurement reviews clear here.
SOC 2 Type IIISO 27001Evidence automationVPAT / accessibility
Table stakes. The goal is not to pass — it’s to pass cheaply and never again.
01
Commercial federal
Stage 01
Unlocks — Civilian agencies, state & local, heavily regulated commercial.
FedRAMP ModerateStateRAMPCJIS where relevantCustomer-held encryption (BYOK/HYOK)US-only residency
The first step that actually changes the product. Expect 12–18 months and a dedicated program lead.
02
DoD-adjacent
Stage 02
Unlocks — DoD mission owners, intelligence community primes, defence industrial base.
FedRAMP HighDoD IL4 · IL5CMMC 2.0 L2 / L3GovCloud isolationHardware root of trust
Architecture starts to fork. Worth it if your deal sizes justify a second SKU.
03
Export-controlled
Stage 03
Unlocks — Primes, defence R&D, regulated research, the top of the pyramid.
ITAR boundaryEAR / dual-use controlsNationality-aware accessRegional data residency (EU, UK, AUKUS)Air-gapped / sovereign options
Path-aware access control — not just endpoint checks — is the quiet load-bearing requirement here.

Cross-cutting

Insider risk &
privileged access

The same set of controls keeps reappearing at every stage — and it’s the set procurement and internal audit teams scrutinise most. Get these right once and the climb is dramatically cheaper.

01
Privileged access, justified in real time
Every privileged action — human or agent — gates on a business justification evaluated in under 200ms. Justification and decision are sealed into the audit record.
02
Path-aware authorization
Validate every device a request traverses, not just source and destination. The class of control regulators are increasingly asking for, and what my 2024 patent disclosure addresses.
03
Agent identity, audited like human identity
Treat AI agents as first-class principals: their own identity, their own scoped permissions, their own attributable audit trail. Not a shared service account.
04
Break-glass without the glass shattering
Emergency access that is fast, heavily-audited, time-boxed, and reviewed. A pattern that survives a P0 *and* a post-incident GRC sit-down.
05
Access reviews that aren’t theatre
Quarterly attestations that actually remove privileges — structured so reviewers can say yes or no without reading a thousand rows.

Referencejustauth.tech ↗

§III

Process

How an engagement unfolds.

Every engagement starts with a conversation and ends with something you keep. What happens in between is shaped to the problem.

I30 min
Listen
A free call. You describe the problem; I ask a lot of questions. We both leave with a clearer picture of whether I’m useful.
II1 week
Scope
A short written proposal: the problem as I understand it, the proposed engagement shape, deliverables, and a fixed price.
III2–6 weeks
Dive
I work alongside your team — reading code, instrumenting systems, joining standups, writing design docs, shipping fixes.
IV1 week
Hand-off
A durable artifact: a memo, a working system, a hiring brief, a playbook. The goal is that you don’t need me after I leave.

§IV

Case notes

The shape of the work.

A few representative engagements drawn from years on Access SRE. Companies anonymised, details adjusted, shape preserved.

Case 01

Regulated-industry SaaS

6 weeks

Problem

“Zero-trust authorization stalled at scale — policy latency made every request a liability.”

Approach

Re-shaped evaluation as path-aware, with attribute inputs resolved at the edge and a tiered cache for device and location signals.

Result

Policy evaluation under 50ms at p99. Audit coverage held. Compliance team stopped blocking launches.

AuthZZero Trust

Case 02

Legal + finance org

8 weeks

Problem

“High-value research was slow, manual, and too expensive to repeat reliably.”

Approach

Built grounded AI research workflows with clear review checkpoints, source capture, and failure boundaries that made the output auditable.

Result

Turnaround dropped from days to minutes, replacing expensive manual research loops with agentic systems people would actually use.

AI EngReliability

Case 03

AI-native startup

4 weeks

Problem

“Agentic system worked in demo, drifted in prod; no way to measure the drift.”

Approach

Built a repo-grounded eval harness (see RepoGauge), wired evals into CI, and defined the authorization boundaries the agents had to respect.

Result

Regression caught in PR rather than from customers. Agent cost per task down 35%. Trust, slowly, restored.

AI Eng.Evals

§V

Engagement

Three shapes of working together.

Office hours

Weekly · ongoing·From $2.5k / mo

Two hours of my week, on tap. A standing call plus async Slack/email. The right shape for teams that need a thoughtful outside eye, not a deliverable.

✓Architecture review on demand
✓Hiring & rubric guidance
✓Code & design-doc reads

Most chosen

Focused engagement

2–8 weeks·Fixed-price

Most common. A defined problem with a defined outcome — a reliability rollout, an AI pipeline hardening, a platform blueprint. Scoped together before we start.

✓Named deliverable(s)
✓Code contributions where useful
✓Written artifact you keep

Advisory

Quarterly·Cash or equity

For founders in the early innings. A few hours a month at your side as you make the architectural decisions that are hard to unmake later.

✓Strategic technical direction
✓Interview loops & senior hires
✓Early-stage partnership

To begin

Tell me what’s on fire.

A single, honest paragraph is all it takes to start. I read every message, and reply to most within two business days.

[email protected]→See past work

Hard problems,patiently solved.

Five pillars, one point of view.

Authorization & Data Governance

Site Reliability Engineering

Service Mesh & Zero Trust

Distributed Systems & Platform

AI & Agentic Engineering

Selling into Big Enterprise?Here’s the roadmap.

Foundations

Commercial federal

DoD-adjacent

Export-controlled

How an engagement unfolds.

Listen

Scope

Dive

Hand-off

The shape of the work.

Three shapes of working together.

Office hours

Focused engagement

Advisory

Tell me what’s on fire.

Hard problems,
patiently solved.

Selling into Big Enterprise?
Here’s the roadmap.