Selected Work

CarbonPoolProduction AI

Two production AI agents, gated on evaluation, inside an insurance startup

Director of Climate Risk Products and Applied AI · CarbonPool

Days → under 10 minutes
~80% of team on AI-augmented workflows
2 agents in production

Technical project assessment and counterparty due diligence took several days of manual analyst work, and in an insurance context the output carries liability, so speed could not come at the cost of correctness.

What I did

Architected, built, and deployed two production AI agents with Claude Code: an applied research agent for technical project assessment, and a due-diligence / KYC agent for project developers, investors, and documentation. Each was gated on the model-evaluation pipeline before any client use. I also rolled out Claude Code to the modelling team and Cowork to the C-suite.

Outcome

Manual turnaround cut from several days to under 10 minutes. Around 80% of the team onboarded onto AI-augmented workflows, with time saved measured per workflow.

Skills

Production agent development
Evaluation-first governance
AI adoption across a regulated org
Tech-to-non-tech translation

View Demo View Repo

Open sourceLive demo

A UK fair house price estimator that reports its own confidence

Personal open-source project, built with Claude Code on UK open data

3-method ensemble
10 pre-registered gates
~10% median error

Most automated valuation tools give a single confident number with no error band. This one runs a three-method ensemble, matched comparable sales, £/sq ft from EPC-verified floor area, and a public AVM cross-check. Then it reports a range and a confidence rating, and abstains when the methods disagree.

What I did

Built the whole thing with Claude Code on UK open data alone, HM Land Registry sold prices and the EPC register, with no Rightmove or Zoopla scraping. The estimate runs three methods: matched comparable sales, a £/sq ft figure from EPC floor area, and an optional public-AVM cross-check. It reconciles them into a range and a confidence rating, and when they disagree by more than 20% it abstains instead of guessing. Above the estimate is an accountability layer of 10 pre-registered gates that an area has to clear before the tool will publish a verdict for it. I fix every threshold before I see the results, and I don't move them afterwards.

Outcome

All three demo areas (Maidenhead, West Ealing, Hayes) quarantine right now, and that's the design paying off. Both open-data methods land near a 10% median-error floor while the coverage gate wants about 5%. I ran a feasibility audit to be sure, and there's no defensible change that turns a quarantine into a pass without moving a threshold I'd committed to leaving alone. So the tool holds the line and abstains. It won't ship a number it can't stand behind, and that refusal is the part I'm proudest of.

Skills

Eval-first development
Calibrated confidence and abstention
Pre-registered decision gates and refuse-to-ship governance
Uncertainty quantification for high-stakes decisions
Reproducible pipelines with end-to-end provenance

View Demo View Repo

CarbonPoolRisk modelling

Zero to $XX M: a risk model behind a first-of-kind insurance product

Technical Team Founding Member (Employee #5) · CarbonPool

$XX M pre-underwritten premiums
50% faster pre-underwriting
100+ projects evaluated

A new carbon-credit insurance product needed a risk model that did not yet exist, built to standards an insurer and its regulator could stand behind.

What I did

Built the 0-to-1 risk and model-evaluation pipeline and architected the end-to-end risk-modelling stack and AWS structure to insurance-grade reproducibility. Established model governance, documentation, and data-quality protocols across 4 teams, and evaluated 100+ carbon-removal projects across nature-based and engineered pathways on Verra, Gold Standard, ACR, and CAR registries. The uncertainty methodology was reviewed by FINMA. Rebuilt the analytical codebase with Claude Code as advisor, writer, and reviewer.

Outcome

The model generated $XX M in pre-underwritten premiums. Pre-underwriting time cut 50% and the assessable project portfolio expanded. Analytical turnaround cut ~97%, recovering around $1,000/day in analyst hours.

Skills

0-to-1 risk modelling
Regulatory accountability
Model governance
Technical architecture
Portfolio and pricing input

McKinsey&CoClimate analytics

TCFD physical-risk disclosure for enterprise clients at McKinsey

Research Science Specialist (Climate Data Scientist) · McKinsey & Company

~$XX M revenue-at-risk per client
3 enterprise clients
30+ AI/ML technologies assessed

Three top enterprise clients needed TCFD physical-risk disclosures that quantified financial exposure credibly enough for C-suite, risk, and technical stakeholders.

What I did

Delivered asset-level climate-hazard exposure analysis using Python, statistical modelling, and geospatial visualisation, and presented findings across stakeholder levels. Separately, built and led a competitive-intelligence programme covering 30+ AI/ML climate and decarbonisation technologies.

Outcome

Quantified approximately $XX M in revenue-at-risk per client. The competitive-intelligence investment thesis was adopted by the firm's green-business-building practice and published in a firm report.

Skills

Financial-impact quantification
Executive communication
Geospatial and statistical modelling
Competitive intelligence

Two production AI agents, gated on evaluation, inside an insurance startup

What I did

Outcome

Skills

A UK fair house price estimator that reports its own confidence

What I did

Outcome

Skills

Zero to $XX M: a risk model behind a first-of-kind insurance product

What I did

Outcome

Skills

TCFD physical-risk disclosure for enterprise clients at McKinsey

What I did

Outcome

Skills

Want the detail behind any of these?