Projects

Transferithm

Calibrated transfer windows for the people who actually read the rumour mill.

StatusActive
Started2025-10
Stackpython · duckdb · scikit-learn · sportmonks · langchain · langgraph · pgvector · astro

What it is

A six-layer prediction system that fuses three independent streams of evidence — structural data (contracts, wages, age curves), rumour signals (NLP over a corpus of newsroom articles), and historical priors from Sportmonks — into a single calibrated probability per player per window.

The system is being built ahead of the 2026 World Cup, with the goal of publishing a public weekly probability sheet during the summer transfer window.

Why this and not a deep learning model

Soccer transfer prediction is a low-volume, high-stakes domain: a few hundred predictions per window, every one of them retroactively scoreable against a public outcome. In that setting the moat is not modelling capacity — it’s calibration. A Brier-calibrated model with auditable signal weights is something a journalist or an analyst can defend in print. A black-box deep model is not.

This is also the project where I get to work through the architecture patterns most relevant to Applied AI SA roles: two-stage retrieval, multi-source signal fusion, post-hoc calibration, agent-style orchestration via LangGraph, eval harnesses, and observability via Langfuse.

Status

Layer 0 (data ingestion) and Layer 1 (structural feature store) are in active build. Public dashboard targeted for the 2026 summer window. RITHM, the canonical scoring system, replaces an earlier intelligence-analysis prototype.

Linked writing

The architecture writeup will live under /writing/ once it’s published. Until then this page is the canonical pointer.