See our work

Anyone can show a good number.

What is hard — and what we sell — is the rigor that makes a number trustworthy: tested out-of-sample, checked against chance, pre-registered, forward-tested, and reported honestly even when it kills the idea. Below: how we do that, then a worked example or two for each service.

We have no public client case studies yet, so nothing here is a dressed-up client result. Each example is tagged either Reproducible (a real, public result you can re-run, linked to its source) or Illustrative (a representative walk-through of the method, no invented figures).

How we make a result trustworthy

The same checks, on every result we hand you

Out-of-sample, always

We never grade a method on the data it was built on. We hold data back — different periods, different cases — and only believe a result if it survives there.

Tested against chance

Could a coin-flip have done this? We shuffle the labels and re-run (a permutation test). If a result does not clearly beat the shuffle, it does not ship.

Deflated for how many we tried

Try enough ideas and one looks great by luck. We deflate every result for the number of attempts behind it, so we are not rewarding a lucky search.

Pre-registered and hash-frozen

Before we see an outcome, we write down the decision, the hypotheses, and the exact test — then freeze it with a dated SHA-256 hash. The answer cannot be quietly moved after the fact.

Forward-tested

Some claims only the calendar can settle. We pre-register the prediction and let it read out on a future date — not in hindsight.

Reported even when it fails

Our own files are full of “we tried this, it did not work.” That is the point — a firm that only ever shows wins is hiding its losses.

Why this is the moat

“Isn't a data moat weak?” Often, yes — so we don't sell you data.

The investors who popularised the “data moat” are also its sharpest critics — and where they land is exactly where we build. We sell defensible data designplus the embedded process that turns it into a decision, with a named expert accountable for the result. The proof is the offer: point us at a real problem and we'll show the working on a worked example you keep.

“Instead of getting stronger, the defensible moat erodes as the data corpus grows and the competition races to catch up.”
— Martin Casado & Peter Lauten, “The Empty Promise of Data Moats,” a16z

“Surviving vendor scrutiny to get access to sensitive data can itself be a moat against competitors.”
— a16z, same analysis. That bar — pre-registered, SHA-256 hash-frozen design, an expert in the loop — is what we build to.

The same direction of travel elsewhere: McKinsey's QuantumBlack argues the agentic-AI edge comes from rebuilding workflows around agents, not the raw data (Seizing the agentic AI advantage), and Bessemer's vertical-AI principles stress data quality over quantity. Which is the work below.

Data-Bottleneck Diagnosis & Fix

Point us at the data-heavy step that is slowing you down; we rebuild it and prove the gain.

Reproducible

Reproducing an expensive benchmark from public data

We took a publicly-visible third-party screening benchmark and matched 95.8% of its top-100 from public data alone — then published the full reproducibility package, every factor's correlation included. The point is not the number; it is that you can re-run it yourself.

See the method + reproducibility →

Reproducible

Forecasting demand — and proving the number is real

On a public dataset of 17,379 hourly demand records, we forecast a later stretch of hours the model never saw. Scored the easy, flattering way (a random split), it looks like R² 0.67, where 1.0 is perfect. Scored the honest way, out-of-time, it is 0.62 — and it cuts a naive hour-of-day baseline's error by about 6%. The proof it is signal, not luck: across 1,000 shuffled-label runs, not one did better (p ≈ 0.001). Same public data, one script, the same numbers every time.

The public dataset (UCI) →

More worked examples are being added for this service.

The Forward-Data Process

Design, build and validate the dataset you should be building now for the decision you are about to make.

Reproducible

Pre-registering a decision so the answer cannot move

Before any data is touched, we freeze the decision, the hypotheses, the experiment and causal design, and the success test as a signed, dated, SHA-256 record. When the result lands there is no room to quietly re-cut it to taste — the freeze is the accountability.

See the Forward-Data Process →

Illustrative

Designing what to measure, before measuring it

Representative: a team about to run an experiment. We map the decision to its hypotheses, the causal design, the power needed to detect a real effect, and the pre-registration — so the data they collect is decision-grade, not vanity. Illustrative of the design process.

More worked examples are being added for this service.

Want this rigor on your problem?

Bring us the data-heavy step that is slowing you down, or the decision you need better data to make. We will scope it — and you will get the result with its working, not just the headline.

Start a project Get a free Data Snapshot

Anyone can show a good number.

The same checks, on every result we hand you

Out-of-sample, always

We never grade a method on the data it was built on. We hold data back — different periods, different cases — and only believe a result if it survives there.

Tested against chance

Could a coin-flip have done this? We shuffle the labels and re-run (a permutation test). If a result does not clearly beat the shuffle, it does not ship.

Deflated for how many we tried

Try enough ideas and one looks great by luck. We deflate every result for the number of attempts behind it, so we are not rewarding a lucky search.

Pre-registered and hash-frozen

Before we see an outcome, we write down the decision, the hypotheses, and the exact test — then freeze it with a dated SHA-256 hash. The answer cannot be quietly moved after the fact.

Forward-tested

Some claims only the calendar can settle. We pre-register the prediction and let it read out on a future date — not in hindsight.

Reported even when it fails

Our own files are full of “we tried this, it did not work.” That is the point — a firm that only ever shows wins is hiding its losses.

“Isn't a data moat weak?” Often, yes — so we don't sell you data.

“Instead of getting stronger, the defensible moat erodes as the data corpus grows and the competition races to catch up.”
— Martin Casado & Peter Lauten, “The Empty Promise of Data Moats,” a16z

“Surviving vendor scrutiny to get access to sensitive data can itself be a moat against competitors.”
— a16z, same analysis. That bar — pre-registered, SHA-256 hash-frozen design, an expert in the loop — is what we build to.

Data-Bottleneck Diagnosis & Fix

Point us at the data-heavy step that is slowing you down; we rebuild it and prove the gain.

Reproducible

Reproducing an expensive benchmark from public data

See the method + reproducibility →

Reproducible

Forecasting demand — and proving the number is real

The public dataset (UCI) →

More worked examples are being added for this service.

The Forward-Data Process

Design, build and validate the dataset you should be building now for the decision you are about to make.

Reproducible

Pre-registering a decision so the answer cannot move

See the Forward-Data Process →

Illustrative

Designing what to measure, before measuring it

More worked examples are being added for this service.