Cross-validation — does an independent dataset agree? · Learn

In one lineBefore trusting "rainfall is falling here," ask a different rainfall dataset — built from different inputs — the same question. If both agree, your confidence goes up. If they don't, you've caught a problem.

Why one dataset isn't enough

Every dataset has a personality. ERA5 is a reanalysis — a weather model run over history, nudged by observations. It's superb for temperature but has known biases for precipitation (models struggle with rain). So an ERA5-only rainfall trend could be partly an artefact of the model, not the climate.

The fix is triangulation. CHIRPS is built completely differently — it blends rain-gauge records with satellite cloud-temperature estimates. If ERA5 and CHIRPS both say "drying, about this fast," the answer survived an independent check. That's what the "Cross-check" line on a /verify precipitation trend reports.

What "agree" actually means

Two estimates agree when they point the same direction and their confidence intervals overlap — i.e. the difference between them is within the noise. It's a weak-but-real test: cheap to run, and it catches gross errors and product-specific artefacts.

The honest crack"Agree" means "survived one independent check" — not "true." Two gridded products can lean on some of the same satellite inputs and share a bias, so they can be wrong together. Cross-validation raises confidence; it doesn't prove correctness. For that you need to leave the models entirely and compare to ground instruments.

Play with it

Two trend estimates with their uncertainty. Slide them apart, or widen the uncertainty, and watch the verdict flip between agree and disagree.

ERA5 slope CHIRPS slope Uncertainty (± CI)

direction: CIs overlap:

Do it yourself

editable · runs in your browser

import numpy as np
from scipy.stats import theilslopes
rng = np.random.default_rng(1)
years = np.arange(1990, 2021)
truth = -3.0 * (years - 1990)                    # a real decline. Flip the sign on one product -> DISAGREE.
era5   = 1300 + truth + rng.normal(0, 70, years.size)
chirps = 1300 + truth + rng.normal(0, 70, years.size)
def trend(v):
    s, _, lo, hi = theilslopes(v, years); return s, lo, hi
es, elo, ehi = trend(era5); cs, clo, chi = trend(chirps)
same_sign = (es > 0) == (cs > 0); overlap = elo <= chi and clo <= ehi
agree = same_sign and overlap
print("ERA5  :", round(es, 2), "/yr    CHIRPS:", round(cs, 2), "/yr")
print(">>>", "AGREE - both independent products tell the same story" if agree else "DISAGREE - do not trust either yet")

This is internal consistency — the system checking itself. It's necessary but not sufficient. The next guide, ground-truth anchoring, is the step that actually leaves the models and checks against the physical world.