Colour Index v2 to v3: migrating a global standard

The Colour Index — the global reference for dye and pigment data, maintained by the Society of Dyers and Colourists and relied on by manufacturers, formulators and researchers worldwide — has moved to a new platform. We rebuilt it ground-up as v3: edge-first, API-first, and migrated off its ageing monolith without losing a single record.

The headline isn't the new stack, satisfying as it is. It's that a global standard dataset changed platforms underneath the people who depend on it every day, and nobody had to stop, re-register, or lose their place. The hard part of any platform rebuild isn't the code you write — it's the data you can't afford to drop. Here's how the v2 → v3 move went.

Every product and colour fingerprint carried across from v2 — nothing left behind

records migrated

Why v2 had to go

The Colour Index v2 was a Laravel and MySQL monolith. It worked — it had served the catalogue for years — but it carried a decade of legacy schema debt, was slow to change, and had become a recurring source of support tickets. Years of open requests had piled up in the backlog with nowhere to go, because the data model fought against the way the catalogue actually works: a single shade dropdown where records needed many tags, classification taxonomies that had quietly drifted, verified products that had lost their tags, fingerprints stuck unpublished.

And the data itself was substantial and irreplaceable: tens of thousands of products, thousands of colour fingerprints with their chemical identifiers, structures and spectral data, over a thousand organisations, and nearly twelve thousand user accounts. None of it could be lost or reset in a rebuild.

The catalogue is a global standard. A rebuild could not afford to drop a record, break a login, or lose the provenance of who changed what and when.

The migration that lost nothing

The numbers tell the story. Moving off the old monolith, we carried across 45,720 products, 11,771 colour fingerprints, 1,090 organisations and around 11,800 users. Crucially, existing password hashes were migrated rather than reset — so on day one, every user logs in to v3 with the credentials they already had. No reset emails, no lockouts, no “please re-register” banner. The platform changed; their experience of signing in didn't.

Repaired in flight

A migration that only copies data forward inherits all of its old problems. This one didn't just move the catalogue — it cleaned it on the way through. Records that had drifted or quietly broken in v2 arrived in v3 consolidated and corrected.

Data corrected as it crossed, not after — v3 arrived cleaner than v2 ever was

Verified product tags restored

fingerprints flipped to published

application-area links rebuilt

drifted attribute slugs canonicalised

3,064 Verified product tags restored. 8,358 fingerprints flipped to published. 10,175 application-area links rebuilt. 208 drifted attribute slugs canonicalised. Each of those was a small, quiet wrong in v2 that the rebuild put right — so the catalogue users see in v3 is not just the same data on a faster platform, it's a better-formed version of the data than they had before.

Edge-first, and API-first this time

v3 runs on a modern, edge-first serverless stack — Cloudflare Workers and R2, with Supabase Postgres — designed around three principles: edge-first, API-first, and AI augments while humans verify. The most consequential of those is API-first. The catalogue is now exposed as a documented, rate-limited public API, with an MCP server and webhooks alongside it. The website is just one consumer of that API; there are no UI-only endpoints. For the first time, authenticated third parties and AI agents can build directly on the world's reference dataset for colour.

One documented API — queried by the site, third parties and AI agents over MCP

api.colourindex.orgMCP

GET /v3/colourants/CI-Pigment-Red-122

200 OK

{
  "ci_number": "73915",
  "name": "C.I. Pigment Red 122",
  "class": "Quinacridone",
  "status": "verified"
}

The chemistry tooling leans on AI without ever handing it the final say. A SMILES change auto-regenerates the 2D structure image via RDKit; a missing structure can be proposed by looking it up on PubChem; and a structure image can be turned back into SMILES by an extraction model. Every AI proposal carries a confidence score and is human-approved before it enters the canonical record. The machine drafts; a person signs off.

AI proposes a structure with a confidence score — a person approves before it's canonical

SMILESO=C1c2cc3cc2…c4ccccc4

↓RDKit regenerates the 2D structure

AI proposed · 98% confidence

✓ Human approved · entered the canonical record

And something that never existed before: Heritage

Decades of historical Colour Index publications existed only in print — invisible to anyone searching the modern catalogue. v3 brings them online as a searchable, AI-linked corpus, connected back to the modern entities they describe. The provenance of a pigment, once buried in a printed volume on a shelf, is now a click away from its live record.

Where it stands

The entire v2 catalogue now lives on v3: cleaner than it was before the move, with integrity guarantees the old monolith never had — every record can answer who changed what, when and why. The platform is in launch hardening now, but the hard part — moving a global reference dataset between platforms without losing a record — is done and proven.

If you want the full engineering story — the data repair, the edge architecture, the AI tooling and the Heritage build — we wrote it up as a case study. And if you've got an ageing platform of your own carrying a decade of schema debt, that's exactly the kind of move we like to be handed: tell us about it.

Moving the Colour Index from v2 to v3, without losing a record

Why v2 had to go

The migration that lost nothing

Repaired in flight

Edge-first, and API-first this time

And something that never existed before: Heritage

Where it stands

Got a platform carrying
a decade of schema debt?

Moving the Colour Index from v2 to v3, without losing a record

Why v2 had to go

The migration that lost nothing

Repaired in flight

Edge-first, and API-first this time

And something that never existed before: Heritage

Where it stands

Got a platform carryinga decade of schema debt?

Got a platform carrying
a decade of schema debt?