Rebuilding the Colour Index, the global dye and pigment reference, on an edge-first stack
The brief: the Colour Index is the global reference for dye and pigment data, maintained by the Society of Dyers and Colourists and relied on by manufacturers, formulators and researchers worldwide. The platform behind it had become an ageing Laravel and MySQL monolith carrying years of schema debt. The brief was to rebuild it ground-up, without losing a single record.
The problem
The Colour Index v2 was a Laravel and MySQL monolith. It worked, but it carried a decade of legacy schema debt, was slow to change, and a recurring source of support tickets. Years of open requests had piled up in the backlog with nowhere to go. The data model fought against the way the catalogue actually worked: a single shade dropdown where records needed many tags, classification taxonomies that had drifted, verified products that had quietly lost their tags, fingerprints stuck unpublished.
And the data itself was substantial and irreplaceable: tens of thousands of products, thousands of colour fingerprints with their chemical identifiers, structures and spectral data, over a thousand organisations, and nearly twelve thousand user accounts. None of it could be lost or reset in a rebuild. On top of that, decades of historical Colour Index publications existed only in print, invisible to anyone searching the modern catalogue.
The catalogue is a global standard. A rebuild could not afford to drop a record, break a login, or lose the provenance of who changed what and when.
What we built
A ground-up v3 on an edge-first, serverless stack (Cloudflare Workers and R2, with Supabase Postgres), designed around three principles: edge-first, API-first, and AI augments while humans verify.
- A full v2 to v3 migration that lost nothing. 45,720 products, 11,771 colour fingerprints, 1,090 organisations and around 11,800 users moved off the old monolith. Existing password hashes were migrated rather than reset, so users log in to v3 on day one with no disruption.
- Data repaired in flight. The migration did not just copy the old data, it cleaned it: 3,064 Verified product tags restored, 8,358 fingerprints flipped to published, 10,175 application-area links rebuilt, and 208 drifted attribute slugs canonicalised.
- A public, rate-limited API plus an MCP server and webhooks. The catalogue is now queryable by authenticated third parties and AI agents. The UI is just one consumer of the API; there are no UI-only endpoints.
- AI-assisted chemistry tooling. A SMILES change auto-regenerates the 2D structure image via RDKit; missing structures can be proposed by looking up PubChem; and a structure image can be turned back into SMILES by an extraction model. Every AI proposal carries a confidence score and is human-approved before it enters the canonical record.
- The Heritage Edition. Decades of historical Colour Index publications brought online as a searchable, AI-linked corpus, connected back to the modern entities they describe.
- A complete admin and subscriber platform. Eleven live dashboard panels, reporting with CSV and print-to-PDF export, a built-in CMS, comments and contributor workflows, and a self-service account suite spanning collections, organisations and subscriptions.
The result
The entire v2 catalogue now lives on a modern edge-first platform, cleaner than it was before the move, with the integrity guarantees the old monolith never had: every record can answer who changed what, when and why. Users keep their existing logins. The data that was scattered and partly broken in v2 came across consolidated and repaired.
Two things that simply did not exist before now do. The catalogue is exposed as a documented public API with an MCP server, so third parties and AI agents can build on it directly. And the Heritage corpus, previously locked in print, is searchable and linked to the live catalogue for the first time. The platform is now in launch hardening, but the hard part, moving a global reference dataset without losing a record, is done and proven.