It's more complicated than you might think.
We don't just compare slider values. Both colors are converted into a color space designed to match human vision, then the distance between them is measured using CIEDE2000 — a formula from color science that quantifies how different two colors look, not how far apart their numbers are. It corrects for known issues where older formulas scored greens, blues, and purples unfairly. That distance is shaped into a game score that rewards getting the right color family — because remembering "it was a warm orange" is the hardest part, and that should count. Five rounds, 0-10 per round, max 50.
The game picker uses three sliders: Hue, Saturation, and Brightness (HSB). It's tempting to just measure how far off each slider is and call it a day. But that would produce scores that feel unfair — because human eyes don't see color the way numbers work.
The same numerical difference on a slider can look dramatic or invisible depending on context. A scoring system based on raw slider math would reward and punish the wrong things. So we use color science instead.
Four stages: convert to a perceptual color space, measure distance, shape it into a score, then adjust for hue accuracy.
Both colors are converted from HSB to CIELAB — a color model specifically designed so that equal distances correspond to equal perceived differences. It was created by the International Commission on Illumination (CIE) and is the standard model in color science for measuring how colors look to people.
CIELAB has three axes:
With both colors in CIELAB, we measure the perceptual distance between them using CIEDE2000 — the most accurate Delta E formula in color science. Unlike the simpler CIE76 (which just measures straight-line distance in Lab), CIEDE2000 applies corrections for lightness, chroma, and hue that match how human vision actually works. It's the industry standard in manufacturing, printing, and display calibration.
The formula accounts for the fact that we're more sensitive to differences in some color regions than others — a hue shift near green looks bigger to us than the same shift near blue. CIE76 ignores this, which is why the original version of the game scored greens and purples unfairly harshly.
What does a CIEDE2000 distance actually mean?
A raw CIEDE2000 value is a distance, not a game score. We need to map it to 0-10 in a way that feels fair. A straight line wouldn't work — it would be too generous for mediocre guesses and too harsh near the top. Instead, the scoring uses an S-shaped curve that's generous for close matches, punishing for misses, and steepest in the middle where differentiation matters most:
The two constants control the shape:
This means precision matters most above 7/10 — you need to be very close to earn a high score, but bad guesses all compress toward 1-2.
The base score treats all perceptual errors equally. But in a memory game, remembering the right color family is the hardest part and the most satisfying to get right. If you remembered "it was a warm orange" and nailed the hue but were off on brightness, that should count for something. Two adjustments tilt the score toward hue accuracy.
If you got the hue right (within about 25°), you earn back some of the points you lost from saturation or brightness errors:
Recovery is lighter than it was under the old CIE76 system (0.25 vs 0.50) because CIEDE2000 already handles hue-region differences more accurately. The bonus is still meaningful — nailing the hue when brightness or saturation are off can recover 1-2 points. On grays (saturation under 30%), recovery fades to zero — because hue is visually meaningless on desaturated colors.
If your hue is off by more than 30°, you take a penalty — but only on vivid colors where that difference is actually visible:
The penalty is much lighter than the old CIE76 system (0.15 vs 0.4) because CIEDE2000 already produces high distances for wrong-hue guesses — no need to double-count. The 30° dead zone means small hue errors are never penalized. And guessing the wrong hue on a gray costs nothing.
Adjust the sliders and watch the score update in real time. The white dot on the S-curve above tracks your position.
The game originally used CIE76 (from 1976), the simplest Delta E formula — a straight-line distance in Lab space. It worked, but it didn't treat all colors equally. A 20° hue shift on green produced a CIE76 distance 3x larger than the same shift on blue. That meant greens, purples, and cyans were scored unfairly harshly for the same quality of guess. We switched to CIEDE2000, which corrects for this. The comparison tool below still works — you can see the difference for yourself.
Adjust the pick color and see how both systems score the same pair. Try the presets for the most interesting cases where they disagree.
We ran both systems across every hue at multiple error sizes. The results were clear.
When the pick is close to the target — the most common gameplay scenario — both systems produce nearly identical scores. Across 12 hues with a typical small error, 8 pairs were within 0.3 points. For the range where most games are decided, it doesn't matter which system you use.
A 20° hue shift on green produces a CIE76 distance of 19.0 but a CIEDE2000 distance of just 6.2 — a 3:1 ratio. The same shift on red is 22.9 vs 15.1 — only 1.5:1. The old system was significantly harsher on greens and purples than on reds and oranges for the same size error. There was no gameplay reason for that — it was an artifact of CIE76's known non-uniformity. CIEDE2000 corrects for this.
Hue recovery — the game-design layer that rewards remembering the right color family — works with either foundation. When you nail the hue but miss the brightness, recovery adds 1-2 points. When you get the hue wrong, a light penalty applies. This carried over to CIEDE2000 with retuned constants.
CIEDE2000 is now live. The game-design layer — hue recovery, hue penalty, the S-curve — carried over with recalibrated constants. The S-curve midpoint moved from 38 (CIE76 scale) to 25.25 (CIEDE2000 scale), and the hue adjustments were reduced (recovery 0.50 → 0.25, penalty 0.4 → 0.15) because CIEDE2000 already handles the cross-region fairness that the old adjustments were partially compensating for.
The overall difficulty is unchanged — average scores across 50,000 random color pairs are within 0.001 of the old system. But greens, blues, and purples are no longer penalized for being in the wrong part of the color wheel.