Tuesday, July 25, 2017

Is 1.0 delta E a "just noticeable difference"?

My favorite scene from Fiddler on the Roof has a group of men talking politics in the town square. One of the men says that the czar is a really great guy who is bringing prosperity to the little town of Anatevka. To this, Tevya (the main character in the play) replies "Yes... You're right." I wish I could do accents in this typed blog. Imagine a rich, deep, heavily accented Russian-Jewish voice.

Another man disagrees, saying that the czar is destroying tradition in the village. Tevye again nods his head and says "Ahhh... You're right."

Yet another man looks at Tevye and says, "Tevye, how can they both be right??!?" Tevye slowly shakes his head in agreement, "You know, you are also right."

This is the third blog post in this series about the measurement of color difference. To recap, here are the two previous, contradictory explanations about the unit of measure of color difference.

   1. The size of a DE color difference is based on the Munsell Color System, which is all about uniform spacing of colors. 1.0 DE00 is one of 76 perceptually equal steps between pure black and pure white. Color differences throughout color space are scaled to this.

    2. The size of a DE color difference is all about tolerances in the industry. For print work, 2.0 DE00 is considered pretty darn good, and 6.0 DE00 is merely "pleasing".

Naturally, I will tie this up by providing a third contradictory explanation.

What is a JND?

The year 1931 was a banner year for color science. This year saw the publication of a set of tables that directly related to the color response of the human eye. Color could now be measured. Because of this work, you can measure the spectrum of a sample with a spectrophotometer, and then use the tables to convert to a real color.

The four little bumps that all color scientists know and love

I need to explain what I meant by real color. The measurement and subsequent computation would give you a set of three numbers, X, Y, and Z, which are called the tristimulus values. The numbers came with a guarantee: If two color samples measured exactly the same tristimulus values, then they would be perceived as the same color.

But, tristimulus values have two drawbacks. First, the values are non-intuitive. It was not a simple mental task to convert back and forth between tristimulus values and our common concept of color. More importantly, there was not an answer to a very basic and important question in the color industry: How close do two XYZ values need to be in order for them to be a good match?

David MacAdam sought to answer this question in his 1942 paper, entitled "Visual Sensitivities to Color Differences in Daylight". He performed a series of experiments where the test subject adjusted knobs to make one color match another color. Naturally, even if the same test subject repeatedly performs this task, the numbers won't always come out the same. He coined the phrase just noticeable difference to describe this variability.

MacAdams's gizmo for testing color discrimination
not to be confused with gizmos for discrimination against color

MacAdam created what have become known as the MacAdam ellipses, as shown in the image below. The image below is called a chromaticity diagram, and is based on the XYZ values. The ellipses  in the plot represent regions of ambiguity, magnified by a factor of ten. According to his tests, all colors within the various regions are indistinguishable.

These ellipses are ten standard deviation units across. (He chose ten in order to make the ellipses visible.) By my calculation, roughly 39% of all observations would be within ellipses of one standard deviation unit, and about 87% should be within ellipses that are twice the size.

This was a landmark paper. There have been something over 1,100 citations to it. The basic concept in MacAdam's paper was an enormously important realization for everyone who needed to put conformance ranges around color values. If you were to use XYZ as a target value for a color, then you have to allow different acceptance windows for every color and for every direction of color change. Yuck! 

Here is an interesting factoid: the MacAdam ellipses are a counterexample to Stigler's Law of Eponomy. Unlike virtually every other scientific discovery, history has correctly named these ellipses after the person who first described them. Then again, if Stigler's law is infallible, then Science has lost the name of the person who originally proposed the MacAdam ellipses.

A moment of candor with John the Math Guy

I Googled some names, and could not find anyone by the name of Avard Håkansson. Since the name of the original inventor of the MacAdam ellipses is lost, and the name Avard Håkansson is lost, it logically follows that the ellipses should be named the Håkansson ellipses. I am circulating a petition to update the 1,100 or so papers, and the 40K+ websites that refer to the MacAdam ellipses.

MacAdam's paper spurred interest in finding some transform to apply to tristimulus values that would lead to a color space that is uniform. Many attempts have been made at this. I have previously blogged on that subject (boy, that's a surprise), and in my normal obsessive compulsive way, I identified 14 different attempts between 1989 and 2010 alone.

Here's a bigger surprise. I am aware of only a handful of color spaces that were directly based on the MacAdams ellipses. Two were developed by a trio consisting of Friele, MacAdam, and Chickering, and are eponymously called FMC-1 and FMC-2.

The set of equations for FMC-2 are described in the 2002 release of the standard ASTM D2244:02. This document describes the equations in an annex with the title "Color Spaces and Color Difference Metrics No Longer Recommended But Still in Use". The 2004 version of this standard omitted this annex. It is my understanding that the last person who was using FMC-2 retired in 2003.

Here is an interesting factoid: Evidently, just like Fleetwood Mac, Simon and Garfunkel, and the Beetles, there was a falling out between these three gents, and Friele went on to develop the FCM color space by himself in 1978. The new acronym stands for Fine Color Metric. Just imagine the outrage when MacAdam and Chickering found that the acronym did not include their own names!

So, long story short, a just noticeable difference is based on the work of MacAdam on the smallest differences in color that a person is able to discern. To the best of my knowledge, there are no color spaces developed directly on this work that are currently commercially available.

Are they the same? 

I have heard it said that 1.0 DE is 1.0 JND. Is this true?

It is worth noting that CIE 142-2001, which defines DE00, does not include the name "MacAdam" as a reference, or include the phrase "just noticeable difference". So clearly the answer to that rhetorical question is no. Further, a JND and a DE are based on different data sets, so they will differ numerically.

There is a somewhat more philosophical answer, however. One of these measures of color difference is based on perceptibility, and the other on uniformity. Are these ultimately related? I propose a gedanken experiment.

Supposed that I create the gray-to-burnt-orange ramps (described in a previous blog post) in a slightly different way. Instead of basing the ramps of finer and finer subdivision of the range from gray to burnt orange, what if we arranged the spacing of the colors by taking very tiny steps from gray to burnt orange. Each tiny step would be a just noticeable difference.

The development of a JND scale (top) versus
the development of a perceptually linear scale (bottom)

Would the spacing on the two ramps be kinda the same? Another way of asking that question: When my brain makes a judgment call about midway-ness between two colors, does it count JND steps to reach that conclusion?

I don't know. I suspect that there is not a "color midway-ness determination area of the brain". I don't think it is a fundamental concept, and as such, there is a lot of variability in what people might call midway between two colors that are a modest distance apart. I have a set of special brain probes on order through Amazon. When they arrive, I will get right on that question about what's going on in my brain. My wife has been wondering about that for years.

A crinkly wrinkle

I have spent the better part of two blog posts trying to make it clear that the DE color difference and the just noticeable difference do not have the same lineage, so they must be different. Imagine how cruel you will think I am when this whole world comes crashing down. If truth be told, the two are very intimately related.

The formula for DE00 is universally regarded as being the second ugliest set of equations in the known universe. You may reckon differently, but I count a total of 26 free parameters that were available to tweak the equations, including the handful of parameters that were inherited from the formula for L*a*b*. The equations mix Fourier series, Pade formula, square roots, cube roots and seventh powers. There was a lot of knob-twiddling of the free parameters in order to get the equation just right.

Egad. John the Math Guys really don't like this: When regression goes bad. Finding the right model. Mathematical models.

What did the authors use to assess their tweaking? A bunch of large data sets that came from just noticeable difference experiments. While the lineage of DE00 is based on Munsell's perceptually linear color space, there was significant cross-breeding from the JND folks.

Final answer, what is a DE00?

DE00 is a unit of color difference, which has proven itself in practice as a way to assess conformance of manufactured color. It is loosely based on the equal gradations of color in the Munsell space. The magnitude (scaling) of DE00 is based on the size of a color step at the middle of CIELAB space, which was in turn based on 100 levels of gray. The size of a DE00 in other parts of color space was scaled so that the color difference is that same number of just noticeable difference units throughout color space.


  1. I got an email from the venerable Danny Rich. It seems that Google blogger doesn't like him, since it doesn't let him post a comment. If anyone else is having problems posting comments please post a comment here so that I can look into it.

    Danny has provided us with a bit more detail about JND:

    "I find a couple of premises in this blog that are not supported in the literature and lead to a questionable conclusion. The first is that visual color space (not some mathematical approximation to it) is visually uniform across all hues, lightness and chromas (using your Munsell notations). Thus trying to equate any form of DeltaE to a visual jnd is a bit like trying to catch the wind. The second is that someone, somehow invented MacAdam ellipses and then he just adopted them for his research. Reading his papers and his presentations to the OSA one observes, that despite the complicate schematic, his experiment was relatively simple. There was a reference color field and a test color field. They began as being equal and the test field was varied along a single direction in chromaticity (no differences in the brightness). When his observe (Mr. Nutting) could detect a visual difference the experiment stopped, the setting of the test color recorded and the experiment was repeated, moving in a different direction in chromaticity. This process was repeated a number of times and the distance to jnd determined from the standard deviations of the trials. So the final experimental data was the 20 or so color centers with a lot of little "sticks" pointing outward. Today we might cast a convex hull around those points but in the 1940s such arithmetic was not easy to do. Since the lengths of the sticks were not equal in all directions, a circle would not be a good fit and the next simple geometric figure was an ellipse. So ellipses were cast around his data points. A later paper by Brown, Jackson (also of Kodak) and Howe (an intern at Kodak from NC State) demonstrated that the ellipse was simply the 2D standard deviations of the bivariate Normal statistical distribution. Brown went on to derive the same mathematical analysis for combined lightness and chromaticity differences as the 3D ellipsoid, again simply represented by the trivariate Normal distribution. The reason that the shape should be elliptical or ellipsoidal lies in the relationship between the coordinates. (x,y) or (Y,x,y) are not independent but are correlated so that the covariance is not zero. A good discussion of this can be found in the section on Color Metrics in the textbook Color Science by Wyszecki and Stiles. The arguments apply to the work on differences in CIELAB space. L*, a*, b* all contain the Y tristimulus value and as a result are correlated. Since L*,a*,b* are not independent they will not form a Euclidean space and the standard deviations of visual judgments will always form an ellipsoid. So if we have to assign a cause for MacAdam's jnd's being elliptical perhaps it should be R. A. Fisher "

  2. The 2000 formula is only the second ugliest?

    There is at least one study (sorry, I do not have the citation handy) that shows that the 2000 formula did not perform better than the 1994 color difference formula. An oblique application of Occam's razor would favor the 1994 formula, a relatively straightforward extension of the 1976 a*, b* formula.

    Personally, I believe if one must go to lengths as Herculean as those employed in the 2000 formula, there is a fundamental flaw in the foundation, and your time might be more productively spent elsewhere.

    IPT and related spaces are only slightly more complicated than CIELAB, and are far more uniform. They avoid the oversimplification made in CIELAB that the x-bar, y-bar, and z-bar color matching functions are cone fundamentals, and use a more realistic set of cone fundamentals. IPT-like spaces should be investigated further as a replacement for CIELAB.

    Apostasy, heresy, iconoclasm, or just plain practicality? I would like to think the latter, but many disagree with me.

    1. Was it the TAGA paper from Tony Johnson, perhaps? I don't have a reference handy, either.

      I agree with you, as usual, Steve. CIELAB is icky all by itself, because 1) it is based on nonlinear functions applied to the XYZ functions which do not exist in nature, and 2) it applies a piecewise function, which again, does not exist in nature.

  3. Hi John,
    Loved your article. In my experience from psycho physical tests for acceptability and perceptability, dE00(1,1,1) works best despite its ugliness. dE00 between 0.6-0.7 works for most industries. In delivery chains tolerance halving is very popular leading to ridiculously small tolerances. Related to JND I have the experience that asking people for perceptible differences (aka JND) did not yield very different answers as compared to asking for acceptable differences.

  4. You are very optimistic with "The numbers came with a guarantee: If two color samples measured exactly the same tristimulus values, then they would be perceived as the same color." A standard observer is really an average of (IIRC) around 20 people, so we should expect that people see colours slightly different: we know about IR and UV ranges, but I expect (but never read about it) that there is a variance (small? large?).

    1. I agree, cate! The guarantee is a bit shaky.

      There is a well understood variation between individuals who are "normal" trichromats: the lens in our eyes yellow as we age.

      I first noticed this when I pointed a 405 nm laser at a a TiO2 white, and described it as a very dim purple. He disagreed, saying it was very bright. I was older than him.

      The difference is tabulated in CIE 170-1: http://www.cie.co.at/publications/fundamental-chromaticity-diagram-physiological-axes-part-1