Thursday, September 12, 2013

How many colors are there - the definitive answer

Here is the quick summary: There are 346,005 discernible colors.

I published a rather popular blog post in October 2012 that posed the slightly whimsical question "How many color are in your rainbow?"  I looked at the question in a number of ways and came up with answers anywhere from 3 to 16,777,216. The intent of the post was to collect some interesting facts into one coherent and entertaining post.

But, I sort of sidestepped the implied question: How many discernible colors are there?  I intend in this post to answer the question a bit more scientifically. Well... quite a bit more scientifically. For this blog post, I used a Monte Carlo technique to determine the volume of CIELAB space, and then modified this volume according to DE00 to account for the nonlinearity of CIELAB. And the number 346,005 plopped out.

General idea

I started by generating zillions of random spectra [1]. Now, I'm not gonna say that I generated every spectra possible. I'm sure I missed a few that were hiding down there in the shadows. But I did look at a whole bunch of them. Half a billion, to be exact.

I converted each spectra into L*a*b* values using D50 illumination and the 2° observer. The resulting L*a*b* values were tabulated into boxes in a three-dimensional array, with each box indicating whether the corresponding region in CIELAB space contained a viable L*a*b* value.  

Next, I counted up the number of boxes checked to establish the volume of CIELAB space. According to my experiment, the volume is just short of 2.2 million. This number fits in reasonably well with two papers cited by Gary Field in my addendum blog post:

Research on the number of colors issue usually starts with reference to the Dorothy Nickerson and Sidney Newhall paper of 1943 (JOSA, pp. 419-422). They conclude that there are about 7,500,000 surface colors at "supraliminal" viewing conditions, and 1,875,000 colors when viewing conditions approximate those used for color matching work.

Mike Pointer and Geoff Attridge concluded that there were about 2,280,000 discernible colors in their 1998 CR&A article (pp. 52-54). 

Thus, my number (2.176 million) corroborates the previous results from Nickerson and Newhall (1.875 million), and from Pointer and Attridge (2.280 million).

But we're not done yet. As we know, CIELAB is just not all that uniform. In particular, two saturated yellow colors might be 5 units apart (according to DEab) but might still be perceived by a human as just barely different. Thus, this figure is an overestimate of the number of colors that are actually discernible. Since I have all (or nearly all) the physically realizable colors in boxes, I can compute the volume of each box using DE00. Adding the DE00 volumes of each of the boxes will provide an estimate of the true number of colors, corrected for visual linearity.

Based on this correction, the number of discernible colors is 346,005. I won't attempt to name them in this blog post. That will come in a future post.

Now for some details on how the calculation was done...

Generating spectra

All the spectra were "physically realizable reluctance spectra", which is to say, the reflectance values were all between 0 and 100%. I created spectra from 380 nm to 730 nm, in 10 nm increments. All the spectra I generated were somewhat "smooth", in that they were piece-wise linear functions. I show one example below.


The spectra above is comprised of nine segments. I generated 125 million of these nine-segment spectra, along with the same number of spectra with eight segments, the same number with seven segments, and the same number with six. Thus, there were 500 million spectra en toto.

Initially, I used reflectance values that were uniformly distributed between 0 and 100%. This proved a bit slow to converge (slow to fill the area), since a lot of spectra were generated at the light end where our sensitivity to color difference is rather weak. For this final work, I used random numbers distributed according to the cube root distribution. 

Caveat - This Monte Carlo analysis necessarily will produce only a subset of all possible spectra. First, discontinuous spectra were left out. Second, the fact that "only" half a billion spectra were analyzed leaves open the possibility that some are missed. This would tend to cause my estimate to be a bit low.

I also tried generating purely random spectra, with no correlation between wavelengths. Initially this was slow to converge - perhaps that might have worked out in the long run if I would have just had the patience.

Tallying the number of unique colors

A three dimensional array was created, representing L* values from 0 to 100, a* values from -150 to + 150, and b* values from -150 to +150. All three dimensions were quantized in steps of 5, resulting in 21 X 61 X 61 boxes. Thus, there was a single cube, for example, in CIELAB space representing all colors in the range 20 < L* < 25, -40 < a* < -35, and 80 < b* < 85.

Each of the spectra were converted to CIELAB values using the D50 light source and the 2° observer. The CIELAB values were then converted to a position in the three dimensional array, and the location was marked to indicate that there was at least one viable CIELAB value within the box.

If anyone is interested, I can send you a list of the centers of all the boxes, representing all valid CIELAB colors. Send me an email at john@johnthemathguy.com. If anyone is really interested, I can provide a set of very colorful charts like the one below, that summarize all this data. If enough people are really interested, I will post those to my website for all to marvel at.

Viable a*b* values in the range 55 < L* < 60
(each square represents a 5 X 5 box in a*b*)

Caveat - This discretization causes a bit of an error. It will cause an over-estimation of the number of colors. Why? Let's say that a certain box is at the edge of color space, straddling the line between viable CIELAB values and silly-lab values. If zillions of spectra are tested, then this box will eventually get a tally, despite the fact that only half of its volume should have been counted.

Converting to a count of discernible colors

Now for the novel part, the conversion to DE00. In the previous analysis, the tacit assumption was made that each of those 5 X 5 X 5 boxes had a volume of 125. To be completely correct, the volume of each cube is 125 ΔEab3, cubic delta E units. I guess maybe that's not an assumption, that is pretty much just geometry. The assumption comes in when this is interpreted as meaning that each box contains 125 discernible colors. Those who have subscribed to the Color Science Times Newspaper for the last 30 years know that this might not be exactly the case.

So, I computed the volume using a color difference formula that is closer to human visual perception, ΔE00. Theoretically, we could just compute the volume of a box by determining the color difference from top to bottom, from side to side, and from other side to other side. These three numbers would be multiplied together to get the volume of discernible colors in that box. This is reasonable, but it falls just outside of the spec for this color difference. Due to nonlinearity, the warranty on ΔE00 expires at 4. Beyond that point, it may not give reasonable results.

Just to make sure this didn't introduce an error, I divided the cube into eight cubes, each with sides of 2.5 ΔEab, and added these up. Now we are within the warranty.

346,005. I'm going to use this for all my computer passwords. Just to make sure I remember it.

---------------------

[1] I am talking here in the first person, like I actually generated all the spectra myself. I didn't really. I have better things to do. Like drink beer. I had my assistant Dell Studio generate the spectra. He didn't seem to mind, although he did seem to take his time about it. 



3 comments:

  1. I do not follow your reasoning for using cube root sampled, piecewise linear reflectance curves for the optimization. Since you are using Monte Carlo and we know that worst case estimate for the number of points s L*=100, a*=160, b*=180 which should mean tht there are 2,880,000 points in CIELAB separated by 1 step along any one axis. If we halve the step size to 0.5 CIELAB units then the maximum number of points is 23,040,000. That is still a tractable number.

    ReplyDelete
  2. Yeah, it would be good to sample at a finer resolution. It would be tractable to go down to (say) 1 DE sampling in terms of memory, but the big issue is computer time, not storage. If I were to increase the resolution by a factor of five in all dimensions, then it would take 125 times as long to get counts in the boxes. As it is, this was about ten hours on my computer (running non-compiled Mathematica code) to generate the half a billions spectra.

    Even with that many spectra, there is a little bit of fringe near the edge. Look to the left of the top in the diagram above. There is a green pixel next to a hole just to the right. Is this hole reality, or i it the fact that I didn't generate quite the right spectra to fill that box?

    You mentioned the cube root thing... this was another tactic to speed up the convergence - that is, to speed up the rate that boxes get checked off.

    As for the piecewise linear thing... that was another attempt to speed things up. This one I can justify for my particular use, since I live in a world of measuring the color of light reflected from solid objects. Generally speaking these spectra are "smooth".

    The fact that there were two previous papers (doing this is different ways) that came up with very similar numbers leads me to believe that my assumptions are not that far off.

    ReplyDelete
  3. The graduated cotton bud test is far more accurate. Lighter hues are more easily discernable than bright saturated colours.

    The test involves placing cotton buds in the correct order of an ascending chromatic scale. The average unimpaired subject, can usually differentiate several million shares and hues of colour with this very practical test. The lighter the hue the easier it gets!

    Ask any automobile spray painter, and they will tell you that 'White' is the most difficult colour to match. If it's a door panel with a minute scratch, the entire door has to be repainted the closest match, and the spray painter just hopes the customer does not complain! (I serve as Company Secretary with an Industrial Coatings Company, we were consultants for The Brooklyn Bridge, and held the Concorde Supersonic Coatings Contract)

    ReplyDelete