Can we establish the boundaries of "red"?  If a color designer (or my wife - often the same thing) tells me that a certain shade of puce or burnt sienna is in the red family, can I really believe her? And (especially if it is my wife) do I dare to disagree with her?
I want to make sure I have articulated the question I want to ask. There is the question of what we would call pure red, and there is the question of which colors are in the red family. Rufus, pink and burgundy are all in the red family (maybe a color designer would agree with me?), but only fire-engine-candy-apple-lipstick red is really pure red.
The question of whether we all identify pure red as the same color is fascinating. Someday when I go from being a blogger pretending to be a color scientist to a fancy-schmansy researcher with 100 grad students at my beck and call, I will pursue the pure red problem. For now, I will attack the problem of defining the boundary of the red family of colors.
Color family assignments
This being a project where money is no option, I went to no expense to come up with a source of data. Wikipedia cheerfully provided me with a list of 836 color names. As if that weren't cool enough, the bottom of the article lists eleven color families: white, pink, red, orange, brown, yellow, gray, green, cyan, blue, and violet [1].
There were some interesting things I noted. Certain colors were in more than one family. For example, "electric indigo" was in both the blue and the purple families. I suspect that this is not just a careless mistake on an obscure color, since teal is listed in both the green and blue families, and brown is not only in the brown family (where it belongs) but also in the orange family. Now get this: not only is the color fuschia the single most frequently misspelled color name in the greater Tampa metropolitan area, but it is tri-familial. It happily resides in the pink, red, and purple families.
I'm certain if I was doing this work under government funding, I could find more examples, but for now, I will content myself with the statement that the borders of color family groups are fuzzy. There is much overlap - colors belonging to more than one family.
It is unfortunate, but the color families in this Wikipedia entry do not include all of the 836 color names, but only about a third of them. To make up for this, my wife and I went through the list, displaying them on our respective computer monitors, and each deciding which color family or families they belonged in.
In going through this exercise, I found a few areas of color space where the color family was difficult to pin down. For example, I didn't quite know where to put the colors tan, coral, and the dark reddish brownish purple of a kalamata olive or a plum. My conclusion is that there are at least three color families that are missing. I am working with Stephen Colbert to start a super PAC to get tan, coral, and plum officially recognized as color families. Contributions will be gratefully accepted.
The woebegone kalamatas
Determining the color of each color
If I were a real color scientist, and not just playing one on the internet, I would probably have paid an immense amount of money in order to poll an immense number of color designers about the color family affiliations of an immense number of physical color swatches. Then, I would have my immensely underpaid grad students use my immensely expensive spectrophotometer to measure the color of each of those color swatches. In this way, I would be able to attach real color measurements (CIELAB values) to each of the families.
The cadre of grad student slaves that I dream about having
Unfortunately - I don't know if I mentioned this before - I am not in the habit of filing for government grants, so as I said, I did the Wikipedia shortcut. Unfortunately, that left me without CIELAB values for the 836 color names that I now had found families for.
The Wikipedia entry does, however, list RGB values for each of the 836 colors that it names. It is a "simple" matter to convert from RGB to CIELAB [2]. I called on my guy Adam to whip up a little code for me to convert. Wasn't much work for me at all. Everyone should have a guy Adam.
The Wikipedia entry does, however, list RGB values for each of the 836 colors that it names. It is a "simple" matter to convert from RGB to CIELAB [2]. I called on my guy Adam to whip up a little code for me to convert. Wasn't much work for me at all. Everyone should have a guy Adam.
Ellipsoidification - the heavy math part
As a result of the immense effort described in the previous two sections, I had a list of CIELAB values for each of the eleven color families. The brown family, for example, had 144 colors. Now what?
I chose to use an extremely ingenious technique which has become commonly known as ellipsoidification [3]. 
Let's just pretend for the moment that color were one-dimensional. If we had a list of color values that belong to each family, we could compute the average value for each family, as well as the standard deviation. Later, if we wanted to check if a given color fit in a given family, we would determine how close the color was to the average of that family, measured in terms of standard deviation units. If the color was within one standard deviation unit, then it would be pretty clearly that it would show up, uninvited, to Sunday dinner with that color family. If the color were to be more than three standard deviation units away, then it would be pretty clear that a formal invitation would be necessary.
Meet Henry, the Ellipsoid [4]
These statistics are perfectly fine for dealing with data that lives along a line. In our case, however, we are dealing with data that lives in three-dimensional color space. What is the analogy in three dimensions of this fuzzy interval? The technique I have used determines the ellipsoid (with arbitrary axes and arbitrary tilt) that is analogous to the interval on the line... a three dimensional version of average and standard deviation [5].
Where does the Red family live?
Now we come to the culmination of this fascinating research. I performed ellipsoidification to ascertain one ellipsoid that characterized each of the color families. Each ellipsoid was defined by a centroid, the size of each of the three axes, and two angles to define how it is tilted.
Here is the easy part. The centroid of the Red family ellipse is an L*a*b* value of {57, 57, 14}. To get an idea of how the family is arranged in color space around that centroid, I will give a series of slices through L*a*b* space, each at a different L* value.
Red does not show up at L* = 45, but it can be seen in the slice at L* = 50. You can see it living comfortably alongside the Black family. The smallest and darkest ellipse represents the one standard deviation unit region. The next two ellipses are the two and thee standard deviation unit regions.
Black and red families shown at the L* = 50 plane
As we move up in lightness to L* of 55, the neighborhood becomes a bit more interesting. The black and red families are still around, but we have started seeing the tip of the blue and purple families. The overlap between red and purple is interesting. All the colors in the purple family (at least in this range of L*) are fringe  members of the red family.
Note that the ellipses for red are getting bigger. As we move up in L*, we move to the fatter part of the red family.
Black, red, blue and purple at L* = 55
Next, we see L* of 60. Brown has joined the neighborhood, as with the purple family, the Browns are living on the red fringe.
Various color families at L* = 60
Finally, at L* = 65, the red family is starting to peter out. Anything brighter, and it won't be called "red" anymore. We see that the green family is showing up, and has some fringe overlap with blue. 
Color families at L* = 65
Summary
Let me make this clear, just in case I haven't already. There are some limitations to the data that went into these plots. My assigning of color families to the color names is a bit iffy, and my assigning of CIELAB values to the color names is also iffy.
On the other hand, I think there is some promise to the technique. Given the right data, this technique could be used, for example, to look at a CIELAB value and determine which color families it belongs in and how strongly.
Of course, there is an assumption here that color families can be approximated by ellipses. If I had more reliable data, I could test that hypothesis. Ahhh... all the fun research that I could be doing!
---------------
[1] It is interesting to note that eleven was one of my answers to the question of how many colors there are. At first, one would think that their list of eleven color families would be that same as my list of eleven color names that everyone knows. But this was not to be. Mr. Wiki did not check my blog post before writing his article. He must have heard of it though, because he had the correct number. But, the Wikipedia list omits black, and includes cyan.
[2] I say this, of course, laughing under my breath. The color I see on my computer monitor, my laptop to which it's connected, the Kindle Fire I use to stay connected while I am in the "library", and my smart phone all have a different interpretation of any given RGB value. So, I readily acknowledge that this is a soft spot in this otherwise flawless research paper/blogpost.
[3] This technique was pioneered by the venerable John the Math Guy. Maybe saying that ellipsoidification is  "commonly known" is a tiny bit premature, since the idea was first introduced in this blog post.
[4] Henry the Ellipsoid is kind of an egg-head.
[4] Henry the Ellipsoid is kind of an egg-head.
[5] The math is not easy. Sorry!










 
I would like to direct your attention to http://blog.xkcd.com/2010/05/03/color-survey-results/ who did a color survey and has some interesting results
ReplyDeleteYou may also have a look Nathan Moroney's color naming experiment here or here if you don't know it already.
ReplyDeleteGreg, the blog you referenced is definitely good reading! Thanks for commenting.
ReplyDeleteJM, I am familiar with Nathan's work on the color naming experiment. Good stuff. One might note a certain similarity between the xkcd blog (which does not have an author's name) and Nathan's work. Hmmmm...
These are slightly different topics than my blog post - individual color naming versus color family assignment. Still very interesting stuff.
I'm sure you've explored Munsell and NCS color ordering systems. They assign colors to hue families. Munsell ten families, NCS five (I think). Here's one of my favorite color links: http://www.vcsconsulting.co.uk/VirtualAtlas.htm
ReplyDeleteNo matter what it is you want to do, the reason for ordering color into families is to craft harmonious color relationships. If colors aren't reduced down to and ordered into a limited number of categories, or hue families, then you can't purposefully make color combinations/schemes. You're just guessing. And the quality of color relationships will always reveal those who guessed.
In the end, the most basic goal of any method to order color should be to make sense as a guide to color relationships and harmony.
That website is way cool! Thanks for sharing.
ReplyDeleteColor harmonies... very interesting topic. One that I hope to apply some math to. Someday.
Facinating stuff John! Especially for one such as myself who spent two days straight recently with a French client (my language skills admittedly abysmal) experimenting on a packing project which required a " light Red but not pink in any way".
ReplyDeleteThanks