Wednesday, November 28, 2012

A Tidey Question

First question - Why are there tides?

The connection between the Moon and Sun to tides has been known at least since 150 BC. Seleucus was the ancient Greek go-to guy on tides. He knew that the Moon was in control of tides, but he thought the interaction was through "pneuma". Pneuma is something that doesn't exist but is all over the place. Pneuma was invented to allow the ancient Greeks to pretend that they understood. In this way, pneuma is akin to aether, phlogiston, electromagnetic waves, dark matter, and my non-existent buddy Horace. I pretend to go out for a beer with him when I don't want to tell my wife where I really am.

Where am I when I don't want my wife to know where I am? Probably singing an old Righteous Brothers song at a karaoke bar. She is trying to cure me of this addiction. [1]

One of my "go to" karaoke songs when I am in a tidal mood

Posidonius (who lived around 100 BC) was another ancient Greek who was interested in tides. He said that the Moon's effect was because the Moon heated the water enough to make it expand, but not enough to make it boil. The effect of the Moon on tides was, in his mind, proof of astrology. If the effect of the Moon on the entire ocean is so large, then why can't a star that is a zillion miles away have enough of a selective force over my life so as to predestine who I would fall in love with and that I should be a karaoke star? Damn the stars for sentencing me to this unresolvable conflict!

It wasn't until Newton's Philosophiæ Naturalis Principia Mathematica that gravitational forces were singled out as the way the Moon and Sun controlled the tides. That's not such a big surprise, since Newton invented gravity. By the way, gravity didn't exist before July 5, 1687, and neither did tides.

Tides are caused by the gravitational pull of the Moon and the Sun on the water on the water. Gravity is pulling on the solid parts of Earth as well, but the water is free to slosh about. With respect to a point on Earth, the Sun rotates around every 24 hours, and the Moon every 24 hours and 50 minutes. This gives lots of opportunity for them to combine efforts and for them to cancel each other out as the two forces get into and out of phase with each other.

Fun fact #273 - Tidal waves are caused by earthquakes, not tides

Second question - Why are there two tides a day?

Now we have the more difficult question! The simple answer is that the oceans bow out on both sides of the Earth. The side facing the Moon or Sun has high tide, and the side facing away from the Moon or Sun also has high tide. Note that if the Moon is making high tides for me in Milwaukee and for my good friends in India, the Moon is doing nothing for my buddies in England.

Actual unretouched photo of the Earth with some way big tides

I have used a technique here called "answering a hard question with a vague and incomplete answer that really doesn't address the real question." This never worked on my mother, either.

Third question - Why is there a tidal bulge on the opposite side of the Earth?

This third question is the tough question. It almost looks like the Moon's gravitational force is pushing the water on the other side of the planet away. That don't make no sense. I went googling for the answer to this enigma. I found lots of answers. I have tried to arrange them according to similar explanations:

Crazy quirky

Additionally, by a crazy quirk of physics, it also causes the water to dome on the opposite side of the earth.

I will add "crazy quirk" to my list including pneuma, aether, and my non-existent buddy Horace.

Effect of the Sun

BUT, the sun is also pulling at the same time in the opposite direction halfway around the world. This produces two bulges, one near the sun, and one near the moon.

Ummm... so the Sun is always directly opposite the Moon?

Centrifugal force

I found a lot of explanations that call on centrifugal force to explain the bulge on the opposite side.

The centrifugal force produced by the Earth's rotation cause water to pile up on the opposite side as well.

On the side of the earth directly opposite the moon, the net tide-producing force is in the direction of the greater centrifugal force, or away from the moon.

At the centre of the earth the two forces acting: gravity towards the moon and a rotational force away from the moon are perfectly in balance. … On the opposite side of the earth, gravity is less as it is further from the moon, so the rotational force is dominant.

…on the near side the direct pull dominates and causes the oceans to bulge in the direction of the moon; on the far side the centrifugal effect dominates and causes the oceans to bulge in the direction away from the moon.

These explanations are cool, and obvious, right? But, they kinda miss the point. Isn't the centrifugal force pretty much the same all the way around the globe? Why does the water bulge just at the two ends?

It's obvious

This is one of my favorite explanations. Wrap a hard problem in fancy words and then slip in an "it's obvious".

If every particle of the earth and ocean were being urged by equal and parallel forces, there would be no cause for relative motion between the ocean and the earth. Hence it is the departure of the force acting on any particle from the average which constitutes the tide-generating force. Now it is obvious that on the side of the earth towards the moon the departure from the average is a small force directed towards the moon; and on the side of the earth away from the moon the departure is a small force directed away from the moon.
http://www.1902encyclopedia.com/T/TID/tides-03.html

I was with ya until the part about "every particle"

Gravity differential, correct but confusing

Now we come to the real reason for two tides. The pull of gravity is slightly greater on the side of the Earth that faces the Moon as compared with the pull on the side that is opposite the Moon. I'll give my explanation of why this should cause tides in a bit, but first I want to acknowledge some explanations that are correct, but still confusing.

The bulge on the side of the Earth opposite the moon is caused by the moon "pulling the Earth away" from the water on that side.

What???

The not so obvious part is that the water on the far side is getting left behind because the earth is getting pulled away from it.

This sounds like that last explanation that I couldn't understand!

Owing to the differences of distance of the moon from various portions of the earth, the amount of attractive force will be different in different places and tend to produce a deformation.
Van Nostrand's Scientific Encyclopedia, Fourth Edition, 1968, entry on Tides

Ok, so why will it tend to produce a deformation???

And the water which is closer to the moon is pulled more strongly and so it’s pulled up into a tide. The [water] on the opposite side is pulled slightly less strongly and so it’s pulled down less strongly towards the surface of the Earth and so you get a second bulge on the far side of the Earth.

Ummm... it is pulled less strongly... why does that cause a bulge on the opposite side? [2]

The Moon exerts a force on the Earth, and Earth responds by accelerating toward the Moon; however, the waters on the side facing then Moon, being closer to the Moon, accelerate more and fall ahead of Earth. Similarly, Earth itself accelerates more than the waters on the far side and falls ahead of these waters. Thus two aqueous bulges are produced, one on the side of Earth facing the Moon, and one on the side facing away from the Moon.

The Moon is falling! The Moon is falling! We must run and tell the king!

Trophy for best explanation I could find

I won this trophy for my karaoke rendition of Fly Me to the Moon

Here is the explanation that came closest to explaining it for me. I still think it needs work, but this got me thinking along the right track.

The pull is greater on the side facing the Moon, pulling the water there closer to the Moon, while the pull is weaker on the side away from the Moon, making the water there lag behind. This stretches out the Earth and the water on it, creating two bulges.

First, gravitational pull decreases with distance. If I am standing directly beneath the Moon, it's tug on me will be larger than the tug I would get if I were on the opposite side of the Earth. This has nothing to do with the Earth getting in the way. It is all about distance.

Now, if I pull really hard on one side of a ball, and pull not quite so hard on the middle, and still a little less on the opposite side, the ball will deform a bit.

That's the layman's explanation. The answer for a graduate physics exam would probably be a bit more involved. There would be some stuff about the inverse square law [3], and how the Moon and Earth are star-crossed lovers, destined to never meet [4]. Momentum and some crazy stuff about adding vectors of motion together would probably jump out.

But in the end, it's all about stretching.
--------------
[1]  I know that, as a blogger, I have a solemn obligation to always tell the truth, but... I lied when I said that my wife is trying to cure me of my karaoke addiction. She has been known to grab the mic herself.

[2] This answer was from the Naked Scientists podcast. Very entertaining. I recommend it. Except they have this thing about trying to make science entertaining. Come on. Science is serious business. Stop making it fun.

[3] This law is on the statutes for Rhode Island. It says that you are not allowed to invert a square on any holidays that involve eating.

[4] The term "star-crossed lovers" comes from Shakespeare in reference to Romeo and Juliet. This is an astrological reference. It means that the stars have thwarted the romance. I think it was very clever of me to slip in another reference to astrology.

Wednesday, November 21, 2012

The Full Monty

My good buddy Steve from the UK gave me the idea for this week's blog. He suggested that I write about  the "Monty Hall" problem. My first reaction was that this was yet another dumb British idea, like Twiggy, Monty Python, and double-decker buses. What could I possibly have to say that hasn't already been said by a whole bunch of people who are either really smart, or who claim to be really smart?

We have a lot to thank the Brits for

One person who fits into one of those categories is Marilyn vos Savant [1]. She treated the Monte Hall problem in her Parade magazine columns many years ago. As I recall, she caught a lot of flak from a lot of really smart people for giving the wrong answer. So, I know that whatever I say, I will get flak from someone who claims to be smart.

Monroe or vos Savant? Ginger or Mary Ann?

Then I stopped to think. I was a teenage nerd once, so I kinda like Monty Python. Maybe the Brits do have some good ideas once in a while. Maybe it wasn't such a bad idea for us Americans to send settlers to colonize England a few centuries ago. And maybe it's time for a blog post on the Monte Hall problem.

The Monte Hall problem

Years ago (oh no, not another "back when I was a kid" lecture) there was a show called Let's Make a Deal, hosted by Monte Hall. For reasons that I was never able to understand, people who showed up in the audience dressed like they were going to a showing of Rocky Horror. Only there weren't quite so many drag queens. Some of the folks with the most outlandish outfits became contestants.And at the end of every show, there was one contestant who got to chose from three doors "behind where Carol Merrill is standing". [2]

Vanna White's role model

There was always a fabulous prize behind one door, like a boat the size of Lake Michigan or new kitchen filled with a lifetime supply of Spam and some baked beans. The other doors? Goats. Maybe it wasn't always a goat, but that's what I remember. I remember thinking it might be kinda cool to have a goat. Although the lifetime supply of Spam sounds pretty alluring.

What? You don't want me?!?!?

Now we're coming to the fun part. No matter which door the contestant picked, Monty would open a different door, showing a goat, and ask if they wanted to change their choice. What's the best strategy for the contestant? Change or hang on? One line of reasoning says that there are two doors, one with something good, the other with something bad. Fifty-fifty. Why bother changing? Monty is just trying to trick you into changing to the goat.

The other line of reasoning is long and arduous and you have to think and it's hard and ... well, it says that you're chances are better if you change doors. Like... for some reason your probability of picking the fabulous prize has changed from 1 in 3? Or something. Really?

Wikipedia, the world's largest source of misinformation, has an entry on this famous problem. They say that it makes sense to change. They explain this non-intuitive answer in several non-intuitive ways. I am going to add yet another explanation, but first I am going to appear to change the subject.

Monte Carlo methods

Monte Carlo. It brings to mind "shaken, not stirred" and casinos and baccarat. And numerical methods, of course. Well ,maybe not for most people, but certainly for me.

Did you say Monte Carlo methods?

In college, I took a class in Monte Carlo methods. Basically, this is a way of solving numerical problems using random numbers. The first use of this method was back in the days of the Manhattan Project. The guys [3] were working on shielding from neutrons. They could easily explain the path of any particular neutron, but they had trouble finding the equation that would explain what happens to the neutrons in the aggregate as a function of the material and its thickness. They eventually solved the problem by using random numbers to simulate the path of thousands or millions of neutrons - start each simulated neutron out from a random position, moving in a random direction and see where it goes.

The professor for my class, John Halton [4], started us out with a bit of a refresher on probability. He asked us to compute the odds of all the basic poker hands: three of a kind, flush, straight, etc. This is a bit of a tricky problem, not impossible, but it's easy for a kitten like me to get balled up in the combinatorics.

I knew that I was likely to make some sort of mistake, so I decided to double check my answers by doing a little computer simulation. I wrote the code to deal hands at random, and then decide what the hand was. I let my computer play poker overnight, and then checked in the morning to get the results. Sure enough, my simulation revealed that one of my calculations was off by a factor of two. I found my error, and I turned in both the combinatoric solution and the program for my simulation.

This was to be (perhaps) the shining moment of my otherwise drab and wretched college years, since I was never elected class president or homecoming queen. Dr. Halton was tickled that I used a Monte Carlo method for an assignment in a Monte Carlo methods class, and he took part of a lecture to explain how I had double checked my homework.

Actual photo of me basking in the glory of my well-deserved accolades
I was so handsome in those days

I have used this same approach numerous times. Sure, I can derive equations and write them out and solve them. But I have also learned that, despite meticulous double checking, I can still have bone-headed mistakes interspersed with my usual absolutely brilliant analysis. When a problem lends itself to Monte Carlo simulation, I will often use the simulation to double check my answer.

Finally getting to the point

For those readers who may have forgotten what this Full Monty blog post is about, I have been making Monty Python references to lead up to a discussion about using a Monte Carlo method to solve the Monty Hall problem.

This problem is ideal for Monte Carlo double checking. It is complicated enough that it is easy to talk yourself into some wrong assumptions, but at the heart, it is quite simple to simulate.

1. Randomly decide which of the three doors will conceal the fabulous prize.
2. Randomly select a door for the contestant.
3. Decide which door Monty will show.
4. Switch the contestant to the remaining door - not the one chosen, and not the one opened.
5. Record results.
6. Repeat a zillion times.

It's hard to get balled up in this simulation. Or then again... Some clever mathematical chap might try to get all clever on the algorithm that I have described. For example, someone could "clever this algorithm up" when it comes to step #4. It's a bit hard to put that step into code. But here is something clever. Let's say that the door picked is 2, and the door opened is 3. Subtract those from 6, and what do you get? 6 - 2 - 3 = 1. This is the remaining door. This always works. Clever, eh? I'm rather proud of it.

This cute image has little to do with the rest of this blog

A clever mathematical chap might be tempted to use this trick on #3 as well. Unfortunately this would be a bug. It works when the contestant picks a door with a goat behind it, but not when he picks a door with the yaught.

If you feel tempted to simplify the problem in this way, then please slap your face. The idea of trying to apply clever mathematical analysis completely defeats the purpose of using Monte Carlo to check your work. The whole point to using this technique is to avoid falling into the trap of outclevering yourself.

Results

Here is the test. I have assumed (just for the sake of simplifying this discussion) that the probability of winning a yacht is one in three if the contestant does not switch. I wrote some code to simulate what would happen if the contestant switched every time.

The program that I wrote is listed at the end of this blog post. I make no claims that this is the most efficient way to write the code. In fact, I intentionally focused on making the code as straightforward as possible. Oh... also... I do most of my programming in Mathematica. I'm not going to apologize for that. I realize there are probably some Matlab users reading this. Well... I have been using Mathematica since 1985. So there.

I ran the program once, with 100 iterations, and got 66 yacht and 34 goats. This sounds convincing. The number of yachts is much closer to 2/3 than 1/3. But, this is statistics, so I could easily get thrown off. I could solve that by upping the number of iterations to one thousand or one million. This should give me a better estimate of the odds, but I really don't know much about the dependability of my estimate.

I have a cleverer trick. How about I run this program with 100 iterations ten times. The experiment will have been run with a total of 1,000 contestants, but I would have them partitioned off in groups of 100. This gives me an idea of the range of the estimates.

Here are the number of yachts that were won in each batch of 100: 67, 71, 60, 66, 65, 61, 68, 61, 61, and 66. The mean is 64.70 and the standard deviation is 3.59. I am going to take a big leap now, and propose that this distribution is roughly normal [5]. If that is the case, then the number of yachts won per hundred has a 95% chance of being between 57.52 and 71.88. (I have gone two standard deviation units to either side.) I have no idea where I am going to park all those yachts.

Now I need to be careful in how I say this next bit. The range (57.52 to 71.88) is the range for the number of goats per hundred. It is not the range for my overall estimate of the mean. The mean was based on a total of 1,000 trials, so clearly the range must have gotten smaller.

The rule is that when you run n trials, the standard deviation goes down by a factor of square root of n. I ran ten trials, so the range has been reduced by a factor of just over 3. I know then that the average number of yachts won per hundred trials is then  62.43 to 66.97, with a 95% confidence.

Based on that, I can safely exclude the possibility that there is no advantage to switching. Also, I can't exclude the possibility that the probability of winning when you switch is 2 in 3.

Another application

This is a technique that I use frequently to double check my algebra. I plan on using it shortly when I do some further investigation on the distribution of ΔE values. I am wondering about the theoretical distribution. Some have called it a chi-squared function. I think that the distribution of  ΔE squared might be chi-squared, but I can test this. And I should, because I know I often get myself balled up.

--------------------------
[1] What kind of a name is that anyway? Who calls themselves "savant"? I mean really, I would never call myself John the Savant Guy!
[2] Carol Merrill was born in Wisconsin. John the Math Guy was born in Wisconsin. Draw your own conclusions.

[3] You know, the same old guys. Stanislaw Ulam, Nicholas Metropolis, Enrico Fermi, and my idol, John von Neumann. I have all their baseball cards.

[4] John Halton was from England, by the way. Oxford. Cambridge. All that rot. I do not know whether he knew John Cleese or Eric Idle. He did come into class one day, telling us how just a sprinkle of water mixed in with the beaten eggs made the fluffiest of omelets, though.

[5] This is another important discussion - whether you can assume that a distribution is normal - but I will leave that for another blog post.

The program (in Mathematica)
yaughts = 0;
goats = 0;
Do [
(* pick door for yaught and contestant pick, randomly and independently *)
yaughtdoor = Random  [Integer, {1, 3}];
pickdoor = Random  [Integer, {1, 3}];

(* assemble list of doors that Monty can open *)
freetoopen = {};
Do [
If [
(door ≠ pickdoor) && (door ≠  yaughtdoor),
AppendTo [freetoopen, door]
],
{door, 1, 3}];

(* let Monty pick from the door(s) available *)
If [Length [freetoopen] == 1,
opendoor = freetoopen [[1]],
opendoor = freetoopen [[Random [Integer, {1, 2}]]]
];

(* determine the remaining door that contestant can pick *)
Do [
If [
(door ≠ pickdoor) && (door ≠  opendoor),
newpick = door
],
{door, 1, 3}];

(* evaluate results *)
If [
newpick == yaughtdoor,
yaughts++,
goats++
],
{100}]
Print ["Yaughts and goats: ", {yaughts, goats}]

Wednesday, November 14, 2012

Assessing color difference data

The punch line

For those who are too impatient to wade through the convoluted perambulations of a slightly senile math guy, and for those who already understand the setting of this problem, I will cut to the punch line. When looking at a collection of color difference data (ΔE values), it makes no difference whether you look at the median, the 68th, 90th, 95th, or 99th percentile. You can do a pretty darn good job of describing the statistical distribution with just one number. The maximum color difference, on the other hand, is in a class by itself.

Cool looking function that has something to do with this blog post, copied,
without so much as even dropping an email, from my friend Steve Viggiano

In the next section of this techno-blog, I explain what that all means to those not familiar with statistical analysis of color data. I give permission to those who are print and color savvy to skip to the final section. In this aforementioned final section, I describe an experiment that provides rather compelling evidence for the punch line that I started out with.

Hypothetical situation

Suppose for the moment that you are in charge of QC for a printing plant, or that you are a print buyer who is interested in making sure the proper color is delivered. Given my readership, I would expect that this might not be all that hard for some of you to imagine.

If you are in either of those positions, you are probably familiar with the phrase "ΔE", pronounced "delta E". You probably understand that this is a measurement of the difference between two colors, and that 1 ΔE is pretty small, and that 10 ΔE is kinda big. If you happen to be a color scientist, you probably understand that ΔE is a measurement of the difference between two colors, and that 1 ΔE is (usually) pretty small, and that 10 ΔE is (usually) kinda big [1].

Color difference example copied,
without so much as even dropping him an email, from my friend Dimitri Pluomidis

When a printer tries valiantly to prove his or her printing prowess to the print buyer, they will often print a special test form called a "test target". This test target will have some big number of color patches that span the gamut of colors that can be printed. There might be 1,617 patches, or maybe 928... it depends on the test target. Each of these patches in the test target has a target color value [2], so each of these printed patches has a color error that can be ascribed to it, each color error (ΔE) describing just how close the printed color is to reaching the target color.

An IT8 target

This test target serves to demonstrate that the printer is capable of producing the required colors, at least once. For day-to-day work, the printer may use a much smaller collection of patches (somewhere between 8 and 30) to demonstrate continued compliance to the target colors. These can be measured through the run. For an 8 hour shift, there might be on the order of 100,000 measurements. Each of these measurements could have a ΔE associated with it.

If the printer and the print buyer have a huge amount of time on their hands because they don't have Twitter accounts [3], they might well fancy having a look at all the thousands of numbers, just to make sure that everything is copacetic. But I would guess that  if the printers and print buyers have that kind of time on their hands, they might prefer watching reruns of Andy Griffith on YouTube, doing shots of tequila whenever Opie calls his father "paw".

But I think that both the printer and the print buyer would prefer to agree on a way to distill that big set of color error data down to a very small set of numbers (ideally a single number) that could be used as a tolerance. Below that number is acceptable, above that number is unacceptable.

It's all about distillation of data

But what number to settle on? When there is a lot at stake (as in bank notes, lottery tickets and pharmaceutical labels) the statistic of choice might be the maximum. For these, getting the correct print is vitally important. For cereal boxes and high class lingerie catalogs (you know which ones I am talking about), the print buyer might ask for the 95th percentile - 95% of the colors must be within a specified color difference ΔE. The printer might push for the average ΔE, since this number sounds less demanding. A stats person might go for the 68th percentile, purely for sentimental reasons.

How to decide? I had a hunch that it really didn't matter which statistic was chosen, so I devised a little experiment with big data to prove it.

The distribution of color difference data

Some people collect dishwasher parts, and others collect ex-wives. Me? I collect data sets [4]. For this blog post I drew together measurements from 176 test targets. Some of these were printed on a lot of different newspaper presses, some were from a lot of ink jet proofers, some were printed flexography. For each, I found a reasonable set of aim color values [5], and I computed a few metric tons of color values in ΔE00  [6].

Let's look at one set of color difference data. The graph represents the color errors from one test target with 1,617 different colors. The 1,617 color differences were then collected in a spreadsheet to make this CPDF (cumulative probability density function). CPDFs are not that hard to compute in a spread sheet. Plunk the data into the first column, and then sort this from small to large. If you like, you can get the percentages on the graph by adding a second column to the spreadsheet that goes from 0 to 1. If you have this second column to the right, then the plot will come out correctly oriented.

Example of the CPDF of color difference data

This plot makes it rather easy from the chart to read off any percentile. In red, I have shown the 50th percentile - something over 1.4  ΔE00. If you are snooty, you might want to call this the median. In green, I have shown the 95th percentile - 3.0  ΔE00. If you are snooty, you might want to call this the 95th percentile.

Now that we understand how a CPDF plot works, let's have a look at some of the 176 CPDF plots that I have at my beck and call. I have 9 of them below.

Sampling of real CPDFs

One thing that I hope is apparent is that, aside from the rightmost two of them, they all have more or less the same shape. This is a good thing. It suggests that maybe our quest might not be for naught. If they are all alike, then I could just compute (for example) the median of my particular data set, and then just select the CPDF from the curves above which has the same median. This would then give me a decent estimate of any percentile that I wanted.

How good would that estimate be? Here is another look at some CPDFs from that same data set. I chose all the ones that had a median somewhere close to 2.4 ΔE00

Sampling of real CPDFs with median near 2.4

How good is this for an estimate? This says that if my median were 2.4 ΔE00, then the 90th percentile (at the extreme) might be anywhere from 3.4 to 4.6 ΔE00., but would likely be about 4.0 ΔE00

I have another way of showing that data. The graph below shows the relationship between the median and 90th percentile values for all 176 data sets. The straight line on the graph is a regression line that goes through zero. It says that   90th percentile = 1.64 * median. I may be an overly optimistic geek, but I think this is pretty darn cool. Whenever I see an r-squared value of 0.9468, I get pretty excited.

Ignore this caption and look at the title on the graph

Ok... I anticipate a question here. "What about the 95th percentile? Surely that can't be all that good!" Just in case someone asks, I have provided the graph below. The scatter of points is broader, but the r-squared value (0.9029) is still not so bad. Note that the formula for this is 95th percentile = 1.84 * median.

Ignore this one, too

Naturally, someone will ask if we can take this to the extreme. If I know the median, how well can I predict the maximum color difference? The graph below should answer that question. One would estimate the maximum as being 2.8 times the median, but look at the r-squared value: 0.378. This is not the sort of r-squared value that gets me all hot and bothered.

Max does not play well with others

I am not surprised by this. The maximum of a data set is a very unstable metric. Unless there is a strong reason for using this as a descriptive statistic, this is not a good way to assess the "quality" of a production run. This sounds like to sort of thing I may elaborate on in a future blog.

The table below tells how to estimate each of the deciles (and a few other delectable values) from the median of a set of color difference data. This table was generated strictly empirically, based on 176 data sets at my disposal. For example, the 10th percentile can be estimated by multiplying the median by 0.467.  This table, as I have said, is based on color differences between measured and aim values on a test target [7].

 P-tile Multiplier r-squared 10 0.467 0.939 20 0.631 0.974 30 0.762 0.988 40 0.883 0.997 50 1.000 1.000 60 1.121 0.997 68 1.224 0.993 70 1.251 0.991 80 1.410 0.979 90 1.643 0.947 95 1.840 0.903 99 2.226 0.752 Max 2.816 0.378

Caveats and acknowledgements

There has not been a great deal of work on this, but I have run into three papers.

Fred Dolezalek [8] posited in a 1994 TAGA paper that the CRF of ΔE variations of printed samples can be characterized by a single number. His reasoning was based on the statement that the distribution “should” be chi-squared with three degrees of freedom. He had test data from 19 press runs with an average of 20 to 30 sheet pulls. It’s not clear how many CMYK combinations he looked at, but it sounds like a few thousand data points, which is pretty impressive for the time for someone with an SPM 100 handheld spectrophotometer!

Steve Viggiano [9] considered the issue in an unpublished 1999 paper. He pointed out that the derivation ofthe chi-squared distribution with three degrees of freedom can be derived from the assumptions that ΔL*, Δa*, and Δb* values are normally distributed, have zero mean, have the same standard deviation, and are uncorrelated. He pointed out that these assumptions are not likely to be met with real data. I'm inclined to agree with Steve, since I hardly understand anything of what he tells me.

David McDowell [10] looked at statistical distributions of color errors of a large number of Kodak QC-60 Color Input Targets and came to the conclusion that this set of color errors could be modeled as a chi-squared function.

Clearly, the distribution of color errors could be anything it wants to be. It all depends on where the data came from. This point was not lost on Dolezalek. In his analysis, he found that the distribution only looked like a chi-squared distribution when the press was running stable.

Future research

What research paper is complete without a section devoted to "clearly further research is warranted"? This is research lingo for "this is why my project deserves to continue being funded"

I have not investigated whether the chi-squared function is the ideal function to fit all these distributions. Certainly it would be a good guess. I am glad to have a database that I can use to test this. While the chi-squared function makes sense, it is certainly not the only game in town. There are the logistic function, the Weibull function, all those silly beta functions... Need I go on? The names are as familiar to me as to everyone. Clearly further research is warranted.

Although I have access to lots of run time data, I have not investigated the statistical distributions of this data. Clearly further research is warranted.

Perhaps the chi-squared-ness of the statistical distribution of color errors is a measure of color stability? If there was a quick way to rate the degree that any particular data set fit the chi-squared function, maybe this could be used as an early warning sign that something is amiss. Clearly further research is warranted.

I have not attempted to perform Monte Carlo analysis on this, even though I know how to use random numbers to simulate physical phenomena, and even though I plan on writing a blog on Monte Carlo methods some time soon. Clearly further research is warranted.

I welcome additional data sets that anyone would care to send. Send me an email without attachment first, and wait for my response so that your precious data does not go into my spam folder: john@JohnTheMathGuy.com. With your help, further research will indeed be warranted.

Conclusion

My conclusion from this experiment is that the statistical distribution of color difference data, at least that from printing of test targets, can be summarized fairly well with a single data point. I have provided a table to facilitate conversion from the median to any of the more popular quantiles.

----------------------
[1] And if you are a color scientist, you are probably wondering when I am going to break into explaining the differences between deltaE ab, CMC, 94, and 2000 difference formulas, along with the DIN 99, and Labmg color spaces. Well, I'm not. At least not in this blog.

[2] For the uninitiated, color values are a set of three numbers (called CIELAB values) that uniquely defines a color by identifying the lightness of the color, the hue angle, and the degree of saturation.

[3] I have a Twitter account, so I have very little free time. Just in case you are taking a break from Twitter to read my blog, you can find me at @John_TheMathGuy when you get back to your real life of tweeting.

[4] Sometimes I just bring data up on the computer to look at. It's more entertaining than Drop Dead Diva, although my wife might disagree.

[5] How to decide the "correct" CIELAB value for a given CMYK value? If you happen to have a big collection of data that should all be similar (such as test targets that were all printed on a newspaper press) you can just average to get the target. I appreciate the comment from Dave McDowell that the statistical distribution of CIELAB values around the average CIELAB value will be different from the distribution around any other target value. I have not incorporated his comment into my analysis yet.

[6] Here is a pet peeve of mine. Someone might be tempted to say that they computed a bunch of ΔE00 values. This is not correct grammar, since "ΔE" is a unit of measurement, just like metric ton and inch. You wouldn't say "measured the pieces of wood and computed the inch values," would you?

[7] No warranties are implied. Use this chart at your own risk. Data from this chart has not been evaluated for color difference data from sources other than those described. The chart represents color difference data in  ΔE00, which may have a different CPDF than other color difference formulas.

[8] Dolezalek, Friedrich, Appraisal of Production Run Fluctuations from Color Measurements in the Image, TAGA 1995

[9] Viggiano, J A Stephen, Statistical Distribution of CIELAB Color Difference, Unpublished, 1999

[10] McDowell, David (presumed author), KODAK Q-60 Color Input Targets, KODAK technical paper, June 2003

Wednesday, November 7, 2012

Evolution versus revolution

I have heard it said that the process of improving our processes of doing things is evolutionary, and not revolutionary. The point is made that it is desirable to follow the course of evolution, and make small gradual changes to optimize our way of doing things. I want to consider this analogy through a number of other analogies.

Just how smart is it to be an ape?

In college, I took a primate psychology course in which I was required to write a paper. I had always been puzzled by the fact that the apes are in general not very successful species, so I tried to answer my puzzlement in my paper. It seemed to me that higher intelligence would imply greater adaptability, and hence greater success.

I started my paper by comparing the apes with other mammals which occupy similar niches. The pairings I made were between the gibbon and the spider monkey, the chimpanzee and the baboon, the orangutan and the tree sloth, the gorilla and the bear. In each case the ape in the pairing was less successful. Three of the apes are also on the endangered species list. The point is clear that there must be something detrimental acquired along with intelligence.

I noted two costs associated with intelligence. First, the additional size of the brain requires that the young be birthed earlier, and hence babies are more helpless and less viable. Furthermore, birth is considerably more stressful for the mother because of the larger head size. Second, the young are dependent on the parents for a much longer period, because of the shift from a "ROM based system" to a "RAM based system".

My conclusion was that, in the range of intelligence inhabited by the apes, the detriments of intelligence outweigh the benefits. I got an A+ on the paper. That was awesome.

The search for the missing link

Years later, I read a book entitled, The Panda's Thumb[1]. The author talks at great length about the fruitless search for the missing links in the fossil record between species. He argues that, while this could be explained by our sparse sampling of the geologic record, the lack of evidence points to our misconception about the evolutionary process. He claims that evolution is not always a gradual process, but that new species are created only when evolution occurs in spurts.

His argument is that, since creatures are generally fairly well fine-tuned to their environment, small changes are likely to be less viable, and will not survive. Only by making large and abrupt changes can a new species find a more successful niche. I think he was probably inspired by my paper.

A-kneeling at the optimal point

This idea is related to problems encountered with the numerical optimization of functions. The goal is to write a computer program which can automatically find the maximum of an arbitrary function. If we could be guaranteed of functions which have only a single local maximum, this is relatively easy. Methods for this have existed since Newton. They all basically steer themselves uphill until they find a flat spot. Unfortunately, in the real world, functions rarely constrain themselves to a single local maximum. Depending upon where you start, you may find yourself on a peak that is not the peak.

There is a numerical optimization method called simulated annealing [2] which can avoid getting caught in a local maximum. It is a procedure which is modeled after the annealing process.

Annealing is the process by which knife blades and the turbine blades in a jet engine are hardened. The piece is first raised to a very high temperature, enabling the molecules to wander around freely, rather than immediately seeking the lowest energy state they can find. Slowly the temperature is lowered, and molecules gradually settle into lower energy states. At any time, they are free to wander around, but as the temperatures goes down, this becomes less likely. The key feature to annealing's ability to find the lowest energy state (a single crystal) is that molecules are allowed to pass through higher energy states.

The numerical annealing process does not force that the method always head uphill. At each stage, the choice between uphill and downhill is random. As the method proceeds, the uphill choice becomes more likely; just as lowering the temperature makes the transitions out of lower energy states less likely.

Application

I have seen this process play out again and again.

I first started pondering the relationship between evolution and revolution when I was trying to understand what made engineering groups work and what made them fail. For a very small group of engineers, it is acceptable, and in fact, most efficient, to keep everything informal - specs, documentation, project planning.

As an engineering group gets bigger, it goes through a phase where making incremental increases in the amount of structure just plain does not work. Wholesale changes must occur in order for the larger group to be efficient. I lived through a few of these. They were painful, but revolutionary evolution was necessary.

In a previous post about printing standards and a panel discussion I led at GraphExpo, I discussed the issues with optical brightening agents in printing. Several steps must be taken to solve the problem. I lamented that taking only one of these steps will make things worse. Evolutionary revolution is required.

While I am on the subject of standards, let me make an observation. As a member of a few standards committees, I have come to realize that you can never change a standard. Even though a standard makes sense, there will be always be those who cannot adapt to whatever new stuff is put into place. This requires evolutionary revolution to make it work.

Here is a quote from an article that appeared on my LinkedIn page today. Same topic. Evolutionary revolution is necessary.
"As organizations continue to evolve within an ever-changing external environment, it has become quite evident that things are shifting talent- wise. This will not likely manifest as a small iterative adjustment in how we view, attract and develop talent. Rather, it seems destined to become a long overdue metamorphosis concerning our most important asset - people."

The patent fight between Apple and Motorola is another example. There are industries where there are very few patent attorneys. And there are industries where every company needs to have a bushel basket full of them. The transition of an industry from one to the other is an evolutionary revolution. A company cannot survive if the competitors have more patent attorneys.

In all these cases, we are not faced with a choice between evolution and revolution, evolution is revolution.

[1] Stephen Jay Gould, The Panda’s Thumb, W. W. Norton and Co., 1980
[2] Press, William, Brian Flannery, Saul Teukolsky, William Vetterling, Numerical Recipes, the Art of Scientific Computing