John the Math Guy

Wednesday, November 2, 2016

Statistical process control of color difference data, part 3

By now, I suspect that pretty much everyone has heard about the first two blog posts in this series. Everyone has been talking about them. E! News, Jimmy Kimmel... There is eager anticipation for the next installment.

The first blog post alluded to some strange goings-on when it comes to statistical process control (SPC) of color difference data. But it was mainly an overview of traditional process control.

The second blog post had some actual data, showing that color difference data does not follow the rules of the game for traditional process control. Color difference data is not normally distributed. And if you don't follow the rules... Well. You kinda shouldn't oughta play the game. Or at least be prepared to be yelled at by an angry lemur.

Lemurs are a bit sensitive about mathematical statistics

In this blog post, I give a somewhat intuitive answer to the big question on everyone's mind. In the next post, I will go way off the deep end and explain it with some <egad> math.

Simple explanation

First, I offer a simple observation. While this doesn't qualify as being a full-fledged theoretical dissertation, it does make one pause and say "hmmmm... there is something strange in this neighborhood." When the time comes around, I will cue you to say that.

For a true normal distribution, there is always a chance - perhaps an incredibly tiny chance - that the values could be negative. But color difference data never goes negative. I think we know that, but the implication is rather broad. Color difference data simply cannot be truly normally distributed. (Cue the Ghostbuster's theme: "If there's something strange, in your ΔE data, who you gonna call? John the Math Guy!")

Groucho demonstrates the absurdity of negative ΔE

One could certainly argue that in some cases, color difference data might be indistinguishable from a normal distribution. Take for example the case where you are determining the color difference between a set of white parts and a set of black parts. The color difference might always be more than 50 or 80 ΔE, so the fact that the color differences can't go negative is inconsequential.

But when we are doing process control of color, one would hope that the color measurements of the products are all kinda clustered around the target color. So the lack of negative color difference values is significant.

Another simple explanation

There is a common spec for spectrophotometers called "repeatability". A more accurate name for the spec might be "the degree to which an instrument agrees with itself in the short term on measurements of the same spot." I have been lobbying the standards committees to change the name, but so far, I haven't got much traction. I dunno, there was some wimpy objection about making tech writer's work too hard. I'll keep pushing the issue.

Repeatability for color measurements is determined by taking ten or maybe twenty measurements of the exact same spot. If someone else is doing the measuring, I prefer twenty. These are averaged to provide a reference color. Then the difference is computed between each of the ten or twenty measurements and the reference color. The reported reproducibility is the standard deviation of the ΔE values. This is known as the Mean Color Difference from the Mean (MCDM) method. It is described in the venerable color science book by the venerables Billmeyer and Saltzman.

I have a bit of a reservation that has to do with the utility of the spec. Years ago, back in the dark ages, photons were scarce and light bulbs varied a lot from shot to shot. As a result, reproducibility was a useful measure of the performance of the instrument. But with today's instruments, this number is normally very tiny. The much more meaningful specs are intra-family agreement (how well two instruments in the same family agree with each other) and inter-instrumemt agreement (how well any two instruments agree). I have written about this before.

In the Dark Ages, the Knights of the Round Table
were reduced mainly to scotopic vision

But the big bugaboo that's bugging me is the boogers caused by one of the spectro manufacturers who has chosen to define their own version of repeatability. I mean, first off, there is a standard out there. How about just following it? But the big bugaboo is that they decided to use the standard deviation of the color difference values, rather than the average.

At first blush, this seems like a reasonable way to look at the variation in a bunch of numbers. I mean, the standard deviation is the tool that I was given in Stats 301 to quantify variation.

But, lemme provide an example where this approach fails miserably. I admit that my example is extremely contrived, and is extremely unlikely to happen in the real world, but still, I think it should make you stop and scratch your head and say to yourself "gosh, is this really the kind of behavior that I want out of a spec for reproducibility?"

Here is the long-awaited and extremely contrived example. Let's say we were determining reproducibility and received the following four measurements. Yes, I know... you should take ten or twenty, but this is an extremely contrived example, ok? Here are the four measurements:

{50, -10, 0}, {50, 10, 0}, {50, 0, -10}, {50, 0, 10}

The average of the four is {50, 0, 0}. If we use the 1976 formula for ΔE, we see that each one is exactly 10 ΔE from this average. Remembering that the final step in computing reproducibility is taking the standard deviation... ummm, let's see... 10, 10, 10, 10... the standard deviation is exactly zero! We have a perfect instrument, with no variation whatsoever!! Woo-hoo! Of course, any pair of measurements are 14 to 20 ΔE apart, but the spec says this spectro is giving readings that are perfectly reproducible!!

The standard deviation of ΔE data can be horribly misleading! And I mean "midnight in a graveyard on Halloween" kinda horrible.

I don't know if I mentioned this before, but this is an extremely contrived example. On the other hand, it will hopefully give one pause to consider -- ΔE measurements, and especially the standard deviation thereof, don't follow the normal rules for SPC that we all know and love.

What went wrong?

The problem is illustrated (in general) in the diagram below. If the color values all lie on a circle (or sphere) centered on the reference color value, then the standard deviation of the ΔE values is absolutely worthless as a measure of the dispersion of the color points.

Since with the standard deviation method for determining repeatability, the reference color value is computed from the color values, this contrived situation will only occur if the color values are scattered uniformly on the surface (or near the surface) of that sphere. But if we venture out to situations where the reference color does not come from the data itself...

A less contrived example, in story form

I may have alluded to this before, but that last example was a bit contrived. It would never happen in real life. How about an example that is more realistic? Cuddle up with your favorite blankie and listen to a little story.

Once upon a time, there was a QC person at a print shop who was really into classical SPC, and a press operator at that same print shop who was particularly fond of run-time charts. Below is an example of one of the charts that they ran. The chart shows the adherence to a target color for the yellow solid of thirty samples from a production run. When I say "adherence to target color", I mean the ΔE between the measured color of the solid and the target color. The yellow lines are the upper and lower control limits, and the red line is the customer tolerance. Note that for the sake of simplicity. I am using the 1976 ΔE fomrula, so a customer tolerance of 4.0 ΔE is typical.

(Full disclosure... this is fabricated data. A very good facsimile of real data, but still fabricated.)

The view of the production run for the press operator and QC person

Everything is hunky-dory. All the samples are well within the control limits, so these partners in print conclude that the process is under control. The CpK is 2.44, which is a clear indication that the process is fully capable of providing product that meets the customers requirement. Everyone is happy. The crew manager is bringing in champagne to celebrate, and the boss is talking about big fat bonus checks for everyone. (Did I mention that this is a fairly tale?)

Now, the ink company regularly receives this SPC data from the printer, since they are genuinely interested in seeing the customer succeed. (You can believe that, can't you?) The inkie at the ink company has decided to have a little different look at that same data. (An inkie is a technician at an ink company. Generally affable. Good at Soduko, but not necessarily inclined to play often.) You will note that there is a similarity between this and the picture in the section "What went wrong?"

(Same disclaimer... this data is fabricated. Good facsimile, blah blah blah. But note that the chart above and the chart below are plots of the same fabricated data set.)

View of the production run of the ink QC person

Unbeknownst to the printer, the QC person at the ink company (who only has the best interest of the printer in mind) notices a very clear issue on this press run. Can you see it? All the little bees are lining up to the left of the target color. The pattern (a scattering which is mostly up and down) is typical. It shows that the variation on press is largely caused by a variation in the amount of ink on the substrate. There is an offset caused by a hue shift that is equally large.

Before making any changes, the inkie checks if this is a fluke or a trend. Previous runs all had the same basic problem. The course of action? Henceforth, the inkie decides that henceforth the formulation of the ink will add two drops of beet juice to every kilogram of yellow ink so as to move the color of the ink closer to the target.

Everyone is happy now, right?

Happy? Not quite. The printing plant is in Wisconsin, isn't it?!?!? Happiness is just not part of the culture in Wisconsin.

Every good tale (fairy or otherwise) must have some tension. There must be a plot point where the protagonists engage in conflict. This blog post, by the way, is a good tale, so here comes the conflict. The press operator was never told about the big change to the ink. (Can't you just imagine Jennifer Aniston doing that?) I realize that this never happens in real life, but go along with me here for the moment. The press is fired up and by gosh and by golly, the beet juice hits the fan. Here is the run chart for the first press run with the new ink.

Press operator's view of the press run after the fateful plot point

Put yourself in the place of the press operator. He had already hinted to his wife about the possibility of an all-expense paid trip to Hawaii if things continued going so well. And there is this press run with virtually all of the samples outside of the two yellow lines! The customer's spec is still being met, but jeepers creepers! Someone's gonna get ticked off.

The QC person sees a different problem. The CpK for this press run is 0.57. That's not good, by the way. One would like that number to be larger than maybe 1.3.

Naturally, they decided to blame the ink manufacturer. This is, as we all know, item #2 on the standard operating procedure in almost every printing plant I have ever been in. They called up to complain. The inkie agreed that, yes, a change was made to the formulation. After being thoroughly beaten about the ears for not telling anyone, he pulled the data from the new product run into his spreadsheet.

The plot below came up. The line of bees had gone from being 2 or 2.5 a* values to the left to being about 0.5 an a* unit to the left. Perhaps a bit of overshoot, or perhaps that's just normal variation. Certainly a welcome improvement. The press operator and QC person thanked the inkie, and agreed to continue with the new formula.

Inkie's view of the production run of the new ink

The press operator and QC person were both a bit flummoxed, but had to admit that the inkie was right. The change seemed to improve things.

Finally, the two badgered enough people that they found someone who knew a bit about color science and could counsel them on what went "wrong". The color science person said,

"What?? You should complain if your color error gets better than it used to be??!?! My advice... Ignore the lower control limit for ΔE. ignore the lower limit in the run time chart, or better yet, tell your stinking software not to plot it when you are dealing with color differences. And when you do the CpK thing... same thing. The formula uses both the upper and lower limits, and takes the minimum. Just do the computation based on the upper limit - the one-sided CpK."

That was what the first color scientist (the one with the beard) said. A second color scientist (who naturally also has a beard) pointed out a use for the lower control limit. Brian said that while the lower control limit may not be a trigger point for deciding when to get all up in arms about the process being broken, it is an indication that something changed for the better, so it might not be a bad idea to poke around and figger out what went right. You might want to continue doing whatever that was!

So everyone went home that evening feeling that the crisis had been averted. And we have two rules that will keep us from making this analysis mistake in the future. (Take note of this. I guarantee these will be on the final exam.) Do not use the lower control limit when doing SPC on ΔE color difference data. Modify the CpK computation so that it only considers the upper control limit.

But...

Even with the satisfying conflict resolution in the plot, the QC person didn't sleep well that night. Finally, at 3:00 AM, he/she got out of bed and sat down at his/her laptop to look at the data. After a lot of futzing around in Excel, he/she came to the realization that the variation in ΔE was much lower before the change was made to the ink. The original data had a standard deviation of 0.24 ΔE. The standard deviation of the color difference of the new data was 0.64 ΔE. This is a problem! Why did the variation in the process jump up like that?

Lemme see... what was item #2 in the SOP? Obviously, the new ink formulation was to blame. So, the inkie got a call at 8:01 AM. "I don't think you stirred the ink enough after you added those two drops of beet juice!" Imagine this said by a person who got like 17 minutes of sleep the night before.

To make a long story just a tad longer, the color scientist kinda person - the first one, you know, the one with the beard - got consulted again. He stroked his beard and then made the following drawing on his chalkboard. (Note that all color scientists have beards, if they're male, anyway. If not, they like Thai food. All color scientists, male or female, have a chalkboard instead of a whiteboard.)

The bearded color scientist demonstrates an uncanny ability to
connect color science with football plays

So, to paraphrase what the bearded color scientist said: standard deviation of color difference values does not work like we expect.

Personally, I can think of exactly one application of the use of standard deviation of ΔE, and that's an example of why no sane person would ever use of standard deviation of ΔE.

I may as well mention...

I hate to go on about my disdain for the use of standard deviation on ΔE values, but I have one more, slightly related thing to get off my chest. Even if this value were reliable, its use can be misleading. SPC is all about using "three sigma" to set control limits. That works for normally distributed data, but as we saw in the previous blog post, ΔE does not follow a normal disturb-ution

Three rules for SPC of color difference data

1. The standard deviation of ΔE is an unreliable measure of the distribution.

2. Ignore the lower control limit when doing SPC on ΔE color difference data.

3. Friends don't let friends compute the standard deviation of ΔE.

4. Modify the CpK computation so that it only considers the upper control limit.

5. Computation of the standard deviation of ΔE? Come on. Just say no.

6. Always proofread for number mismatches.

Foreshadowing the next blog post

The saga continues...

The last blog post was all about color difference data not following a normal distribution. I completely ignored (almost) any of that discussion in this blog post. I could have made all kinds of comments about how, since color difference data is not normally distributed, the traditional computation of upper control limit is no longer applicable, and similarly, the whole CpK thing must be rethought.

I ignored the discussion of normality, since the issues I point out today are fundamental, and not directly related to the statistical distribution.

In the next blog post, I want to address the question, "if the data is not normally distributed, then what is the distribution?" At least that's what I want to write about. Who knows what I actually will write about. I am certainly not smart enough to predict my actions!

Moving on to the final post in this series

This blog post was updated on Nov 7, 2016, thanks to Dave Wyble's comments about two bonehead mistakes I made in the original version. First (I can't believe I actually did this) I used the term "reproducibility" instead of "repeatability". I should have known better. I apologize to all who were hurt, psychologically or otherwise, by my blatant misuse of the English language.

Second, I misquoted the definition of repeatabilty that is given in the standards. And I shoulda known better. I just got it confused with the silly definition used by one of the spectro manufacturers.

Tuesday, October 25, 2016

Statistical process control of color difference data, part 2

Last week, some stark raving mad heretic grabbed my blogging pen, spouting out some blasphemy about how the classical approach to process control is doomed to fail for color difference data. Asteroids laying waste to heavily populated areas, cats sleeping with dogs, my local Starbucks being out of chai... all that doomsday stuff.

Well, perhaps the guy who was using my blogging pen wasn't stark raving mad. Maybe he was just stark raving "mildly annoyed"? And maybe the heretic wasn't just some other guy? I don't want to point the finger, but it might have been me who wrote the blog post. So, perhaps I need to take his contentious assertion seriously?

Here are the sacrilegious assertions from last week's blog post:

Part 1 - Color difference data does not fit a Normal Distribution.

Part 2 - Classical SPC is largely based on the assumption of normality, so much of it does not work well for color difference data.

I submit the chart below as evidence for the first assertion.

This is not normal data!

I need to give some provenance for this data.

In 2006, the SNAP committee (Specifications for Newspaper Advertising Production) took on a large project to come to some consensus about what color you get when you mix specific quantities of CMYK ink on newsprint. A total of 102 newspapers printed a test form on its presses. The test form had 928 color patches. All of the test forms were measured by one very busy spectrophotometer. The data was averaged by patch type, and it became known as CGATS TR 002.

For this blog post, I had a close look at the original data. For each of the 928 patches and for each of the 102 printers, I compared the average L*a*b* value against the measured L*a*b* value. As a result, I had just short of 100K color difference values (in ΔE₀₀).

Of the 94,656 color differences, there were 1,392 that were between 0.0 ΔE₀₀ and 0.5 ΔE₀₀. There were 7,095 between 0.5 ΔE₀₀ and 1.0 ΔE₀₀. And so on. The blue bars in the above chart are a histogram of this color difference data.

I computed the mean and standard deviation of the color difference data: 2.93, and 1.78, respectively. The orange line in the above chart is a normal distribution with those values. Now, we all like to think our data is normal. We all like to think that our data doesn't skew to the right or to the left. The bad news for this election season is that our color difference data is not normal. It is decidedly skewed to the left. (I provide no comment on whether other data in this election season is skewed either to the right or to the left.)

The coefficient of skewness of this distribution is about 1.0, which is about 125 times the skewness that one might expect from a normal distribution. "The data is skewed, Jim!"

The data is skewed, Jim!

Ok. So Bones tells us the data is skewed? Someone may argue that I have committed the statistical equivalent of a venial sin. True. I combined apples and oranges. When I computed the color differences, I was comparing apples to apples, but then I piled all the apple differences and all the orange differences into one big pile. Is there some reason to put the variation of solid cyan patches in the same box as the variation of 50% magenta patches?

Just to check that, I pulled out the patches individually, and did the skewness test on each of the 928 sets of data. Sorry, nit pickers. Same results. "The data is still skewed, Jim!"

The data is still skewed, Jim!

Yeah, but who cares? The whole classical process control thing will still work out, right? Well.... maybe. Kinda maybe. Or, kinda maybe probably not.

I looked once again at the data set. For each of the 928 patches, I computed the 3 sigma upper limit for color difference data. Then I counted outliers. Before I go on, I will come up with a prediction of how many outliers we expect to see.

One would think that the folks doing these 102 press runs were reasonably diligent in the operation of the press for these press runs. The companies all volunteered their time, press time, and materials to this endeavor, so presumably they cared about getting good results. I think it is reasonable to assume that on the whole, they upped their game, if only a little bit just to humor the boss.

Further, back in 2006, several people (myself included) blessed the data. No one could come up with any strong reason to remove any of the individual data points.

So, I am going to state that the variation in the data set should be almost entirely "common cause" variation. This is the inevitable variation that we will see out of any process. Now, let's review the blog post of an extremely gifted and bashful applied mathematician and color scientist. Last week, I wrote the following:

If the process produces normal data, and if nothing changes in our process, then 99.74% of the time, the part will be within those control limits. And once every 400 parts, we will find a part that is nothing more than an unavoidable statistical anomaly.

There were 94,656 data points, and we expect 0.26% outliers... that would put the expectation at about 249 outliers in the whole bunch. Drum roll, please... I found 938! For this data set, I found four times as many outliers as expected.

To put this in practical terms, if a plant were to have followed traditional statistical process control methods on this set of color difference data, they would be shutting down the presses to check it's operation four times as often as they really should. This is a waste of time and money, and as Deming would tell us, stopping the presses and futzing with them just causes additional variation.

Traditional statistical process control of color difference data is dead, Jim!

I should remark that this factor of four is based on one data set. I think it is a good data set, since it is very much real world. But perhaps it includes additional variation because there were 102 printing plants involved? Perhaps there is some idiosyncrasy in newspaper presses? Perhaps there is an idiosyncrasy involved in using the average of all 102 to determine the target color?

I would caution against trying to read too much into the magic factor of four that I arrived at for this data set. But, I will hold my ground and say that the basic principle is sound. Color difference data is not normally distributed, so the basic assumptions about statistical process control are suspect.

In next week's installment of this exciting series, I will investigate the theoretical basis for non-normality of color difference data.

Move on to Part 3

Tuesday, October 18, 2016

Statistical process control of color difference data, part 1

Statistical process control (SPC) of color data -- specifically of color difference (ΔE) data -- can be done, but there is a bit of a twist. Color difference data doesn't behave like your garden variety process control data. Since ΔE doesn't follow the rules, the classical method for computing control limits will no longer work.

In this blog post, I review classical process control to provide a footing for next week's blog, where I pull the rug out from under the footings of the classical approach; explaining why it won't work for color difference measurements. Hopefully, by the time I get around to the third blog post in this trilogy, I will have thought of some new footings on which to erect a new SPC specifically designed for ΔE.

Process control - Do we have an outlier?

Review of process control

The premise of statistical process control is "more or less simple". I say that in the sense that it's not really that simple at all. And I say that because I want to make sure that you understand that what I do is really pretty freaking awesome. But really, the basic idea behind SPC is not all that tough to comprehend: You only investigate your widget-making machine when it starts to produce weird stuff, and you shouldn't sweat it when the product isn't weird.

The complicated part lies in your algorithm for deciding where to draw the line between "normal" and "weird". The red dress on the far left? Elegant, chic, and attractive, and pretty much in line with what all the women at my widget factory are wearing. The next one over? Yeah... I see her in the cafeteria once in a while. But I'm just not getting into the outfit on the far right. Sorry. I'm just not a fan of horizontal stripes. But in between... how do you decide where to draw the line?

Where to draw the line????

Statistical process control has an answer. You start by characterizing your process. As you manufacture widgets, you pull out samples and measure something about them. Hopefully you measure something that is relevant, like the distance between the threads of a bolt, or the weight of the cereal in the box. Since you are (apparently) reading this blog post, it would seem that the widget's color might be the attribute that interests you.

Next, you sadistically characterize this big pile of data. Open up a spreadsheet, and open up a bottle of Black and Tan, a Killian's Red, a Pale or Brown Ale, a Blue Moon, or an Amber Lager. And unleash the sarcastical analysis.

The goal for your spreadsheet is to come out with two numbers, which we call the upper control limit and the lower control limit. Then when you saunter into work the following day, after recovering from a colorful hangover, you can start using these two numbers on brand new production data. Measure the next widget off the production line. If it falls between the lower control limit and the upper control limit, then relax and pull another Black and Tan out of your toolbox. You can relax cuz you know your process is under control.

The yellow crayon is just a few nanometers short of a full deck

When a part falls outside the control limits, the camera doesn't automatically cut to Tom Hanks saying "Houston, we have a problem". We're not sure just yet whether this is a real problem or a shell-fish-stick anomaly. The important thing is, we start looking for Jim the SOP Guy, since he is the only one in the plant who knows where to find the standard operating procedure for troubleshooting the widget making machine.

Note that I was careful not to start the previous paragraph with "when a part is bad..." Being outside of control limits does not necessarily mean that the part is unacceptable for the person writing out a check for the widgets. Hopefully, the control limits are well within the tolerances that are written into the contract. And hopefully, the control limits that are used on the manufacturing floor were based entirely off data from the process, and the SPC code of ethics has not been sullied by allowing the customer tolerances to be used in place of control limits. That would be icky.

Identifying control limits

But how do we decide what the appropriate control limits are? If we set the control limits too tight, then Jim the SOP Guy never gets time to finish the Blue Moon he opened up for breakfast. And we all know that Jim gets really ornery if his beer gets warm.

You don't want to get Jim the SOP Guy angry!

If on the other hand, we humor Jim the SOP Guy and widen the tolerances to the point where Tom Hanks can fly a lunar lander through them, then we will potentially fail to react when the poor little widget making machine is desperately in need of a little TLC.

So, every time we encounter another measurement of a widget, we are faced with a judgement call. Setting control limits is inherently a judgement call where we balance the risk of wasted time troubleshooting versus missing a machine that's out of whack.

Deming

Why is it so bad to spend a little extra time troubleshooting? It is, of course, a business expense, but there is an insidious hidden cost to excessive knob gerfiddling. It makes for more variation in the product. If we try to control a process to tighter than it wants to go, we just wind up chasing our tail.

Well, lemme tell ya about when I worked with Deming. This was back in the late 1940's, just after the Great War to End All Wars. Oh wait. That was WW I. Deming did his stuff just after WW II - the Great War After the Great War to End All Wars. I was about negative thirteen years old at the time. A very precocious young lad of negative thirteen, I was. Deming learned me about the difference between normal variation and special cause. Normal variation is the stuff you can expect with your current process. You can't get rid of this without changing your process. Special cause means that something is broke and needs attending to.

Try this joke at home with Riesling and with Kipling!

Deming traveled to Japan after the war to help rebuild their manufacturing system. He did that very well. I mean, very well. Deming became a super-hero for the Japanese in much the same way that I have become a super-hero for my dogs. Except, of course, that the Japanese came to revere Deming.

In a nutshell, Deming preached that all manufacturing processes have a natural random variation. We should seek, over the long run, to minimize this by improving our process. This is important, but it is not the topic of this blog series. I want to concentrate on the day-to-day. In the short run, we need to understand the magnitude of our variation. This is done by collecting data, and applying statistics to it. This is used to identify subsequent parts that fall outside that range. When this happens, there is a call for identifying the special cause, and correcting the issue.

A part is identified as being potentially bad if it is so far from the norm that it is unlikely to have come from the same process. This is important enough to repeat. A part is identified as being potentially bad if the probability of it falling within the established statistical distribution of the process is very small. So, it's all about probabilities.

Enter normality
If we assume that the underlying distribution is "normal" (AKA a Gaussian or bell curve), then we can readily characterize the likelihood of a part being bad based on the mean and standard deviation of the process. In a normal distribution, 68% of all samples fall within 1 standard deviation of the mean, 95.5% fall within 2 standard deviations of the mean, and 99.74% fall within 3 standard deviations of the mean.

Folks who have taken credit for DeMoivre's invention

So...

The characterizing of our process is pretty simple. You know, when you opened up the spreadsheet and took a long drink of the Amber Lager? You don't have to tell your boss how simple it is, but here it is for you: Compute the average of the data. That goes in one cell of a spreadhseet. Compute the standard deviation. That goes in a second cell. Then, multiply the standard deviation by the magic number 3. Subtract this product from the mean (third cell in the spreadsheet), and add this product to the mean (fourth cell). This third and fourth cell are the lower and upper control limits, respectively.

If the process produces normal data, and if nothing changes in our process, then 99.74% of the time, the part will be within those control limits. And once every 400 parts, we will find a part that is nothing more than an unavoidable tansistical anomaly.

The big IF

Note the sentence that predicated assigning the numbers to the likelihood of false alarms: If the underlying distribution is normal...

Spoiler alert for next week's blog post. Color difference data is not normal. And by that I mean, it doesn't fit the normal distribution. This messes up the whole probability thing.

Sadly, differences of color don't live in this city!

Here is a scenario that suggests there may be a difficulty. Let's just say for example, that the average of our color difference data is 5 ΔE, and that the standard deviation is 1 ΔE. That puts our lower control limit at 2 ΔE.

Let's say that we happen to pull out a part and the difference between its color and the target color is 1 ΔE. What should we do? Classical control theory says that we need to start an investigation into why this part is outside of the control limits. Something must be wrong with our process! The sky is falling!

But stop and think about it. If the part is within 1 ΔE of the target color, then it's pretty darn good. Everyone should be happy. Classical control theory would lead us to the conclusion that something must be wrong with our process because the part was closer to the target color than is typical!

The obvious solution to this is that we simply ignore the lower control limit. That will avoid our embarrassment when we realize that we fired that incompetent operator for doing too good a job. But, this simple example is a clue that something larger might be amiss. Stay tuned for next week's exciting blog post, where I explain how it is that color difference values are really far from being normally distributed!

Move on to Part 2

Tuesday, October 11, 2016

A backwards optical illusion

I think this is just downright weird. An optical illusion with a twist.

An illusion

I expect many of you may have seen the clever illusion that I show below. The gray rectangle on the left (the one surrounded by white) looks darker than the one on the right.

This illusion is the topic of the day!

Well, it looks that way, but it's not, really. All we need to do is join the two with a bar of the same color and the illusion goes away. Or, if you suspect that I am doing a little creative image editing, then cut two holes in a piece of paper, one for each square.

The illusion de-mystified

A (failed) attempt to explain the illusion

Why does this happen? To shed some light on the illusion (pun intended), I wanted to see if my camera saw the same thing. So, being the clever and resourceful guy that I am, I displayed this image on my computer screen. and took a picture of it with my Canon G10. The picture can be seen below.

What my camera sees when it looks at this illusion

The square on the left (white surround) has average gray values of (42.8, 45.6, 60.5). The one on the right (black surround) has average gray values of 34.4, 37.1, 10.9). The little square on the left has an average difference in brightness of 8.4 in the red channel, 8.4 in the green, and 10.9 in the blue. Aha!! Just as I thought!! The camera is seeing the same thing I am seeing!

(For those of you who noticed a mistake in my reasoning in that last sentence, just sit tight on your hands for a little while. Please don't spoil the surprise for the rest when I reveal the intriguing error.)

Why does the camera not see the two squares the same? In any optical system, be it a camera or an eyeball, there is scattered light - light that doesn't focus just the way we want it to. As a result, light from bright areas in the scene will scatter into dark areas in the scene. We call this, veiling glare.

The picture below is an extreme case of veiling glare. The photos above and below were taken of the same mastiff figurine and with the same camera and lens. In the one below, I fogged up the lens by breathing on it. Never do that, by the way. It will make your mastiff foggy. (I had a mastiff once... Bubba. I miss him.)

When a good dog gets veiling glared

Note that the black background didn't just turn gray. It took on some of the color of the dog. The fawn colored light coming from the fawn colored coat of Bubba should all have been focused on the image of Bubba at the sensor of the camera. But some of it wound up going somewhere else because of the temporary foggy imperfection in the lens.

This is extreme, as I said, but all lenses do this to an extent. A small portion of the average intensity of the image is added to all the pixels. That's veiling glare. But there is also a more localized effect. Going back to the image that my camera took of my computer screen, the gray square that is surrounded by white is made just a tad bit brighter because it is standing near all those other bright pixels. Just like when I stand next to Albert Einstein, Isaac Newton, and John Von Neumann, their brilliance scatters over me and I look so much brighter.

(Just in case you didn't know, John Von Neumann was one of the fathers of the computer, being credited with the idea of a stored program computer. And, an applied mathematician.)

My high school chess club
(from left to right) Newton, Einstein, me, and Von Neumann

So (prepare for the oops), I have just demonstrated that this effect - the effect of the illusion I started with - can be readily seen in images taken with a camera. And the effect is from the scatter that happens in the lens, be it the one in the eye or the one in the camera.

Say what?

Wait a sec. I think that's exactly backwards of what we perceive when we look at the illusion. The gray patch that is surrounded by white actually shines just a bit brighter on the retina, but it looks darker! How can this be?????!!?!

I gave a bit of a clue, when I mentioned my good friends Albert, Isaac, and John. If you happened to overhear me having a conversation with these gents, would you really think of me as being of their caliber? Or rather, would I appear more dumber, since I would be compared to them?

Take a look again at my chess club picture. Have a close look at me, in the Gold's Gym shirt. I really don't think I have all that bad of a physique. In isolation, one might actually think I am rather buff. But standing next to those other mesomorphs, I am afraid I have to admit that I look like the anti-hero from a Woody Allen movie - a real nebbish.

The thing is, somewhere between the retina and the cognitive part of the brain, it all becomes about comparisons. The gray square on the left is compared to the white that it is next to. Because of that proximity, it is perceived as being darker than it really is. Similarly, the square on the right is perceived as being lighter due to its proximity to the black area.

Judging by the fact that we really can't "turn this effect off" just by thinking about it, the comparison must have been done prior to the signal reaching the cognitive brain. Maybe it's in the rods and cones? Maybe in the neurons? Or maybe in the lower limbic system? This sounds like a topic for a another blog. Maybe I will reference another blogpost of mine about the famous "what color is the dress" fiasco.

But it is interesting to note that that the cognitive part of the brain does this same trick. The guy in the chess club pic looks like a wimp. I look dumb when you hear me converse with really bright people. The kid at the prep school who grew up in a middle class family feels like he/she had a life of squalor.

Summarizing...

I have described two effects here. First, there is light scattered in the remarkable optical system. This changes the amount of actual light that is registered in the retina. Areas adjacent to bright stuff have the largest effect.

Someplace in the early parts of the human visual system, there is a larger counter-effect caused by a constant comparison of objects against what is nearby.

And so, we have a simple illusion that has more to it than one would expect.

Tuesday, October 4, 2016

Blues in the night

Fall is upon us. Everyone in Wisconsin is either eagerly anticipating the lovely fall colors, or are dreading the onset of Seasonal Affective Disorder (SAD). I have already blogged about the color science of the autumn leaves, so, it's probably time for me to blog about the eye, color, and it's tawdry relationship with SAD. Along the way, I will talk about zeitgebers and circadian rhythm entrainment, the spectral response of retinal ganglia, and of course, the supra-chiasmatic nucleus! Hot Damn!

Public service announcement: If this is a medical emergency, please hang up and dial 911. If you are looking for some practical advice on dealing with something that is a little more than the winter blues, I would suggest having a look at the blog post from BetterHelp called Seasonal Depression Is More Than Just The Winter Blues. But if you want to understand what happens in your body that makes SAD happen, then please read on!

Are you getting sleepy yet?

Today's sleep-inducing lecture begins with melatonin. I think most everyone has heard of it, and everyone has used it, whether they are aware of it or not. Melatonin is the hormone that sends a wake up call to all the team leaders in our body that it's time to start wrapping up the day's activity. Melatonin makes us sleepy.

This is not an endorsement

You can buy melatonin from your favorite drug store, supermarket, or neighborhood dope peddler. It is inexpensive, you can get it without a prescription, and it helps you get to sleep.

Alternately, you can get it from your local pineal gland. If you are a vertebrate, then it is quite likely that you own one of these hot little jobbies. The pineal gland is located kinda in the middle of your brain. It's just a tiny thing, but it'll put you to sleep faster than reading a blog post about seasonal affective disorder. The pineal gland is where the melatonin comes from. But only when the pineal gland is good and ready to give you that mellowing stuff.

From Wikipedia

But, how does the pineal gland decide when it's time to play Mr. Sandman?

I don't know these women, but I wish I would have done this video!

Getting in the rhythm

What we need is a clock for Mr. Pineal. No. Check that. Mr. Pineal needs an alarm clock. No. Check that again. Mr. Pineal would do real well to have what my electrical engineering friends would call a phase locked loop. Essentially, this is an oscillator with the added feature of having an active mechanism to keep the oscillator in sync. In this case, the phase locked loop would keep in sync with the rotation of the Earth. By the way, the name of the phase locked loop for the pineal gland is the "supra-chiasmatic nucleus". (Not to be confused with the super-charismatic nucleus, AKA John the Math Guy.)

The job of the supra-chiasmatic nucleus

How does the SCN get synchronized? What is the signal that serves as a zeitgeber?

Aside: I throw in words like "zeitgeber" to make it sound like I know what I am talking about. I have learned through the years that every profession has a relatively small collection of words that act as passwords to get you in the door. Use them correctly in a sentence, and you get to join the club.

Zeitgeber is, of course, a German word. "Zeit" means "time", and "geber" means "giver". This is the official name of "that thing that keeps you in sync with the diurnal cycle".

There have been a lot of suggestions about what the zeitgeber (or zeitgebers) might be. Some obvious guesses are coffee, activity, loud noises, your spouse snoring, a cold or hot shower, a good breakfast, and social interaction. All of these will help us to wake up.

But, sadly, these are not the most effective way to actually reset our alarm clock. In other words, let's say I were to get a real job, one where I had to actually get up before noon every day. If I stop at Starbucks tomorrow at 6:00 AM and have a mocha with three shot of espresso, it will not make it easier for me to get up the following morning. That suprachiasmatic nucleus just keeps chugging along, thinking that 2:00 AM is a decent time to crawl between the sheets.

It took a lot of research, but eventually Science came to the conclusion that light is the primary zeitgeber.

Do you think it unromantic of me to see this sunrise and say "What a glorious zeitgeber"?

SAD and the clandestine pathway

Scientists found many animals that could easily be entrained to the day with light. Originally, it was thought that humans just didn't work this way. It wasn't until the late 1980s that it was found that 2,500 to 3,500 lux was required to activate the SCN. (A typical indoor setting is only a few hundred lux.)

SAD (so the theory goes) is merely what happens when the body is saying "it's time to sleep" when annoyingly happy people are awake being annoyingly happy. SAD is a failure to entrain the SCN during the winter when there is a dearth of sunlight.

Light therapy with full spectrum lights has been used to treat SAD. Historically, treatment of SAD has taken gobs and gobs of light, basically four fluorescent tubes at arm's length. Researchers found that it took a lot of light to entrain a human, and that the light from an incandescent bulb was not particularly effective. So-called "full spectrum" lights became the recommended therapy. Note that the difference between incandescent and full spectrum light is at the blue end. Full-spectrum light has a lot more blue.

It was only fairly recently (1991) that researchers found that the rods and cones in the eye were not the light receptors that kept the SCN in sync with the day. The actual pathway was through the network of nerves in the eye which is called the retinal ganglion. And guess what? The peak response of these puppies is in the blue region, somewhere between 460 nm and 480 nm. This is blue light, by the way.

Today, you can find light therapy boxes that use the much more efficient blue light to treat SAD. It is perhaps coincidental that there are blue LEDs with a dominant wavelength very close to the peak response of the ganglion.

See how annoyingly happy and productive she is?

Avoiding disentrainment

The Foundation for the Research and Investigation of Early Nighttime Diversions (FRIEND) recently published a study that showed that an alarming number of couples have recently forgone previous snuggling activity in preference to checking email and watching stupid cat videos on their cell phones and tablets. [This study involved a random sampling of couples living in my house. The study was published in a blog post from John the Math Guy, entitled "Blues in the night".]

This is a problem. First off, cuz I like snuggling. But perhaps more importantly, cuz many cellphones emit light in that critical region between 460 nm and 480 nm. In other words, the devices stimulate this pathway that tells the SCN that it is still daytime. My closest friend, Jeff Yurek, published a blog post that says that certain devices are less prone to this problem. [Jeff became one of my closest friends when he posted a link to one of my blog posts. I really am that shallow.]

Jeff says that the blue light from quantum dot displays is a bit further into the violet end of the spectrum than the critical region where the ganglion are sensitive. His article calls out the Vizio RS65-B2 as one TV that has a quantum dot display. Who wouldn't want one of these 65" TVs on teh wall in their bedroom?!?

But if you are thinking of something a bit more portable, I just tested my KindleFire HD, and found that it is probably less prone to messing with circadian rhythms.

By the way, I would be more than happy to test anyone's tablet for how susceptible it is to upsetting your biological clock. Of course, I won't guarantee that I will return the device. If I like it, I might just keep it.

Tuesday, September 20, 2016

Lemme ask a couple thousand of my friends

I think that crowdsourcing is a really cool idea. Crowdsourcing is where you ask a couple thousand of your closest friends to help you with some project - maybe to answer an important question. Who will win the Superbowl? Should I wear a red tie to the interview? Will FitBit stock go up? Will my doggy iPad be a successful product? And, most important, should I order another strawberry margarita, or move right on to my main course of Maker's Mark on the rocks?

Jimi doing some crowdsourcing on whether is vest was the coolest part of the 60's

Naturally, I have blogged about crowdsourcing before. After all, all great men repeat themselves. I had a blog on recommendation engines, and talked a bit about how Netflix cogitates on all your ratings in order to make sure you have a superlative movie viewing experience. In this Valentine's Day blogpost, I did a bit of crowdsourcing in the name of romance.

Opinion polls are an early form of crowdsourcing. The Nielsen company once solicited my opinion on TV shows and radio stations. I showed them! I didn't watch any TV, and only listened to classical music on the radio through the entire period. That'll show 'em.

As we are in a political season (and when aren't we?), we are inundated with the latest pontifications from pollsters. But can we trust the pollsters? Are they biased? Who rates the raters? I have an answer for that! Nate Silver is a prominent statistician who applies his science to meta-analysis - analyzing the analysis. Have a look at his webpage that rates the survey companies on how well they follow an unbiased protocol and on their accuracy. And check out this page for the latest compilation of presidential polls.

I don't know how I feel about these poll results

Opinion polls have one deficiency. In order to get a statistically significant sample size, you need to ask the opinions of a lot of people, and many of these people don't know or don't care. This is not the most efficient or reliable way to make predictions.

Let's say you had some reason to want to predict the outcome of the NBA playoffs. I dunno... maybe you had some sort of financial stake? (I assume you are like an owner or something... cuz betting on games is naughty.) One way to get a good prediction is to talk to some people who follow the basketball teams closely. Like me, for example. I can tell you, right off the top of my head, how many RBIs Peyton Manning had when he went up against Tiger Woods in the 2015 Stanley Cup. You definitely would want to get my opinions on Michael Phelps before you put money on Yankees to win the Superbowl!

(By the way, I advise against putting money on the Yankees for the Superbowl.)

The problem is... opinion polls don't take into account the expertise of the people being polled. I would argue that the opinion of ten experts is more reliable than the a good random sampling of 1,000 random people who are randomly clueless on the random topic.

You don't want the opinion of this random actor!

But, coming up with a panel of real experts on a random topic is a lot of work. Might there be another way that is almost as good?

Here's an interesting take... how about letting people tell you whether they are an expert? Oh. That's a bad idea. Ok, how about this... ask people to put their money where their mouth is?

Racetracks do this every day. And here is the interesting part: The odds on a horse are not based on the expert opinion of some expert. The odds at the racetrack are based entirely on crowdsourcing. When a lot of money has been bet on a given horse, the odds change. Curiously, the odds change in such a way as to make sure that the track makes money. What are the odds of the track making money!?!

You don't like racetracks and all the shady undesirables lurking around? I have an example of crowdsourcing where people declare their expertise with their checkbook. The stock market. If a lot of people bet on a given stock, the price goes up. If no one likes the company, the stock goes down. Each individual decides how much they are willing to pay to buy a stock, or how much they are willing to sell a stock for. Just like the racetrack, only with a different sort of shady undesirables hanging around.

Now I have set the stage for a clever idea: Let's say that your company is considering whether a given idea for a new product will pay off. You could give one person that job and hope he/she gets it right. You could get a committee on it, and watch the committee form sub-committees, do focus groups, pay for market research, etc. And two years later, one person will finally have to fire the committee and make the decision. Committees are always the best way to get decisions made fast.

Or (get ready for the cool idea!) you could set up a virtual stock market for your employees to invest fake money in a bunch of potential product ideas. By introducing money - even though it's fake - you get people to invest where they feel they have some expertise. And those who actually have that expertise will tend to invest "correctly" and then have more money with which to sway future ideas.

Of course, the details get a bit involved. There is some fancy math under the hood that is needed to simulate how the price of a stock goes up when you put money into it and goes down when you sell or short a stock. This math is called a "Market Maker". It has nothing to do with Maker's Mark, unfortunately.

The domain of mathcanics

There is a company in Milwaukee called IdeaWake that has developed some software to do all this. I'm happy to say that I helped them out, just a little bit.