Wednesday, November 14, 2012

Assessing color difference data

The punch line

For those who are too impatient to wade through the convoluted perambulations of a slightly senile math guy, and for those who already understand the setting of this problem, I will cut to the punch line. When looking at a collection of color difference data (ΔE values), it makes no difference whether you look at the median, the 68th, 90th, 95th, or 99th percentile. You can do a pretty darn good job of describing the statistical distribution with just one number. The maximum color difference, on the other hand, is in a class by itself.

Cool looking function that has something to do with this blog post, copied,
without so much as even dropping an email, from my friend Steve Viggiano 

In the next section of this techno-blog, I explain what that all means to those not familiar with statistical analysis of color data. I give permission to those who are print and color savvy to skip to the final section. In this aforementioned final section, I describe an experiment that provides rather compelling evidence for the punch line that I started out with.

Hypothetical situation

Suppose for the moment that you are in charge of QC for a printing plant, or that you are a print buyer who is interested in making sure the proper color is delivered. Given my readership, I would expect that this might not be all that hard for some of you to imagine.

If you are in either of those positions, you are probably familiar with the phrase "ΔE", pronounced "delta E". You probably understand that this is a measurement of the difference between two colors, and that 1 ΔE is pretty small, and that 10 ΔE is kinda big. If you happen to be a color scientist, you probably understand that ΔE is a measurement of the difference between two colors, and that 1 ΔE is (usually) pretty small, and that 10 ΔE is (usually) kinda big [1].

Color difference example copied,
without so much as even dropping him an email, from my friend Dimitri Pluomidis

When a printer tries valiantly to prove his or her printing prowess to the print buyer, they will often print a special test form called a "test target". This test target will have some big number of color patches that span the gamut of colors that can be printed. There might be 1,617 patches, or maybe 928... it depends on the test target. Each of these patches in the test target has a target color value [2], so each of these printed patches has a color error that can be ascribed to it, each color error (ΔE) describing just how close the printed color is to reaching the target color.

An IT8 target

This test target serves to demonstrate that the printer is capable of producing the required colors, at least once. For day-to-day work, the printer may use a much smaller collection of patches (somewhere between 8 and 30) to demonstrate continued compliance to the target colors. These can be measured through the run. For an 8 hour shift, there might be on the order of 100,000 measurements. Each of these measurements could have a ΔE associated with it.

If the printer and the print buyer have a huge amount of time on their hands because they don't have Twitter accounts [3], they might well fancy having a look at all the thousands of numbers, just to make sure that everything is copacetic. But I would guess that  if the printers and print buyers have that kind of time on their hands, they might prefer watching reruns of Andy Griffith on YouTube, doing shots of tequila whenever Opie calls his father "paw".

But I think that both the printer and the print buyer would prefer to agree on a way to distill that big set of color error data down to a very small set of numbers (ideally a single number) that could be used as a tolerance. Below that number is acceptable, above that number is unacceptable.

It's all about distillation of data

But what number to settle on? When there is a lot at stake (as in bank notes, lottery tickets and pharmaceutical labels) the statistic of choice might be the maximum. For these, getting the correct print is vitally important. For cereal boxes and high class lingerie catalogs (you know which ones I am talking about), the print buyer might ask for the 95th percentile - 95% of the colors must be within a specified color difference ΔE. The printer might push for the average ΔE, since this number sounds less demanding. A stats person might go for the 68th percentile, purely for sentimental reasons.

How to decide? I had a hunch that it really didn't matter which statistic was chosen, so I devised a little experiment with big data to prove it.

The distribution of color difference data

Some people collect dishwasher parts, and others collect ex-wives. Me? I collect data sets [4]. For this blog post I drew together measurements from 176 test targets. Some of these were printed on a lot of different newspaper presses, some were from a lot of ink jet proofers, some were printed flexography. For each, I found a reasonable set of aim color values [5], and I computed a few metric tons of color values in ΔE00  [6].

Let's look at one set of color difference data. The graph represents the color errors from one test target with 1,617 different colors. The 1,617 color differences were then collected in a spreadsheet to make this CPDF (cumulative probability density function). CPDFs are not that hard to compute in a spread sheet. Plunk the data into the first column, and then sort this from small to large. If you like, you can get the percentages on the graph by adding a second column to the spreadsheet that goes from 0 to 1. If you have this second column to the right, then the plot will come out correctly oriented.

Example of the CPDF of color difference data

This plot makes it rather easy from the chart to read off any percentile. In red, I have shown the 50th percentile - something over 1.4  ΔE00. If you are snooty, you might want to call this the median. In green, I have shown the 95th percentile - 3.0  ΔE00. If you are snooty, you might want to call this the 95th percentile.

Now that we understand how a CPDF plot works, let's have a look at some of the 176 CPDF plots that I have at my beck and call. I have 9 of them below.

Sampling of real CPDFs

One thing that I hope is apparent is that, aside from the rightmost two of them, they all have more or less the same shape. This is a good thing. It suggests that maybe our quest might not be for naught. If they are all alike, then I could just compute (for example) the median of my particular data set, and then just select the CPDF from the curves above which has the same median. This would then give me a decent estimate of any percentile that I wanted.

How good would that estimate be? Here is another look at some CPDFs from that same data set. I chose all the ones that had a median somewhere close to 2.4 ΔE00

Sampling of real CPDFs with median near 2.4

How good is this for an estimate? This says that if my median were 2.4 ΔE00, then the 90th percentile (at the extreme) might be anywhere from 3.4 to 4.6 ΔE00., but would likely be about 4.0 ΔE00

I have another way of showing that data. The graph below shows the relationship between the median and 90th percentile values for all 176 data sets. The straight line on the graph is a regression line that goes through zero. It says that   90th percentile = 1.64 * median. I may be an overly optimistic geek, but I think this is pretty darn cool. Whenever I see an r-squared value of 0.9468, I get pretty excited. 

Ignore this caption and look at the title on the graph

Ok... I anticipate a question here. "What about the 95th percentile? Surely that can't be all that good!" Just in case someone asks, I have provided the graph below. The scatter of points is broader, but the r-squared value (0.9029) is still not so bad. Note that the formula for this is 95th percentile = 1.84 * median.

Ignore this one, too

Naturally, someone will ask if we can take this to the extreme. If I know the median, how well can I predict the maximum color difference? The graph below should answer that question. One would estimate the maximum as being 2.8 times the median, but look at the r-squared value: 0.378. This is not the sort of r-squared value that gets me all hot and bothered.

Max does not play well with others

I am not surprised by this. The maximum of a data set is a very unstable metric. Unless there is a strong reason for using this as a descriptive statistic, this is not a good way to assess the "quality" of a production run. This sounds like to sort of thing I may elaborate on in a future blog.

The table below tells how to estimate each of the deciles (and a few other delectable values) from the median of a set of color difference data. This table was generated strictly empirically, based on 176 data sets at my disposal. For example, the 10th percentile can be estimated by multiplying the median by 0.467.  This table, as I have said, is based on color differences between measured and aim values on a test target [7].

P-tile
Multiplier
r-squared
10
0.467
0.939
20
0.631
0.974
30
0.762
0.988
40
0.883
0.997
50
1.000
1.000
60
1.121
0.997
68
1.224
0.993
70
1.251
0.991
80
1.410
0.979
90
1.643
0.947
95
1.840
0.903
99
2.226
0.752
Max
2.816
0.378

Caveats and acknowledgements 

There has not been a great deal of work on this, but I have run into three papers.

Fred Dolezalek [8] posited in a 1994 TAGA paper that the CRF of ΔE variations of printed samples can be characterized by a single number. His reasoning was based on the statement that the distribution “should” be chi-squared with three degrees of freedom. He had test data from 19 press runs with an average of 20 to 30 sheet pulls. It’s not clear how many CMYK combinations he looked at, but it sounds like a few thousand data points, which is pretty impressive for the time for someone with an SPM 100 handheld spectrophotometer!

Steve Viggiano [9] considered the issue in an unpublished 1999 paper. He pointed out that the derivation ofthe chi-squared distribution with three degrees of freedom can be derived from the assumptions that ΔL*, Δa*, and Δb* values are normally distributed, have zero mean, have the same standard deviation, and are uncorrelated. He pointed out that these assumptions are not likely to be met with real data. I'm inclined to agree with Steve, since I hardly understand anything of what he tells me.

David McDowell [10] looked at statistical distributions of color errors of a large number of Kodak QC-60 Color Input Targets and came to the conclusion that this set of color errors could be modeled as a chi-squared function.

Clearly, the distribution of color errors could be anything it wants to be. It all depends on where the data came from. This point was not lost on Dolezalek. In his analysis, he found that the distribution only looked like a chi-squared distribution when the press was running stable. 

Future research

What research paper is complete without a section devoted to "clearly further research is warranted"? This is research lingo for "this is why my project deserves to continue being funded"

I have not investigated whether the chi-squared function is the ideal function to fit all these distributions. Certainly it would be a good guess. I am glad to have a database that I can use to test this. While the chi-squared function makes sense, it is certainly not the only game in town. There are the logistic function, the Weibull function, all those silly beta functions... Need I go on? The names are as familiar to me as to everyone. Clearly further research is warranted.

Although I have access to lots of run time data, I have not investigated the statistical distributions of this data. Clearly further research is warranted.

Perhaps the chi-squared-ness of the statistical distribution of color errors is a measure of color stability? If there was a quick way to rate the degree that any particular data set fit the chi-squared function, maybe this could be used as an early warning sign that something is amiss. Clearly further research is warranted.

I have not attempted to perform Monte Carlo analysis on this, even though I know how to use random numbers to simulate physical phenomena, and even though I plan on writing a blog on Monte Carlo methods some time soon. Clearly further research is warranted.

I welcome additional data sets that anyone would care to send. Send me an email without attachment first, and wait for my response so that your precious data does not go into my spam folder: john@JohnTheMathGuy.com. With your help, further research will indeed be warranted.

Conclusion

My conclusion from this experiment is that the statistical distribution of color difference data, at least that from printing of test targets, can be summarized fairly well with a single data point. I have provided a table to facilitate conversion from the median to any of the more popular quantiles.

----------------------
[1] And if you are a color scientist, you are probably wondering when I am going to break into explaining the differences between deltaE ab, CMC, 94, and 2000 difference formulas, along with the DIN 99, and Labmg color spaces. Well, I'm not. At least not in this blog.

[2] For the uninitiated, color values are a set of three numbers (called CIELAB values) that uniquely defines a color by identifying the lightness of the color, the hue angle, and the degree of saturation.

[3] I have a Twitter account, so I have very little free time. Just in case you are taking a break from Twitter to read my blog, you can find me at @John_TheMathGuy when you get back to your real life of tweeting.

[4] Sometimes I just bring data up on the computer to look at. It's more entertaining than Drop Dead Diva, although my wife might disagree.

[5] How to decide the "correct" CIELAB value for a given CMYK value? If you happen to have a big collection of data that should all be similar (such as test targets that were all printed on a newspaper press) you can just average to get the target. I appreciate the comment from Dave McDowell that the statistical distribution of CIELAB values around the average CIELAB value will be different from the distribution around any other target value. I have not incorporated his comment into my analysis yet.

[6] Here is a pet peeve of mine. Someone might be tempted to say that they computed a bunch of ΔE00 values. This is not correct grammar, since "ΔE" is a unit of measurement, just like metric ton and inch. You wouldn't say "measured the pieces of wood and computed the inch values," would you?

[7] No warranties are implied. Use this chart at your own risk. Data from this chart has not been evaluated for color difference data from sources other than those described. The chart represents color difference data in  ΔE00, which may have a different CPDF than other color difference formulas.

[8] Dolezalek, Friedrich, Appraisal of Production Run Fluctuations from Color Measurements in the Image, TAGA 1995

[9] Viggiano, J A Stephen, Statistical Distribution of CIELAB Color Difference, Unpublished, 1999

[10] McDowell, David (presumed author), KODAK Q-60 Color Input Targets, KODAK technical paper, June 2003

Wednesday, November 7, 2012

Evolution versus revolution


I have heard it said that the process of improving our processes of doing things is evolutionary, and not revolutionary. The point is made that it is desirable to follow the course of evolution, and make small gradual changes to optimize our way of doing things. I want to consider this analogy through a number of other analogies.

Just how smart is it to be an ape?

In college, I took a primate psychology course in which I was required to write a paper. I had always been puzzled by the fact that the apes are in general not very successful species, so I tried to answer my puzzlement in my paper. It seemed to me that higher intelligence would imply greater adaptability, and hence greater success.


I started my paper by comparing the apes with other mammals which occupy similar niches. The pairings I made were between the gibbon and the spider monkey, the chimpanzee and the baboon, the orangutan and the tree sloth, the gorilla and the bear. In each case the ape in the pairing was less successful. Three of the apes are also on the endangered species list. The point is clear that there must be something detrimental acquired along with intelligence.

I noted two costs associated with intelligence. First, the additional size of the brain requires that the young be birthed earlier, and hence babies are more helpless and less viable. Furthermore, birth is considerably more stressful for the mother because of the larger head size. Second, the young are dependent on the parents for a much longer period, because of the shift from a "ROM based system" to a "RAM based system".


My conclusion was that, in the range of intelligence inhabited by the apes, the detriments of intelligence outweigh the benefits. I got an A+ on the paper. That was awesome.

The search for the missing link

Years later, I read a book entitled, The Panda's Thumb[1]. The author talks at great length about the fruitless search for the missing links in the fossil record between species. He argues that, while this could be explained by our sparse sampling of the geologic record, the lack of evidence points to our misconception about the evolutionary process. He claims that evolution is not always a gradual process, but that new species are created only when evolution occurs in spurts.

His argument is that, since creatures are generally fairly well fine-tuned to their environment, small changes are likely to be less viable, and will not survive. Only by making large and abrupt changes can a new species find a more successful niche. I think he was probably inspired by my paper.

A-kneeling at the optimal point

This idea is related to problems encountered with the numerical optimization of functions. The goal is to write a computer program which can automatically find the maximum of an arbitrary function. If we could be guaranteed of functions which have only a single local maximum, this is relatively easy. Methods for this have existed since Newton. They all basically steer themselves uphill until they find a flat spot. Unfortunately, in the real world, functions rarely constrain themselves to a single local maximum. Depending upon where you start, you may find yourself on a peak that is not the peak.


There is a numerical optimization method called simulated annealing [2] which can avoid getting caught in a local maximum. It is a procedure which is modeled after the annealing process.

Annealing is the process by which knife blades and the turbine blades in a jet engine are hardened. The piece is first raised to a very high temperature, enabling the molecules to wander around freely, rather than immediately seeking the lowest energy state they can find. Slowly the temperature is lowered, and molecules gradually settle into lower energy states. At any time, they are free to wander around, but as the temperatures goes down, this becomes less likely. The key feature to annealing's ability to find the lowest energy state (a single crystal) is that molecules are allowed to pass through higher energy states.

The numerical annealing process does not force that the method always head uphill. At each stage, the choice between uphill and downhill is random. As the method proceeds, the uphill choice becomes more likely; just as lowering the temperature makes the transitions out of lower energy states less likely.

Application 

I have seen this process play out again and again.

I first started pondering the relationship between evolution and revolution when I was trying to understand what made engineering groups work and what made them fail. For a very small group of engineers, it is acceptable, and in fact, most efficient, to keep everything informal - specs, documentation, project planning.

As an engineering group gets bigger, it goes through a phase where making incremental increases in the amount of structure just plain does not work. Wholesale changes must occur in order for the larger group to be efficient. I lived through a few of these. They were painful, but revolutionary evolution was necessary.


In a previous post about printing standards and a panel discussion I led at GraphExpo, I discussed the issues with optical brightening agents in printing. Several steps must be taken to solve the problem. I lamented that taking only one of these steps will make things worse. Evolutionary revolution is required.

While I am on the subject of standards, let me make an observation. As a member of a few standards committees, I have come to realize that you can never change a standard. Even though a standard makes sense, there will be always be those who cannot adapt to whatever new stuff is put into place. This requires evolutionary revolution to make it work.

Here is a quote from an article that appeared on my LinkedIn page today. Same topic. Evolutionary revolution is necessary.
"As organizations continue to evolve within an ever-changing external environment, it has become quite evident that things are shifting talent- wise. This will not likely manifest as a small iterative adjustment in how we view, attract and develop talent. Rather, it seems destined to become a long overdue metamorphosis concerning our most important asset - people."

The patent fight between Apple and Motorola is another example. There are industries where there are very few patent attorneys. And there are industries where every company needs to have a bushel basket full of them. The transition of an industry from one to the other is an evolutionary revolution. A company cannot survive if the competitors have more patent attorneys.

In all these cases, we are not faced with a choice between evolution and revolution, evolution is revolution.



[1] Stephen Jay Gould, The Panda’s Thumb, W. W. Norton and Co., 1980
[2] Press, William, Brian Flannery, Saul Teukolsky, William Vetterling, Numerical Recipes, the Art of Scientific Computing

Wednesday, October 31, 2012

Just some random thoughts

Today's blog was inspired by a good friend and co-worker, Pat. He sent me a link to a brilliant piece of erudition / social commentary.

Random math paper generation


A fellow by the name of Nate Eldredge wrote a program called MathGen that built on the previous program called SCIgen. This clever piece of work will randomly generate math papers. Papers in pure math, ready for submission to a journal. Gibberish papers. 

It seems that someone actually took the challenge and submitted this randomly generated paper to the journal Advances in Pure Mathematics. I am sure that most of my readers have a copy of this journal on their nightstand. I know I keep mine handy.

The paper was accepted. Well, to be honest, it was provisionally accepted. The editor had the gall to ask for a few things to be cleaned up. This oppressive editor complained that he couldn't quite catch the main thought of the article from the abstract.

Naturally, I had to try generating my own paper. And since I am more or less a vain fellow, hell-bent on self aggrandizement, I put my name on the paper. Here is the abstract of my paper.


I will admit to being quite proud of my efforts. I haven't a clue what it means, although I recognize several of the words. But still, I am proud to have my name at the top.

Commentary

That was the erudite part. Taken is small pieces, the text is hard to distinguish from virtually any paper on pure math. Clever little program.

What about the social commentary part of this? Here are some of the conclusions that I could draw from this story.

Conclusion 1 - The journal that provisionally accepted this manuscript? They certainly have egg on their face. What else can I say?

Conclusion 2 - The papers that MathGen writes look a lot like every other pure math paper that I have gone out of my way to not read, so I can hardly fault the editor for not taking the time to understand this random paper. Let's face it. The truth is, pure math papers are all gibberish anyway. I have seen several pure math papers that supply proofs of this.

I should explain here that "John the Math Guy" is not my full name. My full name is "John the Applied Math Guy". I should further explain that applied and pure math guys are not always the best of friends. This is not well known, but applied and pure mathematicians have been at odds ever since Aristotle got in a fistfight with Archimedes over an urn that one of them broke during some bacchanalia. The remnants of those pottery shards and ancient arguments can still be seen in the applied versus pure math jokes that have been going around. I am sure you have heard all of them [1]. 

Conclusion 3 - The MathGen program has demonstrated some level of artificial intelligence, perhaps not quite up to the Turing test [2], but it was up to the task in a rather limited scope. This is perhaps not all that surprising, since the Turing test is not all that hard, given the circumstances are narrow enough. The 1966 program ELIZA did a decent job of convincing people that it was a psychotherapist. 

Many lines of code have passed through the compiler since ELIZA. What is the state of the art now? Are other programs capable of producing creative work that rivals humans?  

Random song generation

Years ago, when I got my first computer (it was a Radio Shack Color Computer with 48K of RAM), I played with writing a program that would generate songs at random. The first approach was, well, lousy. It sounded a lot like music by John Cage. This is not all that strange, since Cage wrote one of the first totally aleatoric music pieces, meaning "music composed by throwing dice".


Many times I have pondered ways to make random music sound, well, less random. Rules could be added like "the piece should end on the tonic", and "small steps in pitch are more likely than large ones", and "there is an underlying chord pattern in the song that the notes might want to follow", and, an important one, "breaking the rules once in a while makes the song interesting." I have never taken this idea any further. I guess I am just too lazy to sit around and have my computer write music for me. 

I would assume that I am not the only one who has thought along these lines. I did a very tiny amount of searching and found that Wolfram [3] has put at least some effort into this.  This is cool, cuz these folks are really sharp and they know lots of stuff and everything.

Here is their music generator website. I played with this a while, trying to get a little blues riff going. Blues is one of the choices they have for genre. I figgered this would be a good choice, since the eight-bar blues chord progression is a piece of cake to master [4]. I selected a few instruments that I might like to hear doing blues.

I can't express my level of disappointment. If John Lee Hooker ever met John Cage... Clearly since this percolated up to the top of the Google search, it must represent the absolute bestest available in random music generation technology. Sad.

Random artwork generation

Maybe random generation of art is more state-of-the art? I poked around the internet a bit and found a website where I could generate my own random artwork. I used the title "Random art?" and got the fabulous image below. (This program evidently uses the words you type in to reseed their random number generator, since the same name always gives you the same artwork.)

Random Art?

I don't have to tell you that this piece is absolutely gorgeous. I had it printed on a 2 ft by 2 ft canvas and framed it for my wall. But, is this up to par with what a human artist can do? Have a look at the piece below, and you be the judge [5].


Yes, I agree. Monet couldn't hold a candle to either of these incredible pieces of art.

Random poetry generation

Poetry is another field where the old dice can give a practitioner a run for his money. 

The algorithm for one automatic poetry generator is pretty transparent. You give the poetry generator lists of nouns and of adverbs and all that. You give it a template for the poem, telling it what part of speech to use where. You push the button and it will fill in the blanks with randomly selected words from your lists. Tedious, simple, and boring.

On the other hand, it is my understanding that the algorithm used by this program is not all that different from the one used at the heart of MathGen.

You know what would be nice? It would just be nice if the poetry generator would save me the effort of typing in words and just go out and search the web for seed words. I dunno, maybe I could give it a list of my patents, and it could turn them into some love sonnets. I am sure that would get me the babes.

This next one was quite a bit less work for me, and it actually sounds like poetry  At least to me, someone who really doesn't dig poetry. I guess a poetry aficionado would read this and make sense of it. Maybe even think it's deep?

asphalt lines, slicked with tar in the sun 
radiation in the stormclouds 
bubble cloud air and water mixed 
smooth flat memories of past 
blood becomes timebomb 
flower petals 
forest deer 
rainstorms 
leaves 

Here is yet another random poetry generator, one that has been around for quite some time.

The Rando-Dylan poetry generator

Here is some poetry generated by this random poetry generator:

You never turned around to see the frowns on the jugglers and the clowns
When they all come down and did tricks for you
You never understood that it ain’t no good
You shouldn’t let other people get your kicks for you
You used to ride on the chrome horse with your diplomat
Who carried on his shoulder a Siamese cat
Ain’t it hard when you discover that
He really wasn’t where it’s at
After he took from you everything he could steal

I understand that some people think that Rando-Dylan produces some deep poetry, as well.

Random prose generation

The first random poetry generator picked words at random, but it had the advantage of being fed information about whether a given word was a noun or preposition. This, combined with a template, was enough to create some semblance of proper grammar.

Claude Shannon [6] suggested a different algorithm. Imagine starting by creating a word probability array. The array contains a list of a lot of words along with their probability of occurrence. In this first order approach, the computer merely selects the next word according to the likelihood in the array.

Simply, but it sounds like total gibberish. There is no knowledge of grammar.

Shannon's second order approach is stochastic. The next word in the sequence depends upon current word. The first step is to create a two dimensional histogram of words. Position i,j in this array is the probability that word i will be followed by word j. Such an array could be quite easily created by downloading public domain novels from the internet.

I don't think Shannon did any downloading from the internet, but he did generate the following second-order stochastic prose:

THE HEAD AND IN FRONTAL ATTACK ON AN ENGLISH WRITER THAT THE
CHARACTER OF THIS POINT IS THEREFORE ANOTHER METHOD FOR THE
LETTERS THAT THE TIME OF WHO EVER TOLD THE PROBLEM FOR AN
UNEXPECTED.

Gibberish? Well, yeah. But consider doing third order, where each word is generated based on the two previous words. Thus, three word sequences would only appear in the output if they occurred somewhere in the text that built the histogram.

Or why not fourth order, or... Clearly at some point this go from being from a cool way to write novels to plagiarism. I don't care how many books you fed into the program to create the multidimensional histogram, If the computer has a sequence of 50 words, the next word will probably come from the novel that the first 49 words came from. 

Political speech generation

Since I know that both Obama and Romney rush to read my blog every Wednesday morning when it posts, I will end with something that will be of great use to both of them. Yes, there is a random generator of political speeches! I rolled the dice a few times and here is what I got:

My opponent is conspiring with Halliburton board members, street gangs and military-industrial warmongers. I will work for an America where pedophiles and socialists cannot sabotage our love for the Bible. Unlike my opponent, I will protect our right to kill foreigners, our sense of trust and our right to free speech.

I certainly think this passes the Turing test.

-----------------   Notes  --------------------------

[1] A fellow becomes a pure mathematician when it realizes he doesn't have enough personality to be an applied mathematician. Or an accountant. How can you tell the difference between a pure and an applied mathematician? An applied mathematician will look at your shows when he talks to you, rather than his own. A pure and an applied mathematician walk into a bar. The rest of the joke is obvious.

[2] The early computer scientist propose a simple test of whether a computer can "think". If the computer can be mistaken in a conversation with a human being, then the machine can be said to think, or to have artificial intelligence, or at least, can be said to have passed the Turing test.

[3] Wolfram makes the program Mathematica, which I have been using since around 1985.

[4] I haven't quite mastered the eight-bar blues. I am usually stumbling by the third or fourth bar. And I have never made it all the way through the circle of fifths.

[5] I have been accused of stretching the truth sometimes, but this is truth. The image was generated by the online program that I mentioned. You can try it yourself. The painting was indeed painted by Mark Rothco. You can click the link to see. The uncanny resemblance of these two is clearly proof of the fundamental connectedness of all things.

[6] Claude Shannon (1948). The Mathematical Theory of Communication. Bell Systems Technical Journal 27, pp. 379–423, 623–656
  




Wednesday, October 24, 2012

It's all in how you shuffle the deck

The Astounding card trick

Ladies and gentlemen, I will now perform for you the celebrated John the Math Guy card trick. Without the use of PhotoShop or other devious means of deception, I will shuffle an ordinary deck of cards not once (as in tricks performed by the far less talented Rupert the Math Guy), but twice. 

Yes, my friends, I said that two times I will invoke the Goddess Tyche, the Greek (or Roman or whatever) goddess of luck. Twice I will knock on her gates and implore her to make my deck of cards be like unto a random ordering in her sight.

Please turn your attention to the first photograph below - an actual unretouched photograph of the order in which I have placed the deck before my heretofore unprecedented act of prestidigitation. Note the orderliness of the cards, as befits an applied mathematician of my lofty stature. I have proof of the veracity and authenticity of this picture on file in the form of a document to which has been affixed the paw prints of my dear and most honorable dog, Scrabble, attesting to the undeniable fact that all I am relating is the absolute truth.
The actual cards used in this actual experiment before the actual experiment takes place

Now, please watch carefully as I perform the act of shuffling this ordinary deck of cards. The procedure is tripartite in that there are three parts. First the deck is cut roughly in half. (Preferably, it is cut lengthwise and parallel to the face of the cards.) The right and the left hand take the first and second half of the deck, respectively.

Second, the two half-decks are "riffle shuffled", as shown below. The cards are flexed and the thumbs are slowly moved to allow the cards to be flapped downward under the inducement of the flexing. As is well known to one skilled in the art, the order that the cards fall, left or right, is governed solely by the whims of Lady Tyche.

The riffle step of the riffle shuffle

In the third and final step of this tripartite procedure, one applies a reverse flexing of the cards so as to propel them out over the table to the embarrassment of the dealer. The random nature of collecting the cards back into a deck is how randomness is attained.

Friends, with Dog as my witness [1], I am here to tell you that I performed this very procedure exactly twice. Surely if Tyche has provided us with a completely random order by virtue of the first riffle shuffle, then the second process of riffle shuffling should leave us with an even more random order. But to the amazement and perplexification of all in my reading audience, the following unretouched photo shows but a sampling of the bewildering arrangement of the deck. I show the top ten cards of the deck, which include the unbelievable quantity of no less than four kings, and the even more unbelievable quantity of four queens... all in the very topmost ten cards. 

First ten cards in the deck after two riffle shuffles

Explanation

There is no trick here. It took me a few tries to get these results, but these are actual results. I am not a skilled magician, and definitely not a skilled card player. I have been to Vegas only once, and that was for a convention. Let me tell you, the gaming tables are pretty lonely when the annual Math Guy convention hits town! But at least the bars are busy. And the show girls.

If one has executed the riffle shuffle perfectly, the last card down is (let's say) the king of spades. The next to last down will be king of clubs from the other hand. Prior to that, the queen of spades falls, and prior to that, the queen of clubs, and so on. When the deck is split in half for the second riffle, the deck is cut between the two black aces and the two red kings. (Ok... truth be told, I cheated and glanced at the cards as I cut the deck so I could get the correct split. Scrabble didn't catch that.)

It does not take a whole lot of practice to be able to amaze your own friends with this trick. A reasonably well controlled riffle will readily produce a pattern after two shuffles that is clearly not random. This is not rocket science, but anyone who has settled for two shuffles of the deck clearly has not thought this through.

How many times should you shuffle?

Which brings me to the practical question... how many times should you shuffle? I am sure there are many statistically valid ways to look at this question, but I offer one approach that is relatively simple. I start by considering the very top card, which in the illustration of the virgin deck was the king of spades.

I make the simplifying assumption that corresponding cards (that is, the fifth card of the half-deck on the right side, and the fifth card of the half-deck on the left side) will square off and the two will fall in random order. Thus, after the first "perfect" riffle, the king of spades might still be on top, or he might have dropped to the position of the second card.

"No... you go first, I insist"

If the king fell to the second position after the first riffle shuffle, then he will square off with the second card on the other side and wind up in either the third or fourth position after the second riffle. If he makes it down to the fourth position he has a chance to make it to the eighth position after the next riffle. With each additional riffle shuffle, the possible extent that the king may have traveled is doubled.

Thus, under these simplifying assumptions, one needs to shuffle the deck at least six times in order to make it possible for the top card to move to any other position. Wow. I can't imagine any group of card players having the patience to allow someone to shuffle that much!

Scrabble playing poker with some of his buddies

Schafkopf and the riffle shuffle

If you happen to be from Wisconsin, you will know that "Schafkopf" (also known as sheepshead) is a card game. If you are lucky enough to have had someone try to teach you the game, you will know that the queen of clubs takes all, but is only worth two points (or was it three?), whereas the tens are all worth ten points, but don't take many tricks at all. Kings, of course, are four points. The seven of diamonds (which is worth no points at all) will take the ace of spades, which is a good thing, since the partner (one of four players aside from the fellow who picked) has the called ace and has to play it when someone leads that suit. And that's eleven points. I mean the ace is eleven points. If this happens (I mean the seven of diamonds thing) and you are the picker, you better hope you buried some points in the blind so you can at least get schneider, cuz people are likely to schmear!

Sheepshead is often played with five people, and most frequently with a sixth player alternately sitting out so they can deal and then refill everyone else's beer. If you happen to be from Wisconsin, you will be familiar with this amber liquid. If you happen to be from Wisconsin and play sheepshead (but I repeat myself), you will understand that it is impossible to remember all the darn rules for sheepshead without large quantities of this amber liquid. Some have gone so far as to say that beer was invented just so that that people would be able to play sheepshead. 

I mention sheepshead first off because it is the only thing in Wisconsin that is sillier than a cheesehead hat [2]. But more importantly, I mention it because I have a secret to tell - one that up until now, no one else has known.

Sheepshead uses a deck of 32 cards. (That's not the secret, by the way.) If you were to do five perfect riffle shuffles (where the cards fall first from the right side, then the left, then the right, and so on, always starting from the right side) on a sheepshead deck, you will be back exactly where you started. The deck will appear to have not been shuffled at all.

How do I happen to know this? I am glad you asked, because it allows me to brag. Way back in 1984, when most computers were built from Tinkertoys, I built an image processing computer that could perform two dimensional fast Fourier transforms on images. Exciting stuff. Good resume fodder. I mean, what real applied mathematician hasn't taken a youthful foray into Fourier space? I spent at least four years being spaced out in the foyer.
I envisioned at the time a faster fast Fourier transform computer, one which utilized 16 or 32 or 1,024 individual processors, each working on their own part of the data. The difficulty in this design is that eventually the processors would need to share data with other processors - not just any processors, but very specific ones. The riffle shuffle provided exactly the right connections for the proper data sharing. In the first pass, the deck (the data array) is split in half, and adjacent pairs are combined. In the second pass, the array is once again split in half... and so on.

Looking at the FFT flowchart below, it looks as though that second pass does not split the deck in half, but rather into quarters. The difference is that in the traditional algorithm, the results of the combination of the first and the n/2th entry goes in the first location and the n/2th location. There is no shuffling in the traditional algorithm, as there is in my riffle shuffle FFT algorithm. As a result, the data points to combine are always the same in the RS FFT - data points in the first half of the array are always combined with the corresponding points in the second half. This is the magic that makes this the perfect algorithm for a group of 2n processors. The processors that need to pass data amongst themselves are always the same.

FFT butterflies follow the perfect shuffle

When implemented in a standard processor, there is a small benefit to the RS FFT algorithm. The standard FFT requires a final step of de-scrambling the array. The RS FFT algorithm does not require this. The final results are in the correct order.

And that's how I figgered out that five perfect riffle shuffles will restore a deck of cards to it's original order.


I never built this processor or even filed for a patent, but I still think about it whenever I am playing sheepshead and my deal comes up.

------------------------------------------

[1] The Dog in question can be plainly seen in the painting that can be seen between my wrists in the riffle shuffle photo. This is a painting of Scrabble the Shitsu. For more examples of my sister's excellent portraiture, please have a look at Jean Kelly's website. Scrabble was also the model for my blog post on mathematical misnomers.

[2] Disclaimer - While I mention the cheesehead hat, show a picture, and provide a link for where to buy one, I neither promulgate nor disparage wearing the C.H. hat. Those who choose to wear one on a first date, job interview, or wedding ceremony do so at their own risk.