Thursday, September 27, 2012

Why does my cyan have the blues? (addendum)

I was asked a question about my previous blog about why the hue of ink sometimes changes when the ink film is increased.

I had a lovely plot (see below) that showed that showed that Beer's law doesn't do all that bad of a job at predicting the ink trajectory (and the hook) of a magenta ink. The plot shows how close the match is in a*b*. Erik pointed out that I didn't show what is going on with L*. It could be that Beer's law works well in a*b*, but really messes up when it comes to L*.

The magenta hook, real and estimated

So, I had a look at this same data from a few other perspectives. Here is what the data looks like in the L*a* plane.

And here it is on the L*b* plane.

My conclusion is that it doesn't do so bad. Thanks Erik, for keeping me honest. Naturally, if it hadn't worked out I would have suppressed the results.

Wednesday, September 26, 2012

Why does my cyan have the blues?

When I started in the print industry as an apprentice to Gutenberg, I noticed that the folks in the press room called the inks red, yellow, and blue. This confused me. Everything I had read in color theory books said that cyan, magenta, and yellow were the subtractive primaries. These were the primaries that you use to make a wide range of colors with pigments and filters. Pigments and filters work by subtracting certain wavelengths of light. On the other hand, red, green, and blue were the additive primaries, and these were used to make all the colors when you are mixing light, as in a TV or computer monitor.

Polaroid snapshot of me working at my first job
Why were those silly printers using some of the additive and some of the subtractive primaries? Didn’t they realize that this reduced their gamut? That was the theory, anyway[1].
Just a naming issue?
Anyone who knows me, or who loves me[2] can attest to the fact that I am a firm believer that ignorance is the main explanation for every cultural and scientific phenomenon. In this case, my previous blog about counting colors provides a clue as to the sort of ignorance that might explain why magenta is so curiously called red.
The eleven people who read my previous blog learned that there are only eleven basic one-word color names in our active vocabulary. Neither cyan nor magenta made that list[3]. Clearly the folks on press were calling the inks “red” and “blue” because they have no other words to describe the colors.
Cyan ink is blue, and magenta is red
In my normal incisive way, it took me a few years to realize that the pressmen were not quite as ignorant as I thought they were. I guess I spent too much time running for buckets of halftone dots to actually put my head in a bucket of ink. When I finally did put my head in a bucket of ink (as part of a hazing[4] experiment) I could see that cyan ink is blue, and that magenta ink is red when you look at them in a bucket.
Cyan and magenta inks are blue and red in the can
Cyan and magenta inks are cyan and magenta on paper
So, this confusion is obviously beyond my original explanation. Just like when a fellow accidently calls his wife by the name of a former girlfriend, you can bet there is something deeper going on.
Beer’s law revisited
In yet another very popular[5] blog of mine, I provided a charming explanation of Beer’s law. This blog post is a prerequisite for the following exciting discussion.
Let’s just say that we have a perfect magenta ink. A perfect magenta ink will reflect all the red light and all the blue light that hits it. As for the green light, a light shade of magenta might reflect about 10% of the green. A rich shade of magenta will reflect about 1%.
Now we bring in Beer’s law. Let’s say we start with that light magenta and add another layer of the same ink. Beer’s law would predict that the reflectance would multiply. Since perfect magenta reflects 100% of red and blue light, Beer’s law predicts that the double layer of magenta will reflect 100% of the red and blue light. Beer’s law would further predict that the green light would reflect at only 10% X 10%, which is 1%. A double layer of light magenta becomes a rich magenta.
Key point here: for this perfect magenta ink, the hue is still that of magenta. It still reflects most of the red and blue light, and absorbs most of the green light.
Let’s just say that we now switch over to a magenta that is less pure. Let’s just say that for some inexplicable reason, the publishers of Schlock magazine are unwilling to spend $100,000 per gallon for their ink. The bargain ink they decide to use does not reflect quite as much blue light as we would hope; maybe it only reflects 40% of the blue light when we put a thin film down, and maybe 10% of the green light. Let’s say that the red light is still reflected at 100%.[6]
What happens when we double the amount of ink on the paper?  Beer’s law takes over, and we see that blue light is reflected at 40% X 40% = 16%.  Green light? The reflectance goes from 10% down to 1%. Red light stays at 100%. The table below summarizes the Beer’s law estimation.

Thin layer
Thick layer
From this table, it would seem that the thick layer of magenta is a lot closer to red. The plot below shows the actual spectra of two magenta patches, one at a larger ink film thickness than the other. The plot leads one to the same impression – that a thick layer of magenta is closer to red in hue than a thin layer.
Spectrum of a magenta ink, normal thickness and thick
The tentative conclusion is that magenta turns red when it is thick because it is impure, or more accurately, because there are several different reflectance levels in the spectrum. When Beer’s law kicks in, the areas of the spectrum where the reflectance is “mid-level” (i.e. 40% reflectance) are grossly effected by the ink film thickness.
The plots below are the spectra of cyan and yellow inks. If the previous rule applies, then we would expect that cyan ink will have an appreciable change in hue as it gets thicker. From the plot of cyan ink, we see that the reflectance values between 500 nm and 600 nm are “intermediate”, somewhere between the highest value and the darkest value. This is the green range. As cyan ink gets thicker, we would expect the amount of green light reflected to drop.
Thus, based on Seymour’s rule of ink hue shift, a quick look at the plot below would suggest that thick cyan ink will be blue, just like thick magenta ink will be red. Yellow ink has very little in the way of intermediate values. It basically has either 75% reflectance or 3%. From that, you would guess that yellow ink will not change in hue. Note that a bucket of yellow ink does indeed look yellow.
Plots of cyan and yellow ink
But spectra can be a bit misleading when trying to discern color. I don’t know many people who can look at a spectrum and tell what the color is. So, I offer a little computational experiment to further validate Seymour’s rule of ink hue shift.
First, I will show the results. Then I will explain how I got them. The chart below shows the a*b* values of a set of ten magenta patches with increasing ink film thickness. These values are the ten blue diamonds in the plot. There is clearly a strong hook. The first five are pretty much along a line without much hue shift. The sixth one goes around the bend, and the last four are changing a lot more in hue than they are in chroma.
The magenta hook, real and estimated

For the other views of this data, I have published an addendum to this blog post.
The magenta colored line in the plot is a prediction of what I call the “ink trajectory”. This is the set of all L*a*b* values that an ink will go though as you change the ink film thickness. To compute this estimated trajectory, I started with the spectrum of the sixth patch and that of the paper. (You will note that the magenta line goes right through that point.) I loaded these spectra into a spreadsheet, and used Beer’s law to estimate the spectrum over a range of ink film thickness. You will note that the estimated trajectory comes reasonable close to predicting actual measured values, and definitely predicts the hook.
For those who want more detail, I have a little more description below. This is excerpted from a paper I presented at TAGA in 2008.
This pretty well settles it in my mind. Magenta ink on paper is magenta. Magenta ink in a bucket is red. I have explained this with some simple ciphering with Beer’s law. This led me to define Seymour’s rule of ink hue shift, which allows you to tell (just by looking at a spectrum), whether an ink will have an appreciable hook.
I then showed some really, really impressive results that show that, armed with just the spectrum of your paper and that of your ink on that paper, you can determine the magenta hook. This is clearly a triumph of modern science.
I have come a long way since I was ransacking the printing plant to find those elusive halftone dots!
This is where I admit to some of the lies in the previous section.
First off, Beer’s law is only an approximation. It makes the simplistic assumption that a photon will either pass right through the ink, or get absorbed. It does not make allowances for photons that reflect directly from the surface, or for photons that bounce around a bit in the ink and maybe come out of the ink without ever having visited the paper.
Despite those simplifications is does fairly well. For the standard process inks. I do not have data to see whether it works for Pantone inks. If anyone has a cup of data to spare
One limitation that I glossed over is that it does not do well at predicting the reflectance of a double layer of ink. Us folks in the know like to say that ink is “sub-additive”, which means that Beer’s law does not do well at predicting the reflectance of a double layer of ink. It will, however, give you a spectrum that is attainable, however. Just not at that particular ink film thickness.
Well, that was kind of a lie as well. There are limitations, especially when you get up to the very high densities. You will note that my hook graph fits the data pretty decently, but it would not be nearly so good if I tried to predict the lightest density from the darkest, or the other way around.
There is one more lie, or one more pair of lies actually, but they are subtle. I demonstrated two ways of deciding whether the spectra of magenta showed a hue change. The first way was kind of hand-wavy. “Look at the spectra and see that it looks a lot like red. Ignore the little bump behind the curtain at 450 nm.”
Well, this argument may fly for someone who has not spent thousands of hours looking at spectra. But, if you have devoted a lifetime to deciphering spectra, you would know that sometimes the stuff happening down at the dark end is important. That little bump at 450 nm might just have a big effect on the color.
In this case, it didn’t. Converting to CIELAB demonstrated that the magenta is definitely turning red.
Or did it turn red? This is where the lie gets very subtle. We are trained from childhood to believe that colors with the same CIELAB hue angle are actually the same hue. But I have stubbornly disagreed with this all along. My first grade teacher almost flunked me over this point. I was glad to come upon a paper by Nathan Moroney where he made an off-hand comment that agreed with me.
The issue has to do with the fact that the CIELAB formula performs a nonlinear function on the XYZ values, which are a linear combination of the actual sensors in the eye, but which probably don’t actually exist in the eye or the brain. But that is grist for another blog.

[1] Yogi Berra said “In theory there is no difference between theory and practice. In practice there is.”
[2] I am still baffled as to why there are so few people who both know me and love me. Why is there no intersection between these two sets?
[3] Both words came into our language relatively late. Magenta became a word shortly are 1859, and cyan became a word in 1879. You wouldn’t expect them to become common words that quickly, would you? After all, look how long it took “internet”, “email”, and “perifarbe”  to become common words.
[4] “Hazing” of course is some sort of print defect for gravure printing. Nothing to do at all with the old guys picking on the newbie.
[5] Popular? So far, seven people have read the Beer’s law blog post. Well, I should clarify. Seven people stumbled upon the blog post. It is perhaps optimistic of me to expect that all seven of them took the time to actually read the blog rather than just look at the really cool pictures.
[6] Standard process magenta ink is not all that perfect, and there is a ”magenta” ink that is a bit closer to perfect: I am exaggerating just a tiny bit about the price of the alternative. I have not checked the price of Pantone Rhodamine just lately, but I think I can hook you up with a guy who can get you a gallon for something less than $80K a gallon. Unless of course, you are looking for ink jet ink.

Wednesday, September 19, 2012

How many colors are in your rainbow?

“Ok, mister smarty-pants Math Guy who thinks he’s a color scientist, answer me this!  Just how many colors are there? Huh?”  I get that question all the time. Boy have I got an answer for you. Or maybe a whole bunch bunch of answers…
This blog is dedicated to Jerry Nelson, the voice of The Count from Sesame Street. Jerry died August 23, 2012.
The Count
The simplest answer is that there are three colors: red, green, and blue.
On the off chance that you don’t believe me (maybe you were gonna say there are more?) pull out a magnifying glass, or a microscope, and look at your computer monitor. Unless you happened to grab a scanning electron microscope[1], you probably see something like the image below. Your computer screen is a combination of red, green, and blue dots. Every color that you can see on your screen is a combination of those three colors. 
Picture of red, green, and blue pixels on a computer screen
For the sake of honesty, I have to say that this last part was something of a lie. Not really a lie, but perhaps misleading – like telling your wife that you are out with “the guys”. The thing is, you can’t get every possible color on your computer monitor. You can get a whole bunch of them, but have a look at the chromaticity diagram[2] below.
The chromaticity diagram, showing the gamut of a hypothetical computer monitor
The black triangle shows a hypothetical gamut for a computer monitor. The three vertices of the triangle represent the colors of the red, green, and blue pixels. By mixing the colors, you can reach any color within the triangle.
You will note that there are colors that are outside the gamut of this hypothetical monitor. In fact, it is impossible to build a computer monitor with three fixed lights that will display all possible colors. Chromaticity space is bowed out, so you can’t make a triangle with physically realizable colors (colors within the horseshoe shape) that covers all possible colors[3].
Here’s another fun experiment. Go into Photoshop or Paint or whatever program that allows you to select colors. Try to make orange. Not only can’t you find a word to rhyme with orange, but you can’t make a good orange on a computer monitor[4]. The best I could do looks a little brownish.
The best orange I can make on my monitor (RGB = 255, 192, 0)
So, my first answer to the question is that you can make good percentage of all possible colors out of just three colors. Four colors might be a bit better.
How about the number of colors in the rainbow? The colors from the rainbow can be combined to make every possible color. Isaac Newton did some research with prisms and sunlight, and decided there were seven colors in the rainbow: red, orange, yellow, green, blue, indigo, and violet (ROYGBIV). The other colors I get, but what about indigo? I’m not sure I even know what indigo is!
Here is my explanation of how indigo got into the rainbow. Newton saw a big section in the blue that seemed just too wide to be a single color. Now, Newton was something of a mystic, so he saw a fundamental connection between the seven elements, the seven planets, the seven notes of the tone scale on the piano, and the seven colors of the rainbow.
Or maybe he looked at the color just below 500 nanometers, and rather than call it light blue or cyan, he decide that was blue. Having no other name, he was forced to use indigo to name the color that I might call blue or possibly dark blue. This is at least plausible, since the word “cyan” didn’t come into use in the English language until 1879.
My second answer to the question of how many colors there are is seven, or maybe six.
Sixteen million
Now I’m going to go to the other extreme, and claim that there are 16,777,216 colors, and that my computer monitor can prove it. Those of you who recognize this number as 256 X 256 X 256, will immediately understand that this is the number of RGB combinations you get when you have 256 levels of red, and 256 levels of green, and 256 levels of blue. If you do not recognize this number I’m sorry, but you are a poor excuse for a computer geek.
Just because I can go into Photoshop and make 16 millions colors, does that mean there are really that many colors? The image below might persuade you that there are (possibly) not that many. One of the rectangles has the RGB values 255, 255, 255. The other has the RGB values 254, 254, 254. I don’t know what you see on your screen, but I can’t really tell the difference.
In other words, while there are 16 million possible combinations of RGB, not all of them are distinct “colors” according to the eye.
Which rectangle is brighter?
My third answer is that there are 16 million colors on my computer display, but that might be a little bit of computer hype.
I taught an algebra class at UW Milwaukee. One day I gave a pop quiz to my two classes. I asked them to take out a sheet of paper and a writing implement. I gave them two minutes and asked them to write down all the single-word color names that they could think of.
The eleven colors that everyone can think of
I had 50 students that took the quiz, half male and half female. Almost everyone – 48 of the students – came up with the names of the eleven colors in the picture above: white, black, gray, red, orange, yellow, blue, pink, brown, and purple. If I remember correctly, there were two students who said that their art teacher told them that black is not a color; it is the absence of light. So that explains that. What do art teachers know?
The next two colors down the list were silver and gold, each with about half of the students recalling those colors. I am going to argue with the art teacher that silver and gold are not true colors. They are gonio-apparent effects. And I am right, since this is my blog.
So, my fourth answer is that there are eleven basic colors that everyone can recall.
An interesting thing showed up in the data. There was a statistically significant difference between the number of colors that the men could recall versus the number that the women could recall. Men averaged 15, and women averaged 18. Not a big difference, but my sample was wide enough that it was significant. Now, it could be that my sample was not representative of the population since there is a bit of a gender difference in math performance and this was an introductory algebra class taught in a college. OR it could be that women are inherently more color conscious. I have no real explanation, other than stating what I observed.
The student with the most colors was a female art student, who recalled 29 colors in the two minutes allotted. I have given this test to a number of other times, in particular, to my wife. In two minutes, she was able to recall over 50 single word color names. After the two minutes were up, she kept writing down single word color names for a few hours after, eventually compiling a list of nearly 250. Can there be any question about why I fell in love with this woman?
Answer number five: there are somewhere around 250 colors that my wife can name.
Something like fifteen
I performed another experiment, this time on myself. Unlike many of my other experiments, this one did not involve mind-altering chemicals. It involved colors. I started with a big bucket of possible colors, and tried to assign each of these to a color family. I started with the eleven basic colors as my names for color families.
There were a lot of colors that fit in more than one family. For example, yellowish orange belongs in both the yellow and the orange families. And reddish green fits into both red and green[5].
As I went through my bucket of colors, I came up with a few sets of colors where I wasn’t comfortable with putting the color in any of the families. One group could be called beige, taupe, off-white, or possibly eggshell. These were colors that were really not white, but they were not saturated or dark enough to be called brown or yellow. Another misfit color was tan. Is it brown, or yellow? Not really either, I think. Or maybe it’s really brown.
Another group that didn’t seem to have a proper color family included the colors coral and fuchsia. Were these red, or orange, or maybe pink? I couldn’t decide. Maybe it is in the purple family? Nothing seemed to fit
And then there was plum or burgundy, kind of a brownish dark purple. Maybe sea foam was another group, and that group might include the color that my wife argues with me about. Cyan, turquoise, aqua, powder blue, sky blue, teal, periwinkle, azure, cerulean… (Are you kidding?!?!??  Periwinkle doesn’t belong with those colors!)
I count three or four or maybe five additional color families, so my sixth answer is that the total number of color families is around fifteen. But, I have no idea what a different observer might say. I hesitate to run this experiment on my wife.
One hundred fifty or two thousand
It wasn’t that long ago that the biggest box of Crayola crayons available (with 96 crayons) was proudly displayed on my desk. Today, I see that you can buy a set of 150 crayons for the low price of $14.97. A must have for any serious color scientist. My birthday is coming up, by the way.
Crayola 150-Count Telescoping Crayon Tower
Crayola is not the only company that is compelled to add more colors. Pantone just recently announced the addition of 336 colors to their color formula guides, bringing them up to “1,677 chromatically arranged color choices to unleash their passion and let their creativity soar.” I heard many passions unleashed by printers at the necessity to drop another $100 on their color matching books.
Not to be outdone, the Valspar American Tradition paint swatch book has 1,764 colors, including “Homestead Resort Parlour Raspberry”, “Misty Morning Blue”, and the very popular “Swampwater”. Swampwaters were very popular when I was in college. No one knew what went into them.
So, the number of commercially distinguishable colors is somewhere between 150 and 2,000. My seventh answer to the question.
Two million
Now we get to some more scientific answers. Let me make the question a bit more precise. How many distinct colors can be reliably distinguished by people with normal color vision?
The quick and dirty answer comes from looking at the bounds of CIELAB space[6]. CIELAB space was designed so that one step in any direction is approximately “just noticeable”. The value of the lightness value goes from 0 to 100. In the red to green direction there are let’s see, how many steps? Lemme check my copy of Wysczecki and Stiles. Hang on a sec… Still looking…
Ok, so they don’t say how big color space is in that direction. Or in the blue to yellow direction. Let’s just assume that there are something like 200 steps in each direction. That means that the rectangle that fits all of color space is 100 X 200 X 200, or about four million.
Of course, this assumes that color space is a box-shape (rectangular prism). But color space is not rectangular. Maybe it’s more like an ellipsoid? My buddy Adam has referred to the shape as a “space potato”. Let’s just say that it’s an ellipsoid, in which case, the volume is about half of the volume of the box, or two million. Answer number eight.
Some other number?
How many lies have I told so far? Well, I need to admit to another one, or at least another misleading statement. (Honest, I was out with the guys that night.) I said that “CIELAB space was designed so that one step in any direction is approximately just noticeable.” I did put the word “designed” in italics to tip off that this might be a white lie.
I have to admit that, yes, this was the design goal, but it is not really all that exact. For example, if you take five steps in the direction from saturated yellow to really saturated yellow, you will just barely be able to tell the difference in color. On the other hand, if you have a color near gray, you can get away with only about half a step before you can see a difference.
This means that the reconnoitering in that last section is a bit flawed. What’s the real number? I honestly don’t know. I have some ideas on how to compute it, though. I do know that there are only about 70 steps in the lightness direction, whereas CIELAB says there are 100.
My ninth and final answer: Someday I will actually write some code to figger it out, but I suspect the number of discernible colors is around 689,262. Then again, maybe it’s the Count’s favorite number: 34,969.
So, how many colors are there? I dunno. It depends on how you ask the question and who you ask. Pick a number between 3 and 16,777,216.

[1] SEM images are still back in the days of black and white. If you see an SEM image with any other colors in it, it has been colorized. Someday, someone will figger out how to make electrons with different colors, and then the images will be really, really cool. I think I’ll go file for a patent on that idea.
[2] The chromaticity diagram was an early attempt at trying to turn the spectral response of the eye into something that explained our perception of color. It was replaced by other mathematical models (in particular, CIELAB), but is still the easiest way to understand the gamut of a set of light sources.
[3] Sharp introduced the Aquos Quattron display in September of 2010 which added a fourth color of pixel, yellow. For this monitor, the gamut is expanded into a quadrilateral that gives you more yellow and orange colors.
[4] Speaking of orange… I think that carrots are a richer orange than oranges are. I think we should swap the names of these too foods.
[5] Ok, I was just kidding about that one. Reddish green isn’t a color; it’s the name of the band I am going to form after I retire from all this color stuff and learn to play the sax.
[6] CIELAB was the next big step forward in coming up with a measurable number that corresponds intuitively to our perception of color. For dinner, a movie, and plane tickets, I will come to your living room and give the CIELAB lecture to you and ten of your most intimate friends.

Wednesday, September 12, 2012

Finding the right model

Yogi Berra once said that “predictions are hard, especially when they are about the future”. Or maybe it was Niels Bohr who said it? Or Casey Stengel, or Mark Twain, or Sam Goldwyn, or Dan Quayle? Nobody is sure who first said it, because the past is also hard to predict.
I offer here one explanation of what makes prediction hard. It has to do with finding the right underlying model.
Which is the right model?
The population growth problem from 6th grade
The year was 1970. I was in sixth grade. The teacher gave us an exercise in looking for patterns which would (as a side effect) increase our awareness of the global population problem. We were given the data in the first two columns of the table at the right. Note that the years listed are for successive doublings of the world population. Our assignment was to compute the doubling time, shown in column three, and then predict the world population in the year 2020.
4331 BC
50 million

1131 BC
100 million
470 AD
200 million
1270 AD
400 million
1670 AD
800 million
1870 AD
1.6 billion
1970 AD
3.2 billion
Sixth grade assignment
We were supposed to notice that each successive doubling time is half the previous doubling time. The world population doubled between 1870 and 1970 (in 100 years), so the next doubling to 6.4 billion would require 50 years. According to the rule we determined in class, the world populations would reach 6.4 billion in 2020.
Well, the estimate was a bit off. We reached 6.4 billion in 2005. Clearly the prediction was not drastic enough. But wait…
Taking this a step further, we would expect that the world population would hit 12.8 billion in 2045, and 25.6 billion in July of 2057. In sixth grade, there was something unsettling about this. Of course, the idea of unrestrained population growth was alarming, but somehow I couldn’t help but think that there was something wrong there.
The ludicrousness of this prediction only occurred to me years later. According to the model, the doubling period would eventually reach the very short time span of nine months. Now, in order for the population to double in this amount of time, every woman on the planet, aged 1 to 101, would need to be pregnant, and must give birth to twins[2]! The next doubling would occur in only four and a half months, so all the twins would need to be pregnant when they are born... What a curious world our descendants will live in!
In the year 2081

In his book 2081, A hopeful View of the Human Future, Gerard K. O'Neill made an interesting historical observation. He looked at the typical speed when someone travelled. Here is his chart.

of travel
Top speed
60 MPH
600 MPH
6,000 MPH
He noted that every century, there has been a ten-fold increase in speed, so it would follow logically that in another century we will hop into our mass transit vehicles (whatever they may be) and speed off at eight times the speed of sound. Maybe that’s not unreasonable. If we are taking pleasure trips to the moon, then that might actually be rather slow.
Now if we take his formula further forward, we can see that in the year 2581, people will be regularly making trips at about ten times the speed of light. This stretches my credibility a little bit. I prefer to obey speed limits, especially when it comes to the speed of light. Clearly science fiction writers disagree with me on that.
If we go the other direction, we can see that O’Neill’s formula falls into ridiculousville almost immediately. His formula would predict that a typical speed of travel would be 0.6 MPH, about one-fifth of a person’s normal walking speed.
What we have here is another example of an inappropriate mathematical model being used to make predictions. We might as well play off the fact that the word “train” has half as many letters as “stagecoach”, and that “jet” as half as many letters as “train”. (Well… kinda.) The next big leap forward in travel will have 1.25 letters.
The exponential growth of energy usage
I had a sense of déjà vu when I took Environmental Geology in college. There was a homework question that was aimed at impressing on us the disastrous implications of unbridled exponential growth.
The problem stated that the annual world-wide energy usage had been increasing throughout the century at an annual rate of 5%. The problem went on to state that there is a theoretical upper bound on the total energy available if all the matter in the entire Earth is converted into energy. This upper bound is stated in Einstein’s formula e = mc2. Figures were given for the mass of the Earth and current energy usage, and the question was asked: In what year would our annual energy requirements equal the total available energy?
The answer we arrived at was only a few millennia away, perhaps the year 3500, or maybe 10,000, I honestly don’t remember. I do remember the next question, and the answer that I gave. The question was “What do you conclude?” I knew that the correct answer was “It is high time for us to get off our lazy butts and do something about the energy crisis, because in fewer than 100 generations there will be a million billion trillion people standing on the last crumb that is left of the Earth.
I knew that, but I was stubborn. My answer was “Nothing”. For some reason, I did not receive full credit for that answer. The university environment does not favor the creative mind. Or the lazy one, either.
My comeback
I had lost one point on a homework assignment that was worth 1% of my grade in the class. I should have just stopped right there, but I had a point to make. It was all about the principle of the thing. I spent hours and hours defending my short answer.
First, I amended my answer a little, from “nothing” to “nothing, because the answer depends a great deal on the underlying mathematical model that is assumed for the growth of energy usage”. Then, I got out my slide rule and did some curve fitting. I don’t have the original work. I am sure that the paper I wrote it out on has long since crumbled to dust. I will reproduce the salient aspects of it.
In this first graph, I show some data that I cooked up. The data is an exponential curve with the some added to it. In the graph, I also show the least squares fit of an exponential curve. This shows that the data can be approximated with an exponential curve that is increasing at 5.33% per year for 50 years.
Hypothetical energy usage, with exponential curve
I will take this to be a reasonable approximation of what energy usage data might look like. And, based on this data, a scientist or policymaker might come to the conclusion that energy usage is going up at a rate of about 5% per year. (Actually, the data I have dug up looks more chaotic than this. Wars and other catastrophes do a good job of being unpredictable.)
In this next graph, I show that same data, but this time it is approximated by a parabola. Looking at this curve fit, I don’t think I could testify in court that the data must be an exponential. A parabola doesn’t do too bad at fitting the data.
Hypothetical energy usage, with parabolic fit curve
Although, maybe the fit is not so good at the far left side? The parabola dips downward just a tiny bit in the first few years, and the data seems to be going upward. I can fix this fairly easily by using a cubic parabola, a third order polynomial. Just looking at the graphs, I can see no reason why someone would reject this fit over the fit of the exponential.
Hypothetical energy usage, with third order polynomial fit
Why stop at third order? Just for grins, I had a look at fitting the data with a fifth order polynomial. Once again, the fit looks pretty good.
Hypothetical energy usage, with fifth order polynomial fit
Why not try seventh order? Well, I did try it, and I think I will reject this one, maybe just for aesthetic reasons. The curve is a little bumpy. I am not sure the data has enough evidence to support those bumps.
Hypothetical energy usage, with seventh order polynomial fit
Taking a brief excursion back to ridiculousville, I tried using a 20th order polynomial to fit the data. Clearly the wiggles in this polynomial are not a true feature of the real data, but a feature of the noise. (Those who read my post on When regression goes bad will understand why this failed.
Hypothetical energy usage, with twentieth order polynomial fit
I did one last curve fit. This one starts with some reasonable assumptions about growth. An exponential curve is a reasonable approximation for growth of most physical things at the onset, but eventually in any real system, there has to be some saturation. Bunnies multiply, but eventually they run out of food.
For anything real, there has to be constrained growth. One commonly used model for this is the logistics curve. This curve shows initial exponential growth, but the growth gradually slows down as it approaches an asymptote. Once again, the fit looks fairly reasonable.
Hypothetical energy usage, with logistic curve  fit
The punch line
So far, all I have done is demonstrate that a number of curves can be bent around to look like a noisy exponential growth curve. At arm’s length, they all do a modest job at approximating the data that I provided. While some curves are somewhat better than others, there is no slam dunk best curve. Hang on to that thought, because the punch line is coming.
Each of the curves that I have fit to the data can be used to predict what the energy usage will be at some later date. I have gone through that exercise with each of these curves to yield a prediction about the energy usage in year 100 (50 years beyond the end of the data), and at year 500.
Year 100
Year 500
1.919 X 1011
Fifth order
-1.608 X 106
Sixth order
-3.662 X 108
Seventh order
-1.668 X 1010
Eighth order
-3.798 X 1011
Fifteenth order
4.355 X 1010
5.756 X 1022
Twentieth order
-5.857 X 1013
-4.471 X 1029
I hope you are saying wow. These equations all looked kind of similar from t = 0 to t = 50.  Even at t = 100, we have mega-ginormous disagreements on what the energy usage will be: anywhere from the silly value of -6 X 1013 up to the tremendous value of 4 X 1010. I think I can safely say that I have made my point. The underlying choice of model might not matter much for interpolation, but for extrapolation, the choice of model can change an estimate by many orders of magnitude.
Actually, the only model that did not give huge answers for the 500 year estimate is the logistic model, the mathematical equation that was designed to model constrained growth. Hmmm…
Considerable effort usually goes into finding an equation that fits the existing data. Often, a variety of equations are tried and the one that best fits the existing data is chosen.
This typical process omits a crucial step. That step getting to know the data, understanding the natural constraints, and looking at the forces that drive the values up or down. This knowledge should drive the choice of mathematical model.

[1] Little bit of trivia here... the year before 1 AD was 1 BC, rather than 0. The correct difference between these two years is thus 1600, rather than 1601.
[2] I assume that there are an equal number of men and women. If there is only a single, very busy man, the women can be spared having to have twins.