Wednesday, October 31, 2012

Just some random thoughts

Today's blog was inspired by a good friend and co-worker, Pat. He sent me a link to a brilliant piece of erudition / social commentary.

Random math paper generation

A fellow by the name of Nate Eldredge wrote a program called MathGen that built on the previous program called SCIgen. This clever piece of work will randomly generate math papers. Papers in pure math, ready for submission to a journal. Gibberish papers. 

It seems that someone actually took the challenge and submitted this randomly generated paper to the journal Advances in Pure Mathematics. I am sure that most of my readers have a copy of this journal on their nightstand. I know I keep mine handy.

The paper was accepted. Well, to be honest, it was provisionally accepted. The editor had the gall to ask for a few things to be cleaned up. This oppressive editor complained that he couldn't quite catch the main thought of the article from the abstract.

Naturally, I had to try generating my own paper. And since I am more or less a vain fellow, hell-bent on self aggrandizement, I put my name on the paper. Here is the abstract of my paper.

I will admit to being quite proud of my efforts. I haven't a clue what it means, although I recognize several of the words. But still, I am proud to have my name at the top.


That was the erudite part. Taken is small pieces, the text is hard to distinguish from virtually any paper on pure math. Clever little program.

What about the social commentary part of this? Here are some of the conclusions that I could draw from this story.

Conclusion 1 - The journal that provisionally accepted this manuscript? They certainly have egg on their face. What else can I say?

Conclusion 2 - The papers that MathGen writes look a lot like every other pure math paper that I have gone out of my way to not read, so I can hardly fault the editor for not taking the time to understand this random paper. Let's face it. The truth is, pure math papers are all gibberish anyway. I have seen several pure math papers that supply proofs of this.

I should explain here that "John the Math Guy" is not my full name. My full name is "John the Applied Math Guy". I should further explain that applied and pure math guys are not always the best of friends. This is not well known, but applied and pure mathematicians have been at odds ever since Aristotle got in a fistfight with Archimedes over an urn that one of them broke during some bacchanalia. The remnants of those pottery shards and ancient arguments can still be seen in the applied versus pure math jokes that have been going around. I am sure you have heard all of them [1]. 

Conclusion 3 - The MathGen program has demonstrated some level of artificial intelligence, perhaps not quite up to the Turing test [2], but it was up to the task in a rather limited scope. This is perhaps not all that surprising, since the Turing test is not all that hard, given the circumstances are narrow enough. The 1966 program ELIZA did a decent job of convincing people that it was a psychotherapist. 

Many lines of code have passed through the compiler since ELIZA. What is the state of the art now? Are other programs capable of producing creative work that rivals humans?  

Random song generation

Years ago, when I got my first computer (it was a Radio Shack Color Computer with 48K of RAM), I played with writing a program that would generate songs at random. The first approach was, well, lousy. It sounded a lot like music by John Cage. This is not all that strange, since Cage wrote one of the first totally aleatoric music pieces, meaning "music composed by throwing dice".

Many times I have pondered ways to make random music sound, well, less random. Rules could be added like "the piece should end on the tonic", and "small steps in pitch are more likely than large ones", and "there is an underlying chord pattern in the song that the notes might want to follow", and, an important one, "breaking the rules once in a while makes the song interesting." I have never taken this idea any further. I guess I am just too lazy to sit around and have my computer write music for me. 

I would assume that I am not the only one who has thought along these lines. I did a very tiny amount of searching and found that Wolfram [3] has put at least some effort into this.  This is cool, cuz these folks are really sharp and they know lots of stuff and everything.

Here is their music generator website. I played with this a while, trying to get a little blues riff going. Blues is one of the choices they have for genre. I figgered this would be a good choice, since the eight-bar blues chord progression is a piece of cake to master [4]. I selected a few instruments that I might like to hear doing blues.

I can't express my level of disappointment. If John Lee Hooker ever met John Cage... Clearly since this percolated up to the top of the Google search, it must represent the absolute bestest available in random music generation technology. Sad.

Random artwork generation

Maybe random generation of art is more state-of-the art? I poked around the internet a bit and found a website where I could generate my own random artwork. I used the title "Random art?" and got the fabulous image below. (This program evidently uses the words you type in to reseed their random number generator, since the same name always gives you the same artwork.)

Random Art?

I don't have to tell you that this piece is absolutely gorgeous. I had it printed on a 2 ft by 2 ft canvas and framed it for my wall. But, is this up to par with what a human artist can do? Have a look at the piece below, and you be the judge [5].

Yes, I agree. Monet couldn't hold a candle to either of these incredible pieces of art.

Random poetry generation

Poetry is another field where the old dice can give a practitioner a run for his money. 

The algorithm for one automatic poetry generator is pretty transparent. You give the poetry generator lists of nouns and of adverbs and all that. You give it a template for the poem, telling it what part of speech to use where. You push the button and it will fill in the blanks with randomly selected words from your lists. Tedious, simple, and boring.

On the other hand, it is my understanding that the algorithm used by this program is not all that different from the one used at the heart of MathGen.

You know what would be nice? It would just be nice if the poetry generator would save me the effort of typing in words and just go out and search the web for seed words. I dunno, maybe I could give it a list of my patents, and it could turn them into some love sonnets. I am sure that would get me the babes.

This next one was quite a bit less work for me, and it actually sounds like poetry  At least to me, someone who really doesn't dig poetry. I guess a poetry aficionado would read this and make sense of it. Maybe even think it's deep?

asphalt lines, slicked with tar in the sun 
radiation in the stormclouds 
bubble cloud air and water mixed 
smooth flat memories of past 
blood becomes timebomb 
flower petals 
forest deer 

Here is yet another random poetry generator, one that has been around for quite some time.

The Rando-Dylan poetry generator

Here is some poetry generated by this random poetry generator:

You never turned around to see the frowns on the jugglers and the clowns
When they all come down and did tricks for you
You never understood that it ain’t no good
You shouldn’t let other people get your kicks for you
You used to ride on the chrome horse with your diplomat
Who carried on his shoulder a Siamese cat
Ain’t it hard when you discover that
He really wasn’t where it’s at
After he took from you everything he could steal

I understand that some people think that Rando-Dylan produces some deep poetry, as well.

Random prose generation

The first random poetry generator picked words at random, but it had the advantage of being fed information about whether a given word was a noun or preposition. This, combined with a template, was enough to create some semblance of proper grammar.

Claude Shannon [6] suggested a different algorithm. Imagine starting by creating a word probability array. The array contains a list of a lot of words along with their probability of occurrence. In this first order approach, the computer merely selects the next word according to the likelihood in the array.

Simply, but it sounds like total gibberish. There is no knowledge of grammar.

Shannon's second order approach is stochastic. The next word in the sequence depends upon current word. The first step is to create a two dimensional histogram of words. Position i,j in this array is the probability that word i will be followed by word j. Such an array could be quite easily created by downloading public domain novels from the internet.

I don't think Shannon did any downloading from the internet, but he did generate the following second-order stochastic prose:


Gibberish? Well, yeah. But consider doing third order, where each word is generated based on the two previous words. Thus, three word sequences would only appear in the output if they occurred somewhere in the text that built the histogram.

Or why not fourth order, or... Clearly at some point this go from being from a cool way to write novels to plagiarism. I don't care how many books you fed into the program to create the multidimensional histogram, If the computer has a sequence of 50 words, the next word will probably come from the novel that the first 49 words came from. 

Political speech generation

Since I know that both Obama and Romney rush to read my blog every Wednesday morning when it posts, I will end with something that will be of great use to both of them. Yes, there is a random generator of political speeches! I rolled the dice a few times and here is what I got:

My opponent is conspiring with Halliburton board members, street gangs and military-industrial warmongers. I will work for an America where pedophiles and socialists cannot sabotage our love for the Bible. Unlike my opponent, I will protect our right to kill foreigners, our sense of trust and our right to free speech.

I certainly think this passes the Turing test.

-----------------   Notes  --------------------------

[1] A fellow becomes a pure mathematician when it realizes he doesn't have enough personality to be an applied mathematician. Or an accountant. How can you tell the difference between a pure and an applied mathematician? An applied mathematician will look at your shows when he talks to you, rather than his own. A pure and an applied mathematician walk into a bar. The rest of the joke is obvious.

[2] The early computer scientist propose a simple test of whether a computer can "think". If the computer can be mistaken in a conversation with a human being, then the machine can be said to think, or to have artificial intelligence, or at least, can be said to have passed the Turing test.

[3] Wolfram makes the program Mathematica, which I have been using since around 1985.

[4] I haven't quite mastered the eight-bar blues. I am usually stumbling by the third or fourth bar. And I have never made it all the way through the circle of fifths.

[5] I have been accused of stretching the truth sometimes, but this is truth. The image was generated by the online program that I mentioned. You can try it yourself. The painting was indeed painted by Mark Rothco. You can click the link to see. The uncanny resemblance of these two is clearly proof of the fundamental connectedness of all things.

[6] Claude Shannon (1948). The Mathematical Theory of Communication. Bell Systems Technical Journal 27, pp. 379–423, 623–656

1 comment:

  1. I just heard about this today... Google can write poetry also: