Tuesday, 7 March 2017

On Writing Bad Poetry On Unfortunate Topics For My Children

My children and I used to write poetry for each other occasionally. Here are two pieces of doggerel I wrote for them.

The first one came from being limited by my son Nico to write about one of the first five topics that came up randomly on Wikipedia. My very first topic was "Glutamate dehydrogenase 2, mitochondrial, also known as GDH 2, [...] an enzyme that in humans is encoded by the GLUD2 gene". Although this did not seem a promising topic for a poem, it was better than the next four (Lindmania stenophylla, The Manitoba Day Award [now sadly deleted from Wikipedia], The Sunshine Millions Dash [also now sadly deleted from Wikipedia], and OMB Circular A-123, a "Government circular that defines the management responsibilities for internal controls in [American] Federal agencies") so I wrote an ode to GLUD2.

My daughter Zoe chose not to limit me at all and asked for a poem on any topic in any style, so I wrote a poem for her about not being limited by rules.

Ode to GLUD2 [For Nico]
An enzyme is one wondrous way
That miracles occur each day.
Each enzyme serves to catalyze
The slow reactions that arise
Inside our bodies; and without
Their helpful work there is no doubt
That life would not exist at all!
Life calls for speed; they heed the call.
And who was it that made that call?
Some call it ‘Chance’ some call it ‘All’;
Some call it ‘God’ but all we know
Is something called to make life so.
And why should we not worship it,
That what’s-it-called that made things fit?
If there’s no God, are mysteries solved?
Are enzymes crap if they evolved?
If you need proof that life’s divine
Then chemistry should suit you fine:
I say that no one ever knew
A thing as lovely as GLUD 2.
On Playing the Game [For Zoe]
You said that I could write in any style:
So I thought: Free verse! But after a while
I thought that things work better with some rules.
I don’t say that all anarchists are fools,
But the world’s big! To focus your view
It helps if there are rules guiding you.
If soccer was played just any old way
I don’t think it would be as fun to play
As it really is: Who would shoot to score
If the goal was moving around or
If some players could use a hockey stick?
I don’t play soccer but I think the trick
(maybe not just there, but in poems too)
Is that masters of the game are those who
Learn to love the rules. So I wrote in rhyme:
Maybe I’ll do free verse another time.

Wednesday, 4 January 2017

On How Many Words We Know

How many words does the average person know? This sounds like it should be an easy question, but it is actually very difficult to answer with any confidence. There are a lot of complications.

One complication is that is it not easy to say what it means 'to know' a word. Language users can often recognize many real words whose meaning they cannot explain. Does merely recognizing a word count as 'knowing' it? If not, if we have to know what a word means to count it, how can we decide what it means to 'know the meaning of a word'? As any university professor who has marked term papers will attest, many of us occasionally use words in a way that is not quite consistent with their actual meaning. In such cases, we think we know a word, but we don't really know what it means.

A second complication is that it is not totally obvious what we should count when we reckon vocabulary size. Although the word cats is a different word than the word cat, it might seem unreasonable to count both words when we are counting vocabulary, since it essentially means that all nouns will be counted twice, once in their singular and once in their plural form. The same problem arises for many other words. What about verbs? Should we count meow, meows, meowing, and meowed as four different words? What about catlike, catfightcatwoman, and cat-lover? Since English is so productive (allows us to so easily make up words by glomming together words and morphemes, subparts of words like the pluralizing 's'), it gets even more confusing when we start considering words that might not yet really exist but easily could if we wanted them to: catfest, catboy, catless, cattishness, catliest, catfree, and so on. A native English speaker will have no trouble understanding the following (totally untrue) sentence: I used to be so cattish that I held my own catfest, but now I am catfree.

The third complication is a little more subtle, and hangs on the meaning of the term 'the average person'. In an insightful paper published a couple of years ago (Ramscar, Hendrix, Shaoul, Milin, & Baayen, 2014), researchers from Tubingen University in Germany argued (among other things) that it was very difficult to measure an adult's vocabulary with any reliability. Assume, reasonably, that there are a number of words that more or less everyone of the same educational level and age all know. If we test people only on those words, those people will (by the assumption) all show the same vocabulary size. The problem arises when we go beyond that common vocabulary to see who has the largest vocabulary outside of that core set of words. Ramscar et al. argued (and demonstrated, with a computational simulation) that the additional (by definition, infrequent) words people would know on top of the common vocabulary are likely to be idiosyncratic, varying according to the particular interests and experiences of the individuals. A non-physician musician might know many words that a non-musician physician does not, and vice versa. They wrote: "Because the way words are distributed in language means that most of an adult's vocabulary comprises low-frequency types ([...] highly familiar to some people; rare to unknown to others), [...] the assumption that one can infer an accurate estimate of the size of the vocabulary of an adult native speaker of a language from a small sample of the words that the person knows is mistaken". Essentially, the only fair way to assess the true vocabulary size of adults (i.e. of those who have mastered the common core vocabulary) would be to give a test that covered all of the possible idiosyncratic vocabularies, which is impossible since it would require vocabulary tests composed of tens of thousands of words, most of which would be unknown to any particular person.

So, is it just impossible to say how many words the average person knows? No. It is possible, as long as you define your terms and gather a lot of data. A recent paper (Brysbaert, Stevens, Mandera, and Keuleers, 2016) made a very careful assessment of vocabulary size. To address the first complication (What does it mean to know a word?), they used the ability to recognize a word as their criterion, by asking many people (221,268 people, to be exact) to make decisions about whether strings were a word or a nonword. To address the second issue (What counts as a word?), they focused on lemmas, which are words in their 'citation form', essentially those that appear in a dictionary as headwords. A dictionary will list cat, but not cats; run, but not running; and so on. If this seems problematic to you, you are right. Brysbaert et al. mention (among other attempts to identify all English lemmas) Goulden et al's (1990) analysis of the 208,000 entries in the (1961) Websters Third New International Dictionary. That analysis was able to identify 54,000 lemmas as base words, 64,000 as derived words (variants of a base word that had their own entry), and 67,000 as compound words, but also found that 22,000 of the dictionary headwords were unclassifiable. Nevertheless, Brysbaert et al. settled on a lemma list of length 61,800. To address the third issue (What is an average person?) they presented results by age and education, which they were able to do because they had a huge sample.

And so they were able to come up with what is almost certainly the best estimate to date of vocabulary size (drumroll please): "The median score of 20-year-olds is [...] 42,000 lemmas; that of 60-year-olds [...] 48,200 lemmas." They also note that this age discrepancy suggests that we learn on average one new lemma every 2 days between the ages of 20 and 60 years.

As I hope the discussion above makes clear, 48,200 lemmas is not the same as 48,200 words, as the term is normally understood... Because they focused on lemmas specifically to address the problem of saying what a word is, Brysbaert et al. didn't speculate on how many words a person knows [1], where we define words as something like 'strings in attested use that are spelled differently'. I have guesstimated myself, informally and very roughly, that about 40% of words are lemmas, so I would guesstimate that we could multiply these lemma counts by about 2.5, and say that an average 20-year old English speaker knows about 105,000 words and an average 60-year-old English speaker knows about 120,500 words...but now I just muddying much clearer and more careful work.

[1] Update: After this was published to the blog, Marc Brysbaert properly chastised me for failing to note that their paper includes the sentence "Multiplying the number of lemmas by 1.7 gives a rough estimate of the total number of word types people understand in American English when inflections are included", with a reference to the Golden, Nation, and Read (1990) paper. He also noted that this does not include proper nouns. Without boring you with the details of how I came to my estimate of the multiplier, I will note that my estimate was made on a corpus-based dictionary that included many proper nouns, so our estimates of how to go from lemmas to words are perhaps fairly close. My apologies to the authors for mis-representing them on this point.

Brysbaert, M., Stevens, M., Mandera, P., & Keuleers, E. (2016). How Many Words Do We Know? Practical Estimates of Vocabulary Size Dependent on Word Definition, the Degree of Language Input and the Participant’s Age. Frontiers in Psychology, 7.

Goulden R., Nation I. S. P., & Read J. (1990). How large can a receptive vocabulary be? Applied Linguistics, 11, 341–363.

Ramscar, M., Hendrix, P., Shaoul, C., Milin, P., & Baayen, H. (2014). The myth of cognitive decline: Non‐linear dynamics of lifelong learning. Topics in Cognitive Science, 6(1), 5-42.

Sunday, 11 September 2016

On the Narcissism of Small Differences

In his 1929 book Civilization and Its Discontents, Sigmund Freud commented on "the narcissism of small differences", the idea that people who are very similar to each other become extremely attentive to the very small distinctions that differentiate them. When I came across a reference to this idea recently, it reminded me of a discussion in Milan Kundera's novel Immortality, in which Kundera discusses two ways of differentiating ourselves from others: by addition (highlighting a positive characteristic that makes us unique) or by subtraction (highlighting how we are unique in lacking a negative characteristic). In his usual wordy way, Kundera writes:
"The method of addition is quite charming if it involves adding to the self such things as a cat, a dog, roast pork, love of the sea or of cold showers. But the matter becomes less idyllic if a person decides to add love for communism, for the homeland, for Mussolini, for Catholicism or atheism, for fascism or antifascism. [...] Here is that strange paradox to which all people cultivating the self by way of the addition method are subject: they use addition in order to create a unique, inimitable self, yet because they automatically become propagandists for the added attributes, they are actually doing everything in their power to make as many others as possible similar to themselves; as a result, their uniqueness (so painfully gained) quickly begins to disappear. We may ask ourselves why a person who loves a cat (or Mussolini) is not satisfied to keep his love to himself and wants to force it on others. Let us seek the answer by recalling the young woman [...] who belligerently asserted that she loved cold showers. She thereby managed to differentiate herself at once from one-half of the human race, namely the half that prefers hot showers. Unfortunately, that other half now resembled her all the more. Alas, how sad! Many people, few ideas, so how are we to differentiate ourselves from one another? The young woman knew only one way of overcoming the disadvantage of her similarity to that enormous throng devoted to cold showers: she had to proclaim her credo 'I adore cold showers!' as soon as she appeared in the door of the sauna and to proclaim it with such fervor as to make the millions of other women who also enjoy cold showers seem like pale imitations of herself. Let me put it another way: a mere (simple and innocent) love for showers can become an attribute of the self only on condition that we let the world know we are ready to fight for it."
The narcissism of small differences (and the unfortunate human drive for it) explains much of the insanity in the world in general, and in the current US election cycle in particular.

Tuesday, 30 August 2016

On Why Croatia and Jamaica Are the Best Olympians

[Note: A small error in the original posted tables has been corrected.]

The Olympic medal rankings are silly because they are not adjusted for population. Obviously countries with a larger pool of talent to draw from will do better than countries with a smaller pool. I have calculated the population-adjusted rankings for the top 20 countries by raw medal count. I stopped at 20 because that is where my own country, Canada, fell.

G = Gold; S = Silver; B = Bronze; SUM = All Medals. 
The 'ACTUAL RANKING' is the ranking by raw medal count, the usual way.
'ADJUSTED' is the population adjusted ranking.
'DIFFERENCE' is the ranking by the difference between 'ACTUAL RANKING' and 'ADJUSTED' ranking.
Ties are given the same value.

The table is sorted by the average of all the difference rankings, i.e. by a measure of how much a better a country did in overall ranking compared to what would be expected given its population.

As you can see, by this measure (which to me is much more rational than raw medal count) the countries that did the best are Jamaica, Croatia, and New Zealand. Canada is 8th. The United States and China are big fat losers, 19th and 20th respectively once we adjust for the huge pool of talent they had to draw from.

Saturday, 6 August 2016

On Emotion as Math

My colleagues and I published a paper a few months ago on a strange topic, which is: Why are some nonwords funny? (I blogged about this previously when I wrote about Schopenhauer's expectation violation theory of humor.) The study got a lot of press, in part because the paper included the graph above, which shows that Dr. Seuss's silly nonwords (like wumbus, skritz and yuzz-a-ma-tuzz) were predicted to be funny by our mathematical analysis. You can read about the study in many places (for example, in The Guardian and The Walrus) or, if you have access, get the original paper here.

A lot of people got confused about the way we measured how a NW was funny, in part because we were loose about how we used the term 'entropy' in the paper (though very clear about what we had measured). Journalists understood that we had shown that words were funnier to the extent that they were improbable, and that we had used this measure 'entropy', but most journalists did not report the measure we used correctly. Most thought that we had said strings with higher entropy are funnier, which is incorrect. Here I explain what we actually measured and how it relates to entropy.

Shannon entropy was defined (by Shannon in this famous paper, by analogy to the meaning in physics of the term 'entropy') over a given signal or message. It is presented in that paper as a function of the probabilities of all symbols across the entire signal, i.e. across a set of symbols whose probabilities sum to 1. I italicize this because it emphasizes that entropy is properly defined over both rare and common symbols, by definition, because it is defined over all symbols in the signal.

Under Claude Shannon’s definition, a signal like ‘AAAAAA’ (or, identically, ‘XXXXXX’) has the lowest possible entropy, while a signal like ‘ABCDEF’ (or, identically, ‘AABBCCDDEEFF’, which has identical symbol probabilities) has the highest possible entropy. The idea, essentially, was to quantify information (a synonym for entropy in Shannon's terminology) in terms of unpredictability. A perfectly predictable message like ‘AAAAAA’ has the lowest information, for the same reason you would hit me if I walked into your office and said “Hi! Hi! Hi! Hi! Hi! Hi!”. After I have said it once, you have my point–I am saying hello–and repeating it adds nothing more = it is uninformative.

So, Shannon entropy is defined across the signal that is the English language as a function of the probabilities of the 26 possible symbols, the letters A-Z (we can ignore punctuation and case; we could include them easily enough but they don’t change the general idea and played no role in our nonwords). 

If we do the math (by summing -p(X)log(X), for every letter in the alphabet, which is how Shannon entropy is defined), the entropy of English is 4.2 bits. What this means is that I could send any message in English using a binary string for each letter of length 5. This makes perfect sense if you know binary code: 2^5 = 32, which gives us more codes than we need to code just 26 symbols…so concretely, A = 00000, B = 00001, C = 00010, and so on until we get to Z = 11010).

What we computed in our paper can be conceived of as the contribution of each nonword to this total entropy of the English language, that string's own -p(X)log(X). In essence, we treated each nonword as one of part of a very long signal that is the English language. This is indeed a measure of how unlikely a particular string is, but that is not entropy, because entropy is measure of summed global unpredictability, not local probability. 

Think of it this way: If I am speaking and I say 'I love the cat, I love the dog, and I love snunkoople’, you will be struck by snunkoople because it is surprising, which is a synonym for unexpected. We quantified how unexpected each nonword was (the local probability of that part of the signal), in the context of a signal that is English as she is spoken (or written). 

Our main finding was that the less likely the nonword is to be a word of English—basically, the lower the total probability of the letters the nonword contains–the funnier it is. This is not just showing that 'weird strings are funny', but something more interesting that: that strings are funny to the extent that they are weird.

There is an interesting implicit corollary (not discussed in the paper), which is that we are the kind of creatures that use emotion to do probability judgments. Our feelings about how funny a nonword string is are correlated with the probability of that string. If you think about that, it may seem deeply weird, but I think it is not so weird. One of the main functions of emotion is to alert us embodied creatures to unusual, dangerous, or unpredictable aspects of the world that might harm us. Unusualness and unpredictability are statistical concepts, since they are defined by exceptions to the norm. So it makes good sense that emotion and probability estimation would be linked for embodied creatures.

Sunday, 19 June 2016

On the Odds that an American Muslim is a Terrorist

Hi Mr. Trump,

I wrote on Twitter that someone needed to teach you about Bayes' Rule so you could understand why your idea of profiling Muslims is stupid. It is very obvious from your public statements that thinking rationally is not your strong suit, but brew yourself a cup of coffee (I know you don't drink it, but I think you should take help anywhere you can get it) and see if you might be able to follow along here, buddy.

Bayes' Rule is a pretty simple rule in probability. It's not an opinion, Mr. Trump: it's a fact. I know the difference between opinions and facts is not obvious to you. Let me explain. Occasionally there are ideas in the world that are definitely true, no matter what anyone's feelings about them may be. We call these true ideas facts. I could show you a mathematical proof of Bayes' Rule, but I don't want to strain you too hard, as I know this is probably your first foray into rational thinking. 

So please just take it as given, that this equation (Bayes' Rule) is a true fact (you can get 'the deets' here if you like):

P(A|B) = P(B|A)P(A)/P(B)

What does that mean? Well, P(A|B) is the probability of some event A being true, given that some other event B is true. Bayes' Rule says that we can figure out the (actual, true, mathematically-guaranteed) probability of P(A|B) if you can get the values for some other probabilities (that for various reasons are often easier to get): namely, the probability of event B given event A [= P(B|A)], and the individual probabilities of event A [= P(A)] and event B [= (P(B))].

It all seems very abstract, I am sure. I know that abstract thinking is difficult for you. (Indeed, it often seems like even concrete thinking is difficult for you, as you seem to flip-flop about a bewildering number of highly concrete issues.) Let's make this concrete using an example close to your heart: Let's figure out the probability that an American is a terrorist, given that they are Muslim. If that probability is high, then your idea of profiling Muslims is a good idea. If that probability is low, then your idea is not a good idea.

Bayes' rule tell us: 

P(An American is a terrorist | An American is Muslim)  
= P(An American is Muslim | An American is a terrorist) * P(An American is a terrorist) / P(An American is a Muslim)
Maybe the easiest one to start with is P(An American is a Muslim). Wikipedia says that "According to a new estimate in 2016, there are 3.3 million Muslims living in the United States, about 1% of the total U.S. population." So we have our first probability: P(An American is a Muslim) = 1% or (same thing represented a different way so you won't get confused later when we do some math) 0.01.

What about P(An American is a terrorist)? This one is a little more difficult, because it seems that you think that there are millions and millions of terrorists in the USA right now: every Mexican, every Muslim, every Canadian, the President of the United States, and so on. TechDirt had an article on this a few years ago. They estimated that there are a maximum of 184,000 terrorists in the entire world (which they also called "a ridiculously inflated level"). It's a little harder to know how many of them live in the USA because they are all hiding, biding their time. But we can estimate it roughly by looking at the proportion of terrorist deaths that occur in the USA. It has been estimated that in 2014 (a bad year for terrorism, as you may recall) there was 17,891 deaths worldwide from terrorist attacks, of whom 19 were American. Well, as you know, we had 50 deaths just a few weeks ago in Orlando, so maybe terrorism is getting worse rapidly, as you like to suggest. Let's assume that the 2014 estimate is fully ten times too low and use 190 American deaths due to terrorism per year. If deaths due to terrorism are distributed roughly proportionally to terrorists, then we can estimate that 190/17891 or about 1% of all terrorists are American. 1% of 184,000 is 1840. So now we can get what is surely a very high upper estimate on P(An American is a terrorist): the number of terrorists in the USA divided by the US population, or 1840/318.9 million, which works out to 0.0000058.

Now we only have one number left: P(An American is Muslim | An American is a terrorist). You probably disagree with most people on the planet about this number, because I know you labor under the delusion that all terrorists (including Mr. Obama) are Muslims. Researchers from Princeton University used FBI data to actually estimate this number a few years ago, and they estimated that only 10% of terrorists active in the USA are Muslim (though the estimate for the longer time period of 1970 to 2012 is much lower, just 2.5%). We will go with the larger number: P(An American is Muslim | An American is a terrorist) = 10% or 0.10.

Now we are almost done! All we have to do is plug in our numbers: 

P(An American is a terrorist | An American is Muslim) 
= P(An American is Muslim | An American is a terrorist) * P(An American is a terrorist) / P(An American is a Muslim) 
= 0.10 * 0.0000058 / 0.01 
= 0.000058
This is about 6 per 100,000, or (said another way) there is at most a 6/100,000 chance that that a random American Muslim is a terrorist. If your profilers spent just one hour profiling each random American Muslim, they will have to pass on average about 100000/6 = 16,666 hours before they profile just one terrorist. Assuming a 40 hour work week and 50 weeks of work per year, that is 416.6 person weeks or 8.33 person years per terrorist profiled. This does not strike me as a good use of resources, especially given the fact that such total concentration in identifying only Muslim American terrorists will cause you to miss the 90% of American terrorists that are not Muslim.

Let me know if you have any questions.

Sunday, 13 March 2016

On Painting Duchamp Like The Mona Lisa

One of Duchamp's most important contributions to art was his subversion of the meaning of the word art. He achieved his subversion by attacking the meaning simultaneously from two directions, both by treating non-art as art (his ReadyMades) and by treating art as non-art. Perhaps his most famous treatment of art as as non-art was his (1919) L.H.O.O.Q. (reproduced above), a cheap postcard of  Leonardo da Vinci's Mona Lisa on which Duchamp drew a mustache. The title added to the insolence because it makes a pun in French, since it sounds like elle a chaud au cul [she has a hot ass, or, as Duchamp once loosely translated it, she has fire down below], intended to imply that the beautiful lady is horny. Duchamp was not the first artist to make fun of an old master, but he was the first to raise the anti-art gesture to an art form in itself.

I have recently become fascinated with DeepArt.io, a computer program that will attempt to re-do any image in the style of any other. For example, here is its attempt at the Mona Lisa as it might have been painted by Joan Miro:

Duchamp would have approved of this kind of mechanized manipulation of art, I think. It is retinal art, which Duchamp derided, but it is easy to make works that are more about the concept than the product. I thought it would be amusing to strike a blow back at Duchamp on da Vinci's behalf, by having the machine re-do Man Ray's solarized photographic portrait of Duchamp in the style of the Mona Lisa. So here is the mechanized conceptual subversion of Duchamp's L.H.O.O.Q., Duchamp painted in the style of the Mona Lisa:

Saturday, 5 March 2016

On Matrix Heaven

Heaven is The Matrix. Everyone is invited to walk through the beautiful gates (the blue pill). Those who will not fall prey to the illusion that God plays favorites get to wake up (the red pill).

Sunday, 7 February 2016

On Actualizing an Ivory Statue of the Mind

My novel The Bride Stripped Bare By Her Bachelors, Even begins with my narrator Isaac becoming obsessed with an ivory statue of Abraham, his son Isaac, and an angel, which he sees in a small museum devoted to Germanic art at Harvard University, the Busch-Reisinger Museum
 At the bottom, stands a thick, almost squat Abraham. He raises himself up on two massive, short muscular legs topped with immense powerful hips, visible through the covering of his tunic, which is carved so fine that you can almost see the weave. Following the cone of the ivory, his proportions get smaller towards his head, and he seems to stretch out vertically, as if to portray his yearning to reach up to heaven. His outstretched arms are very slightly too thin, too long for the rest of his body. In his hands, held high above him, he clutches an improbably tiny child, his unfortunate son Isaac, portrayed in the work as a doll-sized baby.
    In order to keep the carving contained with the natural limits of the available medium, the artist had to have Abraham twist his magnificently muscled body unnaturally as he rises up towards God. The fabric of his robe had to billow fantastically down from him, tight to his body because of the natural limits of the tusk, the fluttering of the rough wool cloth incredibly captured in the smooth hard cold cream.
    Flat on the the head of the baby doll son held up so high in Abraham’s mighty hands, almost too tiny to be believably part of the work, is another hand. It is the hand of a tiny buxom angel, carved from the very tip end of the tusk. Her tiny wings flutter out so lightly that the ivory there lets the light shine though. Tiny and nearly transparent herself, she seems hardly there at all.
As far as I could recall, I had never seen an ivory statue of Abraham and Isaac at the museum (though I had been there). I had told many people, including a few friends in Boston who wanted to go and look for it, that it was a figment of my imagination.

However, last week a friend of my mother's was organizing a book club at which they were going to discuss my novel. The friend sent my mother a note asking if the picture above was the ivory statue I had intended. Although the statue is clearly not identical to what I described, it is extremely close and it is part of the collection of the Busch-Reisinger Museum. Moreover, it was very familiar to me, so clearly familiar that I have no doubt I had seen and studied it before I described my fictional carving. I had somehow simply forgotten that it was real, and then re-invented it.

When I saw the picture, I had the uncanny experience that my purely imaginary statue had magically materialized on earth, a strange but very wonderful feeling.

Sunday, 13 December 2015

On Some Untrue Facts About Animals

[Computer-generated lies, thanks to JanusNode.]

Buzzards are the only animal whose buttocks are still evolving. 
Canaries can be trained to destroy.
Armadillos are the only animal whose muscles are parasitic.
Tamarins are going extinct because their boobs are prized for making organ donation containers. 
Voles are the only animal whose perianal areas are angelic. 
Porcupines are the only animal whose bodies can fuse. 
Lice are the only animal whose goos are composed of caffeine. 
The polecat's hairs are shaped like planks.
The largest partridge ever recorded weighed more than a cabdriver's garbage pail.
If a dog eats a kiwi it will climax.
Some jackrabbits found in Cambodia defend themselves by shooting tilapia from their vaginas.

Wednesday, 25 November 2015

On Some New Holiday Gift Ideas

My wife showed me one of those special Christmas catalogs that we receive at this time of year, listing crazy gift ideas that seem implausibly likely to ever sell. She suggested that JanusNode might be able to come up with some equally good products. I whipped up a quick 'Holiday Gift Idea' generator (to be included with the next release of JanusNode), and append herewith a few examples of the output. 

I gift these ideas to the world at large; feel free to develop them for your own profit.
Husband's Laser Pointer Boxer shorts 
Gourmet Garden Statue, made with real dried shark dung
Programmable Coffin with genuine antique finish
Touch-screen Playable Panties
Two-person Cummerbund-Bra & Knicker Set
Bird-watcher's Flashlight with Secret Soap Dispenser 
Glow-in-the-dark Ball Gown T-shirt with engraved initials 
Wind-up Scarf with battery backup 
Fiber-optic Remote-controlled Jeans/Underwear
Automatic Flying Kipper with silly voice 
Dishwasher-safe Slippers 
Unbreakable Baby's First Laser Pointer 
WiFi-enabled Mustard Dispenser 
Edible Lunch Container 
Japanese Toilet Paper Holder with secret Ferris wheel 
Speaking Bikini/Tracksuit
Crocodile skin USB-powered Socks
Remote-controlled Fish Dispenser
World's Smallest South American Biscuit Dish with engraved initials

Saturday, 24 October 2015

On The Mathematics of Meaning

I have worked with co-occurrence models of semantics for a long time. These computational models try to bootstrap word meaning from analysis of patterns of word co-occurrence in large corpora of text. Recently, Google released a set of tools (word2vec) and associated materials for a new, and very good, kind of co-occurrence model that they have built. There is a nice explanation of the model here.

One of the things you can do with co-occurrence model word representations is subtract or add them, to see what the resultant word representation 'means' (I skip over the mathematical details since we are just here for fun). For example, in word2vec space:
king - man + woman = queen
The equality sign here has to be taken with a grain of salt; it really means 'is similar to'.

My colleague Geoff Hollis and I have been working with the word2vec model (using a smaller dictionary and a slightly different representation and similarity measure than Google). I added the ability to add fractions of representations instead of just adding or subtracting each word representation as a whole, and have spent some time looking for interesting semantic math results. I have defined '=' here as 'being in the top ten closest results' (and also restricted myself by requiring that the final result on the right of the '=' sign cannot be among the top ten closest neighbors of any the input words on the left of that sign). This human flexibility (and the fact that I have deliberately searched for interesting results) means that this math is really a human-computer collaboration rather than a purely computational result. 

Here are some of my most interesting results. Enjoy.
love + 0.4 * sex = friendship
love + sex = infidelity
love + 3 * sex = monogamy

murder + fun = gunplay

apple + pig = potato

cat + 0.7 * dog = poodle

despair + 0.5 * hope = frustration

wealth + 0.2 * dream + 3 * selfish = elitist

courage + 2 * stupidity - incompetence = audacity

hope + time = opportunity

logic + hope = principle

man - 2 * education = snake

tiger - cat = rhino

sex + drunken = debauchery

love + dream = passion
[Image from: Alfred Bray Kempe (1886) A Memoir of the Theory of Mathematical Form.]

Saturday, 17 October 2015

On Attaining Our Goals

 Approach the goal. 
         It is difficult to attain 
what is not there.

The words above are one of my favorite JanusNode productions. I like the idea that life is all about striving to attain goals that are really just figments of our imagination. We make up goals, and then our goals make us up.

In my history of psychology course at the University of Alberta we discuss Carl Jung, whose work addresses the weird question that has to be asked: Who made up the process by which we make up goals? Whoever or whatever controls our goal-making algorithm controls us. Jung had a labyrinthine answer to the question of where that algorithm comes from.

 Jordan Peterson's (1999) book 'Maps Of Meaning: The Architecture of Belief' and Elizabeth and Paul Barbers' (2004) book 'When They Severed The Earth From The Sky: How the Human Mind Shapes Myth' both discuss Jung's answer, more or less, from different perspectives. The discussions they each offer are also complex, but include noting that:

  • Humans are not very good storage devices so information gets distorted when it passes into our heads. What is incidental fades away. What is important is magnified.
  • One way to safeguard what is important when it has to be stored in a leaky human mind is to store it more than once.
  • The unknown is frightening and has to be made comprehensible, predictable, and  approachable by speaking of it using analogies to what we understand for sure, notably human needs and desires.
  • Analogies using human needs and desires require that the story be 'fleshed out a little', with the analogy-maker adding elements to make the story coherent.
  • Similar stories from different sources can be merged into a new meta-representation that can encode the gist of the similarity, giving us recognizable, stable and versatile mythic elements are useful for coherently representing and thinking about what is important.    
When we invent the goals that make us up, we use the cognitive tools that we have. Those tools include some that have been passed down to us by natural selection and others that have been passed down encoded as mythic elements in tales, poems, songs, and legends. 

Who programmed the algorithm by which we make up the figments of imagination that make us up? All of history did.

Saturday, 26 September 2015

On Prickles & Goo

I enjoy this video cartoon of Alan Watts talking about 'prickles and goo', which was made by the people who brought us South Park. I use this cartoon in my History of Psychology class to introduce the fundamental split in history and in contemporary science between the 'Neats' and the 'Fuzzies' (as philosopher Dan Dennett called them): people who best appreciate nice neat clean 'Lego-like' theories (like the Pythagoreans) and those who best appreciate non-linear, messy, 'Plasticine-like' theories (like William James).


Saturday, 5 September 2015

On Schopenhauer's Expectation-Violation Theory of Humor

 "From a scientific point of view, optimism and pessimism are alike objectionable: optimism assumes, or attempts to prove, that the universe exists to please us, and pessimism that it exists to displease us. Scientifically, there is no evidence that is concerned with us either one way or the other."
                Bertrand Russell
                A History of Western Philosophy

Arthur Schopenhauer (pictured above looking happier than ever) is a German philosophy mainly remembered for his (1818) book, The World As Will and Idea. It's a rather dreary document, a precursor to 20th century Existentialism, written as a slow quasi-mystical, sentimental rant about how much better it is not to exist than it is to exist. This is an ancient idea of course: Sophocles could already allude to its long history, writing in his [c. 400 BCE] play Oedipus at Colonus "Never to have lived is best, ancient writers say". Schopie (as a German colleague of mine likes to call him) was allegedly not as miserable as his philosophy makes him sound: he enjoyed eating out at restaurants, having love affairs, and arguing.

My own interest in Schopie stems entirely from an unexpected interlude in The World As Will and Idea on the nature of humor. In that interlude Schopie suggests that “The cause of laughter in every case is simply the sudden perception of the incongruity between a concept and the real objects which have been thought through it in some relation”. The most common theory of humor when Schopie was writing was that incongruity itself was humourous, an idea alluded to by Aristotle (as what wasn't?) but generally attributed to Francis Hutcheson’s (1725) essays Reflections on Laughter. Many critics pointed out the obvious problem with this idea: some incongruities are not funny at all. The psychologist Alexander Bain wrote a great paragraph about this in his (1865) book The Emotions and Will:  

“There are many incongruities that may produce anything but a laugh. A decrepit man under a heavy burden, fives loaves and two fishes among a multitude, and all unfitness and gross disproportion; an instrument out of tune, a fly in ointment, snow in May, Archimedes studying geometry in a siege, and all discordant things; a wolf in sheep's clothing, a breach of bargain, and falsehood in general; the multitude taking the law into their own hands, and everything of the nature of disorder; a corpse at a feast, parental cruelty, filial ingratitude, and whatever is unnatural; the entire catalogue of vanities given by Solomon,— are all incongruous, but they cause feelings of pain, anger, sadness, loathing, rather than mirth.”
Schopie's proposal is that it is not incongruity itself that is funny, but only incongruity between an event in the world and a pre-existing idea or expectation about that event. In particular, he proposed two forms of humor that (only partially following his own terminology) we can call conceptual bifurcation and conceptual subsumption. Humorous bifurcation occurs when an idea that was believed to be from one category turns out to belong to two categories. A pun is a good example. Humorous subsumption is pretty much the opposite: it occurs when two ideas believed to be belong to distinct categories actually belong to the same category, as in the many jokes that ask how one thing is like or unlike another.

Schopie went on to suggest that what he called "the ludicrous effect" (degree of funniness) was related to the size of the relevant incongruity. Though he made no attempt to quantify this idea by controlling the size of the incongruity, I have recently done some work with some colleagues that tries to do exactly that...as it is currently under review, I will leave a discussion of it for a later post. We can get a vague feel for Schopie's claim by comparing frequent and infrequent word pairs that violate our expectations. For example, I find the phrase ghost llama (which has 753,000 occurrences on Google) funnier (more humorous) than the phrase ghost cat (19,300,000 occurrences), even though I do find ghost cat a little funny. It is hard to extend the idea systematically to real jokes, since they are often incongruous in complex ways that do not admit of easy quantification.

[Image adapted from: Zimmern, H. (1876) Arthur Schopenhauer: His Life and His Philosophy. London: Longmans, Green, and Co.]