Running head: HUMOR AND RESPONSE TIME

Unexpectedly Funny: Relations between Lexical Decision Response and Percieved Joke Quality

Jaime A Quinn

Oberlin College

 

Abstract

Several theories of humor make simple incongruity a general ingredient of humor. Using response time on a lexical decision task as a measure for word expectation, the author shows slower response to the humor-bearing closing word of a joke than to other meaningful completions. Moreover, the non-joke endings were rated as significantly funnier when response time (corrected for certain syntactic factors) showed them to be less expected. This result did not hold for the true jokes, however. These results are discussed in light of theories of humor which consider the humor of incongruity to be merely one manifestation of more general humor processes.

 

 

 

Q: Why did the humor theorist cross the road?

A: "Humor is a serious subject," so don't joke about it.

 

The punchline of this joke would not be so trite if it weren't true in some sense. Many things about laughter (and amusement in general) seem to make it ideal for serious study, in particular by cognitive psychologists. It is a basic human process; people do not have to learn to laugh. It is clearly related to other cognitive processes; thus the phrase "getting it" can refer both to being amused by a certain piece of humor and to appreciating the significant aspects of a situation. Moreover, like related cognitive processes about which somehat more is known, it is quick; the time it takes people to laugh at a joke is clearly at least grossly comparable to the time it takes them to understand it. These characteristics all make humor not only attractive as an area of study, but also suitable for such standard cognitive science tools as reaction time analysis.

Nevertheless, humor is not considered part of mainstream cognitive psychology. A recent edition of an introductory textbook in the discipline (Anderson,1985) has no references to "humor" or "laughter" in the index. While there has been some amount of study of this area, there is still much basic work that has not been done. This is even more surprising when one considers that most modern theories of humor focus on its cognitive aspects, with social factors or considerations also strongly present. For instance, Minsky (1981) suggests that humor is (cognitively) a mechanism to focus attention on potential cognitive errors, and (socially) a means of communicating warnings about these possibilities of slip-up. Although such theories of the reasons for or sources of humor are joined by theories which deal more with the mechanical mechanisms of humor apprehension, studies still tend to focus on what is funny when, leaving questions of how entirely theoretical.

I am of course not the first person to see the potential for more study of humor within cognitive psychology. Long and Graesser (1988) call for more research in their review paper on the topic. They begin by summarizing the prevailing groups of psychological theories of humor. In terms of humor's sources and functions, there are two broad classes, arousal theories and disparagement theories. Arousal theory suggests that humor consists of the relief as an increase in arousal or tension is dissipated. This theory was first suggested by Freud in his 1905 Jokes and Their Relation to the Unconscious, but it continues to motivate such studies as Prerost, 1995. Disparagement theory suggests that humor is the pleasurable feeling of superiority one feels as compared to the butt of a joke. While both of these theories may reveal much about humor, they do little to elucidate how humor is cognitively distinct from similar processes (other arousal dissipators or feelings of superiority), and so are of limited value to a cognitive study specifically of humor.

The final class of humor theory does better in this regard. According to incongruity-resolution theory, humor stems from the successful cognitive resolution of an apparent incongruity. Long and Graesser later give a couple of examples of incongruity-resolution theory. The first, classic one, formulated by Suls (1972) is a step-by-step information processing approach, of which the key elements are continuous prediction, recognition of an incongruous ending, and finding a rule that makes the ending follow from the text. While the specific theory mentioned fails to deal with humor, such as wit, that is not bracketed within the confines of a joke, it remains an important contribution.

Long and Graesser then propose an incongruity-based humor model of their own, based on work by Gildea and Glucksberg (1983) on metaphor comprehension. Gildea and Glucksberg found that a sensible figurative meaning for a sentence interfered with subjects judgement of that sentence as false, especially if the context favored the figurative meaning. This led them to suggest that literal and figurative meanings are processed in parallel. By analogy, Long and Graesser suggest that the literal and alternate ("incongruous") meanings of jokes are processed in parallel. If both the alternate and literal meanings turn out to be appropriate, humor results. Their model leaves it unclear what happens when the literal meaning is inappropriate and the alternate one is; this can result in either no humor (a simple metaphor) or humor (a joke or witty statement). This model leaves much open in terms of contextual factors, the influence of relative speeds of processing, etc.

At the end of their article, which also includes a taxonomy of jokes and wit, Long and Graesser do not neglect to provide a punchline. Under the heading, "A Puzzling Paradox: What Makes Humor Funny?" they admit the fact that despite having written an edifying study of humor, they haven't the slightest idea why it is comic. In response to their final straight-faced sentence, "Hopefully, we may yet answer the question 'What's so funny?'," I can only suggest that functional theories such as Minsky's may point us toward such an answer.

In the meantime, further research on metaphor comprehension (Pynt, Besson, Robichon, and Poli, 1996) has suggested further modifications to the parallel joke comprehension model that Long and Graesser propose. By studying ERP's, these researchers hoped to further clarify the internal mechanics of Gildea and Glucksberg's 1982 results on metaphor comprehension. In a series of related experiments, they presented subjects with literal sentences ("Those animals are lions"), familiar metaphors ("Those fighters are lions") and unfamiliar metaphors ("Those apprentices are lions"). Each sort of metaphor was presented either alone or accompanied by appropriate context ("They are not cowardly...") or inappropriate context ("They are not curious...").

As early as 400 ms after the final word, an appropriate context caused a change in brain waves; the N400 component of the ERP, generally agreed to represent some sort of response to semantic unexpectedness, was reduced by an appropriate context. The late-positive component (aka P600) was likewise increased. This main finding in support of parallel or context-dependant metaphor processing wasn't refuted by other analyses in the data. For instance, ERP's were either perturbed at the N400 or not at all; no pair of conditions had similar N400 with significantly different P600, a result which would have suggested a two-stage process.Finally,the N400 for an inappropriate context was if anything smaller than that for no context, suggesting that, since a misleading context does not demonstrably increase the difficulty of comprehension, perhaps metaphoric comprehension is only occuring when it is contextually appropriate. This supports the context-dependant model, as opposed to the parallel model.

This last result, while very preliminary, has implications for theories of humor apprehension. It shows that context can perhaps constrain the meaning of a sentence to its literal one. Jokes could thus result either from a later reassessment which lifts that constraint, suggesting a Suls(1972)-like model, or they could benefit from parallel contextual processing of their incongruous, funny meaning, following a model like that of Long and Graesser (1988).

Finally, a study of humor would do well to take into consideration some methodological results. For instance, humor responses are made greater by positive affect and by sensitization. Studies have failed to find habituation by other jokes on similar subject matter (Deckers, Buttram, Winsted, 1989). A study by Deckers and Ruch (1991) has many methodological points to raise. The 3-WD test pioneered by Ruch (1983) suggests that jokes primarily vary along the mutually exclusive Nonsense (unresolvable incongruity) and Incongruity-resolution scales, as well as along a scale of sexual content. Responses to jokes vary along both funniness and aversion scales. Finally, social tests of generalized sense of humor are not good predictors of level of joke-appreciation in an isolated context.

 

The Experiment

One basic result in the study of humor would be to find a measurable correlation between the incongruity of a stimulus and its humor. To do this, I selected jokes in which the last word carried the humor (i.e. "To keep milk from turning sour, keep it in the cow.") After allowing the subjects to read all of the joke except for the last word, a lexical decision task was given for a proposed completion of the sentence. I use response time on this task as a measure of incongruity.

Substantial evidence exists to support the idea that this response time correlates negatively with expectation. According to the Multiple Read-Out model of Grainger and Jacobs(1996), this task represents the read out of both single-unit and summed activity over a network of word units that can be primed both lexically, syntactically, and pragmatically. Although it would appear that lower-level morphological and semantic processes are the strongest influences in this task, it is at least one easily-assessed measure of incongruity.

Method

Participants

39 undergraduate students (63% women) at Oberlin College participated in return for partial course credit. Recruitment materials included the title of the study, "Humor and Response Time". One data set showed strong evidence of either bad-faith participation or materials problems and had to be discarded.

Materials

Platform The subjects were tested in groups of 8-12 using Power Macintosh 7100/80AV computers and Hypercard 2.3.5 software. This afforded a theoretical temporal measurement accuracy of 10 ms and an effective accuracy of about half that.

Stimulus Events. 46 jokes were culled from many sources, including children's joke books (13), internet humor compilations (14), the aphorisms of Ashleigh Brilliant (12), and personal communication (7). There were notable stylistic differences between jokes from these different sources. The main criterion for selection was that jokes not be funny until the last word was added. Jokes were edited somewhat to fit this criterion and to be appropriate to the audience. Jokes that were strictly puns in the classic sense (many of depend on atypical pronunciation or stressing of their final word) were not considered.

Procedure

Participants were tested in groups of 8-12, both for convenience and to add a moderate social aspect to the study. After an initial mood assay in which subjects rated their feelings from 1-8 in relation to 3 positive and 3 negative adjectives (contentented, depressed, happy, useless, brisk, dispirited), they were given the following instructions, in both a verbal and a visual form:

 

In this experiment, you will be presented with a series of jokes. When you first see a joke, the last word will be missing. As soon as you have read the beginning, press the F1 key. A "word" will then appear in the underlined space below. Your task will be to determine, as accurately and quickly as possible, if this is an English word. This response will be timed.

 

If it IS an English word, you will press the F8 key.

If it is NOT an english word, you will press the F1 key.

 

In order to increase experimental precision, please have your hands ready on these keys when the experiment starts.

 

After you have decided whether the word is English, you will be asked how funny you find the joke, including the last word given, relative to the other jokes in this study. Remember, give a high rating to jokes that are funnier than the others, even if you do not find them especially funny. You will also be asked if you have heard a joke before, ignoring the last word. After responding, press F1 for the next joke.

 

Please write your name in the space provided, without using the arrow keys, now.

 

Please look under your keyboard, now.

 

When they did look under their keyboards, they found one stick of bubble gum, of a brand that contains a small comic strip in the package. This positive mood manipulation was designed to increase their responsiveness to the jokes. After being given a chance to eat the gum, they completed the mood assay again, before proceeding to the main part of the study.

Each presentation began by showing the truncated joke. When subjects had read the joke, they pressed a key and read the test word in a predetermined location on screen. They made a word/non-word decision task on this and recorded their response by hitting one of two keys. Then, they rated the "funniness" of the joke, relative to the other jokes in the study, from 1 to 8. Finally, they indicated whether or not they had heard the joke before (with any word in the final position), and then pressed a key to be given the next joke.

Design and Dependent Measures

This experiment used a within-subject repeated measures design, although there was a 3 x 2 array of groups for counterbalancing purposes. Subjects were randomly assigned to one of 3 conditions for ending type (so that each joke was given all 3 ending types) and 2 conditions for joke order. The 2 order conditions were randomized, except that the 5-joke primacy buffer was the same for both conditions, in order to give more consistency to the relative joke ratings across subjects.

The dependent measures were response time and joke quality. Incorrect responses (7%) and familiar jokes (3%) were excluded from the analysis. Also, certain hardware problems arose during the study, as a network problem led computers to freeze for several seconds. This was usually visible in the raw data, as repeated striking of one key would lead to anomalously low (<100ms) times for reading, response. These data points (3%) were also eliminated from analysis. Finally, the 5 question primacy buffer was not analyzed.

Also recorded were several attributes of the jokes and words used. These included total number of words of joke (including final word); number of words in the final clause of the joke; number of letters of the probe word; and number of syllables of the probe word. Also, the probe words were given a rating for their frequency in the English language. This rating was approximately normally distributed, and was obtained by taking the logarithm of the sum of the word's frequency in general written English (from Francis, 1982; based on the 1963 "Brown Corpus"), half the frequency of the stem word for variants formed by suffixation(e.g. "fight" for "fighting"), and twice the word's frequency in the present study's stimuli (to account for differing expectations in a joke context; this was also necessary to prevent any zeros, which would make logarithms impossible).

 

Results

 

Overall

There was no significant effect on mood of the mood intervention or of passing through the study itself.

Overall, the response times were somewhat higher than expected. Grainger and Jacobs(1996), in their survey of lexical decision studies, find average responses between 400 and 800 ms, whereas my medians are 1000-1200 ms and my means even higher. The number of upper outliers (extending to 10 s) is also a serious concern in this data set. These high response times could be due to several factors. My study is by necessity somewhat less monotonous than most RT studies, as each trial includes a joke-rating phase as well as the reading and probe phases. This may prevent subjects from getting "in the groove". Laughter from other subjects or the activity of gum-chewing may provide distractions. Some aspect of study design (such as the specifics of stimulus presentation) may have contributed to my slower scores. And finally, the very humor that is being studied may have unexpected effects on response time; certainly, few of the existing experiments on response time show many signs that humor could have been a factor in their results.

Between-subject variation

There was a slight negative correlation between average mood and response time (r=-.08, 95%). Average mood correlated positively, however, with time spent reading the jokes (r=.09, 95%). There was a positive correlation between the times spent reading and responding (.11, 99%).

Different subjects, of course, responded differently to different jokes. However, there was no significant correlation between individual perceptions of joke quality corrected for the averaged quality of that joke (i.e. the ideosyncratic portion of subjects' individual joke ratings), and response time similarly corrected for average response to that joke. This would be a hard correlation to find, as the meaningful variations in perceived quality of individual jokes are much smaller, and thus more hidden by noise, than the overall variation in quality; and also because to correct for both subject variation and joke variation would leave an underdetermined model.

Across-subject effects on response time

A forward multiple regression of response time against number of joke words, number of joke syllables, number of cloze word letters, joke order of presentation, and probe word frequency(see above), excluding non-word responses, showed several things. As expected, response time improved with practice (beta= 11 ms per joke, 99% confidence), and worsened with probe length (beta= 17 ms per letter, 95% confidence).

Syntactic and morphological effects were also significant. Longer jokes had slower response times (beta= 30 ms per word, 99% confidence). However, the number of words in the final clause showed a negative effect on response time (beta = 21 ms per word, 95% confidence). This suggests that earlier words in the joke serve merely as interference, whereas words in the final clause of the joke serve as useful syntactic constraints on the expectations for the word. Since these syntactic dimensions showed were significantly correlated with joke source, which may have its own impact on response time, too much should not be made of these data.

Unexpectedly, semantic effects did not enter into the final regression equation. While other studies, which control many factors with a rigor unattainable in the present investigation, have found English word appearance frequency to be the single best predictor of response time (Grainger and Jacobs, 1996), my measure of frequency shows a weaker correlation than the effects already mentioned. While it has a significant correlation with the uncorrected response time, that correlation vanishes when response time is corrected for the above effects.

Whatever these effects reflect, they point to a source of variation which was not experimentally controlled. In order to statistically control for them in further analysis, several assumptions must hold. First, it must be shown that they do not interact with probe type. Comparisons of the size of these effects across probe types shows no significant interactions. Also, since joke quality is a major part of any analysis, these across-subject variables cannot be eliminated if they correlate with it. Correlations are non-significant in all cases, but joke order is seen to have a borderline significance (p=.09).

 

Between-condition effects

In within-subject ANOVAs of the dependant variables by subject and probe word type, correcting for the covariates mentioned above, both response time and joke quality showed significant variation across probe word types (p=.02 and .000, respectively; charts 1 and 2.) This probably accounts for the strong correlation of joke quality and response time when all probe word conditions are mixed. It suggests that humor is correlated with incongruity, but since even the non-humor probes follow joke introductions which are atypical of ordinary speech, it is of limited significance.

In order to fully justify separate analyses by probe word type, we must show that probe word type is not just a linear effect but also shows interactions with other variables. An ANOVA of response time by subject and probe word type (considering only humor and non-humor probes, as these are the most similar) shows significant (95%) interaction.

Within subject, within condition effects

Using a manova of joke quality and response time by subject, the within-cells correlation of the two dependant variables was .32 for non-joke probes, and nonsignificant for joke probes. This suggests that in unintentional humor, incongruity plays a key part, but that in more polished humor its effect is lost or hidden.

This same basic statistical result was duplicated using different tools. A new variable, corrected response time, was computed, subtracting out the contributions of whole-joke word number, within-clause word number, and probe letter number. Since the effects of these factors did not vary significantly across probe type, it was felt that these should be subtracted out evenly across probe type. Then, in order to control for between-subject variation, the correlation between joke quality and response time was obtained in the context of an ANOVA for corrected response time by subject with joke order and joke quality as covariates. This was computed using unique sums of squares, so as not to include the effects of intersubject correlations when computing the covariates. In this analysis, there was no significant correlation between joke quality and response time for either humor or non-word probes. However, there was such a correlation for non-humor probes (beta=4.572; p=.03).

In another analysis, joke quality was dichotomized by comparison with the mean joke quality within probe type. An ANOVA was then run on response time by subject, probe type, and dichotomized joke quality, correcting for joke order. All three effects were significant at the 95% level. Moreover, there were significant interactions between subject and joke quality, and subject and ending. The large number of subjects precludes further study of the nature of these interactions, but it does point to individual differences in the humor perceived from incongruity.

 

 

 

Chart 1

Chart 2

 

 

 

Discussion

 

The most significant result from this data is that, while response time and funniness rating are not correlated for jokes, they are correlated for non-jokes. This argues against the most simpleminded of incongruity-resolution models of humor, according to which all humor should obey the same laws and interactions. However, a theory which recognizes different levels of incongruity would be able to account for this data. Although several of the studies discussed in the introduction show clear effects of such high-level processes as metaphoric comprehension on lexical response time, it remains true that lower-level morphological, semantic, and syntactic priming and inhibition are stronger effects. Thus higher-level incongruities in the jokes of the current study may not be showing up in the response time data.

Another interpretation would be that entirely different humor processes, or differing mixtures of unrelated humor processes, are going on in the two conditions. This would probably be the interpretation of Ruch, who would place whatever humor is found in the non-jokes on the nonsense dimension of his 3-WD scale, while putting the humor of jokes on the other two dimensions as well.

Minsky's theory of humor would offer a variation of the different-source interpretation. While this theory suggests that the underlying cause of humor is always the same, the identification of a potential cognitive error, the low-level processes that can go into it can be - by definition - any cognitive process. Thus, the non-jokes were funny when (and partially because) they revealed the subjets' failure to correctly predict the final word of the joke. The humor of this would be facilitated by the experimental situation, which by separating joke from punchline subtly encourages the process of prediction. The jokes, being funny for other reasons, would not show a correlation between humor and response time. According to this theory, riddles (which also get some of their humor from the process of foiled prediction) would show a correlation like that of non-jokes in this study.

Most importantly, however, this study shows that the mechanisms of humor can be studied with a traditional cognitive science toolkit. It is true that finding a stock of jokes useable for a given research paradigm is hard. It is also true that, since jokes are by their nature exceptions to the rules, it is difficult to control a list of jokes for variation in the cognitive processes they call for. However, the rewards would be great, and the current study shows that the project is feasible in at least a limited sense.

That said, I have one final caution. The results of this study may be difficult to reproduce. Although the positive mood manipulation did not show significant results, my assessment of the subjects' mood is that it was notably positive. This positive mood may be necessary to get meaningful funniness ratings for non-joke stimuli. Oberlin is a small college, and friendships between the subjects within one session and between subjects and the experimenter were relatively common, contributing to the jolly mood. Moreover, I would guess that many students here value a childlike sense of play more than the average undergraduate.

 

References

 

Deckers, L. & Ruch, W. (1992) Sesation seeking and the Situational Humor Response Questionairre (SHRQ) Personality and Individual Differences, 13, 1051-1054

Francis, W. Nelson (1982) Frequency Analysis of English Usage: Lexicon and Grammar, Boston: Houghton Mifflin

Gildea, P & Glucksberg, S. (1983) On understanding metaphor: The role of context Journal of verbal learning and Verbal Behavior, 22, 577-590

Grainger, Jonathan; Jacobs, Arthur (1996) Orthographic Processing in Visual Word Recognition: A Multiple Read-Out Model. in Psychological Review, 103, 518-565

Long, D. & Graesser, A. (1988) Wit and Humor in Discourse Processing. Discourse Processes, 11, 35-60

Minsky, M., (1981) Jokes and their Relation to the Cognitive Unconscious. in Cognitive Constraints on Communication, Vaina and Hintikka (eds.) Reidel, 1981

Pynte, J; Besson, M.; Robichon, F.; & Poli, J. (1996) The Time-course of Metaphor Comprehension: an Event-Related Potential Study. Brain and Language, 55 293-316.

 

Appendix: Stimuli

Humor ending Non-Humor Non-Word

A donkey saw a zebra and thought, "Imagine that! A donkey from

jail Africa flutback

Hunter 1: I just saw a big bear. Hunter 2: Did you let him have both barrels? Hunter 1: I let him have the

gun footprint ninsprew

Al: What's the score? Bea: Eight to five. Al: Who's winning? Bea:

eight us larting

Q: What does it take to hit a game-winning homer in the world series? A: A

bat heart hent

Ann: I've been skiing since I was five. Bo: You must be really

tired good turry

Al: What do you get if you cut a steak in half, then in thirds? Bea: Sixths. Al: What if you cut it in half, then thirds, then quarters? Bea:

hamburger twentieths obigion

Woman: Why are you crying, little boy? Boy: My teacher yelled at me for something I didn't even do. Woman: What was it? Boy:

homework fighting mastel

Man: How do you like school? Girl:

closed fine sonducer

Ann: Bo, could you help me with this take-home exam? Bo: It wouldn't be right. Ann: But you could

try explain bain

Q: When don't one and one make two? A: When they make

eleven three teemy

Q: How much dirt can you get from a hole three feet by two feet by two feet? A: The dirt's

gone heavy fick

Ann: Do you have any plans to see a doctor about your impotence? Bo: Nothing

firm definite unpoiler

Doctor: We can probably save the toes, but I don't know about the

foot toenails dersed

I never worry about how I'll feel in the morning. I just

oversleep drink rumb

Anyone caught hanging from the rim will be

suspended punished nack

Wonderful! you have some of my favorite

problems flavors jargle

I'd love to assist you out of your difficulties, into

mine resolution regune

Sometimes I need what only you can provide - your

absence smile spude

I don't guarantee anything, which is lucky for you, since my guarantees are usually

worthless expensive monesis

Sooner or later, I'll be

punctual done grofinity

If you never try anything new, you'll miss many of the world's great

disappointments experiences handicer

Why has it taken me so long to tell you that I find it hard to

communicate type grome

History records no more gallant struggle than that of humanity against the

truth unknown protovalin

Q: What's brown and sticky? A: a

stick cake beal

Q: What do you get when you cross a lawyer and a spineless politician? A:

Chelsea babies nillet

I come from a small town whose population never changed. Each time a woman got pregnant, someone

left died mowl

I always cry at weddings, especially

mine vows tarpa

There's nothing like

uniqueness success rog

Wagner's music is better than it

sounds seems patish

Some things have to be believed to be

seen understood rudio

Take my advice, I usually

don't win horren

Spectator: What a great parade! And look at our boy Al. He's the only boy walking in

step front gortination

Be different, act

normal strangely asseam

On the other hand, you have different

fingers problems enjemple

Smoking cures weight problems,

eventually usually roon

I went to a general store. They wouldn't let me buy anything

specifically alcoholic lestute

There was a power outage at a department store yesterday. Twenty people were trapped on the

escalators elevators fregs

Can you be a closeted

claustrophobic nudist mesk

I never agree with my boss until she's

finished right waise

Puritanism is the haunting fear that someone, somewhere may be

happy sinning lossive

Those who live by the sword get

shot hurt tish

I feel like I'm diagonally parked, in a parallel

universe space tumple

I wonder how much deeper the ocean would be without

sponges islands funda

Despite the cost of living, have you noticed how it remains so

popular affordable plabber

A day without sunshine is like

night loneliness bith

To keep milk from turning sour, keep it in the

cow fridge clant