Throughout this class so far, the method we've been using
to analyze pragmatics has mostly been introspection: we think about
what various utterances might mean and we think about what we might say in
various situations, and we use those ideas to support or criticize theories
of pragmatics. (This approach is sometimes derisively called "armchair
linguistics", because you can do it while relaxing in an armchair.) Another
method we've used is observation: noticing interesting examples in
the real world and thinking about how they support or challenge pragmatic
theories.
In recent decades, though, there has been an increasing
interest in doing controlled empirical experiments to study
pragmatics. Hence a new field of study, experimental pragmatics,
in which researchers use the methods and techniques of experimental
psychology to figure out how pragmatics works and to try to solve
questions and problems that so far have not been solved through traditional
methods of introspection and observation. "Experimental pragmatics" started
to be recognized as a field in the early 2000s, and really exploded into
a major area of research in the 2010s and onward; for example, in 2021 the
XPrag research network organized a symposium
commemorating "20 years of experimental pragmatics".
(Note that here I'm using the term "experiment" to mean
collecting data from lots of people in a controlled survey or test. It's
different from "observation" because it involves the researcher setting
up different conditions—e.g. seeing how people interpret a certain
sentence in different contexts—whereas observation depends on
just finding things as-is in the real world. And I'm also treating it
as different from "introspection" because it involves the collection of
observable data from other people, rather than the collection of the
researcher's own intuitions. Technically, from a philosophy-of-science
perspective, introspection might also be an "experiment", but that's
not the sense in which I'm using the term here. Practically speaking,
and at risk of oversimplifying a bit, "experiments" usually involve
computers or at least paper-pencil surveys.)
Experimental pragmatics is a vast topic which we could
easily spend a whole semester (or more) on. It's also a topic that's near
and dear to my own heart, because it's the kind of research I've done; my own
doctoral dissertation
is a set of pragmatics experiments (even though I have since
come to believe that much of my initial research was misguided). For this
module, though, let's
limit ourselves to two closely related case studies. (For more information
on experimental pragmatics you should read Noveck, which is a great
overview of the field; chapters 6 and 7 in particular are about the
same sort of cases as what this module will examine.)
To learn about experimental pragmatics, we'll look at
two experiments that are nice examples of this field. Both experiments
are about scalar implicatures. So before we look at the experiments,
let's take a quick look at what scalar implicatures are and what are
some of the challenging questions about scalar implicatures that needed
to be addressed through experimentation.
Scalar implicatures
Throughout many of the previous modules we've revisited the
same made-up example, in which Rebecca says "Josh is smart" and
thereby implicates that she thinks Josh is not brilliant.
This is an example of one of the most widely discussed
phenomena in pragmatics, so-called "scalar implicatures". As we've seen,
scalar implicatures are those that supposedly come from thinking about a stronger
alternative that the speaker could have said but chose not to say (i.e.,
Rebecca could have "Josh is brilliant" but for some reason she
didn't). In particular, "scalar" implicatures are ones where this
stronger alternative comes from replacing one word in the utterance
with a stronger word from an ordered "scale" of related words that exists
in a person's vocabulary. For instance, in this example, maybe we assume
that Rebecca and the listener both know that the words like brilliant,
smart, and average form a scale
<brilliant, smart, average>, from strongest
to weakest. When Rebecca uses a word that's not the strongest one on
the scale, the listener wonders why she didn't use a stronger word and
infers that she doesn't think the stronger word applies. Many other words
also exist in scales which may trigger scalar implicatures; for instance,
an utterance like "X or Y" can implicate "X or Y but not both"
because or is weaker than its scalemate and (<and, or> scale);
maybe or probably can implicate that something is not
certain (<certainly, probably, maybe> scale);
and a <always, sometimes> scale can make
sometimes implicate "not all the time", as illustrated near the
beginning of this
funny scene from the movie Best in Show.
Scalar implicatures are probably the biggest topic studied
in experimental pragmatics, at least in terms of the number of experiments
that have focused on them. Just like a good friend of mine once derisively told me
that Cui Jian's "Nothing
to My Name" (一无所有) is the most over-analyzed
song in the history of Chinese rock and roll, scalar implicatures are the most
over-analyzed thing in pragmatics. After having spent a number of years doing
research on them, I no longer believe they're anything special (see, e.g.,
Geurts, for discussion on how scalar implicatures just follow from the
same principles as other implicatures). Nevertheless, they are often a
useful test case for designing experiments to test various
theories of pragmatics. Below we will examine two experiments that were
designed to test two different issues about scalar implicatures. For more
detailed information about these and other issues, excellent summaries of
scalar implicature research are available in
Noveck & Sperber (2007),
Katsos & Cummins (2010),
Sauerland (2012),
Chemla & Singh (2014),
van Tiel et al. (2016,
2019),
and Geurts.
Case study 1: "embedded implicatures"
So far our discussions
of scalar implicature have assumed that they are conversational implicatures
which arise when a speaker flouts the maxim of quantity. Specifically, our
treatment has generally assumed that scalar implicatures are a sort of
generalized conversational implicature which follows from some automatic
inferential steps that happen in any context; but a very similar analysis
could still work even if we treat scalar implicatures as particularized
conversational implicatures. Either way, this is the most traditional
view of scalar implicatures and it's based on Gricean ideas (although people
often call the generalized-conversational-implicature view of them "neo-Gricean",
because it's essentially based on Gricean pragmatics but it's a view that
was further developed by other researchers after Grice).
This view is controversial, though, and there are other
theories that propose very different explanations of how scalar implicatures
work. The one most relevant to this experiment is something called the
"grammatical theory", which claims that "scalar implicatures" are not
implicatures at all, but rather they are interpretations based on
syntax and semantics. According to this theory, scalar words have their literal meaning (e.g.
sometimes means "at least some of the time", smart means
"at least smart", etc.), but their meaning gets enriched by a grammatical
operator when they're used in a syntactic context. You can imagine this
operator as being like a silent "only"; the idea is that when
someone says "Josh is smart", there is actually a silent operator
in the sentence that makes it be interpreted as "Josh is ONLY smart"
(i.e., only smart but not brilliant). Likewise for any other "scalar" terms;
for instance, an utterance like "I sometimes remember to floss before
brushing" gets interpreted as "I ONLY sometimes remember to floss
before brushing" because of the insertion of this silent operator.
Crucially, according to the grammatical theory, this process happens in
syntax and/or semantics, not in pragmatics, and thus "scalar implicatures"
are not actually implicatures; they are grammatical (syntactic and/or
semantic) enrichments.
We don't have enough time or space here to get into all the
details of the debate between these approaches. (But see Geurts, chapters 7-8, and
Noveck, chapters 6-7, for much more discussion of these.) Let's just focus on
one. One major point of contention between these theories has
been the question of whether or not "scalar implicatures" occur in embedded
contexts. All these theories agree that "scalar implicatures" can
happen in matrix sentences; i.e., "I regret some of the things I
said" may be taken to mean "I regret some, but not all, of the
things I said". Where the theories disagree is on whether that "some"
will still get interpreted as "not all" if it's in a syntactically embedded clause
(e.g., "I think the president regrets some of the things
he said"—according to the grammatical theory, this should mean
that the speaker thinks the president does
not regret all the things he said), or in the scope of another semantic
operator (e.g., "All of the people involved in the scandal regret some
of the things they said."—according to the grammatical theory this
should mean that no person involved regrets everything they said, i.e., maybe
person A regrets 80% of what he said and person B regrets 50% of what she said,
but there is no person who regrets 100% of what he said.) Some of the pragmatic
theories, on the other hand, might not
predict "some" to get interpreted as "not all" in these situations,
because implicatures are supposed to happen at the level of the full utterance
rather than the level of a particular clause within the utterance (see, however,
Geurts, chapters 7-8, for an explanation of how these theories can accommodate
so-called "embedded implicatures" if those sorts of implicatures do indeed
occur).
This is not really a question that can be answered by introspection,
because people have very different intuitions about what these utterances mean.
Researchers tend to be willing to interpret these weird sentences in whatever
way ends up consistent with their theory. I don't trust my own intuitions with
very complicated sentences like this; I feel like after a few years of doing
pragmatics I've seen so many of these that my intuitions are messed up and
are not the same as what "normal" people's intuitions are.
Geurts
and Pouscoulous (2009) tested this by designing a clever survey. They showed
volunteers scenarious and sentences like the one depicted in the figure below. In
this example (copied from their paper), there is a picture showing three squares
and three circles. Let's call the square at the top "square A", and the two squares
at the bottom "square B" and "square C". Square B is connected to two of the circles,
and Square C is also connected to two of the circles. Crucially, though, Square A
is connected to all three circles. Along with this picture, the volunteers
saw the sentence "All of the squares are connected with some of the
circles"—i.e., a typical "embedded implicature" sort of example sentence like
the ones we've seen above. The volunteers were asked to decide whether that sentence
is true or false, with respect to the picture.
Recall that the "grammatical theory" predicts that the scalar
implicature should be realized in the embedded context; in other words, people
should interpret this sentence as meaning "all of the squares are connected with
not-all of the circles", i.e., no square can be connected to all of the circles.
Since Square A is connected to all of the circles, people should consider this
sentence "false", according to the grammatical theory. On the other hand, the
Gricean theory supposedly predicts that people won't get scalar implicatures
in this context; in other words, they won't think that "some" has to
mean "not all" here. Therefore, according to that theory, volunteers should
consider this sentence "true", even though Square A is connected to all of
the circles.
In Geurts and Pouscoulous's experiment, every volunteer marked
this sentence as "true". This seems like strong evidence in favor of the
Gricean theory and against the grammatical theory. (Or, at the very least,
strong evidence that people don't interpret "some" as "not all" in
this context; whether the theories actually predict and explain that is
a separate question that has been hotly debated in the years since, and
proponents of the grammatical theory may claim that either (a) their theory
also predicts this outcome, so Geurts and Pouscoulous actually mis-represented
the predictions of the grammatical theory; or (b) there was some problem with
the way the experiment was designed so its results are wrong.)
What makes these results so compelling and important is
that Geurts and Pouscoulous didn't just sit down and say "I think this
sentence doesn't mean 'not all', so my theory is right." They went to
the trouble of designing an experiment and actually collecting data, to
show that the interpretation predicted by their theory is really the way
that real people interpret these sentences.
One limitation of these sorts of research is that they
often rely on using very complex and crazy sentences which seem quite
unusual for normal life. These sorts of experiments always remind me
of Bilbo Baggins's birthday
speech with the ridiculously complicated sets of embdeeded quantifiers
("I don't know half of you half as well as I should like, and I like
less than half of you half as well as you deserve!"). We can't be
sure that the way people understand these crazy sentences is a good
reflection of the way they understand normal language. In our next
case study below, we'll briefly see how we can use psycholinguistic
techniques to measure how people process more normal, natural
utterances.
Case study 2: is there a processing cost for implicatures?
Above, we learned about the debate over whether "scalar implicatures"
are really implicatures. Another, quite separate, debate focuses on whether or
not there is an extra cognitive processing cost to understand implicatures.
According to one view, understanding an implicature should take more time and
more cognitive effort. Understanding the literal meaning of a sentence just
takes semantics, but understanding an implicature takes extra steps of
logical inference (recall that, as we saw in the module on weak
and strong implicatures, recovering a strong scalar implicature is a supposedly
a four-step inferential process) and each of these steps must take some time
and effort. Therefore, some researchers argue that if scalar implicatures are
processed according to Gricean pragmatic reasoning, it should take more time
and effort to interpret "some" as meaning "not all" (i.e., to interpret
it pragmatically) than it does to interpret "some" as meaning "at
least one and possibly all" (i.e., to interpret it literally/semantically).
To test that, we need some way of measuring how long it takes
someone to interpret some utterance, and/or how much effort they are using to
understand the utterance. Fortunately, psycholinguistics
provides us with many tools for doing that. We can use special sorts of equipment
to measure people's brain activity and/or eye movements while they're reading or
hearing a sentence, or we can even use simple computer programs to time how long
it takes them to read a sentence.
One
of my own experiments did just that. We had people read very short stories
like the ones below, and we used a computer to time how long it took them to
read each phrase. (This experiment was inspired by a
very similar 2006 experiment by Breheny, Katsos, & Williams. Their 2006 experiment
was groundbreaking, in that it figured out this clever way to test how people
read sentences with scalar implicatures, and it inspired a whole generation of
follow-up research such as the one I'm describing below. But I'm using my own
experiment as the example for this module because I think the sentences are
a bit more straightforward; the Breheny et al. 2006 experiment is very similar
but there is an extra difference between the key sentences which complicates the
design.)
Upper-bounded: Mary asked John whether he intended to host all
of his relatives in his tiny apartment. John replied that he intended to host
some of his relatives. The rest would stay in a nearby hotel.
Lower-bounded: Mary asked John whether he intended to host any
of his relatives in his tiny apartment. John replied that he intended to host
some of his relatives. The rest would stay in a nearby hotel.
These two stories are almost completely identical, except for one
word. In the "upper-bounded" context, Mary asks a question about whether John
will host all of his relatives. In the "lower-bounded" context, on the other
hand, Mary asks a question about whether John will host any of his
relatives. This tiny difference should cause a difference in how the word "some"
gets interpreted later. In the upper-bounded context, it is likely that readers
would interpret John's response "some" as meaning that he will host
"not all" of his relatives (since Mary directly asked a question about whether
he'd host all of his relatives, and he didn't just say "yes"). On the other hand,
in the lower-bounded context, it is likely that readers would not interpret
"some" in that way, and they might think John is making no commitment
about whether or not he would host all of them.
In other words, we expected that in the upper-bounded context
readers would interpret "some" with an implicature, and in the lower-bounded
context they would not (they would interpret it literally).
Having set up a way to get people to read the same sentence with
or without an implicature (by reading the same sentence in different contexts),
we could then measure how long they took to read each phrase in the sentence. The
key phrase, underlined in the above examples, is "some of". As I mentioned
above, some theories predict that it should take people extra time and effort
to understand an implicature; if so, then people should need more time to read "some of"
in the upper-bounded context than in the lower-bounded context.
You can see the key results from the experiment in the figure below.
The graph shows the reading time for each phrase in the sentence; the reading times
for the key phrase, "some of", are highlighted by a green circle. We can
see that, contrary the prediction, people did not take more time to read this
in the upper-bounded context than the lower-bounded context; if anything, they
read slightly faster in the upper-bounded context.
I actually found similar results as these in later studies measuring
people's brain
waves and eye
movements as they heard or read sentences like these. Across all these studies,
I never found evidence that people need extra time and effort to understand
scalar implicatures. (But I may be in the minority here; most researchers writing
about this topic seem to assume that it's now widely accepted that implicatures
take time and effort. See Noveck, chapter 6, for a good and nuanced discussion
of much of this research looked at together.)
One of the benefits of this type of research is that we don't
have to explicitly ask people what they think a sentence means, and thus we
don't have to worry about subjective responses or the possibility that people
might not be able to accurately report what they think something means (or,
even worse, that our very act of asking the question will influence how they
interpret the sentence). We also don't need to present volunteers with crazy
and complicated sentences. We can just give people relatively normal utterances,
and have them listen to or read them in a relatively natural way. As long as
we have a good experiment design, we can then use their objective behavioural
or neural responses to make inferences about how pragmatics works. The drawback
to all this is that experiments themselves involve a lot of assumptions and
complicated methodology, and the link between the experiment and the
phenomena we want to test may break down along any of these assumptions (i.e.,
using the experiment results to make conclusions about how scalar implicatures
work requires assuming that our experiment is accurately measuring how people
process scalar implicatures, so if there's any reason to doubt that assumption
then that will also cast doubt on anything we learn about pragmatics through
that experiment).
In-class activities
Take any pragmatics question from earlier in this class (it could be something
students have raised before, something from one of the modules, or something
you've thought of on your own). Have students try to think of a way they
could design an experiment to test it.