Postlexical Integration Processes in Language Comprehension.doc

(162 KB) Pobierz
Postlexical Integration Processes in

Postlexical Integration Processes in

Language Comprehension: Evidence

from Brain-Imaging Research

COLIN M. BROWN, PETER HAGOORT, AND MARTA KUTAS

ABSTRACT Language comprehension requires the activation,

coordination, and integration of different kinds of linguistic

knowledge. This chapter focuses on the processing of syntactic

and semantic information during sentence comprehension,

and reviews research using event-related brain potentials

(ERPs), positron emission tomography (PET), and functional

magnetic resonance imaging (fMRI). The ERP data provide

evidence for a number of qualitatively distinct components

that can be linked to distinct aspects of language understanding.

In particular, the separation of meaning and structure in

language is associated with different ERP profiles, providing a

basic neurobiological constraint for models of comprehension.

PET and fMRI research on sentence-level processing is at

present quite limited. The data clearly implicate the left perisylvian

area as critical for syntactic processing, as well as for aspects

of higher-order semantic processing. The emerging

picture indicates that sets of areas need to be distinguished,

each with its own relative specialization.

In this chapter we discuss evidence from cognitive neuroscience

research on sentence comprehension, focusing

on syntactic and semantic integration processes. The

integration of information is a central feature of such

higher cognitive functions as language, where we are

obliged to deal with a steady stream of a multitude of information

types. Understanding a written or spoken

sentence requires bringing together different kinds of

linguistic and nonlinguistic knowledge, each of which

provides an essential ingredient for comprehension.

One of the core tasks that faces us, then, is to construct

an integrated representation. For example, if a listener is

to understand an utterance, then at least the following

processes need to be successfully completed: (a) recognition

of the signal as speech (as opposed to some other

kind of noise), (b) segmentation of the signal into constituent

parts, (c) access to the mental lexicon based on

the products of the segmentation process, (d) selection

of the appropriate word from within a lexicon containing

some 30,000 or more entries, (e) construction of the

appropriate grammatical structure for the utterance up

to and including the word last processed, and (f) ascertaining

the semantic relations among the words in the

sentence. Each of these processes requires the activation

of different kinds of knowledge. For example, segmentation

involves phonological knowledge, which is largely

separate from, for instance, the knowledge involved in

grammatical analysis. But knowledge bases like phonology,

word meaning, and grammar do not, on their own,

yield a meaningful message. While there is no question

that integration of these (and other) sources of information

is a prerequisite for understanding, considerable

controversy surrounds the details.

Which sources of knowledge actually need to be distinguished?

Is the system organized into modules, each

operating within a representational subdomain and

dealing with a specific subprocess of comprehension?

Or are the representational distinctions less marked or

even absent? What is the temporal processing nature of

comprehension? Does understanding proceed via a

fixed temporal sequence, with limited crosstalk between

processing stages and representations? Or is comprehension

the result of more or less continuous interaction

among many sources of linguistic and nonlinguistic

knowledge? These questions, which are among the most

persistent in language research, are now gaining the attention

of cognitive neuroscientists. This is an emerging

field, with a short history. Nevertheless, progress has

been made, and we present a few specific examples in

this chapter.

A cognitive neuroscience approach to language might

contribute to language research in several ways. Neurobiological

data can, in principle, provide evidence on

the representational levels that are postulated by different

language models—semantic, syntactic, and so on (see

the section on PET/fMRI). Neurobiological data can

COLIN M. BROWN and PETER HAGOORT Neurocognition of

Language Processing Research Group, Max Planck Institute

for Psycholinguistics, Nijmegen, The Netherlands

MARTA KUTAS Department of Cognitive Science, University

of California, San Diego, Calif.

882 LANGUAGE

reveal the temporal dynamics of comprehension, crucial

for investigating the different claims of sequential and

interactive processing models (see the sections on the

N400 and the P600/SPS). And, by comparing brain activity

within and between cognitive domains, neurobiological

data can also speak to the domain-specificity of

language. It is, for example, a matter of debate whether

language utilizes a dedicated working-memory system

or a more general system that subserves other cognitive

functions as well (see the section on slow brain-potential

shifts).

Postlexical syntactic and semantic

integration processes

In this chapter we focus specifically on what we refer to

as postlexical syntactic and semantic processes. We do

not discuss the processes that precede lexical selection

(see Norris and Wise, chapter 60, for this subject), but

rather concern ourselves with processes that follow

word recognition. Once a word has been selected within

the mental lexicon, the information associated with this

word needs to be integrated into the message-level representation

that is the end product of comprehension. If

this integration is to be successful, both syntactic and semantic

analyses need to be performed.

At the level of syntax, the sentence needs to be parsed

into its constituents, and the syntactic dependencies

among constituents need to be specified (e.g., What is

the subject of the sentence? Which verbs are linked with

which nouns?). At the level of semantics, the meaning of

an individual word needs to be merged with the representation

that is being built up of the overall meaning of

the sentence, such that thematic roles like agent, theme,

and patient can be ascertained (e.g., Who is doing what

to whom?). These syntactic and semantic processes lie at

the core of language comprehension. Although words

are indispensable bridges to understanding, it is only in

the realm of sentences (and beyond in discourses) that

they achieve their full potential to convey rich and varied

messages.

The field of language research lacks an articulated

model of how we achieve (mutual) understanding. This

lack is not too surprising when we consider the problems

that confront us in devising a theory of meaning for

natural languages, let alone the difficulties attendant on

combining such a representational theory with a processing

model that delineates the comprehension process

at the millisecond level. However understandable,

the lack of an overall model has meant that the processes

involved in meaning integration at the sentential

level have received scant experimental attention. The

one area in which quite specific models of the relationship

between semantic representations and on-line language

processing have been proposed is the area of

parsing research. Here, a major concern has been to assess

the influence of semantic representations on the

syntactic analysis of sentences, with a particular focus on

the moments at which integration between meaning and

structure occurs (cf. Frazier, 1987; Tanenhaus and

Trueswell, 1995). Research in this area has concentrated

on the on-line resolution of sentential-syntactic ambiguity

(e.g., “The woman sees the man with the binoculars.”

Who is holding the binoculars?). The resolution of this

kind of ambiguity speaks to the separability of syntax

and semantics, as well as to the issue of sequential or interactive

processing. The prevailing models in the literature

can be broadly separated into autonomist and

interactive accounts.

In autonomous approaches, a separate syntactic knowledge

base is used to build up a representation of the syntactic

structure of a sentence. The prototypical example

of this approach is embodied in the Garden-Path model

(Frazier, 1987), which postulates that an intermediate

level of syntactic representation is a necessary and obligatory

step during sentence processing. This model stipulates

that nonsyntactic sources of information (e.g.,

message-level semantics) cannot affect the parser’s initial

syntactic analysis (see also Frazier and Clifton, 1996;

Friederici and Mecklinger, 1996). Such sources come

into play only after a first parse has been delivered.

When confronted with a sentential-syntactic ambiguity,

the Garden-Path model posits principles of economy, on

the basis of which the syntactically least complex analysis

of the alternative structures is chosen at the moment

the ambiguity arises. If the chosen analysis subsequently

leads to interpretive problems, this triggers a syntactic

reanalysis.

In the most radical interactionist approach, there are no

intermediate syntactic representations. Instead, undifferentiated

representational networks are posited, in which

syntactic and semantic information emerge as combined

constraints on a single, unified representation (e.g., Bates

et al., 1982; Elman, 1990; McClelland, St. John, and

Taraban, 1989). In terms of the processing nature of the

system, comprehension is described as a fully interactive

process, in which all sources of information influence

the ongoing analysis as they become available.

A third class of models sits somewhere in between the

autonomous and radical interactionist approaches. In

these so-called constraint-satisfaction models, lexically represented

information (such as the animacy of a noun or

the transitivity of a verb) but also statistical information

about the frequency of occurrence of a word or of syntactic

constructions play a central role (cf. MacDonald,

Pearlmutter, and Seidenberg, 1994; Spivey-Knowlton

BROWN, HAGOORT, AND KUTAS: BRAIN-IMAGING OF LANGUAGE COMPREHENSION 883

and Sedivy, 1995). The approach emphasizes the interactive

nature of comprehension, but does not exclude

the existence of separate representational levels as a

matter of principle. Comprehension is seen as a competition

among alternatives (e.g., multiple parses), based

on both syntactic and nonsyntactic information. In this

approach, as in the more radical interactive approach,

sentential-syntactic ambiguities are resolved by the

immediate interaction of lexical-syntactic and lexicalsemantic

information, in combination with statistical

information about the relative frequency of occurrence

of particular syntactic structures, and any available discourse

information, without appealing to an initial syntax-

based parsing stage or a separate revision stage (cf.

Tanenhaus and Trueswell, 1995).

Although we have discussed these different models in

the light of sentential-syntactic ambiguity resolution,

their architectural and processing assumptions hold for

the full domain of sentence and discourse processing.

Clearly, the representational and processing assumptions

underlying autonomous and (fully) interactive

models have very different implications for an account

of language comprehension. We will return to these issues

after giving an overview of results from the brainimaging

literature on syntactic and semantic processes

during sentence processing.

Before discussing the imaging data, a few brief comments

on the sensitivity and relevance for language research

of different brain-imaging methods are called for.

The common goal in cognitive neuroscience is to develop

a model in which the cognitive and neural approaches

are combined, providing a detailed answer to

the very general question of where and when in the

brain what happens. Methods like event-related brain

potentials (ERPs), positron emission tomography (PET),

and functional magnetic resonance imaging (fMRI) are

not equally revealing or relevant in this respect. In terms

of the temporal dynamics of comprehension, only ERPs

(and their magnetic counterparts from magnetoencephalography,

MEG) can provide the required millisecond

resolution (although recent developments in noninvasive

optical imaging indicate that near-infrared measurements

might approach millisecond resolution; cf.

Gratton, Fabiani, and Corballis, 1997). In contrast, the

main power of PET and fMRI lies in the localization of

brain areas involved in language processing (although

recent advances in neuronal source-localization procedures

with ERP measurements are making this technique

more relevant for localizational issues; cf. Kutas,

Federmeier, and Sereno, 1999). Recent analytic developments

in PET and fMRI research further indicate that

information on effective connectivity in the brain (i.e.,

the influence that one neuronal system exerts over another)

might begin to constrain our models of the language

system (cf. Büchel, Frith, and Friston, 1999;

Friston, Frith, and Frackowiak, 1993). However, localization

as such does not reveal the nature of the activated

representations: The hemodynamic response is a

quantitative measure that does not of itself deliver information

on the nature of the representations involved.

The measure is maximally informative when separate

brain loci can be linked, via appropriately constraining

experimental conditions, with separate representations

and processes. A similar situation holds for the ERP

method: The polarity and scalp topography of ERP

waveforms can, in principle, yield qualitatively different

effects for qualitatively different representations and/or

processes, but only appropriately operationalized manipulations

will make such effects interpretable (cf.

Brown and Hagoort, 1999; Osterhout and Holcomb,

1995). In short, whatever the brain-imaging technique

being used, the value of the data critically depends on its

relation to an articulated cognitive-functional model.

Cognitive neuroscience investigations

of postlexical integration

EVENT-RELATED BRAIN POTENTIAL MANIFESTATIONS

OF SENTENCE PROCESSING Space limitations rule out

an introduction on the neurophysiology and signalanalysis

techniques of event-related brain potentials (see

Picton, Lins, and Scherg, 1995, for a recent review). It is,

however, important to bear in mind that, owing to the

signal-to-noise ratio of the EEG signal, one cannot obtain

a reliable ERP waveform in a standard language experiment

without averaging over at least 20–30 different

tokens within an experimental condition. Thus, when

we speak of the ERP elicited by a particular word in a

particular condition, we mean the electrophysiological

activity averaged over different tokens of the same type.

Within the realm of sentence processing, four different

ERP profiles have been related to aspects of syntactic

and semantic processing: (1) A transient negativity

over left-anterior electrode sites (labeled the left-anterior

negativity, LAN) that develops in the period roughly

200–500 ms after word onset. The LAN has been related

not only to the activation and processing of syntactic

word-category information, but also to more general

processes of working memory. (2) A transient bilateral

negativity, labeled the N400, that develops between 200

and 600 ms after word onset; the N400 has been related

to semantic processing. (3) A transient bilateral positivity

that develops in the period between 500 and 700 ms.

Variously labeled the syntactic positive shift (SPS) or the

P600, this positivity has been related to syntactic processing.

(4) A slow positive shift over the front of the

884 LANGUAGE

head, accumulating across the span of a sentence, that

has been related to the construction of a representation

of the overall meaning of a sentence. Let us discuss each

of these ERP effects in turn.

Left-anterior negativities The LAN is a relative newcomer

to the set of language-related ERP effects. Both

its exact electrophysiological signature and its functional

nature are still under scrutiny. Some researchers

have suggested that the LAN is related to early parsing

processes, reflecting the assignment of an initial phrase

structure based on syntactic word-category information

(Friederici, 1995; Friederici, Hahne, and Mecklinger,

1996). Other researchers propose that a LAN is a reflection

of working-memory processes during language

comprehension, related to the activity of holding a

word in memory until it can be assigned its grammatical

role in a sentence (Kluender and Kutas, 1993a,b;

...

Zgłoś jeśli naruszono regulamin