Lee D. - Game theory and neural basis of social decision making.pdf - Teoria gier - narayana

DECISION MAKING

REVIEW

Game theory and neural basis of social decision

making

Daeyeol Lee

Decision making in a social group has two distinguishing features. First, humans and other animals routinely alter their behavior in

response to changes in their physical and social environment. As a result, the outcomes of decisions that depend on the behavior of

multiple decision makers are difﬁcult to predict and require highly adaptive decision-making strategies. Second, decision makers

may have preferences regarding consequences to other individuals and therefore choose their actions to improve or reduce the

well-being of others. Many neurobiological studies have exploited game theory to probe the neural basis of decision making and

suggested that these features of social decision making might be reﬂected in the functions of brain areas involved in reward

evaluation and reinforcement learning. Molecular genetic studies have also begun to identify genetic mechanisms for personal traits

related to reinforcement learning and complex social decision making, further illuminating the biological basis of social behavior.

Decision making is challenging because the outcomes from a particular

action are seldom fully predictable. Therefore, decision makers must

always take uncertainty into consideration when they make choices 1 .In

addition, such action-outcome relationships can change frequently,

requiring adaptive decision-making strategies that depend on the

observed outcomes of previous choices 2 . Accordingly, neurobiological

studies on decision making have focused on the brain mechanisms

involved in mediating the effect of uncertainty and improving decision-

making strategies by trial and error. Signals related to reward magni-

tude and probability are widespread in the brain and are often

modulated by decision making 3–9 . Some of these areas might

be also involved in updating the preference and strategies of

decision makers 10–14 .

Compared to solitary animals, animals living in a large social group

face many distinctive challenges and opportunities, as reﬂected in

various cognitive abilities in the social domain, such as communication

and other prosocial behaviors 15 . This review focuses on the neural basis

of socially interactive decision making in humans and other primates.

The basic building blocks of decision making that underlie the

processes of learning and valuation also are important for decision

making in social contexts. However, interactions among multiple

decision makers in a social group show some additional features.

First, behaviors of humans and animals can change frequently, as

they seek to maximize their self-interest according to the information

available from their environment. This makes it difﬁcult to predict the

outcomes of a decision maker’s actions and to choose optimal actions

accordingly. As a result, more sophisticated learning algorithms might

be required for social decision making 16,17 . Second, social interactions

open the possibilities of competition and cooperation. Humans and

animals indeed act not only to maximize their own self-interest, but

sometimes also to increase or decrease the well-being of others around

them. These aspects of social decision making are reﬂected in the

activity of brain areas involved in learning and valuation.

Game theory and social preference

A good starting point for studies of social decision making is game

theory 18 . In its original formulation, game theory seeks to ﬁnd the

strategies that a group of decision makers will converge on, as they try

to maximize their own payoffs. Nash equilibrium refers to a set of such

strategies from which no individual players can increase their payoffs by

changing their strategies unilaterally 19 . In a two-player competitive

game known as matching pennies (Fig. 1a), for example, each player

can choose between two alternative options, such as the head and tail of

a coin. One of the players wins if both players choose the same option

and loses otherwise. For the matching-pennies game with a symme-

trical payoff matrix (as in Fig. 1a), the Nash equilibrium is to choose

both options with the same probabilities. Any other strategy can be

exploited by the opponent and therefore reduces the expected payoff.

In both humans and nonhuman primates, however, the predictions

based on the Nash equilibrium are often systematically violated for

such competitive games 17,20,21 . As discussed below, this might be due to

various learning algorithms used by the decision makers to improve the

outcomes of their choices iteratively.

How game theory can be used to investigate cooperation and

altruism is illustrated by a well known game, the prisoner’s dilemma.

The two players in this game can each choose between cooperation and

defection. Each player receives a higher payoff by defecting, whether the

other player chooses to cooperate or defect, but the payoff to each

player is higher for mutual cooperation than for mutual defection,

hence creating a dilemma (Fig. 1b). If this game is played only once and

the players care only about their own payoffs, both players should

defect, which corresponds to the Nash equilibrium for this game. In

reality and in laboratory experiments, however, both these assumptions

are frequently violated.

Yale University School of Medicine, Department of Neurobiology, 333 Cedar Street,

SHM B404, New Haven, Connecticut 06510, USA. Correspondence should be

addressed to D.L. ( daeyeol.lee@yale.edu) .

Published online 26 March 2008; doi:10.1038/nn2065

404

VOLUME 11 [ NUMBER 4 [ APRIL 2008 NATURE NEUROSCIENCE

DECISION MAKING

REVIEW

Nonmatcher

Player II

An ultimatum game is similar to the dictator

game in that one of the players (proposer)

offers a proportion of the money to the

recipient, who now has the opportunity to

reject the offer. If the offer is rejected, neither

player receives any money. The average offer

in ultimatum games is about 40%, signiﬁ-

cantly higher than in the dictator game,

implying that proposers are motivated to

avoid the potential rejection 17 . Indeed, in

the ultimatum game, recipients reject offers

below 20% about half the time. Another

important element in social interaction is

captured by a trust game, in which one of

the players (investor) invests a proportion of

his or her money. This money then is multiplied, often tripled, and

transferred to the other player (trustee). The trustee then decides how

much of this transferred money is returned to the investor. The amount

of money invested by the investor measures the trust of the investor in

the trustee, and the amount of repayment reﬂects the trustee’s

trustworthiness. Thus, trust games quantify the moral obligations

that a trustee might feel toward the investor. Empirically, investors

tend to invest roughly half their money, and trustees tend to repay an

amount comparable to the original investment 17 .

Studies on experimental games in nonhuman primates can provide

important insights into the evolutionary origins of social preference

shown by human decision makers. For example, when chimpanzees are

tested in a reduced form of the ultimatum game in which proposers

choose between two different preset offers, they tend to choose the

options that maximize their self-interest, both as proposers and

recipients 34 . Therefore, even though chimpanzees and other nonhu-

man primates show altruistic behaviors, fairness is much more impor-

tant in social decision making for humans.

Player II

Defect

Head

Ta i l

Cooperate

Defect

Cooperate

Head

(1, –1)

(–1, 1)

Cooperate

(3, 3)

(0, 5)

Cooperate

–5

Matcher

Player I

Ta i l

(–1, 1)

(1, –1)

Defect

(5, 0)

(1, 1)

Defect

5 – 5

Figure 1 Payoff matrix for the games of matching pennies and prisoner’s dilemma. (a) For the matching-

pennies game, a pair of numbers within each pair of parentheses indicates the payoffs to the matcher

and nonmatcher, respectively. Blue or yellow rectangles indicate the outcomes favorable to the matcher

or nonmatcher, respectively. (b,c) The prisoner’s dilemma game. (b) A pair of numbers within the

parentheses indicates the payoffs to players I and II, respectively. The yellow and green rectangles

correspond to mutual cooperation and mutual defection, respectively, whereas the gray rectangles

indicate unreciprocated cooperation. (c) Player I’s utility function adjusted according to the model of

inequality aversion. The values of a and b indicate the sensitivity to disadvantageous and advantageous

inequality. For b 4 0.4, mutual cooperation becomes a Nash equilibrium.

Games can be played repeatedly, often among the same set of players.

This makes it possible for some players to train others to deviate from

the equilibrium predictions for one-shot games. In addition, humans

often cooperate in prisoner’s dilemma games, whether the game is one-

shot or repeated 22 . Therefore, for humans, decision making in social

contexts may not be entirely driven by self-interest, but at least partially

by preferences regarding the well-being of other individuals. Indeed,

cooperation and altruistic behaviors abound in human societies 23 and

may also occur in nonhuman primates 24–26 . In theory, multiple

mechanisms—including kin selection, direct and indirect reciprocity

and group selection—can increase the ﬁtness of cooperators and thus

sustain cooperation 23,27 . Punishment of defectors or free-riders at a

cost to the rule enforcer, often referred to as altruistic punishment, also

effectively deters defection 23,28–30 .

In economics, the subjective desirability of a particular choice is

quantiﬁed by its utility function. Although the classical notion of utility

only concerns the state of the decision maker’s individual wealth, the

utility function can be expanded, when people take into consideration

the well-being of other individuals, to incorporate social preference.

For example, the utility function can be modiﬁed by the decision

maker’s aversion to inequality 31 . For two-player games, the ﬁrst player’s

utility, U 1 ( x ), for the payoff to the two players x ¼ [ x 1 x 2 ], can be

deﬁned as follows:

Learning in social decision making

When a group plays the same game repeatedly, some players may try to

train other players. For example, recipients in an ultimatum game may

reject some offers, not as a result of aversion to inequality, but to

increase their long-term payoff by penalizing greedy proposers. To

better isolate the effect of social preference, therefore, many experi-

menters do not allow their subjects to interact with the same partners

repeatedly. In real situations, however, learning is important, as people

and animals do tend to interact with the same individuals repeatedly.

Reinforcement learning theory 2 formalizes the problem faced by a

decision maker trying to discover optimal strategies in an unfamiliar

environment (Fig. 2). This theory has been successfully applied to an

environment that includes multiple decision makers 17,20,21,35,36 .The

sum of future rewards expected from a particular action in a particular

state of the environment is referred to as the value function. Future

rewards are often exponentially discounted, so that immediate rewards

contribute more to the value function. Similar to utility functions in

economics, value functions determine the actions chosen by decision

makers. In addition, the difference between the reward predicted from

the value function and the actual reward is termed reward prediction

error. In simple or direct reinforcement learning algorithms, value

functions are updated only for chosen actions and only when there is a

reward prediction error 2 .

Although reward has a powerful effect on choice behavior, decision

makers receive many other signals from their environment. For

example, they may discover, after their choices, how much reward

they could have received had they chosen a different action. When such

U 1 ðxÞ¼x 1 a I D b I A

where I D ¼ max{ x 2 x 1 ,0}and I A ¼ max{ x 1 x 2 , 0} refer to

inequalities that are disadvantageous and advantageous to the ﬁrst

player, respectively. The coefﬁcients a and b indicate sensitivities to

disadvantageous and advantageous inequalities, respectively, and it is

assumed that b r a and 0 r b o 1. Therefore, for a given payoff to the

ﬁrst decision maker, x 1 , U 1 ( x ) is maximal when x 1 ¼ x 2 , giving rise to

the preference for equality. When the monetary payoff in the prisoner’s

dilemma is replaced by this utility function with the value of b

sufﬁciently large, mutual cooperation and mutual defection both

become Nash equilibria 32 (Fig. 1c). When this occurs, a player

cooperates as long as he or she believes that the other player will

cooperate as well.

Evidence for altruistic social preference and aversion to inequality is

also seen in other experimental games, such as the dictator game, the

ultimatum game and the trust game 17,32 , and their possible neural

substrates have been examined 33 . In the dictator game, a dictator

receives a ﬁxed amount of money and donates a part of it to the

recipient. This ends the game, so there is no opportunity for the

recipient to retaliate. Any amount of donation reduces the payoff to the

dictator, so the amount provides a measure of altruism. During dictator

games, people tend to donate on average about 25% of their money 17 .

NATURE NEUROSCIENCE VOLUME 11 [ NUMBER 4 [ APRIL 2008

405

DECISION MAKING

REVIEW

Value functions

Figure 2 A model-based reinforcement learning model applied to social

decision making. The decision maker receives reward according to his or her

own action and those of other decision makers (DM) in the environment and

updates the value functions according to the reward prediction error. In

addition, the decision maker updates his or her model of the environment,

including the predicted actions of other decision makers. The ﬁctive reward

prediction errors resulting from such model simulations also inﬂuence the

value functions. The blue background indicates computations internalized in

the decision maker’s brain.

a 1 a 2 a 3 a 4 a 5 a 6

a N – 1 a N

Expected

reward

Reward

prediction

error

Fictive reward

prediction

error

Expected

reward

Reward

Fictive

reward

areas of the primate brain, including the amygdala 48 , the basal gang-

lia 49–51 , the posterior parietal cortex 3,52,53 , the lateral prefrontal cor-

tex 54–56 , the medial frontal cortex 57–59 and the orbitofrontal cortex 60–62 ,

modulate their activity according to rewards and value functions.

Nevertheless, how these signals related to value functions in many

areas are updated by real and ﬁctive reward error signals and inﬂuence

action selection is still largely unknown 63 .

In human neuroimaging, signals related to expected reward are found

in several brain areas, such as the amygdala, the striatum, the insula and

the orbitofrontal cortex 64–66 .Thenoninvasivenatureofneuroimaging

makes it possible to investigate the neural mechanisms of complex

ﬁnancial and social decision making in humans. On the other hand, the

signals measured in neuroimaging studies, such as blood oxygen level–

dependent (BOLD) signals, reﬂect the activity of individual neurons only

indirectly. In particular, BOLD signals in functional magnetic resonance

imaging (fMRI) experiments may reﬂect inputs to a given brain area

more closely than outputs from it 67 . Comparisons of results obtained

from single-neuron recording and fMRI studies must take into con-

sideration such methodological differences.

DM 1

DM 2

DM 1

DM 2

DM 3

Environment

Model

(virtual environment)

hypothetical payoffs or ﬁctive rewards differ from the rewards expected

from the current value functions, the resultant error signals, called

ﬁctive reward prediction error 37 or regret 38 , can be used to update the

value functions of corresponding actions. Such errors can indeed

inﬂuence the decision maker’s subsequent behaviors during ﬁnancial

decision making 37 . In model-based reinforcement learning algorithms,

ﬁctive reward signals can be generated from various types of simula-

tions or inferences based on the decision maker’s model or knowledge

of the environment. These ﬁctive reward signals might be crucial in

social decision making when the simulated environment includes other

decision makers (Fig. 2).

In game theory, estimating the payoffs from alternative strategies

based on the expected actions of other players is referred to as belief

learning 16,17 . For example, imagine you observe that a particular

decision maker tends to apply the strategy of tit-for-tat during a

repeated prisoner’s dilemma game. By simulating hypothetical inter-

actions, you can update the value functions for cooperation and

defection and might discover that cooperation with this player would

produce a higher average payoff than defection. Belief learning and

other model-based reinforcement learning algorithms can also update

value functions for multiple actions simultaneously. So far, studies on

competitive games in humans and other primates have failed to

provide strong evidence for such model-based reinforcement learning

or belief learning 20,36,39,40 . In contrast, both theoretical and empirical

studies show that the reputation and moral characters of individual

players inﬂuence the likelihood and degree of cooperation 41–43 .For

example, a player who has donated frequently in the past is more likely

to receive donations when such information is publicly available 42 .

Similarly, people tend to invest more money as investors in trust games

when they face individuals with positive moral qualities 43 .Therefore,

belief learning models might account for how images of individual

players are propagated.

Neural correlates of social decision making

Socially interactive decision making tends to be dynamic, and the

process of discovering an optimal strategy can be further complicated

because decision makers often act according to their preferences

concerning the consequences to other individuals, often referred to

as ‘other-regarding preferences’. Nevertheless, the basic neural processes

involved in outcome evaluation and reinforcement learning might be

generally applicable, whether or not the outcome of choice is deter-

mined socially. For example, neurons in the dorsolateral prefrontal

cortex of rhesus monkeys often encode signals related to the animal’s

previous choice and its outcome conjunctively, not only during a

memory-saccade task 68 but also in a computer-simulated matching-

pennies task 56 . Neurons in the posterior parietal cortex also modulate

their activity according to expected reward or its utility during both a

foraging task 52 and a computer-simulated competitive game 53 .Simi-

larly, in imaging studies, many brain areas involved in reward evalua-

tion and reinforcement learning, such as the striatum, insula and

orbitofrontal cortex, are also recruited during social decision making

(Fig. 3). However, as described below, activity in these brain areas

during social decision making is also inﬂuenced by factors that are

particular to social interactions.

One of the areas that is critical in socially interactive decision making

is the striatum. During decision making without social interaction,

activity in the striatum is inﬂuenced by both real and ﬁctive reward

prediction errors 11,12,37 . Reward prediction errors during social deci-

sion making also lead to activity changes in the striatum. For example,

during the prisoner’s dilemma game, cooperation results in a positive

BOLD response in the ventral striatum when cooperation is recipro-

cated by the partner, but produces a negative BOLD response in the

same areas when the cooperation is not reciprocated 69,70 .Inaddition,

the caudate nucleus of the trustee in a repeated trust game shows

activity correlated with the reputation of the investor 71 . When investors

Neural basis of reinforcement learning

During the last decade, reinforcement learning theory has become a

dominant paradigm for studying the neural basis of decision making

(see other articles in this issue and ref. 44). In nonhuman primates,

midbrain dopamine neurons encode reward prediction errors 10,45 .

Dopamine neurons also decrease their activity when the expected

reward is delayed 46

or omitted 10,47 . In addition, neurons in many

406

VOLUME 11 [ NUMBER 4 [ APRIL 2008 NATURE NEUROSCIENCE

DECISION MAKING

REVIEW

This result raises the possibility that the

striatal response to the reward received by

others might change depending on whether

a particular social interaction is perceived as

competition or cooperation. Indeed, during a

board game in which the subjects are required

to interact competitively or cooperatively, sev-

eral brain areas are activated differentially

depending on the nature of the interaction 81 .

For example, compared to competition, coop-

eration results in stronger activation in the

anterior frontal cortex and medial orbitofron-

tal cortex. However, whether and how these

cortical areas inﬂuence the striatal activity

related to social preference is not known.

Social decision making frequently requires

theory of mind—the ability to predict the

actions of other players based on their knowledge and intentions 82,83 .

Many neuroimaging studies ﬁnd that social interactions with human

players produce stronger activations than similar interactions with

computer players in several brain areas 84–86 , typically including the

anterior paracingulate cortex (Fig. 3c). Assuming that more sophisti-

cated inferences are used to deal with human players than with computer

players, such ﬁndings might provide some clues concerning the cortical

areas specialized for theory of mind. Accordingly, the anterior para-

cingulate cortex might be important in representing mental states of

others 82,84–88 . In the trust game, the cingulate cortex seems to represent

information about the agent responsible for a particular outcome 89 .The

cortical network involved in theory of mind and perception of agency,

however, is still not well characterized and is likely to involve additional

areas. For example, the posterior superior temporal cortex is implicated

in perception of agency 83,86,88 , and its activity correlates with the subject’s

tendency toward altruistic behavior 90 .

APC

Ins

OFC

Figure 3 Brain areas involved in social decision making. (a,b) Coronal sections of the human brain

showing the caudate nucleus (CD), the insula (Ins) and the orbitofrontal cortex (OFC). (c) Sagittal

section showing the anterior paracingulate cortex (APC). Arrows indicate approximate locations of the

sections shown in a and b.

in trust games receive detailed descriptions of the trustees’ positive

moral characters, the investors tend to invest money more frequently.

Moreover, activity in the caudate nucleus of the investor related to the

decision of the trustee is attenuated or abolished when the investor

relies on information about the trustee’s moral character 43 .

As described above, theoretical and behavioral studies show that

altruistic punishment of unfair behaviors promotes cooperation.

Neuroimaging studies provide important insight into the neural

mechanisms for producing such costly punishing acts. For example,

during the ultimatum game, unfair offers produce stronger activation

in the recipient’s anterior insula when they are rejected than when they

are accepted 72 . Because the insula is involved in evaluation of various

negative emotional states, such as disgust 73 , its activation during the

ultimatum game might reﬂect negative emotions associated with unfair

offers. In addition, the investors who have the option of punishing

unfair trustees during the trust game at a cost to themselves often apply

such punishment 74 . This punishment may have some hedonic value to

the investors, as activity in the caudate nucleus of the investor is

correlated with the magnitude of punishment and increases only when

punishment is effective. Comparing the proposer’s brain activity during

the ultimatum game and the dictator game shows that the dorsolateral

prefrontal cortex, lateral orbitofrontal cortex and caudate nucleus are

important in evaluating the threat of potential punishment 75 .

Inequality aversion can give rise not only to altruistic punishment of

norm violators but also to charitable donation. The mesolimbic

dopamine system, including the ventral tegmental area and the

striatum, is activated by both personal monetary reward and the

decision to donate money to charity 76 . In contrast, activity in the

lateral orbitofrontal cortex increases when decision makers oppose a

charitable organization by refusing to donate. Activity in the caudate

nucleus and ventral striatum increases with the amount of money

donated to a charity, even when the donation is mandatory 77 , but the

activity in both of these areas is higher when the donation is voluntary.

Although fairness norms strongly inﬂuence social decision making,

what is considered fair is likely to depend on various contextual factors,

such as the sense of entitlement 78 and the need for competitive

interactions with other players 79 . Similarly, when two participants

play the same game and receive a monetary reward for correct answers,

activity in the ventral striatum increases with the amount of money

paid to the subject but decreases with the amount of money earned by

the partner 80 . In other words, when the subjects are evaluated and

rewarded by the same criterion, activity in the ventral striatum is more

closely related to the subject’s relative payment compared to the

partner’s payment than to the absolute payment of the subject.

Genetic and hormonal factors in social decision making

The ﬁtness value of many social behaviors, such as cooperation with

genetically unrelated individuals, often depends on various environ-

mental conditions, including the prevalence of individuals with the

same behavioral traits. Thus, individual traits related to social decision

making could remain heterogeneous in the population because the

selective forces favoring different traits could be balanced 91 . Indeed,

studies on experimental games commonly show substantial individual

variability in the behavior of decision makers, and neuroimaging

studies on social behavior ﬁnd that activity in several brain areas,

such as the striatum and insula, correlates with the decision maker’s

tendency to show altruistic behaviors 69,72,74,75 . Some of this variability

might be due to genetic factors. For example, the minimum acceptable

offer during an ultimatum game is more similar between monozygotic

twins than between dizygotic twins 92 .

The genetic mechanisms regulating dopaminergic and serotonergic

synaptic transmission might underlie individual differences in beha-

viors and neural circuits implicated in reinforcement learning and

therefore contribute to individual variability in social decision making.

Among the genes related to dopamine functions, the dopamine

receptor D2 (DRD2) gene has received much attention. For example,

DRD2 polymorphisms, such as Taq1A and C957T, inﬂuence how

efﬁciently decision makers can modify their choice of behavior accord-

ing to the negative consequences of their previous actions 93,94 .The

Taq1A polymorphism also inﬂuences the magnitude of fMRI signals

related to negative feedback 93 . In contrast, polymorphism in the dopa-

mine- and cyclic AMP–regulated phosphoprotein of molecular weight

NATURE NEUROSCIENCE VOLUME 11 [ NUMBER 4 [ APRIL 2008

407

DECISION MAKING

REVIEW

32 kDa (DARPP-32) inﬂuences the rate of learning based on positive

outcomes, and a valine/methionine polymorphism in catechol-O-

methyltransferase (COMT) might inﬂuence the ability to adjust choices

rapidly on a trial-by-trial basis by modulating dopamine in the prefrontal

cortex 56,94 . Variations in the proteins involved in serotonin metabolism,

such as the serotonin transporter–linked polymorphism (5-HTTLPR),

might also inﬂuence social decision making 95,96 . For example, rhesus

monkeys carrying only the short variant 5-HTTLPR have less ability to

switch in object-discrimination reversal learning and show more aggres-

sion than monkeys carrying the long variant 97 .Littleisknown,however,

about the neurophysiological changes associated with genetic variability

that might underlie behavioral changes in social decision making. In

addition, any effects of genetic variability on such complex behaviors as

social decision making are likely to involve interactions among many

genes and between genes and the environment 96,98 .

Hormones are also known to inﬂuence social behavior. For example,

high testosterone increases the likelihood that a recipient will reject

relatively low offers during the ultimatum game 99 , and oxytocin

increases the amount of money transferred by the investor during

the trust game 100 .

12. McClure, S.M., Berns, G.S. & Montague, P.R. Temporal prediction errors in a passive

learning task activate human striatum. Neuron 38, 339–346 (2003).

13. O’Doherty, J. et al. Dissociable roles of ventral and dorsal striatum in instrumental

conditioning. Science 304, 452–454 (2004).

14. Tricomi, E.M., Delgado, M.R. & Fiez, J.A. Modulation of caudate activity by action

contingency. Neuron 41, 281–292 (2004).

15. Adolphs, R. Social cognition and the human brain. Tr ends Cogn . Sci . 3, 469–479

(1999).

16. Fudenberg, D. & Levine, D.K. The Theory of Learning in Games (MIT Press, Cambridge,

Massachusetts, USA, 1998).

17. Camerer, C.F. Behavioral Game Theory: Experiments in Strategic Interaction (Princeton

Univ. Press, Princeton, New Jersey, USA, 2003).

18. von Neumann, J. & Morgenstern, O. Theory of Games and Economic Behavior

(Princeton Univ. Press, Princeton, New Jersey, USA, 1944).

19. Nash, J.F. Equilibrium points in n-person games. Proc. Natl. Acad. Sci. USA 36,

48–49 (1950).

20. Erev, I. & Roth, A.E. Predicting how people play games: reinforcement learning

in experimental games with unique, mixed strategy equilibria. Am. Econ. Rev. 88,

848–881 (1998).

21. Lee, D., Conroy, M.L., McGreevy, B.P. & Barraclough, D.J. Reinforcement learning and

decision making in monkeys during a competitive game. Brain Res. Cogn. Brain Res.

22, 45–58 (2004).

22. Sally, D. Conversation and cooperation in social dilemmas. Ration. Soc. 7,58–92

(1995).

23. Fehr, E. & Fischbacher, U. The nature of human altruism. Nature 425, 785–791

(2003).

24. Hauser, M.D., Chen, M.K., Chen, F. & Chuang, E. Give unto others: genetically

unrelated cotton-top tamarin monkeys preferentially give food to those who altruisti-

cally give food back. Proc. R. Soc. Lond. B 270, 2363–2370 (2003).

25. Warneken, F. & Tomasello, M. Altruistic helping in human infants and young chim-

panzees. Science 311, 1301–1303 (2006).

26. de Waal, F.B.M. Putting the altruism back into altruism: the evolution of empathy.

Annu. Rev. Psychol. 59, 279–300 (2008).

27. Nowak, M.A. Five rules for the evolution of cooperation. Science 314, 1560–1563

(2006).

28. Clutton-Brock, T.H. & Parker, G.A. Punishment in animal societies. Nature 373,

209–216 (1995).

29. Fehr, E. & G¨chter, S. Altruistic punishment in humans. Nature 415, 137–140

(2002).

30. Boyd, R., Gintis, H., Bowles, S. & Richerson, P.J. The evolution of altruistic punish-

ment. Proc. Natl. Acad. Sci. USA 100, 3531–3535 (2003).

31. Fehr, E. & Schmidt, K.M. A theory of fairness, competition, and cooperation. Q. J. Econ.

114, 817–868 (1999).

32. Fehr, E. & Camerer, C.F. Social neuroeconomics: the neural circuitry of social

preferences. Trends Cogn. Sci. 11, 419–427 (2007).

33. Sanfey, A.G. Social decision-making: insights from game theory and neuroscience.

Science 318, 598–602 (2007).

34. Jensen, K., Call, J. & Tomasello, M. Chimpanzees are rational maximizers in an

ultimatum game. Science 318, 107–109 (2007).

35. Sandholm, T.W. & Crites, R.H. Multiagent reinforcement learning in the iterated

prisoner’s dilemma. Biosystems 37, 147–166 (1996).

36. Lee, D., McGreevy, B.P. & Barraclough, D.J. Learning and decision making in monkeys

during a rock-paper-scissors game. Brain Res. Cogn. Brain Res. 25, 416–430 (2005).

37. Lohrenz, T., McCabe, K., Camerer, C.F. & Montague, P.R. Neural signature of ﬁctive

learning signals in a sequential investment task. Proc. Natl. Acad. Sci. USA 104,

9493–9498 (2007).

38. Coricelli, G., Dolan, R.J. & Sirigu, A. Brain, emotion and decision making: the

paradigmatic example of regret. Trends Cogn. Sci. 11, 258–265 (2007).

39. Mookherjee, D. & Sopher, B. Learning and decision costs in experimental constant sum

games. Games Econ. Behav. 19, 97–132 (1997).

40. Feltovich, N. Reinforcement learning vs. belief-based learning models in experimental

asymmetric-information games. Econometrica 68, 605–641 (2000).

41. Nowak, M.A. & Sigmund, K. Evolution of indirect reciprocity by image scoring. Nature

393, 573–577 (1998).

42. Wedekind, C. & Milinski, M. Cooperation through image scoring in humans. Science

288, 850–852 (2000).

43. Delgado, M.R., Frank, R.H. & Phelps, E.A. Perceptions of moral character modulate

the neural systems of reward during the trust game. Nat. Neurosci. 8, 1611–1618

(2005).

44. Kawato, M. & Samejima, K. Efﬁcient reinforcement learning: computational theories,

neuroscience and robotics. Curr. Opin. Neurobiol. 17, 205–212 (2007).

45. Schultz, W. Behavioral theories and the neurophysiology of reward. Annu. Rev. Psychol.

57, 87–115 (2006).

46. Roesch, M.R., Calu, D.J. & Schoenbaum, G. Dopamine neurons encode the better

option in rats deciding between differently delayed or sized rewards. Nat. Neurosci. 10,

1615–1624 (2007).

47. Bayer, H.M. & Glimcher, P.W. Midbrain dopamine neurons encode a quantitative reward

prediction error signal. Neuron 47, 129–141 (2005).

48. Paton, J.J., Belova, M.A., Morrison, S.E. & Salzman, C.D. The primate amygdala

represents the positive and negative value of visual stimuli during learning. Nature

439, 865–870 (2006).

49. Kawagoe, R., Takikawa, Y. & Hikosaka, O. Expectation of reward modulates cognitive

signals in the basal ganglia. Nat. Neurosci. 1, 411–416 (1998).

Conclusion

Social decision making is one of the most complex animal behaviors. It

often requires animals to recognize the intentions of other animals

correctly and to adjust behavioral strategies rapidly. In addition, humans

can cooperate or compete with one another, and various contextual

factors inﬂuence the extent to which humans are willing to sacriﬁce their

personal gains to increase or decrease the well-being of others. The

neural basis of such complex social decision making can be investigated

quantitatively by applying game theory. These studies ﬁnd that the key

brain areas involved in reinforcement learning, such as the striatum and

orbitofrontal cortex, also underlie choices made in social settings.

Nevertheless, our current knowledge of neural mechanisms for social

decision making is still limited. This situation will improve as we come

to understand the genetic and neurophysiological basis of information

processing in the brain’s reward system.

ACKNOWLEDGMENTS

I am grateful to Michael Frank for discussions. This research was supported by

the US National Institutes of Health (MH073246 and DA024855).

Published online at http://www.nature.com/natureneuroscience

Reprints and permissions information is available online at http://npg.nature.com/

reprintsandpermissions

1. Kahneman, D. & Tversky, A. Prospect theory: an analysis of decision under risk.

Econometrica 47, 263–292 (1979).

2. Sutton, R.S. & Barton, A.G. Reinforcement Learning: An Introduction (MIT Press,

Cambridge, Massachusetts, USA, 1998).

3. Platt, M.L. & Glimcher, P.W. Neural correlates of decision variables in parietal cortex.

Nature 400, 233–238 (1999).

4. Breiter, H.C., Aharon, I., Kahneman, D., Dale, A. & Shizgal, P. Functional imaging of

neural responses to expectancy and experience of monetary gains and losses. Neuron

30, 619–639 (2001).

5. Fiorillo, C.D., Tobler, P.N. & Schultz, W. Discrete coding of reward probability and

uncertainty by dopamine neurons. Science 299, 1898–1902 (2003).

6. Glimcher, P.W. & Rustichini, A. Neuroeconomics: the consilience of brain and decision.

Science 306, 447–452 (2004).

7. Knutson, B., Taylor, J., Kaufman, M., Peterson, R. & Glover, G. Distributed neural

representation of expected value. J. Neurosci. 25, 4806–4812 (2005).

8. Hsu, M., Bhatt, M., Adolphs, R., Tranel, D. & Camerer, C.F. Neural systems responding

to degrees of uncertainty in human decision-making. Science 310, 1680–1683 (2005).

9. Preuschoff, K., Bossaerts, P. & Quartz, S.R. Neural differentiation of expected reward

and risk in human subcortical structures. Neuron 51, 381–390 (2006).

10. Schultz, W., Dayan, P. & Montague, P.R. A neural substrate of prediction and reward.

Science 275, 1593–1599 (1997).

11. O’Doherty, J.P., Dayan, P., Friston, K., Critchley, H. & Dolan, R.J. Temporal difference

models and reward-related learning in the human brain. Neuron 38, 329–337 (2003).

408

VOLUME 11 [ NUMBER 4 [ APRIL 2008 NATURE NEUROSCIENCE

Lee D. - Game theory and neural basis of social decision making.pdf

Plik z chomika:

Inne pliki z tego folderu:

Inne foldery tego chomika: