Lee D. - Game theory and neural basis of social decision making.pdf

(244 KB) Pobierz
npgrj_nn_2065 404..409
DECISION MAKING
REVIEW
Game theory and neural basis of social decision
making
Daeyeol Lee
Decision making in a social group has two distinguishing features. First, humans and other animals routinely alter their behavior in
response to changes in their physical and social environment. As a result, the outcomes of decisions that depend on the behavior of
multiple decision makers are difficult to predict and require highly adaptive decision-making strategies. Second, decision makers
may have preferences regarding consequences to other individuals and therefore choose their actions to improve or reduce the
well-being of others. Many neurobiological studies have exploited game theory to probe the neural basis of decision making and
suggested that these features of social decision making might be reflected in the functions of brain areas involved in reward
evaluation and reinforcement learning. Molecular genetic studies have also begun to identify genetic mechanisms for personal traits
related to reinforcement learning and complex social decision making, further illuminating the biological basis of social behavior.
Decision making is challenging because the outcomes from a particular
action are seldom fully predictable. Therefore, decision makers must
always take uncertainty into consideration when they make choices 1 .In
addition, such action-outcome relationships can change frequently,
requiring adaptive decision-making strategies that depend on the
observed outcomes of previous choices 2 . Accordingly, neurobiological
studies on decision making have focused on the brain mechanisms
involved in mediating the effect of uncertainty and improving decision-
making strategies by trial and error. Signals related to reward magni-
tude and probability are widespread in the brain and are often
modulated by decision making 3–9 . Some of these areas might
be also involved in updating the preference and strategies of
decision makers 10–14 .
Compared to solitary animals, animals living in a large social group
face many distinctive challenges and opportunities, as reflected in
various cognitive abilities in the social domain, such as communication
and other prosocial behaviors 15 . This review focuses on the neural basis
of socially interactive decision making in humans and other primates.
The basic building blocks of decision making that underlie the
processes of learning and valuation also are important for decision
making in social contexts. However, interactions among multiple
decision makers in a social group show some additional features.
First, behaviors of humans and animals can change frequently, as
they seek to maximize their self-interest according to the information
available from their environment. This makes it difficult to predict the
outcomes of a decision maker’s actions and to choose optimal actions
accordingly. As a result, more sophisticated learning algorithms might
be required for social decision making 16,17 . Second, social interactions
open the possibilities of competition and cooperation. Humans and
animals indeed act not only to maximize their own self-interest, but
sometimes also to increase or decrease the well-being of others around
them. These aspects of social decision making are reflected in the
activity of brain areas involved in learning and valuation.
Game theory and social preference
A good starting point for studies of social decision making is game
theory 18 . In its original formulation, game theory seeks to find the
strategies that a group of decision makers will converge on, as they try
to maximize their own payoffs. Nash equilibrium refers to a set of such
strategies from which no individual players can increase their payoffs by
changing their strategies unilaterally 19 . In a two-player competitive
game known as matching pennies (Fig. 1a), for example, each player
can choose between two alternative options, such as the head and tail of
a coin. One of the players wins if both players choose the same option
and loses otherwise. For the matching-pennies game with a symme-
trical payoff matrix (as in Fig. 1a), the Nash equilibrium is to choose
both options with the same probabilities. Any other strategy can be
exploited by the opponent and therefore reduces the expected payoff.
In both humans and nonhuman primates, however, the predictions
based on the Nash equilibrium are often systematically violated for
such competitive games 17,20,21 . As discussed below, this might be due to
various learning algorithms used by the decision makers to improve the
outcomes of their choices iteratively.
How game theory can be used to investigate cooperation and
altruism is illustrated by a well known game, the prisoner’s dilemma.
The two players in this game can each choose between cooperation and
defection. Each player receives a higher payoff by defecting, whether the
other player chooses to cooperate or defect, but the payoff to each
player is higher for mutual cooperation than for mutual defection,
hence creating a dilemma (Fig. 1b). If this game is played only once and
the players care only about their own payoffs, both players should
defect, which corresponds to the Nash equilibrium for this game. In
reality and in laboratory experiments, however, both these assumptions
are frequently violated.
Yale University School of Medicine, Department of Neurobiology, 333 Cedar Street,
SHM B404, New Haven, Connecticut 06510, USA. Correspondence should be
addressed to D.L. ( daeyeol.lee@yale.edu) .
Published online 26 March 2008; doi:10.1038/nn2065
404
VOLUME 11 [ NUMBER 4 [ APRIL 2008 NATURE NEUROSCIENCE
253101341.024.png 253101341.025.png 253101341.026.png 253101341.027.png 253101341.001.png 253101341.002.png 253101341.003.png 253101341.004.png 253101341.005.png 253101341.006.png 253101341.007.png 253101341.008.png
DECISION MAKING
REVIEW
a
Nonmatcher
b
Player II
c
An ultimatum game is similar to the dictator
game in that one of the players (proposer)
offers a proportion of the money to the
recipient, who now has the opportunity to
reject the offer. If the offer is rejected, neither
player receives any money. The average offer
in ultimatum games is about 40%, signifi-
cantly higher than in the dictator game,
implying that proposers are motivated to
avoid the potential rejection 17 . Indeed, in
the ultimatum game, recipients reject offers
below 20% about half the time. Another
important element in social interaction is
captured by a trust game, in which one of
the players (investor) invests a proportion of
his or her money. This money then is multiplied, often tripled, and
transferred to the other player (trustee). The trustee then decides how
much of this transferred money is returned to the investor. The amount
of money invested by the investor measures the trust of the investor in
the trustee, and the amount of repayment reflects the trustee’s
trustworthiness. Thus, trust games quantify the moral obligations
that a trustee might feel toward the investor. Empirically, investors
tend to invest roughly half their money, and trustees tend to repay an
amount comparable to the original investment 17 .
Studies on experimental games in nonhuman primates can provide
important insights into the evolutionary origins of social preference
shown by human decision makers. For example, when chimpanzees are
tested in a reduced form of the ultimatum game in which proposers
choose between two different preset offers, they tend to choose the
options that maximize their self-interest, both as proposers and
recipients 34 . Therefore, even though chimpanzees and other nonhu-
man primates show altruistic behaviors, fairness is much more impor-
tant in social decision making for humans.
Player II
Defect
Head
Ta i l
Cooperate
Defect
Cooperate
Head
(1, –1)
(–1, 1)
Cooperate
(3, 3)
(0, 5)
Cooperate
3
–5
Matcher
Player I
Player I
Ta i l
(–1, 1)
(1, –1)
Defect
(5, 0)
(1, 1)
Defect
5 – 5
1
Figure 1 Payoff matrix for the games of matching pennies and prisoner’s dilemma. (a) For the matching-
pennies game, a pair of numbers within each pair of parentheses indicates the payoffs to the matcher
and nonmatcher, respectively. Blue or yellow rectangles indicate the outcomes favorable to the matcher
or nonmatcher, respectively. (b,c) The prisoner’s dilemma game. (b) A pair of numbers within the
parentheses indicates the payoffs to players I and II, respectively. The yellow and green rectangles
correspond to mutual cooperation and mutual defection, respectively, whereas the gray rectangles
indicate unreciprocated cooperation. (c) Player I’s utility function adjusted according to the model of
inequality aversion. The values of a and b indicate the sensitivity to disadvantageous and advantageous
inequality. For b 4 0.4, mutual cooperation becomes a Nash equilibrium.
Games can be played repeatedly, often among the same set of players.
This makes it possible for some players to train others to deviate from
the equilibrium predictions for one-shot games. In addition, humans
often cooperate in prisoner’s dilemma games, whether the game is one-
shot or repeated 22 . Therefore, for humans, decision making in social
contexts may not be entirely driven by self-interest, but at least partially
by preferences regarding the well-being of other individuals. Indeed,
cooperation and altruistic behaviors abound in human societies 23 and
may also occur in nonhuman primates 24–26 . In theory, multiple
mechanisms—including kin selection, direct and indirect reciprocity
and group selection—can increase the fitness of cooperators and thus
sustain cooperation 23,27 . Punishment of defectors or free-riders at a
cost to the rule enforcer, often referred to as altruistic punishment, also
effectively deters defection 23,28–30 .
In economics, the subjective desirability of a particular choice is
quantified by its utility function. Although the classical notion of utility
only concerns the state of the decision maker’s individual wealth, the
utility function can be expanded, when people take into consideration
the well-being of other individuals, to incorporate social preference.
For example, the utility function can be modified by the decision
maker’s aversion to inequality 31 . For two-player games, the first player’s
utility, U 1 ( x ), for the payoff to the two players x ¼ [ x 1 x 2 ], can be
defined as follows:
Learning in social decision making
When a group plays the same game repeatedly, some players may try to
train other players. For example, recipients in an ultimatum game may
reject some offers, not as a result of aversion to inequality, but to
increase their long-term payoff by penalizing greedy proposers. To
better isolate the effect of social preference, therefore, many experi-
menters do not allow their subjects to interact with the same partners
repeatedly. In real situations, however, learning is important, as people
and animals do tend to interact with the same individuals repeatedly.
Reinforcement learning theory 2 formalizes the problem faced by a
decision maker trying to discover optimal strategies in an unfamiliar
environment (Fig. 2). This theory has been successfully applied to an
environment that includes multiple decision makers 17,20,21,35,36 .The
sum of future rewards expected from a particular action in a particular
state of the environment is referred to as the value function. Future
rewards are often exponentially discounted, so that immediate rewards
contribute more to the value function. Similar to utility functions in
economics, value functions determine the actions chosen by decision
makers. In addition, the difference between the reward predicted from
the value function and the actual reward is termed reward prediction
error. In simple or direct reinforcement learning algorithms, value
functions are updated only for chosen actions and only when there is a
reward prediction error 2 .
Although reward has a powerful effect on choice behavior, decision
makers receive many other signals from their environment. For
example, they may discover, after their choices, how much reward
they could have received had they chosen a different action. When such
U 1 ðxÞ¼x 1 a I D b I A
where I D ¼ max{ x 2 x 1 ,0}and I A ¼ max{ x 1 x 2 , 0} refer to
inequalities that are disadvantageous and advantageous to the first
player, respectively. The coefficients a and b indicate sensitivities to
disadvantageous and advantageous inequalities, respectively, and it is
assumed that b r a and 0 r b o 1. Therefore, for a given payoff to the
first decision maker, x 1 , U 1 ( x ) is maximal when x 1 ¼ x 2 , giving rise to
the preference for equality. When the monetary payoff in the prisoner’s
dilemma is replaced by this utility function with the value of b
sufficiently large, mutual cooperation and mutual defection both
become Nash equilibria 32 (Fig. 1c). When this occurs, a player
cooperates as long as he or she believes that the other player will
cooperate as well.
Evidence for altruistic social preference and aversion to inequality is
also seen in other experimental games, such as the dictator game, the
ultimatum game and the trust game 17,32 , and their possible neural
substrates have been examined 33 . In the dictator game, a dictator
receives a fixed amount of money and donates a part of it to the
recipient. This ends the game, so there is no opportunity for the
recipient to retaliate. Any amount of donation reduces the payoff to the
dictator, so the amount provides a measure of altruism. During dictator
games, people tend to donate on average about 25% of their money 17 .
NATURE NEUROSCIENCE VOLUME 11 [ NUMBER 4 [ APRIL 2008
405
253101341.009.png 253101341.010.png
DECISION MAKING
REVIEW
Value functions
Figure 2 A model-based reinforcement learning model applied to social
decision making. The decision maker receives reward according to his or her
own action and those of other decision makers (DM) in the environment and
updates the value functions according to the reward prediction error. In
addition, the decision maker updates his or her model of the environment,
including the predicted actions of other decision makers. The fictive reward
prediction errors resulting from such model simulations also influence the
value functions. The blue background indicates computations internalized in
the decision maker’s brain.
a 1 a 2 a 3 a 4 a 5 a 6
a N – 1 a N
Expected
reward
Reward
prediction
error
Fictive reward
prediction
error
Expected
reward
Reward
Fictive
reward
areas of the primate brain, including the amygdala 48 , the basal gang-
lia 49–51 , the posterior parietal cortex 3,52,53 , the lateral prefrontal cor-
tex 54–56 , the medial frontal cortex 57–59 and the orbitofrontal cortex 60–62 ,
modulate their activity according to rewards and value functions.
Nevertheless, how these signals related to value functions in many
areas are updated by real and fictive reward error signals and influence
action selection is still largely unknown 63 .
In human neuroimaging, signals related to expected reward are found
in several brain areas, such as the amygdala, the striatum, the insula and
the orbitofrontal cortex 64–66 .Thenoninvasivenatureofneuroimaging
makes it possible to investigate the neural mechanisms of complex
financial and social decision making in humans. On the other hand, the
signals measured in neuroimaging studies, such as blood oxygen level–
dependent (BOLD) signals, reflect the activity of individual neurons only
indirectly. In particular, BOLD signals in functional magnetic resonance
imaging (fMRI) experiments may reflect inputs to a given brain area
more closely than outputs from it 67 . Comparisons of results obtained
from single-neuron recording and fMRI studies must take into con-
sideration such methodological differences.
DM 1
DM 2
DM 1
DM 2
DM 3
DM 3
Environment
Model
(virtual environment)
hypothetical payoffs or fictive rewards differ from the rewards expected
from the current value functions, the resultant error signals, called
fictive reward prediction error 37 or regret 38 , can be used to update the
value functions of corresponding actions. Such errors can indeed
influence the decision maker’s subsequent behaviors during financial
decision making 37 . In model-based reinforcement learning algorithms,
fictive reward signals can be generated from various types of simula-
tions or inferences based on the decision maker’s model or knowledge
of the environment. These fictive reward signals might be crucial in
social decision making when the simulated environment includes other
decision makers (Fig. 2).
In game theory, estimating the payoffs from alternative strategies
based on the expected actions of other players is referred to as belief
learning 16,17 . For example, imagine you observe that a particular
decision maker tends to apply the strategy of tit-for-tat during a
repeated prisoner’s dilemma game. By simulating hypothetical inter-
actions, you can update the value functions for cooperation and
defection and might discover that cooperation with this player would
produce a higher average payoff than defection. Belief learning and
other model-based reinforcement learning algorithms can also update
value functions for multiple actions simultaneously. So far, studies on
competitive games in humans and other primates have failed to
provide strong evidence for such model-based reinforcement learning
or belief learning 20,36,39,40 . In contrast, both theoretical and empirical
studies show that the reputation and moral characters of individual
players influence the likelihood and degree of cooperation 41–43 .For
example, a player who has donated frequently in the past is more likely
to receive donations when such information is publicly available 42 .
Similarly, people tend to invest more money as investors in trust games
when they face individuals with positive moral qualities 43 .Therefore,
belief learning models might account for how images of individual
players are propagated.
Neural correlates of social decision making
Socially interactive decision making tends to be dynamic, and the
process of discovering an optimal strategy can be further complicated
because decision makers often act according to their preferences
concerning the consequences to other individuals, often referred to
as ‘other-regarding preferences’. Nevertheless, the basic neural processes
involved in outcome evaluation and reinforcement learning might be
generally applicable, whether or not the outcome of choice is deter-
mined socially. For example, neurons in the dorsolateral prefrontal
cortex of rhesus monkeys often encode signals related to the animal’s
previous choice and its outcome conjunctively, not only during a
memory-saccade task 68 but also in a computer-simulated matching-
pennies task 56 . Neurons in the posterior parietal cortex also modulate
their activity according to expected reward or its utility during both a
foraging task 52 and a computer-simulated competitive game 53 .Simi-
larly, in imaging studies, many brain areas involved in reward evalua-
tion and reinforcement learning, such as the striatum, insula and
orbitofrontal cortex, are also recruited during social decision making
(Fig. 3). However, as described below, activity in these brain areas
during social decision making is also influenced by factors that are
particular to social interactions.
One of the areas that is critical in socially interactive decision making
is the striatum. During decision making without social interaction,
activity in the striatum is influenced by both real and fictive reward
prediction errors 11,12,37 . Reward prediction errors during social deci-
sion making also lead to activity changes in the striatum. For example,
during the prisoner’s dilemma game, cooperation results in a positive
BOLD response in the ventral striatum when cooperation is recipro-
cated by the partner, but produces a negative BOLD response in the
same areas when the cooperation is not reciprocated 69,70 .Inaddition,
the caudate nucleus of the trustee in a repeated trust game shows
activity correlated with the reputation of the investor 71 . When investors
Neural basis of reinforcement learning
During the last decade, reinforcement learning theory has become a
dominant paradigm for studying the neural basis of decision making
(see other articles in this issue and ref. 44). In nonhuman primates,
midbrain dopamine neurons encode reward prediction errors 10,45 .
Dopamine neurons also decrease their activity when the expected
reward is delayed 46
or omitted 10,47 . In addition, neurons in many
406
VOLUME 11 [ NUMBER 4 [ APRIL 2008 NATURE NEUROSCIENCE
253101341.011.png 253101341.012.png 253101341.013.png 253101341.014.png 253101341.015.png
DECISION MAKING
REVIEW
a
b
c
This result raises the possibility that the
striatal response to the reward received by
others might change depending on whether
a particular social interaction is perceived as
competition or cooperation. Indeed, during a
board game in which the subjects are required
to interact competitively or cooperatively, sev-
eral brain areas are activated differentially
depending on the nature of the interaction 81 .
For example, compared to competition, coop-
eration results in stronger activation in the
anterior frontal cortex and medial orbitofron-
tal cortex. However, whether and how these
cortical areas influence the striatal activity
related to social preference is not known.
Social decision making frequently requires
theory of mind—the ability to predict the
actions of other players based on their knowledge and intentions 82,83 .
Many neuroimaging studies find that social interactions with human
players produce stronger activations than similar interactions with
computer players in several brain areas 84–86 , typically including the
anterior paracingulate cortex (Fig. 3c). Assuming that more sophisti-
cated inferences are used to deal with human players than with computer
players, such findings might provide some clues concerning the cortical
areas specialized for theory of mind. Accordingly, the anterior para-
cingulate cortex might be important in representing mental states of
others 82,84–88 . In the trust game, the cingulate cortex seems to represent
information about the agent responsible for a particular outcome 89 .The
cortical network involved in theory of mind and perception of agency,
however, is still not well characterized and is likely to involve additional
areas. For example, the posterior superior temporal cortex is implicated
in perception of agency 83,86,88 , and its activity correlates with the subject’s
tendency toward altruistic behavior 90 .
b
CD
APC
Ins
OFC
Figure 3 Brain areas involved in social decision making. (a,b) Coronal sections of the human brain
showing the caudate nucleus (CD), the insula (Ins) and the orbitofrontal cortex (OFC). (c) Sagittal
section showing the anterior paracingulate cortex (APC). Arrows indicate approximate locations of the
sections shown in a and b.
in trust games receive detailed descriptions of the trustees’ positive
moral characters, the investors tend to invest money more frequently.
Moreover, activity in the caudate nucleus of the investor related to the
decision of the trustee is attenuated or abolished when the investor
relies on information about the trustee’s moral character 43 .
As described above, theoretical and behavioral studies show that
altruistic punishment of unfair behaviors promotes cooperation.
Neuroimaging studies provide important insight into the neural
mechanisms for producing such costly punishing acts. For example,
during the ultimatum game, unfair offers produce stronger activation
in the recipient’s anterior insula when they are rejected than when they
are accepted 72 . Because the insula is involved in evaluation of various
negative emotional states, such as disgust 73 , its activation during the
ultimatum game might reflect negative emotions associated with unfair
offers. In addition, the investors who have the option of punishing
unfair trustees during the trust game at a cost to themselves often apply
such punishment 74 . This punishment may have some hedonic value to
the investors, as activity in the caudate nucleus of the investor is
correlated with the magnitude of punishment and increases only when
punishment is effective. Comparing the proposer’s brain activity during
the ultimatum game and the dictator game shows that the dorsolateral
prefrontal cortex, lateral orbitofrontal cortex and caudate nucleus are
important in evaluating the threat of potential punishment 75 .
Inequality aversion can give rise not only to altruistic punishment of
norm violators but also to charitable donation. The mesolimbic
dopamine system, including the ventral tegmental area and the
striatum, is activated by both personal monetary reward and the
decision to donate money to charity 76 . In contrast, activity in the
lateral orbitofrontal cortex increases when decision makers oppose a
charitable organization by refusing to donate. Activity in the caudate
nucleus and ventral striatum increases with the amount of money
donated to a charity, even when the donation is mandatory 77 , but the
activity in both of these areas is higher when the donation is voluntary.
Although fairness norms strongly influence social decision making,
what is considered fair is likely to depend on various contextual factors,
such as the sense of entitlement 78 and the need for competitive
interactions with other players 79 . Similarly, when two participants
play the same game and receive a monetary reward for correct answers,
activity in the ventral striatum increases with the amount of money
paid to the subject but decreases with the amount of money earned by
the partner 80 . In other words, when the subjects are evaluated and
rewarded by the same criterion, activity in the ventral striatum is more
closely related to the subject’s relative payment compared to the
partner’s payment than to the absolute payment of the subject.
Genetic and hormonal factors in social decision making
The fitness value of many social behaviors, such as cooperation with
genetically unrelated individuals, often depends on various environ-
mental conditions, including the prevalence of individuals with the
same behavioral traits. Thus, individual traits related to social decision
making could remain heterogeneous in the population because the
selective forces favoring different traits could be balanced 91 . Indeed,
studies on experimental games commonly show substantial individual
variability in the behavior of decision makers, and neuroimaging
studies on social behavior find that activity in several brain areas,
such as the striatum and insula, correlates with the decision maker’s
tendency to show altruistic behaviors 69,72,74,75 . Some of this variability
might be due to genetic factors. For example, the minimum acceptable
offer during an ultimatum game is more similar between monozygotic
twins than between dizygotic twins 92 .
The genetic mechanisms regulating dopaminergic and serotonergic
synaptic transmission might underlie individual differences in beha-
viors and neural circuits implicated in reinforcement learning and
therefore contribute to individual variability in social decision making.
Among the genes related to dopamine functions, the dopamine
receptor D2 (DRD2) gene has received much attention. For example,
DRD2 polymorphisms, such as Taq1A and C957T, influence how
efficiently decision makers can modify their choice of behavior accord-
ing to the negative consequences of their previous actions 93,94 .The
Taq1A polymorphism also influences the magnitude of fMRI signals
related to negative feedback 93 . In contrast, polymorphism in the dopa-
mine- and cyclic AMP–regulated phosphoprotein of molecular weight
NATURE NEUROSCIENCE VOLUME 11 [ NUMBER 4 [ APRIL 2008
407
a
253101341.016.png 253101341.017.png 253101341.018.png 253101341.019.png 253101341.020.png 253101341.021.png
DECISION MAKING
REVIEW
32 kDa (DARPP-32) influences the rate of learning based on positive
outcomes, and a valine/methionine polymorphism in catechol-O-
methyltransferase (COMT) might influence the ability to adjust choices
rapidly on a trial-by-trial basis by modulating dopamine in the prefrontal
cortex 56,94 . Variations in the proteins involved in serotonin metabolism,
such as the serotonin transporter–linked polymorphism (5-HTTLPR),
might also influence social decision making 95,96 . For example, rhesus
monkeys carrying only the short variant 5-HTTLPR have less ability to
switch in object-discrimination reversal learning and show more aggres-
sion than monkeys carrying the long variant 97 .Littleisknown,however,
about the neurophysiological changes associated with genetic variability
that might underlie behavioral changes in social decision making. In
addition, any effects of genetic variability on such complex behaviors as
social decision making are likely to involve interactions among many
genes and between genes and the environment 96,98 .
Hormones are also known to influence social behavior. For example,
high testosterone increases the likelihood that a recipient will reject
relatively low offers during the ultimatum game 99 , and oxytocin
increases the amount of money transferred by the investor during
the trust game 100 .
12. McClure, S.M., Berns, G.S. & Montague, P.R. Temporal prediction errors in a passive
learning task activate human striatum. Neuron 38, 339–346 (2003).
13. O’Doherty, J. et al. Dissociable roles of ventral and dorsal striatum in instrumental
conditioning. Science 304, 452–454 (2004).
14. Tricomi, E.M., Delgado, M.R. & Fiez, J.A. Modulation of caudate activity by action
contingency. Neuron 41, 281–292 (2004).
15. Adolphs, R. Social cognition and the human brain. Tr ends Cogn . Sci . 3, 469–479
(1999).
16. Fudenberg, D. & Levine, D.K. The Theory of Learning in Games (MIT Press, Cambridge,
Massachusetts, USA, 1998).
17. Camerer, C.F. Behavioral Game Theory: Experiments in Strategic Interaction (Princeton
Univ. Press, Princeton, New Jersey, USA, 2003).
18. von Neumann, J. & Morgenstern, O. Theory of Games and Economic Behavior
(Princeton Univ. Press, Princeton, New Jersey, USA, 1944).
19. Nash, J.F. Equilibrium points in n-person games. Proc. Natl. Acad. Sci. USA 36,
48–49 (1950).
20. Erev, I. & Roth, A.E. Predicting how people play games: reinforcement learning
in experimental games with unique, mixed strategy equilibria. Am. Econ. Rev. 88,
848–881 (1998).
21. Lee, D., Conroy, M.L., McGreevy, B.P. & Barraclough, D.J. Reinforcement learning and
decision making in monkeys during a competitive game. Brain Res. Cogn. Brain Res.
22, 45–58 (2004).
22. Sally, D. Conversation and cooperation in social dilemmas. Ration. Soc. 7,58–92
(1995).
23. Fehr, E. & Fischbacher, U. The nature of human altruism. Nature 425, 785–791
(2003).
24. Hauser, M.D., Chen, M.K., Chen, F. & Chuang, E. Give unto others: genetically
unrelated cotton-top tamarin monkeys preferentially give food to those who altruisti-
cally give food back. Proc. R. Soc. Lond. B 270, 2363–2370 (2003).
25. Warneken, F. & Tomasello, M. Altruistic helping in human infants and young chim-
panzees. Science 311, 1301–1303 (2006).
26. de Waal, F.B.M. Putting the altruism back into altruism: the evolution of empathy.
Annu. Rev. Psychol. 59, 279–300 (2008).
27. Nowak, M.A. Five rules for the evolution of cooperation. Science 314, 1560–1563
(2006).
28. Clutton-Brock, T.H. & Parker, G.A. Punishment in animal societies. Nature 373,
209–216 (1995).
29. Fehr, E. & G¨chter, S. Altruistic punishment in humans. Nature 415, 137–140
(2002).
30. Boyd, R., Gintis, H., Bowles, S. & Richerson, P.J. The evolution of altruistic punish-
ment. Proc. Natl. Acad. Sci. USA 100, 3531–3535 (2003).
31. Fehr, E. & Schmidt, K.M. A theory of fairness, competition, and cooperation. Q. J. Econ.
114, 817–868 (1999).
32. Fehr, E. & Camerer, C.F. Social neuroeconomics: the neural circuitry of social
preferences. Trends Cogn. Sci. 11, 419–427 (2007).
33. Sanfey, A.G. Social decision-making: insights from game theory and neuroscience.
Science 318, 598–602 (2007).
34. Jensen, K., Call, J. & Tomasello, M. Chimpanzees are rational maximizers in an
ultimatum game. Science 318, 107–109 (2007).
35. Sandholm, T.W. & Crites, R.H. Multiagent reinforcement learning in the iterated
prisoner’s dilemma. Biosystems 37, 147–166 (1996).
36. Lee, D., McGreevy, B.P. & Barraclough, D.J. Learning and decision making in monkeys
during a rock-paper-scissors game. Brain Res. Cogn. Brain Res. 25, 416–430 (2005).
37. Lohrenz, T., McCabe, K., Camerer, C.F. & Montague, P.R. Neural signature of fictive
learning signals in a sequential investment task. Proc. Natl. Acad. Sci. USA 104,
9493–9498 (2007).
38. Coricelli, G., Dolan, R.J. & Sirigu, A. Brain, emotion and decision making: the
paradigmatic example of regret. Trends Cogn. Sci. 11, 258–265 (2007).
39. Mookherjee, D. & Sopher, B. Learning and decision costs in experimental constant sum
games. Games Econ. Behav. 19, 97–132 (1997).
40. Feltovich, N. Reinforcement learning vs. belief-based learning models in experimental
asymmetric-information games. Econometrica 68, 605–641 (2000).
41. Nowak, M.A. & Sigmund, K. Evolution of indirect reciprocity by image scoring. Nature
393, 573–577 (1998).
42. Wedekind, C. & Milinski, M. Cooperation through image scoring in humans. Science
288, 850–852 (2000).
43. Delgado, M.R., Frank, R.H. & Phelps, E.A. Perceptions of moral character modulate
the neural systems of reward during the trust game. Nat. Neurosci. 8, 1611–1618
(2005).
44. Kawato, M. & Samejima, K. Efficient reinforcement learning: computational theories,
neuroscience and robotics. Curr. Opin. Neurobiol. 17, 205–212 (2007).
45. Schultz, W. Behavioral theories and the neurophysiology of reward. Annu. Rev. Psychol.
57, 87–115 (2006).
46. Roesch, M.R., Calu, D.J. & Schoenbaum, G. Dopamine neurons encode the better
option in rats deciding between differently delayed or sized rewards. Nat. Neurosci. 10,
1615–1624 (2007).
47. Bayer, H.M. & Glimcher, P.W. Midbrain dopamine neurons encode a quantitative reward
prediction error signal. Neuron 47, 129–141 (2005).
48. Paton, J.J., Belova, M.A., Morrison, S.E. & Salzman, C.D. The primate amygdala
represents the positive and negative value of visual stimuli during learning. Nature
439, 865–870 (2006).
49. Kawagoe, R., Takikawa, Y. & Hikosaka, O. Expectation of reward modulates cognitive
signals in the basal ganglia. Nat. Neurosci. 1, 411–416 (1998).
Conclusion
Social decision making is one of the most complex animal behaviors. It
often requires animals to recognize the intentions of other animals
correctly and to adjust behavioral strategies rapidly. In addition, humans
can cooperate or compete with one another, and various contextual
factors influence the extent to which humans are willing to sacrifice their
personal gains to increase or decrease the well-being of others. The
neural basis of such complex social decision making can be investigated
quantitatively by applying game theory. These studies find that the key
brain areas involved in reinforcement learning, such as the striatum and
orbitofrontal cortex, also underlie choices made in social settings.
Nevertheless, our current knowledge of neural mechanisms for social
decision making is still limited. This situation will improve as we come
to understand the genetic and neurophysiological basis of information
processing in the brain’s reward system.
ACKNOWLEDGMENTS
I am grateful to Michael Frank for discussions. This research was supported by
the US National Institutes of Health (MH073246 and DA024855).
Reprints and permissions information is available online at http://npg.nature.com/
1. Kahneman, D. & Tversky, A. Prospect theory: an analysis of decision under risk.
Econometrica 47, 263–292 (1979).
2. Sutton, R.S. & Barton, A.G. Reinforcement Learning: An Introduction (MIT Press,
Cambridge, Massachusetts, USA, 1998).
3. Platt, M.L. & Glimcher, P.W. Neural correlates of decision variables in parietal cortex.
Nature 400, 233–238 (1999).
4. Breiter, H.C., Aharon, I., Kahneman, D., Dale, A. & Shizgal, P. Functional imaging of
neural responses to expectancy and experience of monetary gains and losses. Neuron
30, 619–639 (2001).
5. Fiorillo, C.D., Tobler, P.N. & Schultz, W. Discrete coding of reward probability and
uncertainty by dopamine neurons. Science 299, 1898–1902 (2003).
6. Glimcher, P.W. & Rustichini, A. Neuroeconomics: the consilience of brain and decision.
Science 306, 447–452 (2004).
7. Knutson, B., Taylor, J., Kaufman, M., Peterson, R. & Glover, G. Distributed neural
representation of expected value. J. Neurosci. 25, 4806–4812 (2005).
8. Hsu, M., Bhatt, M., Adolphs, R., Tranel, D. & Camerer, C.F. Neural systems responding
to degrees of uncertainty in human decision-making. Science 310, 1680–1683 (2005).
9. Preuschoff, K., Bossaerts, P. & Quartz, S.R. Neural differentiation of expected reward
and risk in human subcortical structures. Neuron 51, 381–390 (2006).
10. Schultz, W., Dayan, P. & Montague, P.R. A neural substrate of prediction and reward.
Science 275, 1593–1599 (1997).
11. O’Doherty, J.P., Dayan, P., Friston, K., Critchley, H. & Dolan, R.J. Temporal difference
models and reward-related learning in the human brain. Neuron 38, 329–337 (2003).
408
VOLUME 11 [ NUMBER 4 [ APRIL 2008 NATURE NEUROSCIENCE
253101341.022.png 253101341.023.png
Zgłoś jeśli naruszono regulamin