1 a ads-b_sdr poster of joe zambrano (v1.1) – print resolution
Empirical Implications of Information Structure in Finite ... · November 10, 2004 Our thinking on...
Transcript of Empirical Implications of Information Structure in Finite ... · November 10, 2004 Our thinking on...
Empirical Implications of InformationStructure in Finite-LengthExtensive-Form Games�
Jose Penalva-ZuastiUniversitat Pompeu Fabra
Michael D. Ryally
University of Melbourne
November 10, 2004
�Our thinking on the issues analyzed in this paper have been substantially in�uenced by many�ne colleagues, including Adam Brandenburger, Colin Camerer, David Chickering, Catherine deFontenay, Eddie Dekel, Bryan Ellickson, Joshua Gans, Fabrizio Germano, Ehud Kalai, U¤e Kjærul¤,David Levine, Steven Lippman, Glenn MacDonald, Mark Machina, Leslie Marx, Joseph Ostroy,Scott Page, Judea Pearl, Ben Polak, Larry Samuelson, Joel Watson, Eduardo Zambrano, Bill Zame,participants in the MEDS, Georgetown, UCLA, and UCSD theory workshops as well as the StanfordSITE and Stonybrook theory conferences. The authors are, of course, entirely responsible for allshortcomings.
yCorrespondence should be sent to: M. D. Ryall, Melbourne Business School, 200 Leicester St.,Carlton, VIC 3053, Australia.
1
Abstract
We analyze what can be inferred about a game�s information structure solely from
the probability distributions on action pro�les generated during play. The kinds of
situations we have in mind are those in which the only data available to a social scien-
tist analyzing a game with unknown information structure is a cross-sectional listing
of actions actually taken by each of the players; i.e., data that is uninformative with
respect to who moved when and what they knew at the time of their action choice.
One game is said to empirically compatible to another when the empirical distribu-
tion induced by any behavior strategy pro�le in the latter can also be induced by an
appropriately chosen behavior strategy pro�le in the former. We introduce the in�u-
ence opportunity diagram of a game �a directed, acyclic graph in which the nodes
correspond to player moves and arcs to in�uence possibilities �and demonstrate a
simple methodology that uses such diagrams to test for empirical compatibility. The
essential idea is to connect a game�s information structure, which identi�es the indi-
vidual histories upon which players condition their behavior, to a corresponding set
of conditional independencies that must be observed in all of its empirical distribu-
tions. When two games with di¤erent information structures imply di¤erent sets of
such independencies, then knowledge of the empirical distribution provides a basis
upon which one may be distinguished from the other. In�uence opportunity diagrams
summarize these independencies in a useful way.
Keywords: empirical inference, information structure, extensive form, compatibility,
Bayesian network, causality
2
1 Introduction
Applied noncooperative game theory has spawned numerous streams of empirical
research in economics and other social sciences. A common feature of virtually all
contributions along these lines is that they begin with a full speci�cation of the game
presumed to generate the data.1 An equilibrium concept is then applied and either
tested or used to infer unknown values, depending upon the type of study at hand.
It is well understood that di¤erent assumptions about game structure (particularly
about payo¤s and information) lead to very di¤erent empirical implications. Indeed,
a common complaint invariably encountered at some point by every game theorist
is that any observed economic phenomenon can be described by some appropriately
constructed game and, hence, the theory has no real empirical content. This critique
would be neutralized were the theory able to deliver, say, a joint test of game structure
and the equilibrium concept of interest or, alternatively, a procedure in which game
structure is estimated from data, an equilibrium concept is applied and unknown
values of interest are empirically quanti�ed.
Unfortunately, little is presently known about the estimation of game structure
from data. A notable exception is the developing literature on the testable implica-
tions of game theory in applications with unobserved payo¤s (e.g., Sprumont, 2000;
Ray and Zhou, 2001; Ray and Snyder, 2004). In these papers, unobserved agent
preferences are assumed to meet certain consistency restrictions (e.g., completeness,
transitivity). Then, using classical revealed preference style reasoning (Samuelson,
1938), results are derived that allow one to determine whether observed outcomes are
consistent with certain equilibrium concepts (e.g., Nash versus subgame perfection).
Promising though they are, application of these results requires the pre-speci�cation
of the underlying information structure �an (untested) assumption with respect to
which their implications are not generally robust.
In this paper, we take one step forward in addressing these issues. Speci�cally, we
1In the case of the experimental literature, the structure of the game is actually known.
3
analyze what can be inferred about a game�s information structure solely from the
probability distributions on action pro�les generated during play (which we refer to
as the �empirical distributions�induced by the pro�le of player behavior strategies).
The kinds of situations we have in mind are those in which the only data available to
the social scientist is a cross-sectional listing of actions actually taken by each of the
players; i.e., data that is uninformative with respect to who moved when and what
they knew at the time of their action choice. We wish to analyze the extent to which
such data identi�es the information structure of the underlying game.
To begin, we introduce the notion of empirical compatibility. One game is said
to empirically compatible to another when the empirical distribution induced by any
behavior strategy pro�le in the latter can also be induced by an appropriately chosen
behavior strategy pro�le in the former. Thus, even under in�nite repetition, it is
impossible to distinguish a game from any in its empirical compatibility class solely
on the basis of the observed behavior of its players. Hence, empirical compatibil-
ity is su¢ cient to render two games observationally indistinguishable without, e.g.,
additional assumptions about equilibrium behavior.
An important result of this paper is to associate with a certain class of economi-
cally important, �nite-length extensive form games a surprisingly simple methodology
by which to test empirical compatibility. The essential idea is to connect a game�s
information structure, which identi�es the individual histories upon which players
condition their behavior, to a corresponding set of conditional independencies that
must be observed in all of its empirical distributions. When two games with di¤erent
information structures imply di¤erent sets of such independencies, then knowledge
of the empirical distribution provides a basis upon which one may be distinguished
from the other.
Our analysis is facilitated by the introduction of a new graphical device, the
in�uence opportunity diagram of a game (hereafter, IOD). The IOD is a directed,
acyclic graph whose nodes correspond to the moves made by players in the under-
4
lying game. Given an extensive form game, we illustrate how to construct its IOD.
Then, we demonstrate that the IOD is an �independence map� of every empirical
distribution that can be generated in the underlying game. That is, according to
a particular notion of conditional independence in graphs (introduced below), every
conditional independence in a game�s IOD implies a corresponding conditional inde-
pendence between player behaviors in all of the game�s empirical distributions. Our
work is closely related to, indeed contributes to, the rapidly-expanding literature on
probabilistic networks.2
This result is signi�cant because a necessary requirement for one game to be em-
pirically compatible to another is that their IODs imply a consistent set of conditional
independencies. This consistency, as it turns out, is relatively easy to determine (i.e.,
via visual inspection of the graphs). IOD consistency is not, in general, su¢ cient to
establish empirical compatibility because di¤erences in the speci�c information upon
which players condition their behavior may imply additional restrictions on empirical
distributions that are not picked up by the IOD. However, we do identify an econom-
ically important subclass of games, termed games with observability partitions, for
which this condition is also su¢ cient.
The layout of the remainder of the paper is as follows. In the next section,
we present two extended examples designed to motivate our results and illuminate
the ideas that are generalized later in the paper. In §3, we set up the formalities.
§4 presents and illustrates the de�nition of a game�s IOD. Some useful preliminary
results are contained in §5. The de�nition of empirical compatibility is presented in
§6 along with illustrative examples. Our main results are found in §7. Extensions are
discussed in §8. In §9, we provide some concluding remarks.
2Since most economists are not familiar with this literature, we have included a condensed,
self-contained discussion of some relevant results and references in Appendix A.
5
2 Examples and Intuition
In this section, we present two examples designed: a) to highlight the issues that
arise when game structure is unknown, (one in which the issue of interest is a test
of equilibrium re�nement and another in which the issue is estimation of cost para-
meters in a market game); and, b) to illustrate the notion of empirical compatibility
and foreshadow our analytical approach. Throughout, we assume that the data ob-
served by the social scientist is generated by an extensive form game in which the
participating players know all the relevant structural details. Since the interest is in
empirical work, we also assume that players take noisy actions and that the noise
is independent across player moves. To keep things simple in this section, we do
not add explicit �nature�nodes to the extensive forms to represent the noise terms
but, instead, embed them directly into player strategies in the form of independent
�trembles.�
2.1 A test of equilibrium
Suppose a social scientist observes a two player interaction in which player I plays
either L or R and player II either u or d. The scientist hypothesizes that the true
underlying game is the one shown in Figure 1 below. Although she is certain that
this model captures all the relevant agents (e.g., there are no unobserved factors
in�uencing outcomes outside the choices of the two agents shown) and their payo¤s,
she is unsure of the game�s information structure (who moves when and what they
know at the time).
The social scientist�s data set represents a tally of the outcomes observed over a
large number of repetitions. She would like to know whether the data are consistent
with one of the game�s two pure strategy Nash equilibria, (L; u) and (R; d) : Assuming
player I chooses an action with noise �I and player II with noise �II ; the data set
should be consistent with one of the following frequency tables (the one on the left
6
ΙΙ
Ι
L R
du
ΙΙ
du
(10, 10) (5, 5)(0, 0)(0, 0)
ΙΙ
Ι
L R
du
ΙΙ
du
(10, 10) (5, 5)(0, 0)(0, 0)
Figure 1: two player, simultaneous move game
corresponding to (L; u) and the one on the right to (R; d)).
Table 1
Outcome Frequency
(Lu) (1� �I) (1� �II)
(Ld) (1� �I) (�II)
(Ru) (�I) (1� �II)
(Rd) (�I) (�II)
Table 2
Outcome Frequency
(Lu) (�I) (�II)
(Ld) (�I) (1� �II)
(Ru) (1� �I) (�II)
(Rd) (1� �I) (1� �II)
Suppose the actual frequencies of outcomes are as shown in the following table.
Table 3
Outcome Frequency
(Lu) :81
(Ld) :09
(Ru) :01
(Rd) :09
We call this the empirical distribution generated by play of some behavior strategy
7
pro�le. It is not di¢ cult to demonstrate that these frequencies do not conform to
either of the equilibrium distributions hypothesized above. There are no parameters
�I and �II that deliver the observed values. Apparently, then, the Nash hypothesis
should be rejected.
Such a conclusion would be premature, however. In fact, the data is consistent
with a Nash equilibrium (in particular, a subgame perfect one) in a di¤erent game
than the one shown in Figure 1. To see this, �rst note that the structure of the
hypothesized game (simultaneous move) implies Pr (~aI ; ~aII) = Pr (~aI) Pr (~aII) for all
empirical distributions generated by play (where ~ai is the action taken by player i).3
This restriction is clearly violated by the distribution in Table 3. By the de�nition
of conditional probability, every empirical distribution from any two player game can
be factored as
Pr (~aI ; ~aII) = Pr (~aII j~aI) Pr (~aI) : (1)
The parameters that deliver the distribution in Table 3 according to the factorization
in (1) are as follows.
L R
Pr (~aI) :9 :1
~aI Pr (uj~aI) Pr (dj~aI)
L :9 :1
R :1 :9
This is signi�cant because these parameters correspond exactly to (noisy) behavior
strategies in a di¤erent game, namely the one in which player I moves R or L, is
observed by player II who then moves u or d. Indeed, the behavior strategy pro�le
implied by these parameters is a subgame perfect equilibrium in which �I = �II = :1.
Hence, the data summarized in Table 3 is consistent with the joint hypothesis that
the underlying game is the complete information game with player I moving �rst and
in which play proceeds according to the subgame perfect equilibrium of that game.
3Correlated equilibria are ruled out by the assumption that the player set is complete; i.e., there
are no hidden correlating devices in the form of nature moves or historical player moves.
8
Are there other joint hypotheses with which this data is consistent? The answer
is yes. To see this, note that, again simply by applying the de�nition of conditional
probability, we can equally well consider the factorization
Pr (~aI ; ~aII) = Pr (~aI j~aII) Pr (~aII) : (2)
The corresponding parameters are
u d
Pr (~aII) :82 :18
~aII Pr (Lj~aII) Pr (Rj~aII)
u :99 :01
d :50 :50
These parameters correspond exactly to the game of complete information in which
player II moves �rst. In this case, play is consistent with a subgame perfect equilib-
rium with �II = :18 and error parameters for player I at his respective information
sets of �I;u = :01 and �I;d = :5:
Notice that an implication of equations (1) and (2) is that any empirical distrib-
ution arising as a result of play in the complete information game in which I moves
�rst can be empirically imitated by an appropriately chosen behavior strategy in the
complete information game in which II moves �rst, and visa versa. When two games
have this property, we say they are empirically equivalent.
Which, if either, of the two data-consistent joint hypotheses seem reasonable de-
pend upon what the social scientist wishes to assume about the relative magnitude of
the error terms, their consistency across information sets, and so on. The important
point is that the data in Table 3 rules out the game in Figure 1 [(as we will later
demonstrate)] and, hence, any equilibrium implications that might be drawn from it.
Moreover, the data implies an equivalence class of games with which it is consistent
and, happily in this example, equilibrium behavior as well.
9
2.2 Inferring �rm costs in market games
Now, suppose the social scientist is interested in estimating costs for a 3-�rm industry,
say automobiles, for which market shares have been estimated to be 40%, 40%, and
20% on aggregate unit volume of 100. As before, these estimates are taken from a
large number of observations. It is assumed that the industry is governed by a stable
Cournot mechanism in which inverse demand is estimated as p � 200�P3
i=1 qi: It is
not hard to show that the implied costs are c1 = c2 = 67 and c3 = 93; that is, �rm
3�s low market share is apparently due to a substantial cost disadvantage.
Suppose, however, that upon further inspection and controlling for external cor-
relating factors, our empiricist discovers that the estimated empirical distribution on
�rm production choices can be factored as
Pr (q1; q2; q3) = Pr (q3jq1; q2) Pr (q1) Pr (q2) (3)
but not
Pr (q1; q2; q3) = Pr (q3) Pr (q1) Pr (q2) : (4)
The �nding of (3) and not (4) is highly problematic with respect to the Cournot
hypothesis for, under the simultaneous-move assumption of Cournot, �rm quan-
tity choices should be independent. The dependencies in (3) suggest a di¤erent,
Stackleberg-type mechanism, one in which �rms 1 and 2 move �rst and are observed
by 3 prior to its production decision. This latter information structure is intuitively
captured by the directed graph
1! 3 2: (5)
As we will show momentarily, there is a close connection between graphs of this
type, which we call in�uence opportunity diagrams, and the conditional independen-
cies implied by a game�s information structure. Estimating costs according to the
Stackleberg-type consistent with (5) leads to the conclusion
c1 = c2 = c3 = 80;
10
which is both quantitatively and qualitatively di¤erent than that reached under
Cournot.
Finally, foreshadowing the results to follow, we simply point out that the Stack-
leberg game described by (5) is the only game in which every behavior strategy
generates an empirical distribution that can be factored as in (3). Moreover, this
conclusion can be reached by examination the game�s IOD without any further ref-
erence to its information structure.
3 Setup
Wherever possible, capital letters (X;Z) denote sets, small letters (a; w) elements of
sets and functions, and script letters, (A;F) collections of sets. Sets with product
structure are indicated by bold capitals (A;E) with small bold (a; e) denoting typical
elements (ordered pro�les) in such sets. Graphs and probability spaces play a large
role in the following analysis. Standard notation and de�nitions are adopted wherever
possible.
3.1 Finite extensive-form games
In order to make our analysis as transparent as possible, we begin with the simple
case of a �nite extensive form game in which each player has one move. We also
assume that the game has product structure; that is, a player has access to the same
set of feasible actions at every node. Later, we extend our results to more general
cases.
Let N � f1; :::; ng ; n < 1; index the set of players such that player k is the
one who has the kth move. The game tree is denoted (X;E) with nodes X, edges
E � X � X; and terminal nodes Z � X. The set of move-k nodes is denoted Xk:
Player k has a �nite set of move-k actions labelled Ak. The set of action pro�les is
A � �k2NAk: The assumption of product structure implies that a bijection exists
11
between Z and A. The payo¤ function is ~u : Z ! Rn in which component ~uk (z)
indicates the payo¤ to agent k when the outcome is z 2 Z: Finally, the information
structure is represented by X � fX1; :::;Xng where Xk � fXk;1; :::; Xk;mkg is the
partition of the move-k game tree nodes into their corresponding information sets.
Two conditions that are maintained throughout the paper are: 1) correlated
strategies are prohibited without an explicit correlating device (i.e., all factors that
have the potential to simultaneously in�uence the behavior of multiple players are
explicitly identi�ed in the model), and 2) no player has the ability to alter the in�u-
ence relationships of his opponents (e.g., the player with the �rst move cannot take
an action that switches the players moving second and third). In order to identify
in�uence relations from empirical data, it seems reasonable to limit our attention to
situations in which such relations are stationary.4
We adopt the convention that functions on Z are denoted with tildes to both
emphasize this fact and to distinguish such functions from the speci�c values they
take: For example, we de�ne the function ~xk : Z ! X where ~xk (z) indicates the kth
node on the game tree path terminating at z; so, ~xk (z) = x for some x 2 Xk: Similarly,
~ak : Z ! Ak where ~ak (z) is the action corresponding to the kth edge on the game tree
path terminating at z: Then, ~a � (~a1; :::; ~an) so that ~a (z) 2 A identi�es the action
pro�le associated with z 2 Z: The move-k history is given by ~hk � (~a1; :::; ~ak�1) with~h1 an arbitrary constant. Finally, let ~Xk(z) = Xk;l indicate the move-k information
set that intersects the path terminating in z (i.e., Xk 3 Xk;l 3 ~xk(z)).
Taking the perspective of a social scientist viewing the game from the outside,
the strategies adopted by the players are unobserved parameters that determine the
observed outcomes. Let lk � jAkj denote the number of actions in Ak: Then, every
node in Xk has lk edges emanating from it: A behavior strategy for player k 2 N in
4This is implied by the product structure of the game.
12
this �nite setup is a matrix of conditional probabilities
�k �
26664�k1;1 � � � �k1;lk...
. . ....
�kmk;1� � � �kmk;lk
37775in which �ki;j is the probability player k places on action ak;j 2 Ak upon reaching
information set Xk;i 2 Xk: Hence, each row of �k corresponds to an information set
and constitutes a probability distribution on the elements of Ak: Let� � (�1; :::;�n)
denote a strategy pro�le and � the set of all such pro�les.
To keep track of strategy parameters, let ~�k (z) � �ki;j where ~Xk (z) = Xk;i and
~ak (z) = ak;j: So, ~�k (z) indicates the probability associated with the kth edge in the
path terminating at z given player k�s behavior strategy. As usual, behavior strategies
are restricted to assign equal probabilities to identical actions in the same information
set; that is,�8z; z0 2 Zj ~Xk (z) = ~Xk (z
0) ; ~ak (z) = ~ak (z0)�; ~�k (z) = ~�k (z
0) : (6)
In this context, the function ~� ��~�1; :::; ~�n
�identi�es the edge probabilities associ-
ated with each branch of the game tree given �:
3.2 Empirical distributions
In what follows, we assume that our intrepid social scientist knows who the players
are and, at the conclusion of play observes the actions taken by each; to him, a �data
point� is a list of actions along with the identities of the players that took them.
Every strategy pro�le � 2 � generates a probability measure �� on�Z; 2Z
�de�ned
in the following way:
8W 2 2Z ; �� (W ) �Xz2W
Yk2N
~�k (z) : (7)
We say �� is the empirical distribution induced by �. Note that, given the bijection
between Z and A; �� immediately implies a corresponding distribution on�A; 2A
�:
13
On the one hand, focusing on�A; 2A
�emphasizes our interest in what the social
scientist observes: On the other,�Z; 2Z
�is more useful in proving our results and, so,
this is the measure space used in most of what follows.
Note that, for all � 2 �, the de�nition of conditional probability permits us to
factor �� as follows
8z 2 Z; �� (z) �nYk=1
��
�~akj~hk
�(z) ��-a:e:; (8)
where ���~akj~hk
�is the ��-conditional distribution of ~ak given ~hk: To reduce nota-
tional clutter from this point on, we omit the ���-a:e:�quali�ers whenever it is clear
from the context.
For all i; k 2 N such that i < k; let ~hkni � (~a1; :::; ~ai�1; ~ai+1; :::; ~ak�1) ; i.e., ~hk ex-
cluding ~ai. Given� 2 �; from the de�nition of conditional probability and condition
(8), for all i; k 2 N such that i < k; ~ak is ��-independent of ~ai given ~hkni if
��
�~akj~hk
�= ��
�~akj~hkni
�(9)
In general, given random variables ~x; ~y; ~z on�Z; 2Z ; ��
�; the notation (~x ? ~yj~z)�
means: ~x is ��-independent of ~y given ~z.
4 In�uence opportunity diagrams
We now introduce the formal de�nition of an in�uence opportunity diagram for the
extensive form of a game (the description of the game without payo¤s). Every exten-
sive form game has an in�uence opportunity diagram with which it is associated and
is de�ned independently of player preferences. As we see from the following de�nition,
the construction is quite intuitive.
De�nition 1 The in�uence opportunity diagram (hereafter, IOD) implied by a �nite
extensive form (N; (X;E) ;A;X ) is a directed graph (N;!) such that i ! k if and
14
only if i < k, lk > 1 and there exist z 2 Z and z0 2 ~h�1kni
�~hkni (z)
�; such that
~Xk (z) 6= ~Xk (z0).
Hence, i! k if and only if there exists at least one move-k history (a1; :::; ai; :::; ak�1)
such that the player moving at i has the ability to �unilaterally �place the player
moving at k into a di¤erent information set simply by deviating to some other ac-
tion (i.e., holding other players�actions constant). When i ! k; in�uence (in the
sense that i and k�s moves are correlated in the empirical distribution) is only an
�opportunity�because player k may always choose to ignore the actions of player i.
Neither is i! k required for the appearance of in�uence in an empirical distribution
since player i may a¤ect player k�s behavior indirectly through some other player or
players.
Lemma 1 Every IOD is acyclic.
Proof. This follows immediately from the tree structure of (X;E) and the re-
quirement that i! k only if i < k:
To illustrate the construction we construct the IOD for extensive form of the
standard signalling game (Figure 2). In this game N denotes the Nature player (for
purposes of illustration we will treat N as a standard player). The signalling game
proceeds in the usual way: player I observes player N�s move before he chooses his
own and then player II, after observing player I�s move, chooses her action. The
set of possible play paths are de�ned by the product set fU;Dg � fL;Rg � fu; dg
� examples of paths of play are (U;L; u), (U;R; d), (D;L; u). Given the natural
order de�ned by the EF we start with the Nature player � this player moves �rst
and no one has the opportunity to in�uence it. Player I�s information is de�ned by
the partition �2 = fX2;1; X2;2g, where X2;1 = f(U � �)g, X2;2 = f(D � �)g (where �
represents a placeholder representing all the moves made by the player who chooses
the corresponding move). Consider N ! I, (1 ! 2), and pick a path, say (U;L; u);
~h2(U;L; u) = (U) so that ~h2n1(U;L; u) = ;. Clearly ~h�12n1(;) = Z. In this case we just
15
need to �nd a path, z0, such that ~X2(z0) 6= ~X2;1. As ~X2;1 = f(U; �; �)g any path in
f(D; �; �)g, say (D;L; u), su¢ ces to show that N ! I (1! 2).
N II
I
I
II
U
D
L R
L R
u
d
u
d
u
d
u
d
a1
a4
a3
a2
a5
a6
a7
a8
N II
I
I
II
U
D
L R
L R
u
d
u
d
u
d
u
d
a1
a4
a3
a2
a5
a6
a7
a8
Figure 2: extensive form of the Signalling game
Let us move to player II, which has two information sets X3;1 = f(�L�)g; X3;2 =
f(�; R; �)g. We want to see whether 2! 3: let us pick the same path as before, z =
(U;L; u); ~h3(U;L; u) = (U;L) so that ~h3n2(U;L; u) = (U). Now, ~h�13n2(U) = f(U; �; �)g.
Given ~X3;1 = f(�L�)g, pick z0 = (U;R; u). Clearly, z0 2 ~h�13n2�~h3n2(U;L; u)
�and
z0 62 ~X3;1, so that I ! II (2 ! 3). Finally, consider whether N ! II (1 ! 3) and
again use z = (U;L; u); ~h3n1(U;L; u) = (L) and ~h�13n1(L) = f(�; U; �)g. Now, for all
z0 2 ~h�13n1�~h3n1(U;L; u)
�, z0 2 ~X3;1. If we were to pick z in the other information set,
X3;2 = f(�R�)g the same arguments would be valid so that N 6! II (1 6! 3).
Observe that the � notation used in the description of information sets gives us
a quick though informal way to construct the IOD, namely, i ! k if i < k and one
cannot describe k�s information sets using � for all of i�s moves (alternatively, if in
the description of k�s information sets, an asterisk appears at every ith location then
i 6! k).
16
5 Preliminary Results
The following lemma establishes the rather obvious result that the conditional prob-
ability of ~ak given a history ~hk is simply the value of the probability parameter on the
edge associated with ~ak following the history ~hk. Although the proof is in Appendix
B, working through the proof will establish some useful ideas in a simple context.
Lemma 2 Given a game �, for all k 2 N; � 2 �; z 2 Z;
��
�~akj~hk
�(z) = ~�k (z) :
The next lemma demonstrates an important �nding regarding the relationship
between ~ak and ~ai under the IOD associated with a �nite extensive form. The proof
relies on an extension of the ideas used to prove Lemma 2.
Lemma 3 Given a game � and its IOD (N;!) ; for all k 2 N and all i 2 f1; :::; k � 1g ;
(i9 k), (8� 2 �)�~ak ? ~aij~hkni
��:
Proof. See Appendix B.
The following proposition is critical to all that follows. It establishes a direct
connection between a game�s IOD and certain conditional independencies that must
hold in every one of its empirical distributions. Given an IOD (N;!) ; let �k denote
the parents of k and, in this context, let ~�k represent the projection of ~hk into the
dimensions indexed by �k; the notation ~hkn�k has the obvious meaning.
Lemma 4 Given a game � and its IOD (N;!) ; for all k 2 N and all � 2 �;�~ak ? ~hkn�k j~�k
��:
Proof. See Appendix B.
The main proposition in this section, which follows, demonstrates the connection
between game�s IOD and any of the empirical distributions that might be generated as
a result of play. Speci�cally, the result is that every empirical distribution generated
17
in the play of a game can be factored in accordance with the parental relations
embodied in its IOD. This linkage will prove useful in that it allows certain empirical
properties of games to be easily compared by referring directly to their IODs rather
than the empirical distributions themselves.
Proposition 1 Given a game �; for all � 2 �; �� admits recursive factorization
according to (N;!) : That is,
� 2 �; 8�� =nYk=1
�� (~akj~�k) : (10)
Proof. Apply Lemma 4 to condition (8).
It should be noted that a game�s IOD is �minimal�in the sense that there exist
empirical distributions for which the equation in (10) fails by the removal of any
element of �k for all k 2 N: Such distributions are said to be maximally revealing5.
Of course, most empirical distributions will have more independencies than the ones
implied by (10). The canonical example of such a distribution is the one in which
all players ignore whatever information they have when making their move, in which
case �� =nYk=1
�� (~ak) :
6 Empirical Compatibility and Equivalence
A game �0 is said to be empirically compatible to � when the empirical distribution
induced by any strategy pro�le in � can also be induced by an appropriately chosen
strategy pro�le in �0. The social
scientist observing outcomes generated by repeated play of� in � will, eventually,
develop a fairly precise estimate of ��: However, if �0 is empirically compatible to �;
5Such distributions are known as distributions for which (N;!) is a perfect map (Geiger et al
(1990)) or that the distribution is faithful to (N;!) (Meek(1995)). Meek(1995) demonstrates that
for any DAG and �xed discrete state space there is a distribution to which that graph is a perfect
map.
18
then there is no collection of �-generated data capable of ruling out �0 as the true
underlying game.
An obvious necessary condition for empirical compatibility is that the games have
consistent player sets and outcome pro�les. Given a permutation f : N ! N; let
f (a) denote the permuted pro�le�af(k)
�k2N : Then, � and �
0 are said to be outcome
compatible if there exists an f such that f (A) = A0. Note that when � and �0 are
outcome compatible; there exists a bijection between A and A0 and, hence, between
Z and Z 0: Thus, every �0 2 �0 implies a distribution on�A; 2A
�which we denote by
��0 (this distribution is constructed as in (7) but using the appropriate permutation).
Outcome compatibility is an equivalence relation.
De�nition 2 �0 is empirically compatible to �; denoted � � �0, if � and �0 are
outcome compatible and there exists a function g : � ! �0 such that 8� 2 �;
�� = �g(�):
If � � �0 and �0 � �; then � and �0 are said to be empirically equivalent, de-
noted � � �0: When �0 and � are empirically equivalent, any behavior observed
under � (�observed�in the sense of knowing ��) can also be observed under �0 and
visa versa. Empirical compatibility is strong in that the condition must hold for all
� 2 �. Alternatively, for example, one might be interested in a notion of empirical
compatibility de�ned only for speci�c (e.g., equilibrium) pro�les. As we saw in §2.1,
empirical equivalence does not imply equivalent information structures (recall, we
demonstrated that the complete information game in which I was observed by II
was empirically equivalent to the one in which II was observed by I).
Proposition 2 Empirical equivalence is an equivalence relation on the space of �nite-
length extensive form games. Moreover, if � � �0; there exists an onto correspondence
g : �� �0 such that 8� 2 �;�0 2 g (�) ; �� = ��0 :
Let us turn again to the standard signalling game (treating the nature player as
a regular player). Outcome compatible games are those that have three players, one
19
player with action set fU;Dg, another with fu; dg and the third with fL;Rg. An
example of a game that is empirically compatible to the signalling is one (game 1)
in which II also observes the Nature player, i.e. N ! I, I ! II and also N !
II. Clearly, any strategy in the standard signalling game can be replicated in this
extended game as player II can always ignore what the Nature player has done in
choosing her strategy.
L R
du
Ι
ΙΙ ΙΙ
DU
Ν
DU
Ν
du
DU
Ν
DU
Ν
a1 a4a3 a2 a5 a6a7 a8
L R
du
Ι
ΙΙ ΙΙ
DU
Ν
DU
Ν
du
DU
Ν
DU
Ν
a1 a4a3 a2 a5 a6a7 a8
Figure 3: a variant on the Signalling extensive form
A subtler situation occurs with an alternative game (Figure 3) in which both
player II and player N observe player I (and nothing else). Consider an arbitrary
strategy in this game, �0. Equating moves with player labels, and replacing histories
with the appropriate actions �0 can be decomposed as follows
��0 = ��0(~aI)��0 (~aN j~aI)��0 (~aII j~aI)
= ��0(~aI)��0 (~aN ; ~aI)
��0 (~aI)
��0 (~aII ; ~aI)
��0 (~aI)
= ��0 (~aN ; ~aI)��0 (~aII ; ~aI)
��0 (~aI)
20
Let us return to the standard signalling game and consider an arbitrary strategy �.
Using the same procedure as above, � factors as follows
�� = ��(~aN)�� (~aI j~aN)�� (~aII j~aI)
= ��(~aN)�� (~aN ; ~aI)
�� (~aN)
�� (~aII ; ~aI)
�� (~aI)
= �� (~aN ; ~aI)�� (~aII ; ~aI)
�� (~aI)
It is clear by visual inspection that any � from the standard signalling can be repli-
cated in game 2 �just de�ne �0 such that it assigns strategies in such a way that:
�� (~aN ; ~aI) = ��0 (~aN ; ~aI)
That is6
�I(R) = �� (U;R) + �I(R)�N(U jR)
�N(U jR) =�� (U;R)
�I(R)
�N(U jL) =�� (U;L)
1� �I(R)
Similarly, for any strategy, �0 in game 2, we can construct an equivalent � so
that the signalling game and game 2 are empirically equivalent7.
6The function �� (~aN ; ~aI) (z) can take four values. Fix the value of �� (D;L) by the restriction
that all four probabilities have to add to one. This leaves three free parameters: �I(R), �N (U jL)
and �N (U jR), to be determined from three equations
�� (U;R) = �I(R)�N (U jR)
�� (D;R) = �I(R)(1� �N (U jR))
�� (U;L) = (1� �I(R))�N (U jL)
7This conclusion would not hold if we were to impose the additional (and usual) restriction that
the Nature player cannot condition its behavior on any other player�s actions.
21
7 Assessing Empirical Compatibility
Now, imagine our social scientist observes a large number of outcomes generated
by repetition of a game with unknown information structure. To what extent does
such data illuminate the game�s underlying information structure? Given a candidate
game, empirical compatibility can be checked by the same brute-force approach used
in the earlier examples. In simple cases, this form of analysis is relatively straightfor-
ward.
On the other hand, consider the �Gatekeeper� extensive form shown in Figure
5. Here, 5 players interact under a relatively complex information structure. The
implications of this structure for the empirical distributions on actions arising from
player strategies are not obvious. We now develop results by which these implications
are neatly analyzed.
To begin, Proposition 1 says that every empirical distribution generated during
play of a game must admit recursive factorization according to its IOD. This imme-
diately implies that if one is interested in inferring a game�s information structure
from the empirical distributions it generates, consideration can be limited to games
whose IODs provide consistent bases for factorization. The following proposition is
an immediate corollary to Proposition 1.
Proposition 3 If � � �0 then,
8� 2 �; �� =Yk2N
�� (~akj~�0k) : (11)
To see why condition (11) is not su¢ cient for � � �0, consider the outcome
compatible games in Figure 4. Both have the same IOD: I ! II: It should be easy
to see that �2 � �1 but �1 � �2: The general issue is a measurability distinction
between the feasible strategies for player 2 in each of the games. Let � (�) indicate
the sigma-algebra generated by either a collection of sets or a function. Set I1 �
ffz1; z2g ; fz3; z4g ; fz5; z6gg and I2 � ffz1; z2g ; fz3; z4; z5; z6gg. Note that � (I2) �
22
� (I1) : It can be shown that, in �2;
8� 2 �2; �� = �� (~aII j� (I2))�� (~aI) : (12)
In �1; ��~�1II�= � (I1) while in �2, � (I2) � �
�~�2II�: This implies that �
�~�2II�
or ��~�1II�can be substituted for � (I2) in (12) so that all empirical distributions
generated under �2 can be factored as (10) or (11), respectively. In �2; player II�s
strategy function, ~�2
II ; is only � (I2)-measurable. Since � (I2) � � (I1) ; there are
strategies that can be implemented by II in �1 that cannot be implemented in �2:
# 1 # 2
ΙΙ ΙΙ
ΙL R
du
ΙΙ
dudu
M
ΙΙ ΙΙ
ΙL R
du
ΙΙ
dudu
M
z1 z2 z3 z4 z5 z6 z1 z2 z3 z4 z5 z6
# 1 # 2
ΙΙ ΙΙ
ΙL R
du
ΙΙ
dudu
M
ΙΙ ΙΙ
ΙL R
du
ΙΙ
dudu
M
z1 z2 z3 z4 z5 z6 z1 z2 z3 z4 z5 z6
Figure 4: �2 � �1 but �2 6� �1:
For certain types of games, Proposition 3 can be strengthened. � is said to be a
game with observability partitions if, for all k 2 N; player k observes the moves of
those preceding her either perfectly or not at all (hence, players 1; :::; k � 1 can be
partitioned into two sets: those whose actions are seen by k and those whose actions
are not). This class contains many extensive-form games of economic interest: all
of the games in Section 2 meet this requirement as do many standard market games
such as Cournot, Stackleberg, etc.
Proposition 4 Let �0 be a game with observability partitions. Then, � � �0 if and
only if � and �0 are outcome compatible and condition (11) holds.
23
To actually refute � � �0; one need not check every empirical distribution against
condition (11). Rather, one need only construct a maximally revealing strategy pro-
�le � in �, and then check whether the equation in (11) holds. If not, the games
are not empirically compatible (non-maximally revealing strategy pro�les can only
introduce additional independencies �never additional dependencies �with respect
to the game�s IOD). If � and �0 are games observability partitions, then the ability to
factor �� according to (11) also provides a su¢ cient test of empirical compatibility.
We formalize this below.
Proposition 5 Suppose �0 is a game with observability partitions that is outcome
compatible with �. If � 2 � is maximally revealing and �� =Qk2N �� (akj~�0k) ;
then � � �0:
Corollary 1 Suppose � and �0 are outcome compatible games, both of which have
observability partitions. If � 2 � and �0 2 �0 are maximally revealing and �� =Qk2N �� (~akj~�0k) and ��0 =
Qk2N ��0
�a0f 0(k)j~�f 0(k)
�; then � � �0:
Even limiting analysis to a single, properly constructed behavior strategy, the
preceding results are cumbersome in practice. For example, suppose one wanted to
identify the set of games empirically consistent to the Gatekeeper game (Figure 5).
This game has observability partitions, so one might be tempted to use Proposition
5 is the service of this task (this can be done, but it is clearly not a very heartening
prospect).
As it turns out, there is another, rather remarkable approach that allows the
testing of empirical equivalence by direct comparison of IODs without reference to
speci�c probability parameters. First, given a directed graph (N;!) ; let E � N2 be
the set of edges without reference to direction; i.e., (i; j) 2 E if and only if (i! j)
or (j ! i) : Let S � N3 be the set of all ordered triples such that (i; j; k) 2 S if and
only if (i! j) ; (k ! j) and (i; k) =2 E .
24
A B
1
2L R
3
rl rl
3
2L R
rl
3
rl
3
yx
4 4 4 4 4 4 4 4
yx yx yx yx yx yx yx
ba
5
ba
5
ba
5
ba
5
ba
5
ba
5
ba
5
ba
5
ba
5
ba
5
ba
5
ba
5
ba
5
ba
5
ba
5
ba
5
A B
1
2L R
3
rl rl
3
2L R
rl
3
rl
3
yx
44 44 44 44 44 44 44 44
yx yx yx yx yx yx yx
ba
5
ba
5
ba
5
ba
5
ba
5
ba
5
ba
5
ba
5
ba
5
ba
5
ba
5
ba
5
ba
5
ba
5
ba
5
ba
5
ba
5
ba
5
ba
5
ba
5
ba
5
ba
5
ba
5
ba
5
ba
5
ba
5
ba
5
ba
5
ba
5
ba
5
ba
5
ba
5
Figure 5: Gatekeeper extensive form
Proposition 6 Assume � and �0 are outcome compatible and games with observabil-
ity partitions. Then, � � �0 if only if their IODs are such that E = E 0 and S = S0.
Proof. See Theorem 3 in Appendix A.
Corollary 2 Suppose � and �0 are outcome compatible. If � � �0 then their IODs
are such that E � E 0 and S � S0. If �0 is also a game with observability partitions,
this condition is also su¢ cient.
Hence, at least in the case of games with observability partitions, empirical equiv-
alence can be ascertained immediately upon construction of a game�s IOD.8 To see
8This last result raised a question that was put to us by E. Dekel in correspondence. Given the
well-known work by Thompson (1952) and Elmes and Reny (1994) that identify transformations
on extensive form games that yield the equivalence class of games with the same strategic form, is
25
the signi�cance of this, recall that the IOD for the Gatekeeper extensive form is:
c1 c2
& .
c3
. &
c4 c5
We see where the game gets its name: player 3 is the �gatekeeper� of information
�owing from players 1 and 2 to players 4 and 5. Since this is a game of perfect
observation, Corollary 6 tells us that there are no other games with which this game
is empirically equivalent. Any empirically compatible IOD would have the same set
of edges, some with di¤erent directions. However, reversing any arrow above either
breaks a converging pair of arrows or creates a new one.
The requirement that elements of S not include convergent arrows with linked
tails (i.e., (i; j; k) 2 S) (i; k) =2 U) has an immediate implication for three-move
games with observability partitions. Namely, all outcome compatible games whose
IOD is a variation of the fully connected graph
1
. &
2 �! 3
are empirically compatible.
there a set of operations, similar to these in spirit, that yield games with equivalent IODs (in the
sense of Corollary 6)? Due to space limitations, we do not provide a formal reply. Clearly, however,
Corollary 6 does suggest a step-wise transformation that will yield observationally indistinguishable
extensive forms with di¤erent IODs. The transformation, while di¢ cult to formalize in the context
of an extensive form game, is easy to describe: it is the transformation that �ips an �allowed�arrow
(per Corollary 6) in the original IOD.
26
8 Extensions
8.1 Multiple Moves
We can easily allow for players to make multiple moves and the results of previous
sections will hold with the appropriate additional notation to di¤erentiate the kth
move from the player who makes that move. There are two main di¤erences between
a game with multiple moves and a game where each player moves only once: (1) the
information that a player has when choosing his second (third, ...) move includes the
information he had at the time of the �rst (second, ...); and (2) the player may be
able to play strategies that could not be implemented if the moves were played by
distinct players. The �rst issue is fully dealt with in the de�nition of the information
sets of the game. As we take information sets as given the inclusion of previously
known information does not present any problems for our results �actually, it makes
things a little simpler as the player observes his own previous moves perfectly so that
we are more likely to have a game with observability partitions. The second turns out
to be a non-issue, given that we assume perfect recall. It is well-known (Kuhn 1953,
Ritzberger 1999) that the set of probability distributions that can be generated using
mixed strategies is the same as those that can be generated by behavior strategies.
Hence, strategies made by the same player at di¤erent moves can continue to be
described using conditional probabilities, as was done when the number of moves was
equal to the number of players, so that the probability distributions generated by
players continue to be generated from conditional distributions (Lemma 2) and can
be factored using an IOD as shown in Lemma 1.
8.2 In�nite Action Spaces
The use of in�nite action spaces adds some technical complications but does not
a¤ect our results in any qualitatively substantial way. Technically, action spaces can
be any compact metric space and there are several ways to represent agents�strategies.
27
Given the importance for our results of Lemma 2, it is satisfying to know that such
a representation is valid in the more general setting (Milgrom and Weber 1985).
The representation of information is also more technically challenging. We restrict
ourselves to games with absolutely continuous information in the sense of Milgrom
and Weber9 (1985, assumption R2). Then, behavior strategies can be de�ned using
conditional probabilities which generate a regular probability measure (see Milgrom
and Weber 1985) and Chakrabarty 1999) as in Lemma 2 and hence generate an IOD10
that can be factored as shown in Lemma 1.
An important question now is how should the IOD de�nition be applied in games
with in�nite action spaces. The simplest way to solve this is to use measurabil-
ity requirements, i.e. i ! k if i < k and k�s information is not measurable on
�j2f1;:::;k�1gnfigAj. Given our discussion above, this de�nition implies that Lemma 4
and Proposition 1 hold. Once these basic results are established, the rest follow.
9In Milgrom and Weber(1985) imperfect information is modelled using type spaces. This can be
easily linked to our model by letting the type space be the product of the action spaces of the other
players the agent observes. These type spaces are the �nite product of compact metric spaces and
hence also compact metric spaces.10An issue that has been pointed out to us is that not all probability distributions can be repre-
sented using an IOD/DAG. We feel we have demostrated that this more general result does not have
consequences in our paper as we are restricted, by the nature of behavior strategies, to probability
measures that are built up from conditional probabilities and hence accept (at least) the factoring
that generates them. Furthermore, we know that the set of distributions that do not admit factor-
ization in the space of normal distributions (Spirtes et al (1993) and in the space of multinomial
distributions (Meek(1995)) has Lebesgue measure zero. As these spaces of distributions can be used
to approximate more general distributions we are con�dent that were our results to have to be re-
stricted to probability distributions that can be factored according to a directed acyclic graph, they
would continue to be su¢ ciently general and relevant.
28
9 Conclusions
Although our de�nition of empirical compatibility is new, the idea of observation-
ally indistinguishable strategies is introduced at least as early as Kuhn (1953). Two
strategies, behavior or mixed, are equivalent if they lead to the same probability
distribution over outcomes for all strategies of one�s opponents. Kuhn demonstrates
that, in games of perfect recall, every mixed strategy is equivalent to the unique be-
havior strategy it generates and each behavior strategy is equivalent to every mixed
strategy that generates it (see Aumann, 1964, for an extension to in�nite games). It
follows immediately that every extensive-form game of perfect recall is indistinguish-
able from its reduced normal form. Thus, any two games with the same reduced
normal form are indistinguishable.
We have interpreted our results as consistent with the inferences that would be
made by an outside observer with su¢ ciently informative empirical data. One ques-
tion that immediately comes to mind is whether these ideas can be extended to
construct an econometric test for game structure given cross sectional data on player
actions. For example, the maximum likelihood estimate of the information structure
for an industry might be useful in re�ning cost estimation in empirical work in in-
dustrial organization (as suggested by our earlier example). This is the subject of
on-going research.
Throughout the paper, the assumption was maintained that the underlying players
and action feasible outcomes were complete in the sense that the social scientist was
not concerned with hidden variables issues. There has been much work done in the
�eld of probabilistic networks on dealing with this issue and we direct interested
readers to the references cited in Appendix A for further inquiry.
An important element distinguishing our work from the literature on probabilistic
networks is that the IOD is derived from the primitives of a game and not from
the properties of a single, arbitrarily-speci�ed probability distribution. Thus, the
information encoded in an IOD holds for all empirical distributions arising from play
29
in the underlying game. Moreover, many of our results rely on the special structure
implied by distributions of this kind and, as mentioned earlier, may not hold in a non-
game-theoretic context (or, for example, if correlated strategies are allowed without
inclusion of a speci�c correlating device).
Until recently, work on probabilistic networks focused upon the decision problem
of a single individual. Thus, another aspect that separates our work from the existing
literature in this �eld is its use of these objects in the solution of game-theoretic (i.e.,
interactive) decision problems. Two other papers, one by Koller and Milch (2002)
and another by La Mura (2002) also use probabilistic networks to derive results of
interest to game theorists, though along a di¤erent line. Both of these papers develop
alternative representations for interactive decision problems (i.e., as opposed to a
game�s strategic or normal form) and argue that these representations are not only
computationally advantageous but also provide qualitative insight into the structural
interdependencies between player decisions. Our work clearly complements this line
of research.
30
A Dependency models
Here, we provide a condensed discussion of the relevant underlying theory of proba-
bilistic networks. Since most economists are unfamiliar with this literature, we wish
to: (i) give readers a sense of its theoretical content, and (ii) provide su¢ cient techni-
cal detail to support the development of our proofs. For those interested in pursuing
these ideas further, we suggest starting with the texts by Cowell et al. (1999), Pearl
(1988, 2000) and Spirtes at al. (2000).
De�nition 3 A dependency model M over a �nite set of elements T is a collection
of independence statements of the form (C ? DjE) in which C;D and E are disjoint
subsets of T and which is read �C is independent of D given E.�The negation of an
independency is called a dependency.
The notion of a general dependency model originated with Pearl and Paz (1985),
who were motivated to develop a set of axiomatic conditions on general dependency
models that would include probabilistic and graphical dependencies as special cases.
These axioms are known as the graphoid axioms.13 We are interested in graphoids,
which are de�ned as dependency models that are closed under the graphoid axioms.
For example, given a probability space (;F ; �) and an associated, �nite set of
random variables X indexed by N = f1; :::; ng with typical element ~xr; M� denotes
the list of conditional independencies that hold under �: For all W � N; let ~xW �
(~xr)r2W . Then, for all disjoint C;D;E � N; (C ? DjE) 2 M� if and only if ~xC
is �-conditionally independent of ~xD given ~xE (in the main text we do not work
with general dependency models and, hence, write (~xC ? ~xDj~xE)� for clarity). A
proof that the graphoid axioms hold for conditional independence in all probability
distributions can be found in Spohn (1980).
Alternatively, if G is a graph whose vertices are N , then for all disjoint C;D;E �
N; (C ? DjE) 2 MG if and only if E is a cutset separating C from D: Of course, in
this case, the meaning of (C ? DjE) depends upon how one de�nes �cutset.�The
31
literature on probabilistic networks contains several such de�nitions, depending upon
whether the graph is undirected, directed or some mixture of the two (e.g., a chain
graph). Since our IODs are directed, acyclic graphs (hereafter, DAGs), we proceed
with Pearl�s (1986) notion of d-separation (the d stands for �directed�).
Given a DAG G � (N;!) ; a path is an ordered set of nodes P � N such that,
for all �r; �r+1 2 P; either �r ! �r+1 or �r �r+1. A node �r 2 P is called
head-to-head with respect to P if �r�1 ! �r �r+1 in P . A node that starts or
ends a path is not head-to-head. A path P � N is active by E � N if: (i) every
head-to-head node is in or has a descendant in E; and (ii) every other node in P is
outside E: Otherwise, P is said to be blocked by E:
De�nition 4 If G = (N;!) is a DAG and C;D and E are disjoint subsets of N;
then E is said to d-separate C from D if and only if there exists no active path by E
between a node in C and a node in D:
Examples of d-separation can be found in the Pearl references cited above. Thus,
given a DAG G we de�neMG such that (C ? DjE) 2MG if and only if E d-separates
C from D in G.
We wish to characterize the relationship between probabilistic and graphical de-
pendency models. This is done through the general notion of an independence map
(or, I-map).
De�nition 5 An I-map of a dependency model M is any model M 0 such that M 0 �
M:
Given a probability space (;F ; �) and an associated, �nite set of random vari-
ables X � f~x1; :::; ~xng ; the task of constructing a DAG (N;!) ; N = f1; :::; ng, such
that M(N;!) is an I-map of M� is straightforward (see Geiger et al., 1990, p. 514).
First, for all r 2 N; let Ur � f1; :::; r � 1g index the predecessors of ~xr according to N:
Next, identify a minimal set of predecessors �r � N such that (frg ? Urn�rj�r)� :
32
This results in a set of n independence statements known as a recursive basis drawn
from M� and denoted B�: Now, construct (N;!) such that s ! r if and only
if s 2 �r. The resulting graph G; a DAG, is said to be generated by B� and
�r = fs 2 N js! rg is the set of parents of r in G.
The following theorems are from Geiger et al. (1990, Theorems 1 and 2). First,
an independence statement (C ? DjE) is a semantic consequence (with respect to a
class of dependency modelsM �e.g., those that satisfy the graphoid axioms) of a set
B of such statements if (C ? DjE) holds in every dependency model that satis�es
B; i.e., (C ? DjE) 2M for all M such that B �M 2M.
Theorem 1 (soundness) If M is a graphoid and B is any recursive basis drawn
from M; then the DAG generated by B is an I-map of M:
So, given (;F ; �), the DAG G constructed in the fashion outlined above is an
I-map of M�: That is, every independence statement implied by (N;!) under d
-separation corresponds to a valid �-conditional independency.
Theorem 2 (closure) Let G be a DAG generated by a recursive basis B: Then MG;
the dependency model generated by G; is exactly the closure of B under the graphoid
axioms.
Following Pearl (2000, p. 16-20), two DAGs (N;!) and (N;!0) are said to be
observationally equivalent if every probability distribution that can be factored in
accordance with the recursive basis B(N;!) � f(frg ? Urn�rj�r) jr 2 Ng can also
be factored in accordance with B(N;!0) � f(frg ? U 0rn�0rj�0r) jr 2 Ng : The following
theorem is from Pearl (2000, Theorem 1.2.8). It originally appears in Verma and
Pearl (1990,Theorem 1) and is generalized by Andersson et al. (1997, Theorems B.1
and 2.1).
Theorem 3 Two DAGs (N;!) and (N;!0) are observationally equivalent if and
only if E = E 0 and S = S0:
33
Finally, when (;F ; �) has product structure, the following result is useful.
Theorem 4 Given a DAG G and a probability space (;F ; �) in which � is a product
measure, the following conditions are equivalent:
(DF) � admits recursive factorization according to G;
(DG) d-separation in G implies conditional independence according to �:
(C ? DjE)G ) (C ? DjE)� ;
(DL) � obeys the local Markov property relative to G;
(DO) � obeys the ordered Markov property relative to G.
Proof. This follows from Theorem 5.14 p74 in Cowell, Dawid, Lauritzen and
Spiegelhalter (1999). In our case, �� is a product measure and (N;!) is directed
and acyclic. Hence, by Proposition 4, condition �DF�is satis�ed, thereby implying
the three conditions stated above.
B Selected proofs
B.1 Additional notation used in the proofs
HI(z) � fz0j8i 2 I~ai(z0) = ~ai(z)g
where I represents an index set. Usually we will use
I = �k �the parent set of node k
I = k �shorthand for the index set f1; : : : ; k � 1g
I = k � i �shorthand for the index set f1; : : : ; k � 1g n fig
I = k � �k �shorthand for the index set f1; : : : ; k � 1g n �k
Ak(z) � fz0j~ak(z0) = ~a(z)g
Ak;j � fz0j~ak(z0) = ak;jg
34
B.2 Lemma 2
Lemma 5 Given � 2 �, 8z, if �� (Hk(z)) > 0
��(Hk(z)) =k�1Yj=1
~�j(z);
andX
z02Hk(z)
nYj=k
~�j(z0)
!= 1
Proof. Given � 2 �, consider k = n:
8z 2 Z, ��(z) =Qni=1~�i(z), let ~Xn(z) = Xn;k.
Hn(z) �nz0jz0 =
�~hn(z); ~an(z
0)�o
��(Hn(z)) =X
z02Hn(z)
nYj=1
~�j(z0)
=X
z02Hn(z)
n�1Yj=1
~�j(z)~�n(z0)
=n�1Yj=1
~�j(z)X
z02Hn(z)
~�n(z0)
=n�1Yj=1
~�j(z)lnXj=1
�nk;j =n�1Yj=1
~�j(z)
As, by the de�nition of a strategyPln
j=1 �nk;j = 1.
Consider k = n� 1. Then,
Hn�1(z) �nz0jz0 =
�~hn�1(z); ~an�1(z
0); ~an(z0)�o
��(Hn�1(z)) =X
z02Hn�1(z)
n�2Yj=1
~�j(z)~�n�1(z0)~�n(z
0)
35
For each an�1;k, k = 1; : : : ; ln�1, let zk be any z 2 Hn�1(z) such that Hn(zk) =
(~hn�1(z); an�1;k). Then, using the same arguments as for k = n we get
��(Hn�1(z)) =
ln�1Xk=1
Xz02Hn(zk)
n�2Yj=1
~�j(z)~�n�1(zk)~�n(z0)
=
n�1Xk=1
n�2Yj=1
~�j(z)~�n�1(zk)X
z02Hn(zk)
~�n(z0)
=
ln�1Xk=1
n�2Yj=1
~�j(z)~�n�1(zk) =
n�2Yj=1
~�j(z)
n�1Xk=1
~�n�1(zk)
=
n�2Yj=1
~�j(z)
Then by recursively repeating the above arguments
��(Hk(z)) =k�1Yj=1
~�j(z)
The second equation then follows from
��(Hk(z)) =X
z02Hk(z)
nYj=1
~�j(z0)
=k�1Yj=1
~�j(z)X
z02Hk(z)
nYj=k
~�j(z0)
Proof of Lemma 2
Proof. Given � 2 �, for all z and k such that Hk(z) > 0
��(~akj~hk)(z) =��(Hk(z) \ Ak(z))
��(Hk(z))
=��(Hk+1(z))
��(Hk(z))
=
Qkj=1~�j(z)Qk�1
j=1~�j(z)
= ~�k(z)
36
B.3 Lemma 3
Proof. ()) i 6! k implies either (a) i > k or (b) i < k and 8z 2 Z and z0 2~h�1kni
�~hkni (z)
�, ~Xk (z) = ~Xk (z
0), or (c) i < k and 9z 2 Z and z0 2 ~h�1kni�~hkni (z)
�,
~Xk (z) = ~Xk (z0) and #Ak = 1.
Case (a): i > k ) fig 62 f1; : : : ; k � 1g, so that ~hk(z) = ~hkni(z) so that 8� 2 �,
��(~aij~hk)(z) = ��(~aij~hkni)(z)
Case (b): i < k and 8z 2 Z and z0 2 ~h�1kni
�~hkni (z)
�, ~Xk (z) = ~Xk (z
0) implies
8z0 2 Hk�i(z) \ Ak(z) � Hk�i, ~�k(z) = ~�k(z0). By Lemma 2,
8z0 2 Hk�i(z); ��(~akj~hk)(z0) = ~�k(z0) = ~�k(z)
) ��(Hk+1(z0)) = Hk(z
0)~�k(z)
Let H denote the set of all possible k � 1 length histories, such that
H �n(a1; : : : ; ak�1) j9z0 2 Hk�i(z); ~hk(z0) = (a1; : : : ; ak�1)
oAnd z�1(h) some arbitrary z0 2 Hk�i(z) such that ~hk(z0) = h. Then,
��(Hk�i(z)) =Xh2H
��(h)
��(Hk�i(z) \ Ak(z)) =Xh2H
��(h)~�i(z�1(h))
��(~akj~hk�i(z) =��(Hk�i(z) \ Ak(z))
Hk�i(z)
) ��(~akj~hk�i(z) =
Ph2H ��(h)
~�i(z�1(h))P
h2H ��(h)
=~�i(z)
Ph2H ��(h)P
h2H ��(h)
= ~�i(z) = ��(~akj~hk)(z)
Case (c): If #Ak = 1 then 8z 2 Z; ~�k(z) = 1, so that for all i, ��(~akj~hk�i)(z) =
��(~akj~hk)(z) = 1.
37
(() Consider the possibility that there exists z and z0 such that z0 2 ~h�1kni�~hkni (z)
�,
i < k and #Ak > 1, and ~Xk;1 � Xk(z0) 6= Xk(z) � Xk;2. Note that this implies
~aj(z0) = ~aj(z) for j 2 f1; : : : ; k � 1g n fig and ~ai(z0) 6= ~ai(z). Suppose the condition
of the Lemma holds: for all � 2 � (~a ? ~aij~hkni)�. Consider the following strategy:~�j(z) = 1 for j 2 f1; : : : ; k � 1g n fig, ~�i(z) = 1=4, ~�k(z) = 1=2, ~�i(z0) = 3=4. As
Xk;1 6= Xk;2 we can choose ~�k(z0) � �k2;r, where r 2 f1; : : : ; lkg freely. Without loss of
generality let ~ak(z) = a1, and let �k2;1 = � 2 [0; 1]. We have:
��(Hk(z)) = 1=4; ��(Hk(z0)) = 3=4
��(Hk�i(z0)) = ��(Hk�i(z)) = 1
��(a1j~hk)(z) = 1=2; ��(a1j~hk)(z0) = �
��(a1j~hk�i)(z) =(1=4) � (1=2) + (3=4) � �
1
=1
8(1 + 6�) = ��(a1j~hk)(z) =
1
2
According to the condition of the Lemma this implies that � = 1=2 but � can take
any value in [0,1] �a contradiction.
B.4 Lemma 4
Proof. Given Lemma 3 and the proof of i 6! k ) 8� 2 � (~ai ? ~akj~hkni)�. If
#Ak = 1, let Ak = fag then if is trivially true that for all i; j < k, ��(ajhk�i�j) = 1.
If #Ak > 1, what we now need to show is that 8z0 2 H�k(z), ~Xk(z0) = ~Xk(z). It
su¢ ces to show that for any pair i; j such that i; j 2 �j, 8z0 2 Hk�i�j(z), ~Xk(z0) =
~Xk(z). From Lemma 3 we have that 8z0 2 Hk�i(z) [ Hk�j(z), ~Xk(z0) = ~Xk(z),
but Hk�i(z) [Hk�j(z) is (generically) a strict subset of Hk�i�j(z). Fortunately, the
de�nition of i 6! k applies to all of Z, not just to a particular z. This means that for
any z0 2 Hk�i(z), it is true that 8z00 2 Hk�j(z0) ~Xk(z00) = ~Xk(z
0) = ~Xk(z), and for
any z� 2 Hk�j(z), it is true that 8z000 2 Hk�j(z�) ~Xk(z000) = ~Xk(z
�) = ~Xk(z), so that
our desired result holds, namely, for any pair i; j such that i; j 2 �j, 8z0 2 Hk�i�j(z),
38
~Xk(z0) = ~Xk(z), and hence, 8z0 2 Hk��k(z), ~Xk(z
0) = ~Xk(z). Applying the logic used
in the proof of Lemma 3:
8z0 2 Hk��k(z); ��(~akj~hk)(z0) = ~�k(z)
) ��(Hk+1(z0)) = Hk(z
0)~�k(z)
Let H denote the set of all possible k � 1 length histories, such that
H �n(a1; : : : ; ak�1) j9z0 2 Hk��j(z); ~hk(z0) = (a1; : : : ; ak�1)
oAnd z�1(h) some arbitrary z0 2 Hk��k(z) such that ~hk(z0) = h. Then,
��(Hk�i(z)) =Xh2H
��(h)
��(Hk�i(z) \ Ak(z)) =Xh2H
��(h)~�k(z
�1(h))
��(~akj~hk��k)(z) =��(Hk��k(z) \ Ak(z))
Hk��k(z)
) ��(~akj~hk��k(z)) =
Ph2H ��(h)
~�k(z�1(h))P
h2H ��(h)
=~�k(z)
Ph2H ��(h)P
h2H ��(h)= ~�k(z) = ��(~akj~hk)(z)
39
References
[1] Andersson, S.A., Madigan D., Perlmann, M.D., 1997. A Characterization of
Markov Equivalence Classes for Acyclic Digraphs. The Annals of Statistics,
25(2), 505-541.
[2] Armbruster, W., Boge, W., 1979. Bayesian game theory, in: Moeschlin, O.,
Pallaschke, D. (Eds.), Game Theory and Related Topics. North-Holland, Ams-
terdam, pp. 17-28.
[3] Aumann, R. J., 1964. Mixed and behavior strategies in in�nite extensive games,
in: Dresher, O., Shapley, L. S., Tucker, A. W. (Eds.), Advances in Game Theory.
Princeton University Press, Princeton, pp. 627-650.
[4] Bollobas, B., 1998. Modern Graph Theory. Springer, New York.
[5] Chakrabarti, S. K., 1999. Finite and In�nite Action Dynamic Games with Im-
perfect Information, Journal of Mathematical Economics, 32(2), 243-66.
[6] Cowell, R. G., Dawid, A. P., Lauritzen, S. L., Spiegelhalter, D. J., 1999. Proba-
bilistic Networks and Expert Systems. Springer, New York.
[7] Elmes, S., Reny, P., 1994. On the strategic equivalence of extensive form games.
Journal of Economic Theory, 62(1), 1-23.
[8] Fristedt, B., Gray, L., 1997. A Modern Approach to Probability Theory.
Birkhäuser, Boston.
[9] Geiger, D., Verma, T. S., J. Pearl, J., 1990. Identifying independence in Bayesian
networks. Networks, 20, 507-34.
[10] Harsanyi, J., 1967-68. Games with incomplete information played by �Bayesian�
players,�Management Science, 8, Parts I-III, 159-82, 320-34, 486-502.
40
[11] Koller, D. Milch, B., 2002. Multi-agent in�uence diagrams for representing and
solving games. Seventeenth International Joint Conference on Arti�cial Intelli-
gence, Seattle, WA.
[12] La Mura, P., 2002. Game Networks. Unpublished manuscript.
[13] Mackey, G. W., 1957. Borel structures in groups and their duals. Trans. Amer.
Math. Soc., 85, 283-95.
[14] Meek, C., 1995. Strong completeness and faithfulness in Bayesian networks, Pro-
ceedings of Eleventh Conference on Uncertainty in Arti�cial Intelligence, Mon-
treal, QU, 411-418.
[15] Milgrom, Paul R., Weber, Robert J., 1985. Distributional Strategies for Games
with Incomplete Information. Mathematics of Operations Research, 10 (4), 619-
33.
[16] Pearl, J., 1986. Fusion, propagation and structuring in belief networks. Arti�cial
Intelligence 29, 241-88.
[17] � � , 1988. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible
Inference. North Holland, Amsterdam.
[18] � � , 2000. Causality: Models, Reasoning, and Inference. Cambridge University
Press.
[19] � � , Paz, A., 1985. Graphoids: A graph based logic for reasoning about
relevance relations. Tech rep 850083 (R-53-L), Cognitive Systems Laboratory,
UCLA.
[20] Ritzberger, K., 1999. Recall in Extensive Form Games. International Journal of
Game Theory, 28(1), 69-87.
41
[21] Spirtes, Peter, Glymour, Clark and Scheines, Richard, 1993. Causation, Predic-
tion, and Search. Springer-Verlag.
[22] Spohn, W., 1980. Stochastic independence, causal independence and shieldabil-
ity. J. Phil. Logic, 9, 73-99.
[23] Thompson, F. B., 1952. Equivalence of games in extensive form. The Rand Cor-
poration, Research Memorandum 759.
[24] Verma, T. S., Pearl, J., 1990. Equivalence and synthesis of causal models, in:
Proceedings of the 6th Conference on Uncertainty in Arti�cial Intelligence. Cam-
bridge, pp. 220-7. Reprinted in: Bonissone, P., Henrion, M., Kanal, L. N., Lem-
mer, J. F. (Eds.), Uncertainty in Arti�cial Intelligence, vol. 6, 255-68.
42