Empirical Implications of Information Structure in Finite ... · November 10, 2004 Our thinking on...

Empirical Implications of InformationStructure in Finite-LengthExtensive-Form Games�

Jose Penalva-ZuastiUniversitat Pompeu Fabra

Michael D. Ryally

University of Melbourne

November 10, 2004

�Our thinking on the issues analyzed in this paper have been substantially in�uenced by many�ne colleagues, including Adam Brandenburger, Colin Camerer, David Chickering, Catherine deFontenay, Eddie Dekel, Bryan Ellickson, Joshua Gans, Fabrizio Germano, Ehud Kalai, U¤e Kjærul¤,David Levine, Steven Lippman, Glenn MacDonald, Mark Machina, Leslie Marx, Joseph Ostroy,Scott Page, Judea Pearl, Ben Polak, Larry Samuelson, Joel Watson, Eduardo Zambrano, Bill Zame,participants in the MEDS, Georgetown, UCLA, and UCSD theory workshops as well as the StanfordSITE and Stonybrook theory conferences. The authors are, of course, entirely responsible for allshortcomings.

yCorrespondence should be sent to: M. D. Ryall, Melbourne Business School, 200 Leicester St.,Carlton, VIC 3053, Australia.

1

Abstract

We analyze what can be inferred about a game�s information structure solely from

the probability distributions on action pro�les generated during play. The kinds of

situations we have in mind are those in which the only data available to a social scien-

tist analyzing a game with unknown information structure is a cross-sectional listing

of actions actually taken by each of the players; i.e., data that is uninformative with

respect to who moved when and what they knew at the time of their action choice.

One game is said to empirically compatible to another when the empirical distribu-

tion induced by any behavior strategy pro�le in the latter can also be induced by an

appropriately chosen behavior strategy pro�le in the former. We introduce the in�u-

ence opportunity diagram of a game �a directed, acyclic graph in which the nodes

correspond to player moves and arcs to in�uence possibilities �and demonstrate a

simple methodology that uses such diagrams to test for empirical compatibility. The

essential idea is to connect a game�s information structure, which identi�es the indi-

vidual histories upon which players condition their behavior, to a corresponding set

of conditional independencies that must be observed in all of its empirical distribu-

tions. When two games with di¤erent information structures imply di¤erent sets of

such independencies, then knowledge of the empirical distribution provides a basis

upon which one may be distinguished from the other. In�uence opportunity diagrams

summarize these independencies in a useful way.

Keywords: empirical inference, information structure, extensive form, compatibility,

Bayesian network, causality

2

1 Introduction

Applied noncooperative game theory has spawned numerous streams of empirical

research in economics and other social sciences. A common feature of virtually all

contributions along these lines is that they begin with a full speci�cation of the game

presumed to generate the data.1 An equilibrium concept is then applied and either

tested or used to infer unknown values, depending upon the type of study at hand.

It is well understood that di¤erent assumptions about game structure (particularly

about payo¤s and information) lead to very di¤erent empirical implications. Indeed,

a common complaint invariably encountered at some point by every game theorist

is that any observed economic phenomenon can be described by some appropriately

constructed game and, hence, the theory has no real empirical content. This critique

would be neutralized were the theory able to deliver, say, a joint test of game structure

and the equilibrium concept of interest or, alternatively, a procedure in which game

structure is estimated from data, an equilibrium concept is applied and unknown

values of interest are empirically quanti�ed.

Unfortunately, little is presently known about the estimation of game structure

from data. A notable exception is the developing literature on the testable implica-

tions of game theory in applications with unobserved payo¤s (e.g., Sprumont, 2000;

Ray and Zhou, 2001; Ray and Snyder, 2004). In these papers, unobserved agent

preferences are assumed to meet certain consistency restrictions (e.g., completeness,

transitivity). Then, using classical revealed preference style reasoning (Samuelson,

1938), results are derived that allow one to determine whether observed outcomes are

consistent with certain equilibrium concepts (e.g., Nash versus subgame perfection).

Promising though they are, application of these results requires the pre-speci�cation

of the underlying information structure �an (untested) assumption with respect to

which their implications are not generally robust.

In this paper, we take one step forward in addressing these issues. Speci�cally, we

1In the case of the experimental literature, the structure of the game is actually known.

3

analyze what can be inferred about a game�s information structure solely from the

probability distributions on action pro�les generated during play (which we refer to

as the �empirical distributions�induced by the pro�le of player behavior strategies).

The kinds of situations we have in mind are those in which the only data available to

the social scientist is a cross-sectional listing of actions actually taken by each of the

players; i.e., data that is uninformative with respect to who moved when and what

they knew at the time of their action choice. We wish to analyze the extent to which

such data identi�es the information structure of the underlying game.

To begin, we introduce the notion of empirical compatibility. One game is said

to empirically compatible to another when the empirical distribution induced by any

behavior strategy pro�le in the latter can also be induced by an appropriately chosen

behavior strategy pro�le in the former. Thus, even under in�nite repetition, it is

impossible to distinguish a game from any in its empirical compatibility class solely

on the basis of the observed behavior of its players. Hence, empirical compatibil-

ity is su¢ cient to render two games observationally indistinguishable without, e.g.,

additional assumptions about equilibrium behavior.

An important result of this paper is to associate with a certain class of economi-

cally important, �nite-length extensive form games a surprisingly simple methodology

by which to test empirical compatibility. The essential idea is to connect a game�s

information structure, which identi�es the individual histories upon which players

condition their behavior, to a corresponding set of conditional independencies that

must be observed in all of its empirical distributions. When two games with di¤erent

information structures imply di¤erent sets of such independencies, then knowledge

of the empirical distribution provides a basis upon which one may be distinguished

from the other.

Our analysis is facilitated by the introduction of a new graphical device, the

in�uence opportunity diagram of a game (hereafter, IOD). The IOD is a directed,

acyclic graph whose nodes correspond to the moves made by players in the under-

4

lying game. Given an extensive form game, we illustrate how to construct its IOD.

Then, we demonstrate that the IOD is an �independence map� of every empirical

distribution that can be generated in the underlying game. That is, according to

a particular notion of conditional independence in graphs (introduced below), every

conditional independence in a game�s IOD implies a corresponding conditional inde-

pendence between player behaviors in all of the game�s empirical distributions. Our

work is closely related to, indeed contributes to, the rapidly-expanding literature on

probabilistic networks.2

This result is signi�cant because a necessary requirement for one game to be em-

pirically compatible to another is that their IODs imply a consistent set of conditional

independencies. This consistency, as it turns out, is relatively easy to determine (i.e.,

via visual inspection of the graphs). IOD consistency is not, in general, su¢ cient to

establish empirical compatibility because di¤erences in the speci�c information upon

which players condition their behavior may imply additional restrictions on empirical

distributions that are not picked up by the IOD. However, we do identify an econom-

ically important subclass of games, termed games with observability partitions, for

which this condition is also su¢ cient.

The layout of the remainder of the paper is as follows. In the next section,

we present two extended examples designed to motivate our results and illuminate

the ideas that are generalized later in the paper. In §3, we set up the formalities.

§4 presents and illustrates the de�nition of a game�s IOD. Some useful preliminary

results are contained in §5. The de�nition of empirical compatibility is presented in

§6 along with illustrative examples. Our main results are found in §7. Extensions are

discussed in §8. In §9, we provide some concluding remarks.

2Since most economists are not familiar with this literature, we have included a condensed,

self-contained discussion of some relevant results and references in Appendix A.

5

2 Examples and Intuition

In this section, we present two examples designed: a) to highlight the issues that

arise when game structure is unknown, (one in which the issue of interest is a test

of equilibrium re�nement and another in which the issue is estimation of cost para-

meters in a market game); and, b) to illustrate the notion of empirical compatibility

and foreshadow our analytical approach. Throughout, we assume that the data ob-

served by the social scientist is generated by an extensive form game in which the

participating players know all the relevant structural details. Since the interest is in

empirical work, we also assume that players take noisy actions and that the noise

is independent across player moves. To keep things simple in this section, we do

not add explicit �nature�nodes to the extensive forms to represent the noise terms

but, instead, embed them directly into player strategies in the form of independent

�trembles.�

2.1 A test of equilibrium

Suppose a social scientist observes a two player interaction in which player I plays

either L or R and player II either u or d. The scientist hypothesizes that the true

underlying game is the one shown in Figure 1 below. Although she is certain that

this model captures all the relevant agents (e.g., there are no unobserved factors

in�uencing outcomes outside the choices of the two agents shown) and their payo¤s,

she is unsure of the game�s information structure (who moves when and what they

know at the time).

The social scientist�s data set represents a tally of the outcomes observed over a

large number of repetitions. She would like to know whether the data are consistent

with one of the game�s two pure strategy Nash equilibria, (L; u) and (R; d) : Assuming

player I chooses an action with noise �I and player II with noise �II ; the data set

should be consistent with one of the following frequency tables (the one on the left

6

ΙΙ

Ι

L R

du

ΙΙ

du

(10, 10) (5, 5)(0, 0)(0, 0)

ΙΙ

Ι

L R

du

ΙΙ

du

(10, 10) (5, 5)(0, 0)(0, 0)

Figure 1: two player, simultaneous move game

corresponding to (L; u) and the one on the right to (R; d)).

Table 1

Outcome Frequency

(Lu) (1� �I) (1� �II)

(Ld) (1� �I) (�II)

(Ru) (�I) (1� �II)

(Rd) (�I) (�II)

Table 2

Outcome Frequency

(Lu) (�I) (�II)

(Ld) (�I) (1� �II)

(Ru) (1� �I) (�II)

(Rd) (1� �I) (1� �II)

Suppose the actual frequencies of outcomes are as shown in the following table.

Table 3

Outcome Frequency

(Lu) :81

(Ld) :09

(Ru) :01

(Rd) :09

We call this the empirical distribution generated by play of some behavior strategy

7

pro�le. It is not di¢ cult to demonstrate that these frequencies do not conform to

either of the equilibrium distributions hypothesized above. There are no parameters

�I and �II that deliver the observed values. Apparently, then, the Nash hypothesis

should be rejected.

Such a conclusion would be premature, however. In fact, the data is consistent

with a Nash equilibrium (in particular, a subgame perfect one) in a di¤erent game

than the one shown in Figure 1. To see this, �rst note that the structure of the

hypothesized game (simultaneous move) implies Pr (~aI ; ~aII) = Pr (~aI) Pr (~aII) for all

empirical distributions generated by play (where ~ai is the action taken by player i).3

This restriction is clearly violated by the distribution in Table 3. By the de�nition

of conditional probability, every empirical distribution from any two player game can

be factored as

Pr (~aI ; ~aII) = Pr (~aII j~aI) Pr (~aI) : (1)

The parameters that deliver the distribution in Table 3 according to the factorization

in (1) are as follows.

L R

Pr (~aI) :9 :1

~aI Pr (uj~aI) Pr (dj~aI)

L :9 :1

R :1 :9

This is signi�cant because these parameters correspond exactly to (noisy) behavior

strategies in a di¤erent game, namely the one in which player I moves R or L, is

observed by player II who then moves u or d. Indeed, the behavior strategy pro�le

implied by these parameters is a subgame perfect equilibrium in which �I = �II = :1.

Hence, the data summarized in Table 3 is consistent with the joint hypothesis that

the underlying game is the complete information game with player I moving �rst and

in which play proceeds according to the subgame perfect equilibrium of that game.

3Correlated equilibria are ruled out by the assumption that the player set is complete; i.e., there

are no hidden correlating devices in the form of nature moves or historical player moves.

8

Are there other joint hypotheses with which this data is consistent? The answer

is yes. To see this, note that, again simply by applying the de�nition of conditional

probability, we can equally well consider the factorization

Pr (~aI ; ~aII) = Pr (~aI j~aII) Pr (~aII) : (2)

The corresponding parameters are

u d

Pr (~aII) :82 :18

~aII Pr (Lj~aII) Pr (Rj~aII)

u :99 :01

d :50 :50

These parameters correspond exactly to the game of complete information in which

player II moves �rst. In this case, play is consistent with a subgame perfect equilib-

rium with �II = :18 and error parameters for player I at his respective information

sets of �I;u = :01 and �I;d = :5:

Notice that an implication of equations (1) and (2) is that any empirical distrib-

ution arising as a result of play in the complete information game in which I moves

�rst can be empirically imitated by an appropriately chosen behavior strategy in the

complete information game in which II moves �rst, and visa versa. When two games

have this property, we say they are empirically equivalent.

Which, if either, of the two data-consistent joint hypotheses seem reasonable de-

pend upon what the social scientist wishes to assume about the relative magnitude of

the error terms, their consistency across information sets, and so on. The important

point is that the data in Table 3 rules out the game in Figure 1 [(as we will later

demonstrate)] and, hence, any equilibrium implications that might be drawn from it.

Moreover, the data implies an equivalence class of games with which it is consistent

and, happily in this example, equilibrium behavior as well.

9

2.2 Inferring �rm costs in market games

Now, suppose the social scientist is interested in estimating costs for a 3-�rm industry,

say automobiles, for which market shares have been estimated to be 40%, 40%, and

20% on aggregate unit volume of 100. As before, these estimates are taken from a

large number of observations. It is assumed that the industry is governed by a stable

Cournot mechanism in which inverse demand is estimated as p � 200�P3

i=1 qi: It is

not hard to show that the implied costs are c1 = c2 = 67 and c3 = 93; that is, �rm

3�s low market share is apparently due to a substantial cost disadvantage.

Suppose, however, that upon further inspection and controlling for external cor-

relating factors, our empiricist discovers that the estimated empirical distribution on

�rm production choices can be factored as

Pr (q1; q2; q3) = Pr (q3jq1; q2) Pr (q1) Pr (q2) (3)

but not

Pr (q1; q2; q3) = Pr (q3) Pr (q1) Pr (q2) : (4)

The �nding of (3) and not (4) is highly problematic with respect to the Cournot

hypothesis for, under the simultaneous-move assumption of Cournot, �rm quan-

tity choices should be independent. The dependencies in (3) suggest a di¤erent,

Stackleberg-type mechanism, one in which �rms 1 and 2 move �rst and are observed

by 3 prior to its production decision. This latter information structure is intuitively

captured by the directed graph

1! 3 2: (5)

As we will show momentarily, there is a close connection between graphs of this

type, which we call in�uence opportunity diagrams, and the conditional independen-

cies implied by a game�s information structure. Estimating costs according to the

Stackleberg-type consistent with (5) leads to the conclusion

c1 = c2 = c3 = 80;

10

which is both quantitatively and qualitatively di¤erent than that reached under

Cournot.

Finally, foreshadowing the results to follow, we simply point out that the Stack-

leberg game described by (5) is the only game in which every behavior strategy

generates an empirical distribution that can be factored as in (3). Moreover, this

conclusion can be reached by examination the game�s IOD without any further ref-

erence to its information structure.

3 Setup

Wherever possible, capital letters (X;Z) denote sets, small letters (a; w) elements of

sets and functions, and script letters, (A;F) collections of sets. Sets with product

structure are indicated by bold capitals (A;E) with small bold (a; e) denoting typical

elements (ordered pro�les) in such sets. Graphs and probability spaces play a large

role in the following analysis. Standard notation and de�nitions are adopted wherever

possible.

3.1 Finite extensive-form games

In order to make our analysis as transparent as possible, we begin with the simple

case of a �nite extensive form game in which each player has one move. We also

assume that the game has product structure; that is, a player has access to the same

set of feasible actions at every node. Later, we extend our results to more general

cases.

Let N � f1; :::; ng ; n < 1; index the set of players such that player k is the

one who has the kth move. The game tree is denoted (X;E) with nodes X, edges

E � X � X; and terminal nodes Z � X. The set of move-k nodes is denoted Xk:

Player k has a �nite set of move-k actions labelled Ak. The set of action pro�les is

A � �k2NAk: The assumption of product structure implies that a bijection exists

11

between Z and A. The payo¤ function is ~u : Z ! Rn in which component ~uk (z)

indicates the payo¤ to agent k when the outcome is z 2 Z: Finally, the information

structure is represented by X � fX1; :::;Xng where Xk � fXk;1; :::; Xk;mkg is the

partition of the move-k game tree nodes into their corresponding information sets.

Two conditions that are maintained throughout the paper are: 1) correlated

strategies are prohibited without an explicit correlating device (i.e., all factors that

have the potential to simultaneously in�uence the behavior of multiple players are

explicitly identi�ed in the model), and 2) no player has the ability to alter the in�u-

ence relationships of his opponents (e.g., the player with the �rst move cannot take

an action that switches the players moving second and third). In order to identify

in�uence relations from empirical data, it seems reasonable to limit our attention to

situations in which such relations are stationary.4

We adopt the convention that functions on Z are denoted with tildes to both

emphasize this fact and to distinguish such functions from the speci�c values they

take: For example, we de�ne the function ~xk : Z ! X where ~xk (z) indicates the kth

node on the game tree path terminating at z; so, ~xk (z) = x for some x 2 Xk: Similarly,

~ak : Z ! Ak where ~ak (z) is the action corresponding to the kth edge on the game tree

path terminating at z: Then, ~a � (~a1; :::; ~an) so that ~a (z) 2 A identi�es the action

pro�le associated with z 2 Z: The move-k history is given by ~hk � (~a1; :::; ~ak�1) with~h1 an arbitrary constant. Finally, let ~Xk(z) = Xk;l indicate the move-k information

set that intersects the path terminating in z (i.e., Xk 3 Xk;l 3 ~xk(z)).

Taking the perspective of a social scientist viewing the game from the outside,

the strategies adopted by the players are unobserved parameters that determine the

observed outcomes. Let lk � jAkj denote the number of actions in Ak: Then, every

node in Xk has lk edges emanating from it: A behavior strategy for player k 2 N in

4This is implied by the product structure of the game.

12

this �nite setup is a matrix of conditional probabilities

�k �

26664�k1;1 � � � �k1;lk...

. . ....

�kmk;1� � � �kmk;lk

37775in which �ki;j is the probability player k places on action ak;j 2 Ak upon reaching

information set Xk;i 2 Xk: Hence, each row of �k corresponds to an information set

and constitutes a probability distribution on the elements of Ak: Let� � (�1; :::;�n)

denote a strategy pro�le and � the set of all such pro�les.

To keep track of strategy parameters, let ~�k (z) � �ki;j where ~Xk (z) = Xk;i and

~ak (z) = ak;j: So, ~�k (z) indicates the probability associated with the kth edge in the

path terminating at z given player k�s behavior strategy. As usual, behavior strategies

are restricted to assign equal probabilities to identical actions in the same information

set; that is,�8z; z0 2 Zj ~Xk (z) = ~Xk (z

0) ; ~ak (z) = ~ak (z0)�; ~�k (z) = ~�k (z

0) : (6)

In this context, the function ~� ��~�1; :::; ~�n

�identi�es the edge probabilities associ-

ated with each branch of the game tree given �:

3.2 Empirical distributions

In what follows, we assume that our intrepid social scientist knows who the players

are and, at the conclusion of play observes the actions taken by each; to him, a �data

point� is a list of actions along with the identities of the players that took them.

Every strategy pro�le � 2 � generates a probability measure �� on�Z; 2Z

�de�ned

in the following way:

8W 2 2Z ; �� (W ) �Xz2W

Yk2N

~�k (z) : (7)

We say �� is the empirical distribution induced by �. Note that, given the bijection

between Z and A; �� immediately implies a corresponding distribution on�A; 2A

�:

13

On the one hand, focusing on�A; 2A

�emphasizes our interest in what the social

scientist observes: On the other,�Z; 2Z

�is more useful in proving our results and, so,

this is the measure space used in most of what follows.

Note that, for all � 2 �, the de�nition of conditional probability permits us to

factor �� as follows

8z 2 Z; �� (z) �nYk=1

��

�~akj~hk

�(z) ��-a:e:; (8)

where ��~akj~hk

�is the ��-conditional distribution of ~ak given ~hk: To reduce nota-

tional clutter from this point on, we omit the ��-a:e:�quali�ers whenever it is clear

from the context.

For all i; k 2 N such that i < k; let ~hkni � (~a1; :::; ~ai�1; ~ai+1; :::; ~ak�1) ; i.e., ~hk ex-

cluding ~ai. Given� 2 �; from the de�nition of conditional probability and condition

(8), for all i; k 2 N such that i < k; ~ak is ��-independent of ~ai given ~hkni if

��

�~akj~hk

�= ��

�~akj~hkni

�(9)

In general, given random variables ~x; ~y; ~z on�Z; 2Z ; ��

�; the notation (~x ? ~yj~z)�

means: ~x is ��-independent of ~y given ~z.

4 In�uence opportunity diagrams

We now introduce the formal de�nition of an in�uence opportunity diagram for the

extensive form of a game (the description of the game without payo¤s). Every exten-

sive form game has an in�uence opportunity diagram with which it is associated and

is de�ned independently of player preferences. As we see from the following de�nition,

the construction is quite intuitive.

De�nition 1 The in�uence opportunity diagram (hereafter, IOD) implied by a �nite

extensive form (N; (X;E) ;A;X ) is a directed graph (N;!) such that i ! k if and

14

only if i < k, lk > 1 and there exist z 2 Z and z0 2 ~h�1kni

�~hkni (z)

�; such that

~Xk (z) 6= ~Xk (z0).

Hence, i! k if and only if there exists at least one move-k history (a1; :::; ai; :::; ak�1)

such that the player moving at i has the ability to �unilaterally �place the player

moving at k into a di¤erent information set simply by deviating to some other ac-

tion (i.e., holding other players�actions constant). When i ! k; in�uence (in the

sense that i and k�s moves are correlated in the empirical distribution) is only an

�opportunity�because player k may always choose to ignore the actions of player i.

Neither is i! k required for the appearance of in�uence in an empirical distribution

since player i may a¤ect player k�s behavior indirectly through some other player or

players.

Lemma 1 Every IOD is acyclic.

Proof. This follows immediately from the tree structure of (X;E) and the re-

quirement that i! k only if i < k:

To illustrate the construction we construct the IOD for extensive form of the

standard signalling game (Figure 2). In this game N denotes the Nature player (for

purposes of illustration we will treat N as a standard player). The signalling game

proceeds in the usual way: player I observes player N�s move before he chooses his

own and then player II, after observing player I�s move, chooses her action. The

set of possible play paths are de�ned by the product set fU;Dg � fL;Rg � fu; dg

� examples of paths of play are (U;L; u), (U;R; d), (D;L; u). Given the natural

order de�ned by the EF we start with the Nature player � this player moves �rst

and no one has the opportunity to in�uence it. Player I�s information is de�ned by

the partition �2 = fX2;1; X2;2g, where X2;1 = f(U � �)g, X2;2 = f(D � �)g (where �

represents a placeholder representing all the moves made by the player who chooses

the corresponding move). Consider N ! I, (1 ! 2), and pick a path, say (U;L; u);

~h2(U;L; u) = (U) so that ~h2n1(U;L; u) = ;. Clearly ~h�12n1(;) = Z. In this case we just

15

need to �nd a path, z0, such that ~X2(z0) 6= ~X2;1. As ~X2;1 = f(U; �; �)g any path in

f(D; �; �)g, say (D;L; u), su¢ ces to show that N ! I (1! 2).

N II

I

I

II

U

D

L R

L R

u

d

u

d

u

d

u

d

a1

a4

a3

a2

a5

a6

a7

a8

N II

I

I

II

U

D

L R

L R

u

d

u

d

u

d

u

d

a1

a4

a3

a2

a5

a6

a7

a8

Figure 2: extensive form of the Signalling game

Let us move to player II, which has two information sets X3;1 = f(�L�)g; X3;2 =

f(�; R; �)g. We want to see whether 2! 3: let us pick the same path as before, z =

(U;L; u); ~h3(U;L; u) = (U;L) so that ~h3n2(U;L; u) = (U). Now, ~h�13n2(U) = f(U; �; �)g.

Given ~X3;1 = f(�L�)g, pick z0 = (U;R; u). Clearly, z0 2 ~h�13n2�~h3n2(U;L; u)

�and

z0 62 ~X3;1, so that I ! II (2 ! 3). Finally, consider whether N ! II (1 ! 3) and

again use z = (U;L; u); ~h3n1(U;L; u) = (L) and ~h�13n1(L) = f(�; U; �)g. Now, for all

z0 2 ~h�13n1�~h3n1(U;L; u)

�, z0 2 ~X3;1. If we were to pick z in the other information set,

X3;2 = f(�R�)g the same arguments would be valid so that N 6! II (1 6! 3).

Observe that the � notation used in the description of information sets gives us

a quick though informal way to construct the IOD, namely, i ! k if i < k and one

cannot describe k�s information sets using � for all of i�s moves (alternatively, if in

the description of k�s information sets, an asterisk appears at every ith location then

i 6! k).

16

5 Preliminary Results

The following lemma establishes the rather obvious result that the conditional prob-

ability of ~ak given a history ~hk is simply the value of the probability parameter on the

edge associated with ~ak following the history ~hk. Although the proof is in Appendix

B, working through the proof will establish some useful ideas in a simple context.

Lemma 2 Given a game �, for all k 2 N; � 2 �; z 2 Z;

��

�~akj~hk

�(z) = ~�k (z) :

The next lemma demonstrates an important �nding regarding the relationship

between ~ak and ~ai under the IOD associated with a �nite extensive form. The proof

relies on an extension of the ideas used to prove Lemma 2.

Lemma 3 Given a game � and its IOD (N;!) ; for all k 2 N and all i 2 f1; :::; k � 1g ;

(i9 k), (8� 2 �)�~ak ? ~aij~hkni

��:

Proof. See Appendix B.

The following proposition is critical to all that follows. It establishes a direct

connection between a game�s IOD and certain conditional independencies that must

hold in every one of its empirical distributions. Given an IOD (N;!) ; let �k denote

the parents of k and, in this context, let ~�k represent the projection of ~hk into the

dimensions indexed by �k; the notation ~hkn�k has the obvious meaning.

Lemma 4 Given a game � and its IOD (N;!) ; for all k 2 N and all � 2 �;�~ak ? ~hkn�k j~�k

��:

Proof. See Appendix B.

The main proposition in this section, which follows, demonstrates the connection

between game�s IOD and any of the empirical distributions that might be generated as

a result of play. Speci�cally, the result is that every empirical distribution generated

17

in the play of a game can be factored in accordance with the parental relations

embodied in its IOD. This linkage will prove useful in that it allows certain empirical

properties of games to be easily compared by referring directly to their IODs rather

than the empirical distributions themselves.

Proposition 1 Given a game �; for all � 2 �; �� admits recursive factorization

according to (N;!) : That is,

� 2 �; 8�� =nYk=1

�� (~akj~�k) : (10)

Proof. Apply Lemma 4 to condition (8).

It should be noted that a game�s IOD is �minimal�in the sense that there exist

empirical distributions for which the equation in (10) fails by the removal of any

element of �k for all k 2 N: Such distributions are said to be maximally revealing5.

Of course, most empirical distributions will have more independencies than the ones

implied by (10). The canonical example of such a distribution is the one in which

all players ignore whatever information they have when making their move, in which

case �� =nYk=1

�� (~ak) :

6 Empirical Compatibility and Equivalence

A game �0 is said to be empirically compatible to � when the empirical distribution

induced by any strategy pro�le in � can also be induced by an appropriately chosen

strategy pro�le in �0. The social

scientist observing outcomes generated by repeated play of� in � will, eventually,

develop a fairly precise estimate of ��: However, if �0 is empirically compatible to �;

5Such distributions are known as distributions for which (N;!) is a perfect map (Geiger et al

(1990)) or that the distribution is faithful to (N;!) (Meek(1995)). Meek(1995) demonstrates that

for any DAG and �xed discrete state space there is a distribution to which that graph is a perfect

map.

18

then there is no collection of �-generated data capable of ruling out �0 as the true

underlying game.

An obvious necessary condition for empirical compatibility is that the games have

consistent player sets and outcome pro�les. Given a permutation f : N ! N; let

f (a) denote the permuted pro�le�af(k)

�k2N : Then, � and �

0 are said to be outcome

compatible if there exists an f such that f (A) = A0. Note that when � and �0 are

outcome compatible; there exists a bijection between A and A0 and, hence, between

Z and Z 0: Thus, every �0 2 �0 implies a distribution on�A; 2A

�which we denote by

��0 (this distribution is constructed as in (7) but using the appropriate permutation).

Outcome compatibility is an equivalence relation.

De�nition 2 �0 is empirically compatible to �; denoted � � �0, if � and �0 are

outcome compatible and there exists a function g : � ! �0 such that 8� 2 �;

�� = �g(�):

If � � �0 and �0 � �; then � and �0 are said to be empirically equivalent, de-

noted � � �0: When �0 and � are empirically equivalent, any behavior observed

under � (�observed�in the sense of knowing ��) can also be observed under �0 and

visa versa. Empirical compatibility is strong in that the condition must hold for all

� 2 �. Alternatively, for example, one might be interested in a notion of empirical

compatibility de�ned only for speci�c (e.g., equilibrium) pro�les. As we saw in §2.1,

empirical equivalence does not imply equivalent information structures (recall, we

demonstrated that the complete information game in which I was observed by II

was empirically equivalent to the one in which II was observed by I).

Proposition 2 Empirical equivalence is an equivalence relation on the space of �nite-

length extensive form games. Moreover, if � � �0; there exists an onto correspondence

g : �� 0 such that 8� 2 �;�0 2 g (�) ; �� = ��0 :

Let us turn again to the standard signalling game (treating the nature player as

a regular player). Outcome compatible games are those that have three players, one

19

player with action set fU;Dg, another with fu; dg and the third with fL;Rg. An

example of a game that is empirically compatible to the signalling is one (game 1)

in which II also observes the Nature player, i.e. N ! I, I ! II and also N !

II. Clearly, any strategy in the standard signalling game can be replicated in this

extended game as player II can always ignore what the Nature player has done in

choosing her strategy.

L R

du

Ι

ΙΙ ΙΙ

DU

Ν

DU

Ν

du

DU

Ν

DU

Ν

a1 a4a3 a2 a5 a6a7 a8

L R

du

Ι

ΙΙ ΙΙ

DU

Ν

DU

Ν

du

DU

Ν

DU

Ν

a1 a4a3 a2 a5 a6a7 a8

Figure 3: a variant on the Signalling extensive form

A subtler situation occurs with an alternative game (Figure 3) in which both

player II and player N observe player I (and nothing else). Consider an arbitrary

strategy in this game, �0. Equating moves with player labels, and replacing histories

with the appropriate actions �0 can be decomposed as follows

��0 = ��0(~aI)��0 (~aN j~aI)��0 (~aII j~aI)

= ��0(~aI)��0 (~aN ; ~aI)

��0 (~aI)

��0 (~aII ; ~aI)

��0 (~aI)

= ��0 (~aN ; ~aI)��0 (~aII ; ~aI)

��0 (~aI)

20

Let us return to the standard signalling game and consider an arbitrary strategy �.

Using the same procedure as above, � factors as follows

�� = ��(~aN)�� (~aI j~aN)�� (~aII j~aI)

= ��(~aN)�� (~aN ; ~aI)

�� (~aN)

�� (~aII ; ~aI)

�� (~aI)

= �� (~aN ; ~aI)�� (~aII ; ~aI)

�� (~aI)

It is clear by visual inspection that any � from the standard signalling can be repli-

cated in game 2 �just de�ne �0 such that it assigns strategies in such a way that:

�� (~aN ; ~aI) = ��0 (~aN ; ~aI)

That is6

�I(R) = �� (U;R) + �I(R)�N(U jR)

�N(U jR) =�� (U;R)

�I(R)

�N(U jL) =�� (U;L)

1� �I(R)

Similarly, for any strategy, �0 in game 2, we can construct an equivalent � so

that the signalling game and game 2 are empirically equivalent7.

6The function �� (~aN ; ~aI) (z) can take four values. Fix the value of �� (D;L) by the restriction

that all four probabilities have to add to one. This leaves three free parameters: �I(R), �N (U jL)

and �N (U jR), to be determined from three equations

�� (U;R) = �I(R)�N (U jR)

�� (D;R) = �I(R)(1� �N (U jR))

�� (U;L) = (1� �I(R))�N (U jL)

7This conclusion would not hold if we were to impose the additional (and usual) restriction that

the Nature player cannot condition its behavior on any other player�s actions.

21

7 Assessing Empirical Compatibility

Now, imagine our social scientist observes a large number of outcomes generated

by repetition of a game with unknown information structure. To what extent does

such data illuminate the game�s underlying information structure? Given a candidate

game, empirical compatibility can be checked by the same brute-force approach used

in the earlier examples. In simple cases, this form of analysis is relatively straightfor-

ward.

On the other hand, consider the �Gatekeeper� extensive form shown in Figure

5. Here, 5 players interact under a relatively complex information structure. The

implications of this structure for the empirical distributions on actions arising from

player strategies are not obvious. We now develop results by which these implications

are neatly analyzed.

To begin, Proposition 1 says that every empirical distribution generated during

play of a game must admit recursive factorization according to its IOD. This imme-

diately implies that if one is interested in inferring a game�s information structure

from the empirical distributions it generates, consideration can be limited to games

whose IODs provide consistent bases for factorization. The following proposition is

an immediate corollary to Proposition 1.

Proposition 3 If � � �0 then,

8� 2 �; �� =Yk2N

�� (~akj~�0k) : (11)

To see why condition (11) is not su¢ cient for � � �0, consider the outcome

compatible games in Figure 4. Both have the same IOD: I ! II: It should be easy

to see that �2 � �1 but �1 � �2: The general issue is a measurability distinction

between the feasible strategies for player 2 in each of the games. Let � (�) indicate

the sigma-algebra generated by either a collection of sets or a function. Set I1 �

ffz1; z2g ; fz3; z4g ; fz5; z6gg and I2 � ffz1; z2g ; fz3; z4; z5; z6gg. Note that � (I2) �

22

� (I1) : It can be shown that, in �2;

8� 2 �2; �� = �� (~aII j� (I2))�� (~aI) : (12)

In �1; ��~�1II�= � (I1) while in �2, � (I2) � �

�~�2II�: This implies that �

�~�2II�

or ��~�1II�can be substituted for � (I2) in (12) so that all empirical distributions

generated under �2 can be factored as (10) or (11), respectively. In �2; player II�s

strategy function, ~�2

II ; is only � (I2)-measurable. Since � (I2) � � (I1) ; there are

strategies that can be implemented by II in �1 that cannot be implemented in �2:

# 1 # 2

ΙΙ ΙΙ

ΙL R

du

ΙΙ

dudu

M

ΙΙ ΙΙ

ΙL R

du

ΙΙ

dudu

M

z1 z2 z3 z4 z5 z6 z1 z2 z3 z4 z5 z6

# 1 # 2

ΙΙ ΙΙ

ΙL R

du

ΙΙ

dudu

M

ΙΙ ΙΙ

ΙL R

du

ΙΙ

dudu

M

z1 z2 z3 z4 z5 z6 z1 z2 z3 z4 z5 z6

Figure 4: �2 � �1 but �2 6� �1:

For certain types of games, Proposition 3 can be strengthened. � is said to be a

game with observability partitions if, for all k 2 N; player k observes the moves of

those preceding her either perfectly or not at all (hence, players 1; :::; k � 1 can be

partitioned into two sets: those whose actions are seen by k and those whose actions

are not). This class contains many extensive-form games of economic interest: all

of the games in Section 2 meet this requirement as do many standard market games

such as Cournot, Stackleberg, etc.

Proposition 4 Let �0 be a game with observability partitions. Then, � � �0 if and

only if � and �0 are outcome compatible and condition (11) holds.

23

To actually refute � � �0; one need not check every empirical distribution against

condition (11). Rather, one need only construct a maximally revealing strategy pro-

�le � in �, and then check whether the equation in (11) holds. If not, the games

are not empirically compatible (non-maximally revealing strategy pro�les can only

introduce additional independencies �never additional dependencies �with respect

to the game�s IOD). If � and �0 are games observability partitions, then the ability to

factor �� according to (11) also provides a su¢ cient test of empirical compatibility.

We formalize this below.

Proposition 5 Suppose �0 is a game with observability partitions that is outcome

compatible with �. If � 2 � is maximally revealing and �� =Qk2N �� (akj~�0k) ;

then � � �0:

Corollary 1 Suppose � and �0 are outcome compatible games, both of which have

observability partitions. If � 2 � and �0 2 �0 are maximally revealing and �� =Qk2N �� (~akj~�0k) and ��0 =

Qk2N ��0

�a0f 0(k)j~�f 0(k)

�; then � � �0:

Even limiting analysis to a single, properly constructed behavior strategy, the

preceding results are cumbersome in practice. For example, suppose one wanted to

identify the set of games empirically consistent to the Gatekeeper game (Figure 5).

This game has observability partitions, so one might be tempted to use Proposition

5 is the service of this task (this can be done, but it is clearly not a very heartening

prospect).

As it turns out, there is another, rather remarkable approach that allows the

testing of empirical equivalence by direct comparison of IODs without reference to

speci�c probability parameters. First, given a directed graph (N;!) ; let E � N2 be

the set of edges without reference to direction; i.e., (i; j) 2 E if and only if (i! j)

or (j ! i) : Let S � N3 be the set of all ordered triples such that (i; j; k) 2 S if and

only if (i! j) ; (k ! j) and (i; k) =2 E .

24

A B

1

2L R

3

rl rl

3

2L R

rl

3

rl

3

yx

4 4 4 4 4 4 4 4

yx yx yx yx yx yx yx

ba

5

ba

5

ba

5

ba

5

ba

5

ba

5

ba

5

ba

5

ba

5

ba

5

ba

5

ba

5

ba

5

ba

5

ba

5

ba

5

A B

1

2L R

3

rl rl

3

2L R

rl

3

rl

3

yx

44 44 44 44 44 44 44 44

yx yx yx yx yx yx yx

ba

5

ba

5

ba

5

ba

5

ba

5

ba

5

ba

5

ba

5

ba

5

ba

5

ba

5

ba

5

ba

5

ba

5

ba

5

ba

5

ba

5

ba

5

ba

5

ba

5

ba

5

ba

5

ba

5

ba

5

ba

5

ba

5

ba

5

ba

5

ba

5

ba

5

ba

5

ba

5

Figure 5: Gatekeeper extensive form

Proposition 6 Assume � and �0 are outcome compatible and games with observabil-

ity partitions. Then, � � �0 if only if their IODs are such that E = E 0 and S = S0.

Proof. See Theorem 3 in Appendix A.

Corollary 2 Suppose � and �0 are outcome compatible. If � � �0 then their IODs

are such that E � E 0 and S � S0. If �0 is also a game with observability partitions,

this condition is also su¢ cient.

Hence, at least in the case of games with observability partitions, empirical equiv-

alence can be ascertained immediately upon construction of a game�s IOD.8 To see

8This last result raised a question that was put to us by E. Dekel in correspondence. Given the

well-known work by Thompson (1952) and Elmes and Reny (1994) that identify transformations

on extensive form games that yield the equivalence class of games with the same strategic form, is

25

the signi�cance of this, recall that the IOD for the Gatekeeper extensive form is:

c1 c2

& .

c3

. &

c4 c5

We see where the game gets its name: player 3 is the �gatekeeper� of information

�owing from players 1 and 2 to players 4 and 5. Since this is a game of perfect

observation, Corollary 6 tells us that there are no other games with which this game

is empirically equivalent. Any empirically compatible IOD would have the same set

of edges, some with di¤erent directions. However, reversing any arrow above either

breaks a converging pair of arrows or creates a new one.

The requirement that elements of S not include convergent arrows with linked

tails (i.e., (i; j; k) 2 S) (i; k) =2 U) has an immediate implication for three-move

games with observability partitions. Namely, all outcome compatible games whose

IOD is a variation of the fully connected graph

1

. &

2 �! 3

are empirically compatible.

there a set of operations, similar to these in spirit, that yield games with equivalent IODs (in the

sense of Corollary 6)? Due to space limitations, we do not provide a formal reply. Clearly, however,

Corollary 6 does suggest a step-wise transformation that will yield observationally indistinguishable

extensive forms with di¤erent IODs. The transformation, while di¢ cult to formalize in the context

of an extensive form game, is easy to describe: it is the transformation that �ips an �allowed�arrow

(per Corollary 6) in the original IOD.

26

8 Extensions

8.1 Multiple Moves

We can easily allow for players to make multiple moves and the results of previous

sections will hold with the appropriate additional notation to di¤erentiate the kth

move from the player who makes that move. There are two main di¤erences between

a game with multiple moves and a game where each player moves only once: (1) the

information that a player has when choosing his second (third, ...) move includes the

information he had at the time of the �rst (second, ...); and (2) the player may be

able to play strategies that could not be implemented if the moves were played by

distinct players. The �rst issue is fully dealt with in the de�nition of the information

sets of the game. As we take information sets as given the inclusion of previously

known information does not present any problems for our results �actually, it makes

things a little simpler as the player observes his own previous moves perfectly so that

we are more likely to have a game with observability partitions. The second turns out

to be a non-issue, given that we assume perfect recall. It is well-known (Kuhn 1953,

Ritzberger 1999) that the set of probability distributions that can be generated using

mixed strategies is the same as those that can be generated by behavior strategies.

Hence, strategies made by the same player at di¤erent moves can continue to be

described using conditional probabilities, as was done when the number of moves was

equal to the number of players, so that the probability distributions generated by

players continue to be generated from conditional distributions (Lemma 2) and can

be factored using an IOD as shown in Lemma 1.

8.2 In�nite Action Spaces

The use of in�nite action spaces adds some technical complications but does not

a¤ect our results in any qualitatively substantial way. Technically, action spaces can

be any compact metric space and there are several ways to represent agents�strategies.

27

Given the importance for our results of Lemma 2, it is satisfying to know that such

a representation is valid in the more general setting (Milgrom and Weber 1985).

The representation of information is also more technically challenging. We restrict

ourselves to games with absolutely continuous information in the sense of Milgrom

and Weber9 (1985, assumption R2). Then, behavior strategies can be de�ned using

conditional probabilities which generate a regular probability measure (see Milgrom

and Weber 1985) and Chakrabarty 1999) as in Lemma 2 and hence generate an IOD10

that can be factored as shown in Lemma 1.

An important question now is how should the IOD de�nition be applied in games

with in�nite action spaces. The simplest way to solve this is to use measurabil-

ity requirements, i.e. i ! k if i < k and k�s information is not measurable on

�j2f1;:::;k�1gnfigAj. Given our discussion above, this de�nition implies that Lemma 4

and Proposition 1 hold. Once these basic results are established, the rest follow.

9In Milgrom and Weber(1985) imperfect information is modelled using type spaces. This can be

easily linked to our model by letting the type space be the product of the action spaces of the other

players the agent observes. These type spaces are the �nite product of compact metric spaces and

hence also compact metric spaces.10An issue that has been pointed out to us is that not all probability distributions can be repre-

sented using an IOD/DAG. We feel we have demostrated that this more general result does not have

consequences in our paper as we are restricted, by the nature of behavior strategies, to probability

measures that are built up from conditional probabilities and hence accept (at least) the factoring

that generates them. Furthermore, we know that the set of distributions that do not admit factor-

ization in the space of normal distributions (Spirtes et al (1993) and in the space of multinomial

distributions (Meek(1995)) has Lebesgue measure zero. As these spaces of distributions can be used

to approximate more general distributions we are con�dent that were our results to have to be re-

stricted to probability distributions that can be factored according to a directed acyclic graph, they

would continue to be su¢ ciently general and relevant.

28

9 Conclusions

Although our de�nition of empirical compatibility is new, the idea of observation-

ally indistinguishable strategies is introduced at least as early as Kuhn (1953). Two

strategies, behavior or mixed, are equivalent if they lead to the same probability

distribution over outcomes for all strategies of one�s opponents. Kuhn demonstrates

that, in games of perfect recall, every mixed strategy is equivalent to the unique be-

havior strategy it generates and each behavior strategy is equivalent to every mixed

strategy that generates it (see Aumann, 1964, for an extension to in�nite games). It

follows immediately that every extensive-form game of perfect recall is indistinguish-

able from its reduced normal form. Thus, any two games with the same reduced

normal form are indistinguishable.

We have interpreted our results as consistent with the inferences that would be

made by an outside observer with su¢ ciently informative empirical data. One ques-

tion that immediately comes to mind is whether these ideas can be extended to

construct an econometric test for game structure given cross sectional data on player

actions. For example, the maximum likelihood estimate of the information structure

for an industry might be useful in re�ning cost estimation in empirical work in in-

dustrial organization (as suggested by our earlier example). This is the subject of

on-going research.

Throughout the paper, the assumption was maintained that the underlying players

and action feasible outcomes were complete in the sense that the social scientist was

not concerned with hidden variables issues. There has been much work done in the

�eld of probabilistic networks on dealing with this issue and we direct interested

readers to the references cited in Appendix A for further inquiry.

An important element distinguishing our work from the literature on probabilistic

networks is that the IOD is derived from the primitives of a game and not from

the properties of a single, arbitrarily-speci�ed probability distribution. Thus, the

information encoded in an IOD holds for all empirical distributions arising from play

29

in the underlying game. Moreover, many of our results rely on the special structure

implied by distributions of this kind and, as mentioned earlier, may not hold in a non-

game-theoretic context (or, for example, if correlated strategies are allowed without

inclusion of a speci�c correlating device).

Until recently, work on probabilistic networks focused upon the decision problem

of a single individual. Thus, another aspect that separates our work from the existing

literature in this �eld is its use of these objects in the solution of game-theoretic (i.e.,

interactive) decision problems. Two other papers, one by Koller and Milch (2002)

and another by La Mura (2002) also use probabilistic networks to derive results of

interest to game theorists, though along a di¤erent line. Both of these papers develop

alternative representations for interactive decision problems (i.e., as opposed to a

game�s strategic or normal form) and argue that these representations are not only

computationally advantageous but also provide qualitative insight into the structural

interdependencies between player decisions. Our work clearly complements this line

of research.

30

A Dependency models

Here, we provide a condensed discussion of the relevant underlying theory of proba-

bilistic networks. Since most economists are unfamiliar with this literature, we wish

to: (i) give readers a sense of its theoretical content, and (ii) provide su¢ cient techni-

cal detail to support the development of our proofs. For those interested in pursuing

these ideas further, we suggest starting with the texts by Cowell et al. (1999), Pearl

(1988, 2000) and Spirtes at al. (2000).

De�nition 3 A dependency model M over a �nite set of elements T is a collection

of independence statements of the form (C ? DjE) in which C;D and E are disjoint

subsets of T and which is read �C is independent of D given E.�The negation of an

independency is called a dependency.

The notion of a general dependency model originated with Pearl and Paz (1985),

who were motivated to develop a set of axiomatic conditions on general dependency

models that would include probabilistic and graphical dependencies as special cases.

These axioms are known as the graphoid axioms.13 We are interested in graphoids,

which are de�ned as dependency models that are closed under the graphoid axioms.

For example, given a probability space (;F ; �) and an associated, �nite set of

random variables X indexed by N = f1; :::; ng with typical element ~xr; M� denotes

the list of conditional independencies that hold under �: For all W � N; let ~xW �

(~xr)r2W . Then, for all disjoint C;D;E � N; (C ? DjE) 2 M� if and only if ~xC

is �-conditionally independent of ~xD given ~xE (in the main text we do not work

with general dependency models and, hence, write (~xC ? ~xDj~xE)� for clarity). A

proof that the graphoid axioms hold for conditional independence in all probability

distributions can be found in Spohn (1980).

Alternatively, if G is a graph whose vertices are N , then for all disjoint C;D;E �

N; (C ? DjE) 2 MG if and only if E is a cutset separating C from D: Of course, in

this case, the meaning of (C ? DjE) depends upon how one de�nes �cutset.�The

31

literature on probabilistic networks contains several such de�nitions, depending upon

whether the graph is undirected, directed or some mixture of the two (e.g., a chain

graph). Since our IODs are directed, acyclic graphs (hereafter, DAGs), we proceed

with Pearl�s (1986) notion of d-separation (the d stands for �directed�).

Given a DAG G � (N;!) ; a path is an ordered set of nodes P � N such that,

for all �r; �r+1 2 P; either �r ! �r+1 or �r �r+1. A node �r 2 P is called

head-to-head with respect to P if �r�1 ! �r �r+1 in P . A node that starts or

ends a path is not head-to-head. A path P � N is active by E � N if: (i) every

head-to-head node is in or has a descendant in E; and (ii) every other node in P is

outside E: Otherwise, P is said to be blocked by E:

De�nition 4 If G = (N;!) is a DAG and C;D and E are disjoint subsets of N;

then E is said to d-separate C from D if and only if there exists no active path by E

between a node in C and a node in D:

Examples of d-separation can be found in the Pearl references cited above. Thus,

given a DAG G we de�neMG such that (C ? DjE) 2MG if and only if E d-separates

C from D in G.

We wish to characterize the relationship between probabilistic and graphical de-

pendency models. This is done through the general notion of an independence map

(or, I-map).

De�nition 5 An I-map of a dependency model M is any model M 0 such that M 0 �

M:

Given a probability space (;F ; �) and an associated, �nite set of random vari-

ables X � f~x1; :::; ~xng ; the task of constructing a DAG (N;!) ; N = f1; :::; ng, such

that M(N;!) is an I-map of M� is straightforward (see Geiger et al., 1990, p. 514).

First, for all r 2 N; let Ur � f1; :::; r � 1g index the predecessors of ~xr according to N:

Next, identify a minimal set of predecessors �r � N such that (frg ? Urn�rj�r)� :

32

This results in a set of n independence statements known as a recursive basis drawn

from M� and denoted B�: Now, construct (N;!) such that s ! r if and only

if s 2 �r. The resulting graph G; a DAG, is said to be generated by B� and

�r = fs 2 N js! rg is the set of parents of r in G.

The following theorems are from Geiger et al. (1990, Theorems 1 and 2). First,

an independence statement (C ? DjE) is a semantic consequence (with respect to a

class of dependency modelsM �e.g., those that satisfy the graphoid axioms) of a set

B of such statements if (C ? DjE) holds in every dependency model that satis�es

B; i.e., (C ? DjE) 2M for all M such that B �M 2M.

Theorem 1 (soundness) If M is a graphoid and B is any recursive basis drawn

from M; then the DAG generated by B is an I-map of M:

So, given (;F ; �), the DAG G constructed in the fashion outlined above is an

I-map of M�: That is, every independence statement implied by (N;!) under d

-separation corresponds to a valid �-conditional independency.

Theorem 2 (closure) Let G be a DAG generated by a recursive basis B: Then MG;

the dependency model generated by G; is exactly the closure of B under the graphoid

axioms.

Following Pearl (2000, p. 16-20), two DAGs (N;!) and (N;!0) are said to be

observationally equivalent if every probability distribution that can be factored in

accordance with the recursive basis B(N;!) � f(frg ? Urn�rj�r) jr 2 Ng can also

be factored in accordance with B(N;!0) � f(frg ? U 0rn�0rj�0r) jr 2 Ng : The following

theorem is from Pearl (2000, Theorem 1.2.8). It originally appears in Verma and

Pearl (1990,Theorem 1) and is generalized by Andersson et al. (1997, Theorems B.1

and 2.1).

Theorem 3 Two DAGs (N;!) and (N;!0) are observationally equivalent if and

only if E = E 0 and S = S0:

33

Finally, when (;F ; �) has product structure, the following result is useful.

Theorem 4 Given a DAG G and a probability space (;F ; �) in which � is a product

measure, the following conditions are equivalent:

(DF) � admits recursive factorization according to G;

(DG) d-separation in G implies conditional independence according to �:

(C ? DjE)G ) (C ? DjE)� ;

(DL) � obeys the local Markov property relative to G;

(DO) � obeys the ordered Markov property relative to G.

Proof. This follows from Theorem 5.14 p74 in Cowell, Dawid, Lauritzen and

Spiegelhalter (1999). In our case, �� is a product measure and (N;!) is directed

and acyclic. Hence, by Proposition 4, condition �DF�is satis�ed, thereby implying

the three conditions stated above.

B Selected proofs

B.1 Additional notation used in the proofs

HI(z) � fz0j8i 2 I~ai(z0) = ~ai(z)g

where I represents an index set. Usually we will use

I = �k �the parent set of node k

I = k �shorthand for the index set f1; : : : ; k � 1g

I = k � i �shorthand for the index set f1; : : : ; k � 1g n fig

I = k � �k �shorthand for the index set f1; : : : ; k � 1g n �k

Ak(z) � fz0j~ak(z0) = ~a(z)g

Ak;j � fz0j~ak(z0) = ak;jg

34

B.2 Lemma 2

Lemma 5 Given � 2 �, 8z, if �� (Hk(z)) > 0

��(Hk(z)) =k�1Yj=1

~�j(z);

andX

z02Hk(z)

nYj=k

~�j(z0)

!= 1

Proof. Given � 2 �, consider k = n:

8z 2 Z, ��(z) =Qni=1~�i(z), let ~Xn(z) = Xn;k.

Hn(z) �nz0jz0 =

�~hn(z); ~an(z

0)�o

��(Hn(z)) =X

z02Hn(z)

nYj=1

~�j(z0)

=X

z02Hn(z)

n�1Yj=1

~�j(z)~�n(z0)

=n�1Yj=1

~�j(z)X

z02Hn(z)

~�n(z0)

=n�1Yj=1

~�j(z)lnXj=1

�nk;j =n�1Yj=1

~�j(z)

As, by the de�nition of a strategyPln

j=1 �nk;j = 1.

Consider k = n� 1. Then,

Hn�1(z) �nz0jz0 =

�~hn�1(z); ~an�1(z

0); ~an(z0)�o

��(Hn�1(z)) =X

z02Hn�1(z)

n�2Yj=1

~�j(z)~�n�1(z0)~�n(z

0)

35

For each an�1;k, k = 1; : : : ; ln�1, let zk be any z 2 Hn�1(z) such that Hn(zk) =

(~hn�1(z); an�1;k). Then, using the same arguments as for k = n we get

��(Hn�1(z)) =

ln�1Xk=1

Xz02Hn(zk)

n�2Yj=1

~�j(z)~�n�1(zk)~�n(z0)

=

n�1Xk=1

n�2Yj=1

~�j(z)~�n�1(zk)X

z02Hn(zk)

~�n(z0)

=

ln�1Xk=1

n�2Yj=1

~�j(z)~�n�1(zk) =

n�2Yj=1

~�j(z)

n�1Xk=1

~�n�1(zk)

=

n�2Yj=1

~�j(z)

Then by recursively repeating the above arguments

��(Hk(z)) =k�1Yj=1

~�j(z)

The second equation then follows from

��(Hk(z)) =X

z02Hk(z)

nYj=1

~�j(z0)

=k�1Yj=1

~�j(z)X

z02Hk(z)

nYj=k

~�j(z0)

Proof of Lemma 2

Proof. Given � 2 �, for all z and k such that Hk(z) > 0

��(~akj~hk)(z) =��(Hk(z) \ Ak(z))

��(Hk(z))

=��(Hk+1(z))

��(Hk(z))

=

Qkj=1~�j(z)Qk�1

j=1~�j(z)

= ~�k(z)

36

B.3 Lemma 3

Proof. ()) i 6! k implies either (a) i > k or (b) i < k and 8z 2 Z and z0 2~h�1kni

�~hkni (z)

�, ~Xk (z) = ~Xk (z

0), or (c) i < k and 9z 2 Z and z0 2 ~h�1kni�~hkni (z)

�,

~Xk (z) = ~Xk (z0) and #Ak = 1.

Case (a): i > k ) fig 62 f1; : : : ; k � 1g, so that ~hk(z) = ~hkni(z) so that 8� 2 �,

��(~aij~hk)(z) = ��(~aij~hkni)(z)

Case (b): i < k and 8z 2 Z and z0 2 ~h�1kni

�~hkni (z)

�, ~Xk (z) = ~Xk (z

0) implies

8z0 2 Hk�i(z) \ Ak(z) � Hk�i, ~�k(z) = ~�k(z0). By Lemma 2,

8z0 2 Hk�i(z); ��(~akj~hk)(z0) = ~�k(z0) = ~�k(z)

) ��(Hk+1(z0)) = Hk(z

0)~�k(z)

Let H denote the set of all possible k � 1 length histories, such that

H �n(a1; : : : ; ak�1) j9z0 2 Hk�i(z); ~hk(z0) = (a1; : : : ; ak�1)

oAnd z�1(h) some arbitrary z0 2 Hk�i(z) such that ~hk(z0) = h. Then,

��(Hk�i(z)) =Xh2H

��(h)

��(Hk�i(z) \ Ak(z)) =Xh2H

��(h)~�i(z�1(h))

��(~akj~hk�i(z) =��(Hk�i(z) \ Ak(z))

Hk�i(z)

) ��(~akj~hk�i(z) =

Ph2H ��(h)

~�i(z�1(h))P

h2H ��(h)

=~�i(z)

Ph2H ��(h)P

h2H ��(h)

= ~�i(z) = ��(~akj~hk)(z)

Case (c): If #Ak = 1 then 8z 2 Z; ~�k(z) = 1, so that for all i, ��(~akj~hk�i)(z) =

��(~akj~hk)(z) = 1.

37

(() Consider the possibility that there exists z and z0 such that z0 2 ~h�1kni�~hkni (z)

�,

i < k and #Ak > 1, and ~Xk;1 � Xk(z0) 6= Xk(z) � Xk;2. Note that this implies

~aj(z0) = ~aj(z) for j 2 f1; : : : ; k � 1g n fig and ~ai(z0) 6= ~ai(z). Suppose the condition

of the Lemma holds: for all � 2 � (~a ? ~aij~hkni)�. Consider the following strategy:~�j(z) = 1 for j 2 f1; : : : ; k � 1g n fig, ~�i(z) = 1=4, ~�k(z) = 1=2, ~�i(z0) = 3=4. As

Xk;1 6= Xk;2 we can choose ~�k(z0) � �k2;r, where r 2 f1; : : : ; lkg freely. Without loss of

generality let ~ak(z) = a1, and let �k2;1 = � 2 [0; 1]. We have:

��(Hk(z)) = 1=4; ��(Hk(z0)) = 3=4

��(Hk�i(z0)) = ��(Hk�i(z)) = 1

��(a1j~hk)(z) = 1=2; ��(a1j~hk)(z0) = �

��(a1j~hk�i)(z) =(1=4) � (1=2) + (3=4) � �

1

=1

8(1 + 6�) = ��(a1j~hk)(z) =

1

2

According to the condition of the Lemma this implies that � = 1=2 but � can take

any value in [0,1] �a contradiction.

B.4 Lemma 4

Proof. Given Lemma 3 and the proof of i 6! k ) 8� 2 � (~ai ? ~akj~hkni)�. If

#Ak = 1, let Ak = fag then if is trivially true that for all i; j < k, ��(ajhk�i�j) = 1.

If #Ak > 1, what we now need to show is that 8z0 2 H�k(z), ~Xk(z0) = ~Xk(z). It

su¢ ces to show that for any pair i; j such that i; j 2 �j, 8z0 2 Hk�i�j(z), ~Xk(z0) =

~Xk(z). From Lemma 3 we have that 8z0 2 Hk�i(z) [ Hk�j(z), ~Xk(z0) = ~Xk(z),

but Hk�i(z) [Hk�j(z) is (generically) a strict subset of Hk�i�j(z). Fortunately, the

de�nition of i 6! k applies to all of Z, not just to a particular z. This means that for

any z0 2 Hk�i(z), it is true that 8z00 2 Hk�j(z0) ~Xk(z00) = ~Xk(z

0) = ~Xk(z), and for

any z� 2 Hk�j(z), it is true that 8z000 2 Hk�j(z�) ~Xk(z000) = ~Xk(z

�) = ~Xk(z), so that

our desired result holds, namely, for any pair i; j such that i; j 2 �j, 8z0 2 Hk�i�j(z),

38

~Xk(z0) = ~Xk(z), and hence, 8z0 2 Hk��k(z), ~Xk(z

0) = ~Xk(z). Applying the logic used

in the proof of Lemma 3:

8z0 2 Hk��k(z); ��(~akj~hk)(z0) = ~�k(z)

) ��(Hk+1(z0)) = Hk(z

0)~�k(z)

Let H denote the set of all possible k � 1 length histories, such that

H �n(a1; : : : ; ak�1) j9z0 2 Hk��j(z); ~hk(z0) = (a1; : : : ; ak�1)

oAnd z�1(h) some arbitrary z0 2 Hk��k(z) such that ~hk(z0) = h. Then,

��(Hk�i(z)) =Xh2H

��(h)

��(Hk�i(z) \ Ak(z)) =Xh2H

��(h)~�k(z

�1(h))

��(~akj~hk��k)(z) =��(Hk��k(z) \ Ak(z))

Hk��k(z)

) ��(~akj~hk��k(z)) =

Ph2H ��(h)

~�k(z�1(h))P

h2H ��(h)

=~�k(z)

Ph2H ��(h)P

h2H ��(h)= ~�k(z) = ��(~akj~hk)(z)

39

References

[1] Andersson, S.A., Madigan D., Perlmann, M.D., 1997. A Characterization of

Markov Equivalence Classes for Acyclic Digraphs. The Annals of Statistics,

25(2), 505-541.

[2] Armbruster, W., Boge, W., 1979. Bayesian game theory, in: Moeschlin, O.,

Pallaschke, D. (Eds.), Game Theory and Related Topics. North-Holland, Ams-

terdam, pp. 17-28.

[3] Aumann, R. J., 1964. Mixed and behavior strategies in in�nite extensive games,

in: Dresher, O., Shapley, L. S., Tucker, A. W. (Eds.), Advances in Game Theory.

Princeton University Press, Princeton, pp. 627-650.

[4] Bollobas, B., 1998. Modern Graph Theory. Springer, New York.

[5] Chakrabarti, S. K., 1999. Finite and In�nite Action Dynamic Games with Im-

perfect Information, Journal of Mathematical Economics, 32(2), 243-66.

[6] Cowell, R. G., Dawid, A. P., Lauritzen, S. L., Spiegelhalter, D. J., 1999. Proba-

bilistic Networks and Expert Systems. Springer, New York.

[7] Elmes, S., Reny, P., 1994. On the strategic equivalence of extensive form games.

Journal of Economic Theory, 62(1), 1-23.

[8] Fristedt, B., Gray, L., 1997. A Modern Approach to Probability Theory.

Birkhäuser, Boston.

[9] Geiger, D., Verma, T. S., J. Pearl, J., 1990. Identifying independence in Bayesian

networks. Networks, 20, 507-34.

[10] Harsanyi, J., 1967-68. Games with incomplete information played by �Bayesian�

players,�Management Science, 8, Parts I-III, 159-82, 320-34, 486-502.

40

[11] Koller, D. Milch, B., 2002. Multi-agent in�uence diagrams for representing and

solving games. Seventeenth International Joint Conference on Arti�cial Intelli-

gence, Seattle, WA.

[12] La Mura, P., 2002. Game Networks. Unpublished manuscript.

[13] Mackey, G. W., 1957. Borel structures in groups and their duals. Trans. Amer.

Math. Soc., 85, 283-95.

[14] Meek, C., 1995. Strong completeness and faithfulness in Bayesian networks, Pro-

ceedings of Eleventh Conference on Uncertainty in Arti�cial Intelligence, Mon-

treal, QU, 411-418.

[15] Milgrom, Paul R., Weber, Robert J., 1985. Distributional Strategies for Games

with Incomplete Information. Mathematics of Operations Research, 10 (4), 619-

33.

[16] Pearl, J., 1986. Fusion, propagation and structuring in belief networks. Arti�cial

Intelligence 29, 241-88.

[17] � � , 1988. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible

Inference. North Holland, Amsterdam.

[18] � � , 2000. Causality: Models, Reasoning, and Inference. Cambridge University

Press.

[19] � � , Paz, A., 1985. Graphoids: A graph based logic for reasoning about

relevance relations. Tech rep 850083 (R-53-L), Cognitive Systems Laboratory,

UCLA.

[20] Ritzberger, K., 1999. Recall in Extensive Form Games. International Journal of

Game Theory, 28(1), 69-87.

41

[21] Spirtes, Peter, Glymour, Clark and Scheines, Richard, 1993. Causation, Predic-

tion, and Search. Springer-Verlag.

[22] Spohn, W., 1980. Stochastic independence, causal independence and shieldabil-

ity. J. Phil. Logic, 9, 73-99.

[23] Thompson, F. B., 1952. Equivalence of games in extensive form. The Rand Cor-

poration, Research Memorandum 759.

[24] Verma, T. S., Pearl, J., 1990. Equivalence and synthesis of causal models, in:

Proceedings of the 6th Conference on Uncertainty in Arti�cial Intelligence. Cam-

bridge, pp. 220-7. Reprinted in: Bonissone, P., Henrion, M., Kanal, L. N., Lem-

mer, J. F. (Eds.), Uncertainty in Arti�cial Intelligence, vol. 6, 255-68.

42

Empirical Implications of Information Structure in Finite ... · November 10, 2004 Our thinking on...

Documents

Transcript of Empirical Implications of Information Structure in Finite ... · November 10, 2004 Our thinking on...