Game, set and match - HEC

Copyrigh t © 2004 HEC Montréal.

Tous droits réservés pour tous pays. Toute traduction ou toute reproduction sous quelque forme que ce soit est interdite.

Les textes publiés dans la série des Cahiers de recherche HEC n'engagent que la responsabilité de leurs auteurs.

La publication de ce Cahier de recherche a été rendue possible grâce à des subventions d'aide à la publication et à la diffusion

de la recherche provenant des fonds de l'École des HEC.

Direction de la recherche, HEC Montréal, 3000, chemin de la Côte-Sainte-Catherine, Montréal (Québec) Canada H3T 2A7.

SubGame, set and match. IdentifyingIncentive Response in a Tournament

by Andrew LEACH

Cahier de recherche no IEA-04-02February 2004

ISSN : 0825-8643

SubGame, set and match

Identifying Incentive Response in a Tournament

Andrew Leach∗

HEC Montreal†

February 1, 2003

Abstract

Data from Association de Tennis Professionel (ATP) championship tennistournament finals are used to test for strategic behavior of players and theirresponses to incentives. Tennis provides a rich environment for the study of in-centive response because of the individual nature of the sport, and the clearlydefined tournament structure. The parameters of a sequential game modelare estimated and, controlling for measured ability differences, the existence ofstrategic decision making where players’ efforts vary depending on the state ofthe match is tested against the alternative that players do not alter their effortin response to incentives.

JEL classification: C7, C15, C63

Keywords: Tournaments, Sequential games, Incentive Response, Nash Equi-librium

∗Helpful comments from Christopher Ferrall, John Knowles, Pierre-Thomas Leger, C. RobertClark and Shannon Seitz are gratefully acknowledged, as are comments from seminar participantsat Queen’s University, the Second Annual Conference on Numerically Intensive Policy Analysis, 2001and the Canadian Economics Association Meetings, 2002. Data for this project were obtained fromKevin Gale. Address communications to the author at [email protected] or Institut d’economieappliquee, HEC Montreal, 3000, chemin de la Cote-Sainte-Catherine, Montreal (Quebec) H3T 2A7.

†Affiliated with the University of Montreal

1 Introduction

This paper develops and estimates a structural model of agent interaction in a tour-

nament to provide evidence of the type of alteration in effort levels in response to

incentive structures suggested in Lazear and Rosen (1981). Tournaments are used as

contracts in the labour market, and the theory predicts that the agents facing these

incentives will alter their effort accordingly. Clearly, if agents do not perceive these

incentives or react to them, we may be using inefficient labour contracts. This paper

asks the question of whether agents alter their effort levels conditional on incentives,

and is able to conclude that they do.

A tennis tournament final presents an excellent medium to study the incentive

structure present in a rank-order tournament. A tennis tournament begins with as

many as n=128 players, with each round of the tournament consisting of n/2 pairwise

matches where the loser is eliminated from the tournament, and receives a payoff

depending on the round in which the match occurred. Each match is itself a nested

tournament, with a best 2 out of 3 structure. Further to that, each set and game

within the match form best of 11 and best of 7 tournaments respectively. Given

the extensive sequential game which is described above, it is possible to identify the

actions of agents in response to strategic incentives, as suggested by the models of

Rosen (1986) and Lazear and Rosen (1981). The final match is of particular interest

since the final rewards are clearly known to both players, and additional assumptions

on tournament structure are not required.

The Lazear and Rosen (1981) model focuses on a one-shot game where players’

efforts are positively but not perfectly correlated with performance, and rewards are

2

assigned based on relative performance or output. In a simple, two-player game

performance is measured generally as:

q = x + e, (1)

where x is the action or effort taken by a player, and e is a random component. Effort

is costly, and the cost is determined by C(x) with C ′(x) > 0 and C ′′(x) > 0. Players

are rewarded for performance with one of two prizes, W1 or W2 with W1 > W2, and

the larger prize going to the player with the best (highest) performance q. Therefore,

in the one shot game, if we define P as the conditional probability of winning, the

expected payoff to player i is:

P [W1 − C(xi)] + (1 − P )[W2 − C(xi)] = P [W1 − W2] + W2 − C(xi), (2)

which leads to the first order condition for player i’s actions given by:

∂P

∂xi

[W1 − W2] − C ′(xi) = 0. (3)

This first order condition has two important interpretations. First, the effort level is

not determined by the size of the rewards, only by the difference between the rewards

for winning and losing; second, the effort level will be influenced by the relative effects

on cost and winning probability of changes in effort level, where the player sets the

marginal change in expected reward equal to the marginal effort cost.

A tennis tournament final is clearly parallel in setup to the theoretical paradigms

laid down in Lazaer and Rosen (1981), with two individuals playing a match with a

clearly defined reward structure. The propositions of Lazear and Rosen (1981) that

agents will alter effort levels to suit rewards at each node in a tournament can be tested

3

by imposing subgame perfect Nash equilibrium (SPNE) structure on the model. The

model in this paper is derived from Ferrall and Smith (1999)(FS), and we confront the

model with data from professional tennis tournaments. The results provide strong

empirical evidence for the type of incentive response proposed by Lazear and Rosen

(1981).

It is important to emphasize first why sports data and second why tennis data

may provide a useful experiment in incentive response. In a recent article, Ferrall

and Smith (1999) use professional sports championship series data to test for the

existence of incentive response in a tournament setting. FS use data from hockey,

basketball and baseball; games which may violate the key assumptions of a Lazear

and Rosen (1981) tournament model. Assumptions are violated in the sense that the

reward structure is not clearly defined, nor can they separate the strategic actions of

individuals from the actions of teams as a whole. Consider the example of a hockey

tournament. The outcome of the game is determined by the actions of the teams as

a whole, and teams arguably face a series of incentives at each stage of the series.

Beneath this, the individual players are those who make the effort decisions within

the game, and the model presented in FS does not address the specific incentives

facing players or the resulting actions. It is also less clear in hockey that the rewards

do not vary with the length of the series, particularly at the team level. The best

case scenario for any team is to win the Stanley Cup through winning 4 series that all

go to the maximum number of games, which maximizes television and gate revenue.

This is inconsistent with the basic dynamics of the tournament model, where each

additional round produces only further cost, but no change in benefits. As discussed

4

above, tennis is not subject to these fundamental differences in comparison with the

theory of Lazear and Rosen, and thus can be used to further test the implication of

the use of tournaments to stimulate effort in labour market contracts.

The difficulty with estimating the magnitude of strategic effects in the decisions

is in the identification of the reward structure. For example, in a tennis match, we

know that the winner of the match earns a certain prize, and the loser of a match

wins a certain prize, but what is the reward to winning the third game to go up 3-0

in the second set, after having won the first set? Or for that matter, in any other

game? With the exception of the final game, with the winner of that game winning

the match, all other games have reward structures that depend on the result of future

games, and thus are not clearly observable. The structure introduced in this paper

provides a method for identifying the rewards at each node, and thus the response

to incentives through an entire match. In the context of an extensive form game, the

rewards at each terminal node are known to the participants, and the rewards at each

non-terminal node are endogenous functions of the rewards at game nodes reached

by winning or losing the current game.

The remainder of the paper proceeds as follows. Section 2 presents the SPNE

model of a tournament. Section 3 presents the econometric approach to estimating

the parameters of the model, and Sections 4 and 5 present the data and the results

of the analysis. Section 6 discusses the results and concludes the paper.

5

2 The Model

This paper models the finals of a tennis tournament using a SPNE characterization.

In order to solve a subgame perfect equilibrium, the Nash solution is employed, as

outlined in Rosen (1986) and Osborne and Rubinstein (1994). In this solution, the

Nash equilibria of the static games are solved, and the recursive sequence of Nash

equilibria is a SPNE of the extensive form game. The advantages of this approach

are that the equilibrium is reasonably easy to calculate, and is robust. (Ferrall and

Smith, 1999)

2.1 The Final Match

The final match is characterized by two players playing a sequence of points. The

first player to win four points is the winner of a game. In games and sets, the analysis

abstracts from tie breakers, treating them as any other one shot game.1 Similarly, a

player must win six games in order to win the set, and two sets to win the match.2

This model looks at strategic decisions made by players at the level of individual

games. The outcome of a game is endogenous and stochastic with probabilities of

winning based on effort, as well as a stochastic luck component which is independently

and identically distributed over games in the match.3

Let the two players be indexed as a and b. Figures 4 and 5 show the game trees

1Tie-breakers are present in each level except in the 1 set each tie of a tennis match, but in thispaper, they are considered as the extensive form of the final game. For example, at 40-40, a gameis at deuce, and a player must win two points in order to win the game. In this characterization,the entire deuce ‘point’ is considered as a single game, or node in the subgame, and so on for settie-breakers.

2The ATP tour uses best of 3 set matches. For the remainder of this paper, and in the data usedto estimate the model, this match structure is assumed.

3FS refer to a luck component as ”the bounce of the ball” which is particularly applicable here.

6

for a set and a match respectively. The number of games in a set is variable, but is

bounded by 6 ≤ gi ≤ 11. Where N=11, the eleventh ‘game’ is, as discussed above, a

tie-breaker which is required if and only if the score reaches 5-5. If a player wins a

game that reaches 5-5, this is denoted as a 6-5 win regardless of the number of games

played after 5-5. Sets are nested within the sequential game nodes of a match. A

match is an N=3 tournament with a player having to win 2 sets in order to win. The

state of the match is therefore denoted by the 6x1 vector {g1a, g1b, g2a, g2b, g3a, g3b}where gji denotes the number of games won by player i in set j. The total number

of games played is not considered in the equilibrium, only the number of sets, j,

which can be backed out as j =∑3

i=1(gia + gib ≥ 1), where the brackets represent an

indicator function equal to 1 if the argument is true. The state space is defined as:

ω ∈ {g1a, g1b, g2a, g2b, g3a, g3b : 0 ≤ gji ≤ 6, min{gji} < 6, i ∈ {a, b}, j ∈ {1, 2, 3}} . (4)

The state vector ω yields all the information which is considered by players in the

model. The actual length of the match is also endogenous and stochastic, but is

bounded at 3 sets of a maximum of 11 games each, therefore, the entire match cannot

consist of more than 33 games. The transition of the state space subject to the results

of a game at state ω depends on the score of the set in play. If either player reaches 6

games, the number of sets won by that player increases by one. If that is the second

set won, the match is won by that player. Otherwise, the set score is updated, and

the game score resets to 0,0. If the game being played is not a set or match point,

the score of the set is updated to reflect the winner.

During each game in the match, players make strategic decisions xiω, which is

the amount of costly effort devoted to the play of the game at state ω. The score

7

differential d of the game is stochastic, but depends positively on own effort and

negatively on the effort of the other player, and on a random component ε.

d = (xaω − xbω) + εω (5)

ε ∼ F (0, σ2ε ).

The random term ε is independently and identically distributed across all games, is

distributed according to a cumulative distribution function (CDF) F (ε) and corre-

sponding probability density function (PDF) f(ε). The random term in (5) represents

unpredictable events which effect the outcome of the game. The score differential is

a complete picture of the outcome of the game, and it is assumed that players do

not take into account any signal from the size of the score differential in their future

strategies. We can thus denote the equilibrium(denoted by ∗) probability of victory

for players a and b, conditional on strategies, as:

Paω(x∗aω, x∗

bω) = Prob(d∗ω > 0) = Prob(x∗

aω − x∗bω) > −εω (6)

= 1 − F (−(x∗aω − x∗

bω))

= 1 − Pbω(x∗aω, x∗

bω),

where the F distribution is assumed to be normal with mean 0 and variance σ2 = 1

for the remainder of the paper. Additionally, games in which both players put out

no effort (ie. both give up) are toss ups. The symmetric marginal effect of strategy

choice is given by:

∂Paω(xaω, xbω)

∂xaω

= f(−(xaω − xbω)). (7)

8

Players’ observed ability is denoted by δiω, and is constant over the match. The

ability vector of each player is common knowledge. The ability of players influences

the cost of choosing effort level, according to a cost function.

The cost of effort function is associated with strategy or effort choices xiω, and is

denoted ciω(xiω). The function is increasing and convex in strategy choice, so that

both c′ and c′′ > 0. The cost of effort function is constant over the match. Individual

cost at each node depends on the ability of the individual players exponentially, as

well as on the aggressiveness of the strategy chosen, and assumed to be given by:

ciω(xiω) = e−δiω

r exiω

r , (8)

where r is a constant. The parameter r determines the convexity of the effort cost

function with respect to the players’ strategy choices. Its interpretation is important

in understanding how much agents will attempt to influence outcomes given different

strategic incentives. As r tends to zero, the marginal cost of effort above ability

level is infinite and the marginal cost of effort below ability level is zero, so players

will tend to play their abilities. More importantly for this paper, if the parameter r

estimated to be close to 0, it means that we are seeing no adjustment in behaviour

in response to incentives. This interpretation is clarified in Proposition 1. The cost

function is parameterized such that players with higher abilities (δ) will expend less

effort cost to play the same strategy x in a game. Effort costs are therefore decreasing

exponentially in ability and increasing exponentially in aggressiveness, conditional on

the convexity parameter r. The cost function is not able to capture explicitly the

cost of diverse strategies within the game of tennis, but rather it is meant to capture

the amount to which effort levels are changed in response to incentives. Costs are

9

independent from point to point, so it is not more costly to exert effort after having

recently played a high-effort point.

For further intuition on the role of the parameter r, consider Figures 1-3. Figure 1

shows the relative costs of increasing effort by 10% for different value of the parameter

r. Figure 2 shows, for a value of r in the estimated range, the costs of effort above

and below ability (.2). Finally, 3 shows the shape of the cost function as r limits to 0.

The marginal cost of effort below ability is 0, while playing any strategy above ability

level will have infinite cost. Using the structure of subgame perfect equilibrium to

identify reward structures at each node of the match, and assuming that players set

marginal benefit equal to marginal cost, we are able to identify the shape of this cost

function which is best supported by the data through estimation of the parameter r.

Because of the structure of the cost function, it is necessary to allow players to

’give up’, and play a strategy of xiω = −∞. In the case where the expected rewards

to winning a game are non-positive for any strategy xiω, players will not wish to

incur any costs, and thus must be able to set ciω(xiω) = 0, which is only possible at

xiω = −∞.

Players are risk neutral, and therefore act to maximize their expected net pay-

offs in the match/tournament. Payoffs are defined by the terminal values of the

match/tournament, which are common knowledge.4 The terminal values of the match

are the payoffs for winning or losing, less the expended costs over all of the games

played. Since the match length is stochastic and endogenous, the costs borne by a

4It may be true that the payoffs to winning a tournament include a change in world rankingwhich brings invitations to other tournaments and higher seeding therein. This is abstracted fromin this paper.

10

player vary depending on the path of the match, not just on the final score. Using

the notation of ωi as the state of the match reached when player i wins at state ω5,

the values to winning and losing at any state ω are given by:

{Vi[ωi] −

∑ω∈ω

caω(xaω), Vi[ω−i] −∑ω∈ω

caω(xaω)

}(9)

where ω is the set of states reached in the preceding games played in the match.

2.2 Nash Equilibrium in a Single Game

In order to clarify the SPNE concept in a tennis match, consider the Nash equilibrium

in the one shot game played in the third set of the final, with the set tied at 5-5, so the

game in question is the last game of the match, and the winner of the match wins the

tournament. The payoffs to the winner and loser of the game are very clearly defined

by the prize money available. Each player can choose a strategy xiω to influence the

outcome of the match, and can also choose the mixing probabilities between playing

that strategy and giving up entirely. A mixed-strategy Nash equilibrium is described

by the situation where neither player can achieve higher utility by deviating from

their equilibrium strategy given the equilibrium strategy of the other player. In order

determine the mixing probabilities in Nash equilibrium, it is often necessary to employ

a numerical methodology(see McKelvey, 1996 or Judd, 1998). In this case, much of

the computation can be saved using a pseudo-analytical solution. Equilibrium is

solved up to the solution to a single zero function, as given in FS.

5This notation will be employed throughout the paper, to maintain generality. Implicit in thisis a transition function were the state shifts to a new set when one player wins 6 games, and to theterminal state of the match which a player wins 2 sets

11

The player’s problem at the terminal node of the match is simply to maximize the

expected payoff to playing the game:

maxxiω

− ciω(xiω) + E [Piω(xiω, x−iω)∆Vi(ωi, ω−i) + Vi(ω−i)] i ∈ {a, b}. (10)

The expectation in (10) refers to the beliefs of player i for the actions of the other

player (-i) which determine the winning probability P, and ∆Vi(ωi, ω−i) refers to the

difference in values of winning the game (ωi) and losing it(ω−i). The values are known

to both players, they can be taken outside the expectation operator, and the player

is choosing effort to increase the probability of winning. Since each player receives

the value of losing in either case, and the difference between the winning and losing

value if he wins, the player’s problem in the one shot game reduces to:

maxxiω

− ciω(xiω) + E [Piω(xaω, xbω)] ∆Vi(ωi, ω−i) i ∈ {a, b}. (11)

In the extensive form game, the values at any non-terminal node are stochastic and

endogenous. However, since they are determined as functions of the Nash equilibria

at all possible future nodes, the expected payoffs are known to the players at each

node. Therefore the expectations operator is taken across both probabilities and the

value differential. As in FS, three indices are defined for the purposes of calculating

the Nash equilibrium.

12

incentive advantage : νω ≡ ln∆Va(ωa, ωb)

∆Vb(ωb, ωa)

ability advantage : δω ≡ δaω − δbω (12)

strategic advantage : ∆ω ≡ rνω + δω

The strategic advantage index in (12) incorporates both ability and incentive advan-

tage in a single index which is pivotal to the development of the Nash equilibrium.

In order to distinguish between the players, with respect to the sign of the score

differential, the index I is adapted such that:

It =

{1 if t = a

−1 if t = b. (13)

Adapted from the proofs of FS, the Nash equilibrium is described by the following

proposition:

Proposition 1 Under the given assumptions, the Nash equilibrium at any state ω ofthe match will be given by the solution to the following simultaneous equations:

xtω =

{r ln

(rγt′ωf(∆ω + rln

γt′ωγtω

)∆Vt(ωt, ω−t)eδtω/r

)with probability γtω

−∞ with probability (1 − γtω)

(14)for t ∈ {a, b}. Player t plays a pure strategy (γtω = 1) if:

γt′ω(−rf(∆ω + r ln γt′ω) + F (It∆ω + r ln γt′ω)) +(1 − γt′ω)

2≥ 0 (15)

Otherwise, γtω solves:

γt′ω(−rf(∆ω + r lnγt′ω

γtω

) + F (It∆ω + r lnγt′ω

γtω

)) + (1 − γt′ω)/2 = 0. (16)

13

Proof : See Appendix.

A sketch of the proof of the proposition provides the intuition for the equations

for both the equilibrium strategies and the mixing probabilities. First, the indirect

value of the pure strategy xtω is solved from the first order, necessary conditions for

the maximization of the objective function in (10). Given the value of the interior

solution, and the value of giving up, the indirect value of each of the pure strategies

available to player t are solved, conditional on the mixing probabilities employed by

t′ and the reward structure for the game. If the pure strategy dominates the mixed

strategy, player t will always play the pure strategy (15). If not, player t will choose

γtω such that the expected rewards in Nash equilibrium to each of the mixed strategies

are equal, and this solution is found in (16).

The solution to the Nash equilibrium provides an equilibrium winning probability

as a function of mixing probabilities and incentives. In Nash equilibrium, the winning

probability is described by:

Pω = prob(player a wins) = γaωγbωF (∆ω + r lnγbω

γaω

) (17)

+ (1 − γbω) +(1 − γaω)(1 − γbω)

2.

where the first term is the outcome when both players put out effort levels, and the

additional terms capture the toss-up probability when both players give up. The

algorithm for solving Nash equilibria at each node is described in detail in Section

3.1.

14

2.3 Subgame Perfect Equilibrium

In order to define the SPNE outcomes in the sets and matches, the final payoff

difference between winning and losing a match must be defined. As was shown in the

previous section, the Nash equilibrium at each node gives us the winning probabilities

at that point in the state space. Beginning from the terminal node in the third set,

backward induction allows the calculation of the Nash equilibrium payoffs at each

possible state within the match, which gives the robust SPNE as follows.

Assumption 1 Final payoffs to winning the match/tournament are known to theparticipants. In the final game of the tournament, the payoffs to winning and losingare given by the prize money differential for first and second place. Also, playerscare only about winning or losing and the total costs expended during play, and notexplicitly about the length of the match.

This assumption is analogous to that made in FS, however it may be more tenable

here. In a sports championship series, the rewards come largely from television and

gate revenues, which depend on the length of the series, although it is not clear that

the rewards accruing to the players vary with this length. In a tennis match, the

player is not likely to benefit in any way from a longer tournament/match.

Proposition 2 The SPNE of the match is completely defined by mixing probabilitiesγiω given in Proposition 1 for all elements of ω, and the values of being at each nodecan be defined recursively as the Nash equilibrium payoffs of the individual games ateach node, given the final prize money offered in the tournament. At any state ω,the value to player t of being at that node in the subgame is defined by the rewards towinning (ωt), losing (ω−t), and the winning probabilities defined in Nash equilibriumPtω.

15

Vi[ω] ≡ γtω∆Vt(ωt, ω−t)

[−rf(∆ω + r ln

γtω

γt′ω) + γt′ωF (It∆ω + r ln

γtω

γt′ω)

](18)

+1 − γtω

2[γt′ωVt[ω−t] + (1 − γt′ω)Vt[ωt]]

∆Vt(ωt, ω−t) = Vt[ωt] − Vt[ω−t] (19)

Equations in (18-19) give the value of each node in a set, given the terminalvalues in the match. Additional clarification for the values at the terminal nodes ofintermediate sets is required. For example, in the first set, the rewards to winningare defined as the value of the 0,0 node in the set which will be reached for eachresult (a wins or b wins). Formally, at state ω = (0, 0, gi, g−i)|gi = 5, ∀ g−i, ωi isdefined by Vi(ωi) where ωi = (1, 0, 0, 0). The final rewards to winning the second setclearly depend on the outcome of the first set, which is not immediately consistentwith backward induction. There are only two cases for the second set, either playerhaving won the first set, and the payoffs are then clearly defined as above. (See Figure5) In order to solve the SPNE, the 3rd set is solved, giving the value at the beginningof the third set. This value, combined with the values of winning and losing the matchprovide the complete payoff structures to solve the second sets given 1,0 and 0,1 setscores. Finally, the solution to both second set possibilities yields a value for eachplayer at the first node of the second set, conditional on winning or losing the firstset, which allows the first set of the match to be solved.

The SPNE for the match is therefore clearly defined and robust given final payoffsto winning and losing the match.

Proof: Backward Induction.

3 Econometric Specification

The notion that strategic incentives play a role in the determination of match out-

comes can be tested by constructing a test on the null hypothesis that the curvature

parameter r is equal to 0. In order to construct the empirical structural model, ad-

ditional assumptions must be made on the determination of ability. The empirical

structure of ability differences, and thus winning probabilities is given, conditional on

16

observable covariates X, as:

observed ability advantage (seed difference) : Xj ≡ Xaj − Xbj

residual ability advantage : αj ≡ αa − αb (20)

net ability advantage : δj = δaj − δbj = Xjβ |αa = αb.

The parameter α will not be separately identified in this analysis, since only the

relative difference in abilities matters to the outcome of the match in the model.

The parameter α would only be identifiable as a set of types or as individual ability

parameters for each player. Thus, the preliminaries of the model, the relative abilities

of the players and the rewards to winning and losing the match are defined by elements

in the data (X), and model parameters β. The remainder of the match can be solved

for score probabilities given the other parameters of the model, r and σ2, the variance

of the random effect distribution.

Three model specifications are considered, and the estimation results reported.

The first model specification considers only seed difference in the determination of

ability. The second model allows for the seed difference to enter into ability with

a quadratic relationship. The third model augments the second by adding WTA

ranking difference as a further indicator of ability. Tests of models which included

indicator variables for being seeded and quadratic terms for rank difference showed

no significant relationships.

Each of the models is estimated for both the limiting case probit model and the

full structural model.

17

3.1 Computational Subgame Perfect Nash Equilibrium

The Nash equilibrium is computed using the FS algorithm, and the indices defined

for Nash equilibrium as follows:

Algorithm 1 Computational Nash Equilibrium Algorithm

• Compute ∆ω, which indicates a strategic advantage to player a or b. Assign apure strategy to the player with strategic advantage.

• Check condition given in (15), and solve for other player’s mixing probabilitygiven the pure strategy of the opposition

• If there is no strategic advantage, each player will play the pure strategy Nashequilibrium

In equation (14), the parameters γ which define the mixed-strategy Nash equilib-

rium either follow immediately from (15) or can be solved as the unique solution to

the zero function implied by (16). Since we know that the mixing probabilities are

strictly positive and less than 1, a bracketing algorithm as outlined in Judd (1998)

produces a quick solution to the zero function relative to a non-linear approach.

Letting the realized state of a match be given by the scores in each of the 2 or 3

sets played in the match, ie the terminal state vector:

ω ∈ {g1a, g1b, g2a, g2b, g3a, g3b} . (21)

The probability of an observed sequence of set scores is the product of the proba-

bilities of the observed scores in each set, conditional on the state of the match when

the set is played. Within a set, the probability of a set ending in a particular score

18

is the sum of the path probabilities for each possible path ending in that score6. The

probability of i winning a set by a particular score is calculated as:

Algorithm 2 Path Probability Algorithm:

• begin with a matrix of winning probabilities at each game in a set, calculatedthrough the SPNE

• given the matrix of winning probabilities, calculate the probability of each possiblepath from 0,0 to each of the terminal nodes in the set

• The probability of a match ending 6-i is the sum of the individual probabilitiesof each path that ends in 6-i

The path probability algorithm provides probabilities of individual scores at each

node in the match, conditional on parameters r, σ, and β. Denoting ω as the terminal

state of the match, and Wj, j ∈ {1, 2, 3} as a set of indicator variables for set j being

won by player a, and Pjωg as the SPNE probability that player a wins the set at state

ω such that ω is the terminal state of a set. If the third set is not played, the indicator

s3 is set to 0. Thus, the probability of a sequence of set scores in a match is given by:

P ∗(ω, X|β, r, σ) =2∏

j=1

[Pjωg]Wj [1 − Pjωg]

1−Wj ∗([P3ωg]

W3 [1 − P3ω]1−W3

)s3

. (22)

3.2 Estimation

The estimation in this paper is undertaken by maximizing the log of the likelihood

function given in (22). Therefore, the continuity of P ∗ in the parameter space is

crucial. As in FS, as long as there is some probability that each player wins each

game, this continuity is achieved. This is realized by allowing the mixed strategy

6This is because the current data only provides information on the final score, and not on thesequence of individual game wins.

19

Nash equilibria at each node in the tournament. The structure of the subgame perfect

equilibrium model allows the identification of the parameter of interest, r, and the

construction of a confidence interval around the coefficient estimate. The results of

the estimation procedure are outlined in Section 5. The parameter r is constrained

to be in (0, +∞] in order to assure that the second order conditions will be satisfied

for all values of ∆ω, the strategic advantage. The maximum likelihood procedure was

undertaken constraining these values to be positive using the estimated parameter

r∗defined as:

r = er∗ (23)

(24)

Because of this characterization, hypothesis tests on the value of r∗ are not directly

informative. Confidence intervals around the implied parameter r are evaluated and

reported.

Denoting the vector of estimated parameters θ, such that θ = {β, r∗}, the log

likelihood for the sample of match finals is thus given by:

L(θ) ≡∑

i

ln (P ∗(ω, X|β, r∗)) . (25)

In the data, we can think of each match as a panel of observations, where we first sum

over the likelihood of individual elements in the panel, over the sets in each match in

(22). This gives us a likelihood for each element in the panel, then we sum up the log

likelihood to get the likelihood of the parameters conditional on the data as a whole.

In order to identify the standard errors of the parameter estimates, a single iteration

on Newton’s Method is used to yield the estimated Hessian matrix.

20

4 The Data

The data for this study are taken from Association de Tennis Professionel (ATP)

tournaments over the 1998 and 1999 seasons. The data set consists of set scores and

seeding information from 87 tournaments. The tournaments all feature best of 3 set

structures, and therefore do not include any of the ’Grand Slam’ tournaments, which

employ a best of 5 set structure. Each match is described by which round of the

tournament it is in (only finals in this case), the number of sets won by each player,

and the score in each set. The players’ respective seeds in the tournament and current

ATP World Ranking are also included.

Payoffs in tennis tournaments tend to be of linear structure, with payoffs doubling

at each successive round. However, since the absolute and not relative difference is

important to the outcome of each of the matches, actual prize money is included in

the data set.

Two assumptions were made in the data set to accommodate the assumptions of

the model. The first assumption is that tie-breakers are played as a one-shot game,

which is not the case in the data. In order to rectify this, games ending either as

7-5 or 7-6 were replaced as 6-5. This change affected 25% of the set scores, so it is

a major assumption. However, the outcomes of the sets are not altered, and it is

simplifies the state transitions significantly. The effort responses will still appear, it

will just be captured in the entire tiebreaker ‘game’.

The second assumption is that the seeds of all unseeded players are given by 9,

whereas seeded players are ranked 1-8. An additional parameter identifies a player

21

solely as being seeded in the second estimated model. This captures some of the

effect of this choice of 9 as a seed for all unseeded players. The ability level is allowed

to depend exponentially on seed, and tests for a dummy variable for being unseeded

showed no significant effect in any specification.

5 Results

5.1 Data Analysis

The data are summarized based on the winner of the match, so all statistics presented

referring to winner and loser are meant at the match level.

Table 1 details some of the characteristics of players in championship matches in

the data, while Table 2 summarizes the mean properties of the individual matches

which the model will attempt to replicate. The parameter of interest governs the

probability that players will give up when the match turns against them, and as the

value of r increases, we would expect players to give up earlier. In the data, we can

present some stylized facts on the outcome of matches conditional on earlier events

therein. We see that 58.14% of matches end after 2 sets, such that the winner of the

first set goes on to win the match. Furthermore, the winner of the first set wins 81.4%

(70/86) of the championship matches. The mean absolute score differences in each

of the sets are not significantly different at the 95% level, nor is the mean absolute

score difference statistically different conditional on the match ending after 2 sets or

continuing on to the third set. We are interested in the propensity of players to give

up earlier in the set or match, so Table 3 provides tabulations on the frequency of

different losing player scores per set. Simply put, as r increases, we would expect

22

to see higher probability mass at the lower scores, as players give up sooner in the

match. The stylized facts seem to suggest that there is little or no incentive response

occurring in the sense that players would give up earlier in the second set if they had

lost the first set. Since the data do not provide information on the sequence of scoring

in the sets, it is difficult to draw too many conclusions on the nature of r from the

data without the estimation of the structural model, as seen below in Section 5.2.

Boulier and Steckler (1999) look at the forecasting properties of seeding in sports.

They find that seeding is useful on it’s face as a predictor, and that a probit model

provides an improvement on the forecasting power in seeds alone for predicting win-

ning probabilities. In the tournaments in the data set, 8 of the 32 players were seeded

based on their current ATP/WTA rankings. Seed difference appears initially to be

uninformative, in that half the matches are won by the player with the lower seed.

However, the parameters β to be estimated will quantify the same sort of effect the

Boulier and Steckler find in their paper.

5.2 Estimation of the Parameters of the Structural Model

The results of the maximum likelihood procedure return parameter estimates and

standard errors over the three parameters identified in the structural model: the

marginal effect of seeding and world ranking on ability, and most importantly, the

curvature (incentive response) parameter r conditional on variance of the ”bounce

of the ball” random component σ2 = 1. Table 4 shows the results of the estimation

procedure, with standard errors in brackets.

Table 5 shows the constructed confidence intervals around the parameter of interest

23

in each of the model specifications. The parameter of interest is r, which we see

lying between .07674 and .1287 99% of the time in all of the models specified. This

indicates that there is some altering of effort levels occurring in the data.

Finally, the following Table 6 presents the results of the same model specification

under the limiting probit model which occurs where r = 0.

5.3 Simulation

While the goal of the paper is not to predict the results of tennis matches, it is

important to evaluate the ability of the model to explain the data, if we are to take

the estimates of the model on their face. Table 2 shows the ability of the model to

replicate characteristics of a season of tournament championship data. 100 seasons

of 86 matches were simulated using the estimated model parameters outlined above.

The characteristics of individual match scores are summarized in Table 2. Finally,

the tabulation of set score frequencies for simulated data is included in brackets in

Table 3.

Table 1 shows positive performance of the model. The percentage of matches

ending after 2 sets is close to that in the observed matches. The 95% confidence

interval for the observed mean of 58% is from 47.5% to 68.78%, while the simulated

percentage is 62.90%. The number of matches won by lower seeds is slightly lower

than the 50/50 split seen in the observed data, but the difference is not statistically

significant. The momentum effect in the simulated matches is also similar, with a 1%

(insignificant) error in the prediction on the winner of the match conditional on the

winner of the first set.

24

Table 2 examines the characteristics of individual match scores in the simulated

data. In the actual data, there were no significant differences between the set score

differentials. The simulated score differentials in each fall into the 95% confidence

interval for the observed data. In this table, we see that 63.14% of matches are

correctly predicted by the model. The probit model using the same variables is able

to predict 65.34% of matches correctly. Unfortunately, Boulier and Stekler (1999) do

not report a percent correctly predicted for their probit model of tennis, so the results

are not directly comparable. This slight difference in prediction of individual match

outcomes is not statistically significant at the 1% level. Even if it were, it would not

be directly important, since the key question of the paper is whether we can detect

variance in effort levels within the match, and this is determined by the ability of the

model to explain inter-match dynamics.

6 Discussion

This paper seeks to estimate a parameter which quantifies the effect of incentive re-

sponse on the results of rank-order tournaments, using data on tennis to estimate the

model. The structure of the SPNE model allows the identification and estimation

of the parameter r, which governs the marginal cost of effort above and below abil-

ity. Essentially, if r is significantly different from zero, we see players altering their

effort levels in response to incentive advantages in the data, which corresponds to

a finite cost of effort levels above or below ability. The model is in fact picking up

this marginal cost, but because of the structure of the model, we would only see a

significant parameter estimate for r if agents were adjusting their effort levels.

25

This parameter estimate differs from those found in FS, since their paper did

not find any strong evidence for strategic effects. In their estimation, the estimated

parameter r∗ was -10.44 in hockey series, and small in other sports, with standard

errors of 22778 and higher. This paper shows stronger evidence of incentive response

among participants in a rank order tournaments then has been shown in other studies.

The evidence of agents altering their behaviour in response to identified incen-

tive structures has important implications for many fields of applied microeconomics.

The tournament model has been applied theoretically to research contracts, executive

salaries and principal-agent production problems. If we can be confident in the first

assumption of the model, that is that agents perceive their effort costs and bene-

fits relative to other competitors, and set their efforts to maximize payoffs, then we

can place a great deal more confidence in these theoretical treatments. This paper

provides evidence toward that important under lying confidence in the tournament

model as a predictor of agent behaviour.

26

References

[1] Bryan Boulier and H.O. Steckler, Are sports seedings good predictors?: an eval-

uation, International Journal of Forecasting 15 (1999), 83–91.

[2] Clive Bull, Andrew Schotter and Keith Weigelt, Tournaments and piece rates:

An experimental study, Journal of Political Economy 95(1) (1987), 1–33.

[3] Jurgen A. Doornik, Ox version 3.00, June 2001.

[4] Robert Drago and John S. Heywood, Tournaments, piece rates, and the shape of

the payoff function, Journal of Political Economy 97(4) (1999), 992–998.

[5] Christopher Ferrall and Anthony J. Smith, A sequential game model of sports

championship series: Theory and estimation, Review of Economics and Statistics

81 (November, 1999), 704–19.

[6] Kevin A. Gale, The determination of set score differences in men’s professional

tennis, Master’s thesis, Queen’s University, 2000.

[7] Kenneth L. Judd, Numerical methods in economics, The MIT Press, Cambridge,

Massachusetts, 1998.

[8] Edward Lazear and Sherwin Rosen, Rank-Order Tournaments as Optimum

Labour Contracts, Journal of Political Economy 89(5) (1981), 841–866.

[9] , Testing the Theory of Tournaments: an Empirical Analysis of Broiler

Production, Journal of Labour Economics 12(2) (1994), 155–179.

27

[10] R.D. McKelvey, Handbook of computational economics, Handbooks in Eco-

nomics, ch. 13, pp. 87–142, Elsevier Science, North-Holland, Amsterdam; New

York and Oxford, 1996.

[11] Martin J. Osborne and Ariel Rubinstein, A course in game theory, The MIT

Press, Cambridge, Massachusetts, 1994.

28

A Appendix

A.1 Proof of Propositions

Proposition 1

The problem of a player in the game is to maximize his/her payoff function subject

to the maximization decision of the other player in Nash equilibrium. The objective

function for each player is expressed generally as:

−e−δiω

r exiω

r + γt′ωF (It(xaω − xbω))∆Vt(ωt, ω−t) + (1 − γt′ω)Vt(ωt) + γt′ωVt(ω−t) (26)

thus the first order, necessary conditions for an interior solution for players a and b

respectively are:

exaω

r = rγbωf(xaω − xbω)∆Va(ωa, ωb)eδaω

r (27)

exbω

r = rγaωf(xaω − xbω)∆Vb(ωa, ωb)eδbω

r (28)

dividing (27) by (28), and substituting for index of strategic advantage ∆ω defined in

(12) yields:

xaω − xbω = ∆ω + r lnγbω

γaω

(29)

and substituting (29) into the first order conditions and solving for interior effort level

x yields the interior solution in (14).

Substituting the interior effort level in (14) into the value function for the player

29

yields the value of playing a pure strategy:

γt′ω∆Vtω

(−rf(∆ω + r ln

γbω

γaω

) + F (It(∆ω + r lnγbω

γaω

))

)(30)

+ (1 − γt′ω)Vt(ωt) + γt′ωVt(ω−t)

The mixed strategy equilibrium allows for a player t to give up with probability

(1 − γtω), where giving up has the effect of losing with certainty if the other player

puts in positive effort (xt′ > −∞) and in the case where both give up, the game is a

toss up (50/50). The indirect value of giving up is therefore defined as:

γt′ωVt(ω−t) + (1 − γt′ω)Vt(ωt) + Vt(ω−t)

2(31)

so we know that the interior solution will be preferred to the mixed strategy equilib-

rium if the indirect values defined above are such that (30) > (31), which is simplified

to yield (15). If the pure strategy interior solution is not strictly preferred to giving

up, the player will set γt such that the values of each are exactly equal in equilibrium,

such that (30) = (31), as given by the zero-function in (16).

And finally, the probability of winning at each node is given simply by:

Pω = prob(player a wins) = γaωγbωF (∆ω + r lnγbω

γaω

) (32)

+ (1 − γbω) +(1 − γaω)(1 − γbω)

2

QED

30

Parameter r

The added explanatory power of the parameter r is best understood in a couple

of steps. Looking at the winning probabilities described above, if both players are

playing pure strategies, the outcome is described by:

Pω = = F (∆ω) (33)

= F (α + β(Xj) + rνω)

which, if we do not employ the subgame perfect equilibrium structure, is described

simply as a probit (or logit) model with a latent regressor νω. The SPNE model

allows us to solve for νω at each node of the match, and thus to include both the

parameter r and the ability parameters β in the estimation.

31

Table 1: Summary of ATP Tennis Tournament Championship and Simulated Matches

Data SimulationTotal 86 8600% ending after 2 sets 58.14% 62.90%

Mean Abs. Seed Difference (all players) 2.8023 2.8023Mean Abs. Seed Difference (2 seeded players) 2.7692 2.7692

Match won by lower seed 50.00% 46.04%

Match involves 2 seeded players 26 26001 seeded player 30 30000 seeded players 30 3000

Matches won by player winning first set 81.4% 80.11%in 2 sets 58.14% 62.90%player losing first set 18.6% 19.88%

Percent of Matches Correctly Predicted 63.14%

32

Table 2: Summary of Real and Simulated Data Match Scores

Variable Data Mean Simulated Mean(Data Std. Dev.) (Simulated Std. Dev.)

winner’s seed 5.360 5.619(3.276) (3.255)

loser’s seed 7.395 7.137(2.792) (2.940)

2 set match .5814 .6314(.4962) (.4827)

Score diff. - Set 1 2.581 2.962(1.324) (1.495)

Score diff. - Set 2 2.674 2.931(1.384) (1.519)

Score diff. - Set 3 2.694 2.896(1.348) (1.453)

Correctly Predicted .6314(.4827)

33

Table 3: Frequency of Scores being 6-x by set

Set Loser’s Score 1st Set 2nd Set 3rd Set

0 2.33 2.33 -(5.12) (6.28) (3.76)

1 5.81 8.14 13.89(13.14) (12.44) (9.40)

2 16.28 20.93 11.11(17.67) (15.35) (20.06)

3 25.58 16.28 30.56(22.91) (21.63) (25.39)

4 23.26 27.91 19.44(19.19) (22.67) (22.26)

5 26.74 24.42 25.00(21.98) (21.63) (19.12)

() indicates values simulated with estimated parameters

34

Table 4: Maximum Likelihood Estimation of Structural Model Parameters

Variable Model 1 Model 2 Model 3 Model 3Seed Difference(SD) −0.02345∗ 0.06233∗ 0.06168∗ −0.02358∗

(4.917e−5) (0.001345) (0.001371) (.002313)SD2 −.008361∗ −.008230∗ 0.003440∗

(1.325e−5) (1.331e−5) (3.044e−5)Rank Difference(RD) −4.725e−5∗ −3.122e−5∗

(1.189e−8) (1.286e−8)Seed Indicator(SI) −.32435∗

(.01601)r∗ −2.168∗ −2.237∗ −2.237∗ −2.329∗

(0.05911) (.08194) (.08202) (.1200)ln(likelihood) -467.64 -464.53 -464.43 -460.41

* = significant at the 1% level

Table 5: Confidence Intervals on Incentive Response Parameter

Model Specification Coeff. Std.Err. 99%CI+ Implied 99%CI+Parameter

Model 1 -2.168 0.05911 0.1287 0.1144 0.1017Model 2 -2.236 0.08194 0.1257 0.1068 0.09078Model 3 -2.237 0.08202 0.126 0.1068 0.09072Model 4 -2.329 0.1200 0.1236 0.09740 0.07674

35

Table 6: Maximum Likelihood Estimation of Probit Model Parameters

Variable Model 1 Model 2 Model 3 Model 3Seed Difference(SD) −0.03350∗ 0.08219∗ 0.08147∗ −0.02878∗

(5.085e−5) (0.001914) (0.001917) (.003296)SD2 −.01113∗ −.01098∗ 0.004146∗

(1.727e−5) (1.740e−5) (4.332e−5)Rank Difference(RD) −5.935e−5∗ −3.539e−5∗

(1.846e−8) (1.852e−8)Seed Indicator(SI) −.4072∗

(.01884)ln(likelihood) -469.48 -465.89 -465.79 -461.37

* = significant at the 1% level

36

1.15

1.05

1.2

1.1

r

10.6 0.80.40.2

Figure 1: Effort Cost of Playing 10% aboveability

1.2

1.4

0.8

1

0.6

x

0.30.250.20.10.050 0.15

Figure 2: Effort Cost, r=.28, δ=.2

2

1.5

1

0

0.5

x

0.20.1980.1960.1940.1920.19

2.5

Figure 3: Effort Cost, r=0.001, δ=.2

37

0,0

1,0 0,1

1,1

2,1 1,2

2,2

3,2 2,3

2,0

3,1

0,2

1,3

3,0 0,3

4,1

0,44,0

0,55,0 1,4

5,5

5,4 4,5

4,4

4,3 3,4

3,3

5,3

4,2

3,5

2,4

5,2 2,5

1,55,1

PLAYER 1WINS

PLAYER 2WINS

Figure 4: The Game Tree for a Set

38

1,0 0,1

0,0

1,0

0,1

1,1

2,1

1,2

2,2

3,2

2,3

2,03,1

0,21,3

3,0

0,34

,1

0,4

4,0

0,5

5,0

1,4

5,5

5,4

4,5

4,4

4,3

3,4

3,35

,3

4,2 3

,5

2,4

5,2

2,5

1,5

5,1

A WINS

0,0

1,0

0,1

1,1

2,1

1,2

2,2

3,2

2,3

2,03,1

0,21,3

3,0

0,34

,1

0,4

4,0

0,5

5,0

1,4

5,5

5,4

4,5

4,4

4,3

3,4

3,35

,3

4,2 3

,5

2,4

5,2

2,5

1,5

5,1

0,0

1,0

0,1

1,1

2,1

1,2

2,2

3,2

2,3

2,03,1

0,21,3

3,0

0,34

,1

0,4

4,0

0,5

5,0

1,4

5,5

5,4

4,5

4,4

4,3

3,4

3,35

,3

4,2 3

,5

2,4

5,2

2,5

1,5

5,1

0,0

1,0

0,1

1,1

2,1

1,2

2,2

3,2

2,3

2,03,1

0,21,3

3,0

0,34

,1

0,4

4,0

0,5

5,0

1,4

5,5

5,4

4,5

4,4

4,3

3,4

3,35

,3

4,2 3

,5

2,4

5,2

2,5

1,5

5,1

B WINS

B WINSA WINS

1,1

Figure 5: The Game Tree for a Match

39

Liste des cahiers de recherche publiés par les professeurs des H.E.C.

2003-2004 Institut d’économie appliquée IEA-03-01 ROBERT GAGNÉ; PIERRE THOMAS LÉGER. « Determinants of Physicians’ Decisions to

Specialize »,29 pages. IEA-03-02 BENOIT DOSTIE. « Controlling for Demand Side Factors and Job Matching: Maximum

Likelihood Estimates of the Returns to Seniority Using Matched Employer-Employee Data »,24 pages.

IEA-03-03 ALAIN LAPOINTE. « La performance de Montréal et l'économie du savoir: un changement

de politique s'impose », 35 pages. IEA-03-04 MICHEL NORMANDIN; LOUIS PHANEUF. « Monetary Policy Shocks: Testing Identification

Conditions Under Time-Varying Conditional Volatility », 43 pages. IEA-03-05 MARTIN BOILEAU; MICHEL NORMANDIN. « Dynamics of the Current Account and Interest

Differentials », 38 pages. IEA-03-06: MICHEL NORMANDIN; PASCAL ST-AMOUR. « Recursive Measures of Total Wealth and

Portfolio Return », 10 pages. IEA-03-07: DOSTIE, BENOIT ; LÉGER, PIERRE THOMAS. « The Living Arrangement Dynamics of Sick,

Elderly Individuals », 29 pages. IEA-03-08: MICHEL NORMANDIN. « Canadian and U.S. Financial Markets: Testing the International

Integration Hypothesis under Time-Varying Conditional Volatility », 35 pages.

- 1 -

http://www.hec.ca/iea/cahiers/2002/iea0209_bd.pdf

http://www.hec.ca/iea/cahiers/2002/iea0209_bd.pdf

- 2 -

IEA-04-01: ANDREW LEACH. « Integrated Assessment of Climate Change Using an OLG Model », 34 pages.

Game, set and match - HEC

Documents

Transcript of Game, set and match - HEC