Game, set and match - HEC
Transcript of Game, set and match - HEC
Copyrigh t © 2004 HEC Montréal.
Tous droits réservés pour tous pays. Toute traduction ou toute reproduction sous quelque forme que ce soit est interdite.
Les textes publiés dans la série des Cahiers de recherche HEC n'engagent que la responsabilité de leurs auteurs.
La publication de ce Cahier de recherche a été rendue possible grâce à des subventions d'aide à la publication et à la diffusion
de la recherche provenant des fonds de l'École des HEC.
Direction de la recherche, HEC Montréal, 3000, chemin de la Côte-Sainte-Catherine, Montréal (Québec) Canada H3T 2A7.
SubGame, set and match. IdentifyingIncentive Response in a Tournament
by Andrew LEACH
Cahier de recherche no IEA-04-02February 2004
ISSN : 0825-8643
SubGame, set and match
Identifying Incentive Response in a Tournament
Andrew Leach∗
HEC Montreal†
February 1, 2003
Abstract
Data from Association de Tennis Professionel (ATP) championship tennistournament finals are used to test for strategic behavior of players and theirresponses to incentives. Tennis provides a rich environment for the study of in-centive response because of the individual nature of the sport, and the clearlydefined tournament structure. The parameters of a sequential game modelare estimated and, controlling for measured ability differences, the existence ofstrategic decision making where players’ efforts vary depending on the state ofthe match is tested against the alternative that players do not alter their effortin response to incentives.
JEL classification: C7, C15, C63
Keywords: Tournaments, Sequential games, Incentive Response, Nash Equi-librium
∗Helpful comments from Christopher Ferrall, John Knowles, Pierre-Thomas Leger, C. RobertClark and Shannon Seitz are gratefully acknowledged, as are comments from seminar participantsat Queen’s University, the Second Annual Conference on Numerically Intensive Policy Analysis, 2001and the Canadian Economics Association Meetings, 2002. Data for this project were obtained fromKevin Gale. Address communications to the author at [email protected] or Institut d’economieappliquee, HEC Montreal, 3000, chemin de la Cote-Sainte-Catherine, Montreal (Quebec) H3T 2A7.
†Affiliated with the University of Montreal
1 Introduction
This paper develops and estimates a structural model of agent interaction in a tour-
nament to provide evidence of the type of alteration in effort levels in response to
incentive structures suggested in Lazear and Rosen (1981). Tournaments are used as
contracts in the labour market, and the theory predicts that the agents facing these
incentives will alter their effort accordingly. Clearly, if agents do not perceive these
incentives or react to them, we may be using inefficient labour contracts. This paper
asks the question of whether agents alter their effort levels conditional on incentives,
and is able to conclude that they do.
A tennis tournament final presents an excellent medium to study the incentive
structure present in a rank-order tournament. A tennis tournament begins with as
many as n=128 players, with each round of the tournament consisting of n/2 pairwise
matches where the loser is eliminated from the tournament, and receives a payoff
depending on the round in which the match occurred. Each match is itself a nested
tournament, with a best 2 out of 3 structure. Further to that, each set and game
within the match form best of 11 and best of 7 tournaments respectively. Given
the extensive sequential game which is described above, it is possible to identify the
actions of agents in response to strategic incentives, as suggested by the models of
Rosen (1986) and Lazear and Rosen (1981). The final match is of particular interest
since the final rewards are clearly known to both players, and additional assumptions
on tournament structure are not required.
The Lazear and Rosen (1981) model focuses on a one-shot game where players’
efforts are positively but not perfectly correlated with performance, and rewards are
2
assigned based on relative performance or output. In a simple, two-player game
performance is measured generally as:
q = x + e, (1)
where x is the action or effort taken by a player, and e is a random component. Effort
is costly, and the cost is determined by C(x) with C ′(x) > 0 and C ′′(x) > 0. Players
are rewarded for performance with one of two prizes, W1 or W2 with W1 > W2, and
the larger prize going to the player with the best (highest) performance q. Therefore,
in the one shot game, if we define P as the conditional probability of winning, the
expected payoff to player i is:
P [W1 − C(xi)] + (1 − P )[W2 − C(xi)] = P [W1 − W2] + W2 − C(xi), (2)
which leads to the first order condition for player i’s actions given by:
∂P
∂xi
[W1 − W2] − C ′(xi) = 0. (3)
This first order condition has two important interpretations. First, the effort level is
not determined by the size of the rewards, only by the difference between the rewards
for winning and losing; second, the effort level will be influenced by the relative effects
on cost and winning probability of changes in effort level, where the player sets the
marginal change in expected reward equal to the marginal effort cost.
A tennis tournament final is clearly parallel in setup to the theoretical paradigms
laid down in Lazaer and Rosen (1981), with two individuals playing a match with a
clearly defined reward structure. The propositions of Lazear and Rosen (1981) that
agents will alter effort levels to suit rewards at each node in a tournament can be tested
3
by imposing subgame perfect Nash equilibrium (SPNE) structure on the model. The
model in this paper is derived from Ferrall and Smith (1999)(FS), and we confront the
model with data from professional tennis tournaments. The results provide strong
empirical evidence for the type of incentive response proposed by Lazear and Rosen
(1981).
It is important to emphasize first why sports data and second why tennis data
may provide a useful experiment in incentive response. In a recent article, Ferrall
and Smith (1999) use professional sports championship series data to test for the
existence of incentive response in a tournament setting. FS use data from hockey,
basketball and baseball; games which may violate the key assumptions of a Lazear
and Rosen (1981) tournament model. Assumptions are violated in the sense that the
reward structure is not clearly defined, nor can they separate the strategic actions of
individuals from the actions of teams as a whole. Consider the example of a hockey
tournament. The outcome of the game is determined by the actions of the teams as
a whole, and teams arguably face a series of incentives at each stage of the series.
Beneath this, the individual players are those who make the effort decisions within
the game, and the model presented in FS does not address the specific incentives
facing players or the resulting actions. It is also less clear in hockey that the rewards
do not vary with the length of the series, particularly at the team level. The best
case scenario for any team is to win the Stanley Cup through winning 4 series that all
go to the maximum number of games, which maximizes television and gate revenue.
This is inconsistent with the basic dynamics of the tournament model, where each
additional round produces only further cost, but no change in benefits. As discussed
4
above, tennis is not subject to these fundamental differences in comparison with the
theory of Lazear and Rosen, and thus can be used to further test the implication of
the use of tournaments to stimulate effort in labour market contracts.
The difficulty with estimating the magnitude of strategic effects in the decisions
is in the identification of the reward structure. For example, in a tennis match, we
know that the winner of the match earns a certain prize, and the loser of a match
wins a certain prize, but what is the reward to winning the third game to go up 3-0
in the second set, after having won the first set? Or for that matter, in any other
game? With the exception of the final game, with the winner of that game winning
the match, all other games have reward structures that depend on the result of future
games, and thus are not clearly observable. The structure introduced in this paper
provides a method for identifying the rewards at each node, and thus the response
to incentives through an entire match. In the context of an extensive form game, the
rewards at each terminal node are known to the participants, and the rewards at each
non-terminal node are endogenous functions of the rewards at game nodes reached
by winning or losing the current game.
The remainder of the paper proceeds as follows. Section 2 presents the SPNE
model of a tournament. Section 3 presents the econometric approach to estimating
the parameters of the model, and Sections 4 and 5 present the data and the results
of the analysis. Section 6 discusses the results and concludes the paper.
5
2 The Model
This paper models the finals of a tennis tournament using a SPNE characterization.
In order to solve a subgame perfect equilibrium, the Nash solution is employed, as
outlined in Rosen (1986) and Osborne and Rubinstein (1994). In this solution, the
Nash equilibria of the static games are solved, and the recursive sequence of Nash
equilibria is a SPNE of the extensive form game. The advantages of this approach
are that the equilibrium is reasonably easy to calculate, and is robust. (Ferrall and
Smith, 1999)
2.1 The Final Match
The final match is characterized by two players playing a sequence of points. The
first player to win four points is the winner of a game. In games and sets, the analysis
abstracts from tie breakers, treating them as any other one shot game.1 Similarly, a
player must win six games in order to win the set, and two sets to win the match.2
This model looks at strategic decisions made by players at the level of individual
games. The outcome of a game is endogenous and stochastic with probabilities of
winning based on effort, as well as a stochastic luck component which is independently
and identically distributed over games in the match.3
Let the two players be indexed as a and b. Figures 4 and 5 show the game trees
1Tie-breakers are present in each level except in the 1 set each tie of a tennis match, but in thispaper, they are considered as the extensive form of the final game. For example, at 40-40, a gameis at deuce, and a player must win two points in order to win the game. In this characterization,the entire deuce ‘point’ is considered as a single game, or node in the subgame, and so on for settie-breakers.
2The ATP tour uses best of 3 set matches. For the remainder of this paper, and in the data usedto estimate the model, this match structure is assumed.
3FS refer to a luck component as ”the bounce of the ball” which is particularly applicable here.
6
for a set and a match respectively. The number of games in a set is variable, but is
bounded by 6 ≤ gi ≤ 11. Where N=11, the eleventh ‘game’ is, as discussed above, a
tie-breaker which is required if and only if the score reaches 5-5. If a player wins a
game that reaches 5-5, this is denoted as a 6-5 win regardless of the number of games
played after 5-5. Sets are nested within the sequential game nodes of a match. A
match is an N=3 tournament with a player having to win 2 sets in order to win. The
state of the match is therefore denoted by the 6x1 vector {g1a, g1b, g2a, g2b, g3a, g3b}where gji denotes the number of games won by player i in set j. The total number
of games played is not considered in the equilibrium, only the number of sets, j,
which can be backed out as j =∑3
i=1(gia + gib ≥ 1), where the brackets represent an
indicator function equal to 1 if the argument is true. The state space is defined as:
ω ∈ {g1a, g1b, g2a, g2b, g3a, g3b : 0 ≤ gji ≤ 6, min{gji} < 6, i ∈ {a, b}, j ∈ {1, 2, 3}} . (4)
The state vector ω yields all the information which is considered by players in the
model. The actual length of the match is also endogenous and stochastic, but is
bounded at 3 sets of a maximum of 11 games each, therefore, the entire match cannot
consist of more than 33 games. The transition of the state space subject to the results
of a game at state ω depends on the score of the set in play. If either player reaches 6
games, the number of sets won by that player increases by one. If that is the second
set won, the match is won by that player. Otherwise, the set score is updated, and
the game score resets to 0,0. If the game being played is not a set or match point,
the score of the set is updated to reflect the winner.
During each game in the match, players make strategic decisions xiω, which is
the amount of costly effort devoted to the play of the game at state ω. The score
7
differential d of the game is stochastic, but depends positively on own effort and
negatively on the effort of the other player, and on a random component ε.
d = (xaω − xbω) + εω (5)
ε ∼ F (0, σ2ε ).
The random term ε is independently and identically distributed across all games, is
distributed according to a cumulative distribution function (CDF) F (ε) and corre-
sponding probability density function (PDF) f(ε). The random term in (5) represents
unpredictable events which effect the outcome of the game. The score differential is
a complete picture of the outcome of the game, and it is assumed that players do
not take into account any signal from the size of the score differential in their future
strategies. We can thus denote the equilibrium(denoted by ∗) probability of victory
for players a and b, conditional on strategies, as:
Paω(x∗aω, x∗
bω) = Prob(d∗ω > 0) = Prob(x∗
aω − x∗bω) > −εω (6)
= 1 − F (−(x∗aω − x∗
bω))
= 1 − Pbω(x∗aω, x∗
bω),
where the F distribution is assumed to be normal with mean 0 and variance σ2 = 1
for the remainder of the paper. Additionally, games in which both players put out
no effort (ie. both give up) are toss ups. The symmetric marginal effect of strategy
choice is given by:
∂Paω(xaω, xbω)
∂xaω
= f(−(xaω − xbω)). (7)
8
Players’ observed ability is denoted by δiω, and is constant over the match. The
ability vector of each player is common knowledge. The ability of players influences
the cost of choosing effort level, according to a cost function.
The cost of effort function is associated with strategy or effort choices xiω, and is
denoted ciω(xiω). The function is increasing and convex in strategy choice, so that
both c′ and c′′ > 0. The cost of effort function is constant over the match. Individual
cost at each node depends on the ability of the individual players exponentially, as
well as on the aggressiveness of the strategy chosen, and assumed to be given by:
ciω(xiω) = e−δiω
r exiω
r , (8)
where r is a constant. The parameter r determines the convexity of the effort cost
function with respect to the players’ strategy choices. Its interpretation is important
in understanding how much agents will attempt to influence outcomes given different
strategic incentives. As r tends to zero, the marginal cost of effort above ability
level is infinite and the marginal cost of effort below ability level is zero, so players
will tend to play their abilities. More importantly for this paper, if the parameter r
estimated to be close to 0, it means that we are seeing no adjustment in behaviour
in response to incentives. This interpretation is clarified in Proposition 1. The cost
function is parameterized such that players with higher abilities (δ) will expend less
effort cost to play the same strategy x in a game. Effort costs are therefore decreasing
exponentially in ability and increasing exponentially in aggressiveness, conditional on
the convexity parameter r. The cost function is not able to capture explicitly the
cost of diverse strategies within the game of tennis, but rather it is meant to capture
the amount to which effort levels are changed in response to incentives. Costs are
9
independent from point to point, so it is not more costly to exert effort after having
recently played a high-effort point.
For further intuition on the role of the parameter r, consider Figures 1-3. Figure 1
shows the relative costs of increasing effort by 10% for different value of the parameter
r. Figure 2 shows, for a value of r in the estimated range, the costs of effort above
and below ability (.2). Finally, 3 shows the shape of the cost function as r limits to 0.
The marginal cost of effort below ability is 0, while playing any strategy above ability
level will have infinite cost. Using the structure of subgame perfect equilibrium to
identify reward structures at each node of the match, and assuming that players set
marginal benefit equal to marginal cost, we are able to identify the shape of this cost
function which is best supported by the data through estimation of the parameter r.
Because of the structure of the cost function, it is necessary to allow players to
’give up’, and play a strategy of xiω = −∞. In the case where the expected rewards
to winning a game are non-positive for any strategy xiω, players will not wish to
incur any costs, and thus must be able to set ciω(xiω) = 0, which is only possible at
xiω = −∞.
Players are risk neutral, and therefore act to maximize their expected net pay-
offs in the match/tournament. Payoffs are defined by the terminal values of the
match/tournament, which are common knowledge.4 The terminal values of the match
are the payoffs for winning or losing, less the expended costs over all of the games
played. Since the match length is stochastic and endogenous, the costs borne by a
4It may be true that the payoffs to winning a tournament include a change in world rankingwhich brings invitations to other tournaments and higher seeding therein. This is abstracted fromin this paper.
10
player vary depending on the path of the match, not just on the final score. Using
the notation of ωi as the state of the match reached when player i wins at state ω5,
the values to winning and losing at any state ω are given by:
{Vi[ωi] −
∑ω∈ω
caω(xaω), Vi[ω−i] −∑ω∈ω
caω(xaω)
}(9)
where ω is the set of states reached in the preceding games played in the match.
2.2 Nash Equilibrium in a Single Game
In order to clarify the SPNE concept in a tennis match, consider the Nash equilibrium
in the one shot game played in the third set of the final, with the set tied at 5-5, so the
game in question is the last game of the match, and the winner of the match wins the
tournament. The payoffs to the winner and loser of the game are very clearly defined
by the prize money available. Each player can choose a strategy xiω to influence the
outcome of the match, and can also choose the mixing probabilities between playing
that strategy and giving up entirely. A mixed-strategy Nash equilibrium is described
by the situation where neither player can achieve higher utility by deviating from
their equilibrium strategy given the equilibrium strategy of the other player. In order
determine the mixing probabilities in Nash equilibrium, it is often necessary to employ
a numerical methodology(see McKelvey, 1996 or Judd, 1998). In this case, much of
the computation can be saved using a pseudo-analytical solution. Equilibrium is
solved up to the solution to a single zero function, as given in FS.
5This notation will be employed throughout the paper, to maintain generality. Implicit in thisis a transition function were the state shifts to a new set when one player wins 6 games, and to theterminal state of the match which a player wins 2 sets
11
The player’s problem at the terminal node of the match is simply to maximize the
expected payoff to playing the game:
maxxiω
− ciω(xiω) + E [Piω(xiω, x−iω)∆Vi(ωi, ω−i) + Vi(ω−i)] i ∈ {a, b}. (10)
The expectation in (10) refers to the beliefs of player i for the actions of the other
player (-i) which determine the winning probability P, and ∆Vi(ωi, ω−i) refers to the
difference in values of winning the game (ωi) and losing it(ω−i). The values are known
to both players, they can be taken outside the expectation operator, and the player
is choosing effort to increase the probability of winning. Since each player receives
the value of losing in either case, and the difference between the winning and losing
value if he wins, the player’s problem in the one shot game reduces to:
maxxiω
− ciω(xiω) + E [Piω(xaω, xbω)] ∆Vi(ωi, ω−i) i ∈ {a, b}. (11)
In the extensive form game, the values at any non-terminal node are stochastic and
endogenous. However, since they are determined as functions of the Nash equilibria
at all possible future nodes, the expected payoffs are known to the players at each
node. Therefore the expectations operator is taken across both probabilities and the
value differential. As in FS, three indices are defined for the purposes of calculating
the Nash equilibrium.
12
incentive advantage : νω ≡ ln∆Va(ωa, ωb)
∆Vb(ωb, ωa)
ability advantage : δω ≡ δaω − δbω (12)
strategic advantage : ∆ω ≡ rνω + δω
The strategic advantage index in (12) incorporates both ability and incentive advan-
tage in a single index which is pivotal to the development of the Nash equilibrium.
In order to distinguish between the players, with respect to the sign of the score
differential, the index I is adapted such that:
It =
{1 if t = a
−1 if t = b. (13)
Adapted from the proofs of FS, the Nash equilibrium is described by the following
proposition:
Proposition 1 Under the given assumptions, the Nash equilibrium at any state ω ofthe match will be given by the solution to the following simultaneous equations:
xtω =
{r ln
(rγt′ωf(∆ω + rln
γt′ωγtω
)∆Vt(ωt, ω−t)eδtω/r
)with probability γtω
−∞ with probability (1 − γtω)
(14)for t ∈ {a, b}. Player t plays a pure strategy (γtω = 1) if:
γt′ω(−rf(∆ω + r ln γt′ω) + F (It∆ω + r ln γt′ω)) +(1 − γt′ω)
2≥ 0 (15)
Otherwise, γtω solves:
γt′ω(−rf(∆ω + r lnγt′ω
γtω
) + F (It∆ω + r lnγt′ω
γtω
)) + (1 − γt′ω)/2 = 0. (16)
13
Proof : See Appendix.
A sketch of the proof of the proposition provides the intuition for the equations
for both the equilibrium strategies and the mixing probabilities. First, the indirect
value of the pure strategy xtω is solved from the first order, necessary conditions for
the maximization of the objective function in (10). Given the value of the interior
solution, and the value of giving up, the indirect value of each of the pure strategies
available to player t are solved, conditional on the mixing probabilities employed by
t′ and the reward structure for the game. If the pure strategy dominates the mixed
strategy, player t will always play the pure strategy (15). If not, player t will choose
γtω such that the expected rewards in Nash equilibrium to each of the mixed strategies
are equal, and this solution is found in (16).
The solution to the Nash equilibrium provides an equilibrium winning probability
as a function of mixing probabilities and incentives. In Nash equilibrium, the winning
probability is described by:
Pω = prob(player a wins) = γaωγbωF (∆ω + r lnγbω
γaω
) (17)
+ (1 − γbω) +(1 − γaω)(1 − γbω)
2.
where the first term is the outcome when both players put out effort levels, and the
additional terms capture the toss-up probability when both players give up. The
algorithm for solving Nash equilibria at each node is described in detail in Section
3.1.
14
2.3 Subgame Perfect Equilibrium
In order to define the SPNE outcomes in the sets and matches, the final payoff
difference between winning and losing a match must be defined. As was shown in the
previous section, the Nash equilibrium at each node gives us the winning probabilities
at that point in the state space. Beginning from the terminal node in the third set,
backward induction allows the calculation of the Nash equilibrium payoffs at each
possible state within the match, which gives the robust SPNE as follows.
Assumption 1 Final payoffs to winning the match/tournament are known to theparticipants. In the final game of the tournament, the payoffs to winning and losingare given by the prize money differential for first and second place. Also, playerscare only about winning or losing and the total costs expended during play, and notexplicitly about the length of the match.
This assumption is analogous to that made in FS, however it may be more tenable
here. In a sports championship series, the rewards come largely from television and
gate revenues, which depend on the length of the series, although it is not clear that
the rewards accruing to the players vary with this length. In a tennis match, the
player is not likely to benefit in any way from a longer tournament/match.
Proposition 2 The SPNE of the match is completely defined by mixing probabilitiesγiω given in Proposition 1 for all elements of ω, and the values of being at each nodecan be defined recursively as the Nash equilibrium payoffs of the individual games ateach node, given the final prize money offered in the tournament. At any state ω,the value to player t of being at that node in the subgame is defined by the rewards towinning (ωt), losing (ω−t), and the winning probabilities defined in Nash equilibriumPtω.
15
Vi[ω] ≡ γtω∆Vt(ωt, ω−t)
[−rf(∆ω + r ln
γtω
γt′ω) + γt′ωF (It∆ω + r ln
γtω
γt′ω)
](18)
+1 − γtω
2[γt′ωVt[ω−t] + (1 − γt′ω)Vt[ωt]]
∆Vt(ωt, ω−t) = Vt[ωt] − Vt[ω−t] (19)
Equations in (18-19) give the value of each node in a set, given the terminalvalues in the match. Additional clarification for the values at the terminal nodes ofintermediate sets is required. For example, in the first set, the rewards to winningare defined as the value of the 0,0 node in the set which will be reached for eachresult (a wins or b wins). Formally, at state ω = (0, 0, gi, g−i)|gi = 5, ∀ g−i, ωi isdefined by Vi(ωi) where ωi = (1, 0, 0, 0). The final rewards to winning the second setclearly depend on the outcome of the first set, which is not immediately consistentwith backward induction. There are only two cases for the second set, either playerhaving won the first set, and the payoffs are then clearly defined as above. (See Figure5) In order to solve the SPNE, the 3rd set is solved, giving the value at the beginningof the third set. This value, combined with the values of winning and losing the matchprovide the complete payoff structures to solve the second sets given 1,0 and 0,1 setscores. Finally, the solution to both second set possibilities yields a value for eachplayer at the first node of the second set, conditional on winning or losing the firstset, which allows the first set of the match to be solved.
The SPNE for the match is therefore clearly defined and robust given final payoffsto winning and losing the match.
Proof: Backward Induction.
3 Econometric Specification
The notion that strategic incentives play a role in the determination of match out-
comes can be tested by constructing a test on the null hypothesis that the curvature
parameter r is equal to 0. In order to construct the empirical structural model, ad-
ditional assumptions must be made on the determination of ability. The empirical
structure of ability differences, and thus winning probabilities is given, conditional on
16
observable covariates X, as:
observed ability advantage (seed difference) : Xj ≡ Xaj − Xbj
residual ability advantage : αj ≡ αa − αb (20)
net ability advantage : δj = δaj − δbj = Xjβ |αa = αb.
The parameter α will not be separately identified in this analysis, since only the
relative difference in abilities matters to the outcome of the match in the model.
The parameter α would only be identifiable as a set of types or as individual ability
parameters for each player. Thus, the preliminaries of the model, the relative abilities
of the players and the rewards to winning and losing the match are defined by elements
in the data (X), and model parameters β. The remainder of the match can be solved
for score probabilities given the other parameters of the model, r and σ2, the variance
of the random effect distribution.
Three model specifications are considered, and the estimation results reported.
The first model specification considers only seed difference in the determination of
ability. The second model allows for the seed difference to enter into ability with
a quadratic relationship. The third model augments the second by adding WTA
ranking difference as a further indicator of ability. Tests of models which included
indicator variables for being seeded and quadratic terms for rank difference showed
no significant relationships.
Each of the models is estimated for both the limiting case probit model and the
full structural model.
17
3.1 Computational Subgame Perfect Nash Equilibrium
The Nash equilibrium is computed using the FS algorithm, and the indices defined
for Nash equilibrium as follows:
Algorithm 1 Computational Nash Equilibrium Algorithm
• Compute ∆ω, which indicates a strategic advantage to player a or b. Assign apure strategy to the player with strategic advantage.
• Check condition given in (15), and solve for other player’s mixing probabilitygiven the pure strategy of the opposition
• If there is no strategic advantage, each player will play the pure strategy Nashequilibrium
In equation (14), the parameters γ which define the mixed-strategy Nash equilib-
rium either follow immediately from (15) or can be solved as the unique solution to
the zero function implied by (16). Since we know that the mixing probabilities are
strictly positive and less than 1, a bracketing algorithm as outlined in Judd (1998)
produces a quick solution to the zero function relative to a non-linear approach.
Letting the realized state of a match be given by the scores in each of the 2 or 3
sets played in the match, ie the terminal state vector:
ω ∈ {g1a, g1b, g2a, g2b, g3a, g3b} . (21)
The probability of an observed sequence of set scores is the product of the proba-
bilities of the observed scores in each set, conditional on the state of the match when
the set is played. Within a set, the probability of a set ending in a particular score
18
is the sum of the path probabilities for each possible path ending in that score6. The
probability of i winning a set by a particular score is calculated as:
Algorithm 2 Path Probability Algorithm:
• begin with a matrix of winning probabilities at each game in a set, calculatedthrough the SPNE
• given the matrix of winning probabilities, calculate the probability of each possiblepath from 0,0 to each of the terminal nodes in the set
• The probability of a match ending 6-i is the sum of the individual probabilitiesof each path that ends in 6-i
The path probability algorithm provides probabilities of individual scores at each
node in the match, conditional on parameters r, σ, and β. Denoting ω as the terminal
state of the match, and Wj, j ∈ {1, 2, 3} as a set of indicator variables for set j being
won by player a, and Pjωg as the SPNE probability that player a wins the set at state
ω such that ω is the terminal state of a set. If the third set is not played, the indicator
s3 is set to 0. Thus, the probability of a sequence of set scores in a match is given by:
P ∗(ω, X|β, r, σ) =2∏
j=1
[Pjωg]Wj [1 − Pjωg]
1−Wj ∗([P3ωg]
W3 [1 − P3ω]1−W3
)s3
. (22)
3.2 Estimation
The estimation in this paper is undertaken by maximizing the log of the likelihood
function given in (22). Therefore, the continuity of P ∗ in the parameter space is
crucial. As in FS, as long as there is some probability that each player wins each
game, this continuity is achieved. This is realized by allowing the mixed strategy
6This is because the current data only provides information on the final score, and not on thesequence of individual game wins.
19
Nash equilibria at each node in the tournament. The structure of the subgame perfect
equilibrium model allows the identification of the parameter of interest, r, and the
construction of a confidence interval around the coefficient estimate. The results of
the estimation procedure are outlined in Section 5. The parameter r is constrained
to be in (0, +∞] in order to assure that the second order conditions will be satisfied
for all values of ∆ω, the strategic advantage. The maximum likelihood procedure was
undertaken constraining these values to be positive using the estimated parameter
r∗defined as:
r = er∗ (23)
(24)
Because of this characterization, hypothesis tests on the value of r∗ are not directly
informative. Confidence intervals around the implied parameter r are evaluated and
reported.
Denoting the vector of estimated parameters θ, such that θ = {β, r∗}, the log
likelihood for the sample of match finals is thus given by:
L(θ) ≡∑
i
ln (P ∗(ω, X|β, r∗)) . (25)
In the data, we can think of each match as a panel of observations, where we first sum
over the likelihood of individual elements in the panel, over the sets in each match in
(22). This gives us a likelihood for each element in the panel, then we sum up the log
likelihood to get the likelihood of the parameters conditional on the data as a whole.
In order to identify the standard errors of the parameter estimates, a single iteration
on Newton’s Method is used to yield the estimated Hessian matrix.
20
4 The Data
The data for this study are taken from Association de Tennis Professionel (ATP)
tournaments over the 1998 and 1999 seasons. The data set consists of set scores and
seeding information from 87 tournaments. The tournaments all feature best of 3 set
structures, and therefore do not include any of the ’Grand Slam’ tournaments, which
employ a best of 5 set structure. Each match is described by which round of the
tournament it is in (only finals in this case), the number of sets won by each player,
and the score in each set. The players’ respective seeds in the tournament and current
ATP World Ranking are also included.
Payoffs in tennis tournaments tend to be of linear structure, with payoffs doubling
at each successive round. However, since the absolute and not relative difference is
important to the outcome of each of the matches, actual prize money is included in
the data set.
Two assumptions were made in the data set to accommodate the assumptions of
the model. The first assumption is that tie-breakers are played as a one-shot game,
which is not the case in the data. In order to rectify this, games ending either as
7-5 or 7-6 were replaced as 6-5. This change affected 25% of the set scores, so it is
a major assumption. However, the outcomes of the sets are not altered, and it is
simplifies the state transitions significantly. The effort responses will still appear, it
will just be captured in the entire tiebreaker ‘game’.
The second assumption is that the seeds of all unseeded players are given by 9,
whereas seeded players are ranked 1-8. An additional parameter identifies a player
21
solely as being seeded in the second estimated model. This captures some of the
effect of this choice of 9 as a seed for all unseeded players. The ability level is allowed
to depend exponentially on seed, and tests for a dummy variable for being unseeded
showed no significant effect in any specification.
5 Results
5.1 Data Analysis
The data are summarized based on the winner of the match, so all statistics presented
referring to winner and loser are meant at the match level.
Table 1 details some of the characteristics of players in championship matches in
the data, while Table 2 summarizes the mean properties of the individual matches
which the model will attempt to replicate. The parameter of interest governs the
probability that players will give up when the match turns against them, and as the
value of r increases, we would expect players to give up earlier. In the data, we can
present some stylized facts on the outcome of matches conditional on earlier events
therein. We see that 58.14% of matches end after 2 sets, such that the winner of the
first set goes on to win the match. Furthermore, the winner of the first set wins 81.4%
(70/86) of the championship matches. The mean absolute score differences in each
of the sets are not significantly different at the 95% level, nor is the mean absolute
score difference statistically different conditional on the match ending after 2 sets or
continuing on to the third set. We are interested in the propensity of players to give
up earlier in the set or match, so Table 3 provides tabulations on the frequency of
different losing player scores per set. Simply put, as r increases, we would expect
22
to see higher probability mass at the lower scores, as players give up sooner in the
match. The stylized facts seem to suggest that there is little or no incentive response
occurring in the sense that players would give up earlier in the second set if they had
lost the first set. Since the data do not provide information on the sequence of scoring
in the sets, it is difficult to draw too many conclusions on the nature of r from the
data without the estimation of the structural model, as seen below in Section 5.2.
Boulier and Steckler (1999) look at the forecasting properties of seeding in sports.
They find that seeding is useful on it’s face as a predictor, and that a probit model
provides an improvement on the forecasting power in seeds alone for predicting win-
ning probabilities. In the tournaments in the data set, 8 of the 32 players were seeded
based on their current ATP/WTA rankings. Seed difference appears initially to be
uninformative, in that half the matches are won by the player with the lower seed.
However, the parameters β to be estimated will quantify the same sort of effect the
Boulier and Steckler find in their paper.
5.2 Estimation of the Parameters of the Structural Model
The results of the maximum likelihood procedure return parameter estimates and
standard errors over the three parameters identified in the structural model: the
marginal effect of seeding and world ranking on ability, and most importantly, the
curvature (incentive response) parameter r conditional on variance of the ”bounce
of the ball” random component σ2 = 1. Table 4 shows the results of the estimation
procedure, with standard errors in brackets.
Table 5 shows the constructed confidence intervals around the parameter of interest
23
in each of the model specifications. The parameter of interest is r, which we see
lying between .07674 and .1287 99% of the time in all of the models specified. This
indicates that there is some altering of effort levels occurring in the data.
Finally, the following Table 6 presents the results of the same model specification
under the limiting probit model which occurs where r = 0.
5.3 Simulation
While the goal of the paper is not to predict the results of tennis matches, it is
important to evaluate the ability of the model to explain the data, if we are to take
the estimates of the model on their face. Table 2 shows the ability of the model to
replicate characteristics of a season of tournament championship data. 100 seasons
of 86 matches were simulated using the estimated model parameters outlined above.
The characteristics of individual match scores are summarized in Table 2. Finally,
the tabulation of set score frequencies for simulated data is included in brackets in
Table 3.
Table 1 shows positive performance of the model. The percentage of matches
ending after 2 sets is close to that in the observed matches. The 95% confidence
interval for the observed mean of 58% is from 47.5% to 68.78%, while the simulated
percentage is 62.90%. The number of matches won by lower seeds is slightly lower
than the 50/50 split seen in the observed data, but the difference is not statistically
significant. The momentum effect in the simulated matches is also similar, with a 1%
(insignificant) error in the prediction on the winner of the match conditional on the
winner of the first set.
24
Table 2 examines the characteristics of individual match scores in the simulated
data. In the actual data, there were no significant differences between the set score
differentials. The simulated score differentials in each fall into the 95% confidence
interval for the observed data. In this table, we see that 63.14% of matches are
correctly predicted by the model. The probit model using the same variables is able
to predict 65.34% of matches correctly. Unfortunately, Boulier and Stekler (1999) do
not report a percent correctly predicted for their probit model of tennis, so the results
are not directly comparable. This slight difference in prediction of individual match
outcomes is not statistically significant at the 1% level. Even if it were, it would not
be directly important, since the key question of the paper is whether we can detect
variance in effort levels within the match, and this is determined by the ability of the
model to explain inter-match dynamics.
6 Discussion
This paper seeks to estimate a parameter which quantifies the effect of incentive re-
sponse on the results of rank-order tournaments, using data on tennis to estimate the
model. The structure of the SPNE model allows the identification and estimation
of the parameter r, which governs the marginal cost of effort above and below abil-
ity. Essentially, if r is significantly different from zero, we see players altering their
effort levels in response to incentive advantages in the data, which corresponds to
a finite cost of effort levels above or below ability. The model is in fact picking up
this marginal cost, but because of the structure of the model, we would only see a
significant parameter estimate for r if agents were adjusting their effort levels.
25
This parameter estimate differs from those found in FS, since their paper did
not find any strong evidence for strategic effects. In their estimation, the estimated
parameter r∗ was -10.44 in hockey series, and small in other sports, with standard
errors of 22778 and higher. This paper shows stronger evidence of incentive response
among participants in a rank order tournaments then has been shown in other studies.
The evidence of agents altering their behaviour in response to identified incen-
tive structures has important implications for many fields of applied microeconomics.
The tournament model has been applied theoretically to research contracts, executive
salaries and principal-agent production problems. If we can be confident in the first
assumption of the model, that is that agents perceive their effort costs and bene-
fits relative to other competitors, and set their efforts to maximize payoffs, then we
can place a great deal more confidence in these theoretical treatments. This paper
provides evidence toward that important under lying confidence in the tournament
model as a predictor of agent behaviour.
26
References
[1] Bryan Boulier and H.O. Steckler, Are sports seedings good predictors?: an eval-
uation, International Journal of Forecasting 15 (1999), 83–91.
[2] Clive Bull, Andrew Schotter and Keith Weigelt, Tournaments and piece rates:
An experimental study, Journal of Political Economy 95(1) (1987), 1–33.
[3] Jurgen A. Doornik, Ox version 3.00, June 2001.
[4] Robert Drago and John S. Heywood, Tournaments, piece rates, and the shape of
the payoff function, Journal of Political Economy 97(4) (1999), 992–998.
[5] Christopher Ferrall and Anthony J. Smith, A sequential game model of sports
championship series: Theory and estimation, Review of Economics and Statistics
81 (November, 1999), 704–19.
[6] Kevin A. Gale, The determination of set score differences in men’s professional
tennis, Master’s thesis, Queen’s University, 2000.
[7] Kenneth L. Judd, Numerical methods in economics, The MIT Press, Cambridge,
Massachusetts, 1998.
[8] Edward Lazear and Sherwin Rosen, Rank-Order Tournaments as Optimum
Labour Contracts, Journal of Political Economy 89(5) (1981), 841–866.
[9] , Testing the Theory of Tournaments: an Empirical Analysis of Broiler
Production, Journal of Labour Economics 12(2) (1994), 155–179.
27
[10] R.D. McKelvey, Handbook of computational economics, Handbooks in Eco-
nomics, ch. 13, pp. 87–142, Elsevier Science, North-Holland, Amsterdam; New
York and Oxford, 1996.
[11] Martin J. Osborne and Ariel Rubinstein, A course in game theory, The MIT
Press, Cambridge, Massachusetts, 1994.
28
A Appendix
A.1 Proof of Propositions
Proposition 1
The problem of a player in the game is to maximize his/her payoff function subject
to the maximization decision of the other player in Nash equilibrium. The objective
function for each player is expressed generally as:
−e−δiω
r exiω
r + γt′ωF (It(xaω − xbω))∆Vt(ωt, ω−t) + (1 − γt′ω)Vt(ωt) + γt′ωVt(ω−t) (26)
thus the first order, necessary conditions for an interior solution for players a and b
respectively are:
exaω
r = rγbωf(xaω − xbω)∆Va(ωa, ωb)eδaω
r (27)
exbω
r = rγaωf(xaω − xbω)∆Vb(ωa, ωb)eδbω
r (28)
dividing (27) by (28), and substituting for index of strategic advantage ∆ω defined in
(12) yields:
xaω − xbω = ∆ω + r lnγbω
γaω
(29)
and substituting (29) into the first order conditions and solving for interior effort level
x yields the interior solution in (14).
Substituting the interior effort level in (14) into the value function for the player
29
yields the value of playing a pure strategy:
γt′ω∆Vtω
(−rf(∆ω + r ln
γbω
γaω
) + F (It(∆ω + r lnγbω
γaω
))
)(30)
+ (1 − γt′ω)Vt(ωt) + γt′ωVt(ω−t)
The mixed strategy equilibrium allows for a player t to give up with probability
(1 − γtω), where giving up has the effect of losing with certainty if the other player
puts in positive effort (xt′ > −∞) and in the case where both give up, the game is a
toss up (50/50). The indirect value of giving up is therefore defined as:
γt′ωVt(ω−t) + (1 − γt′ω)Vt(ωt) + Vt(ω−t)
2(31)
so we know that the interior solution will be preferred to the mixed strategy equilib-
rium if the indirect values defined above are such that (30) > (31), which is simplified
to yield (15). If the pure strategy interior solution is not strictly preferred to giving
up, the player will set γt such that the values of each are exactly equal in equilibrium,
such that (30) = (31), as given by the zero-function in (16).
And finally, the probability of winning at each node is given simply by:
Pω = prob(player a wins) = γaωγbωF (∆ω + r lnγbω
γaω
) (32)
+ (1 − γbω) +(1 − γaω)(1 − γbω)
2
QED
30
Parameter r
The added explanatory power of the parameter r is best understood in a couple
of steps. Looking at the winning probabilities described above, if both players are
playing pure strategies, the outcome is described by:
Pω = = F (∆ω) (33)
= F (α + β(Xj) + rνω)
which, if we do not employ the subgame perfect equilibrium structure, is described
simply as a probit (or logit) model with a latent regressor νω. The SPNE model
allows us to solve for νω at each node of the match, and thus to include both the
parameter r and the ability parameters β in the estimation.
31
Table 1: Summary of ATP Tennis Tournament Championship and Simulated Matches
Data SimulationTotal 86 8600% ending after 2 sets 58.14% 62.90%
Mean Abs. Seed Difference (all players) 2.8023 2.8023Mean Abs. Seed Difference (2 seeded players) 2.7692 2.7692
Match won by lower seed 50.00% 46.04%
Match involves 2 seeded players 26 26001 seeded player 30 30000 seeded players 30 3000
Matches won by player winning first set 81.4% 80.11%in 2 sets 58.14% 62.90%player losing first set 18.6% 19.88%
Percent of Matches Correctly Predicted 63.14%
32
Table 2: Summary of Real and Simulated Data Match Scores
Variable Data Mean Simulated Mean(Data Std. Dev.) (Simulated Std. Dev.)
winner’s seed 5.360 5.619(3.276) (3.255)
loser’s seed 7.395 7.137(2.792) (2.940)
2 set match .5814 .6314(.4962) (.4827)
Score diff. - Set 1 2.581 2.962(1.324) (1.495)
Score diff. - Set 2 2.674 2.931(1.384) (1.519)
Score diff. - Set 3 2.694 2.896(1.348) (1.453)
Correctly Predicted .6314(.4827)
33
Table 3: Frequency of Scores being 6-x by set
Set Loser’s Score 1st Set 2nd Set 3rd Set
0 2.33 2.33 -(5.12) (6.28) (3.76)
1 5.81 8.14 13.89(13.14) (12.44) (9.40)
2 16.28 20.93 11.11(17.67) (15.35) (20.06)
3 25.58 16.28 30.56(22.91) (21.63) (25.39)
4 23.26 27.91 19.44(19.19) (22.67) (22.26)
5 26.74 24.42 25.00(21.98) (21.63) (19.12)
() indicates values simulated with estimated parameters
34
Table 4: Maximum Likelihood Estimation of Structural Model Parameters
Variable Model 1 Model 2 Model 3 Model 3Seed Difference(SD) −0.02345∗ 0.06233∗ 0.06168∗ −0.02358∗
(4.917e−5) (0.001345) (0.001371) (.002313)SD2 −.008361∗ −.008230∗ 0.003440∗
(1.325e−5) (1.331e−5) (3.044e−5)Rank Difference(RD) −4.725e−5∗ −3.122e−5∗
(1.189e−8) (1.286e−8)Seed Indicator(SI) −.32435∗
(.01601)r∗ −2.168∗ −2.237∗ −2.237∗ −2.329∗
(0.05911) (.08194) (.08202) (.1200)ln(likelihood) -467.64 -464.53 -464.43 -460.41
* = significant at the 1% level
Table 5: Confidence Intervals on Incentive Response Parameter
Model Specification Coeff. Std.Err. 99%CI+ Implied 99%CI+Parameter
Model 1 -2.168 0.05911 0.1287 0.1144 0.1017Model 2 -2.236 0.08194 0.1257 0.1068 0.09078Model 3 -2.237 0.08202 0.126 0.1068 0.09072Model 4 -2.329 0.1200 0.1236 0.09740 0.07674
35
Table 6: Maximum Likelihood Estimation of Probit Model Parameters
Variable Model 1 Model 2 Model 3 Model 3Seed Difference(SD) −0.03350∗ 0.08219∗ 0.08147∗ −0.02878∗
(5.085e−5) (0.001914) (0.001917) (.003296)SD2 −.01113∗ −.01098∗ 0.004146∗
(1.727e−5) (1.740e−5) (4.332e−5)Rank Difference(RD) −5.935e−5∗ −3.539e−5∗
(1.846e−8) (1.852e−8)Seed Indicator(SI) −.4072∗
(.01884)ln(likelihood) -469.48 -465.89 -465.79 -461.37
* = significant at the 1% level
36
1.15
1.05
1.2
1.1
r
10.6 0.80.40.2
Figure 1: Effort Cost of Playing 10% aboveability
1.2
1.4
0.8
1
0.6
x
0.30.250.20.10.050 0.15
Figure 2: Effort Cost, r=.28, δ=.2
2
1.5
1
0
0.5
x
0.20.1980.1960.1940.1920.19
2.5
Figure 3: Effort Cost, r=0.001, δ=.2
37
0,0
1,0 0,1
1,1
2,1 1,2
2,2
3,2 2,3
2,0
3,1
0,2
1,3
3,0 0,3
4,1
0,44,0
0,55,0 1,4
5,5
5,4 4,5
4,4
4,3 3,4
3,3
5,3
4,2
3,5
2,4
5,2 2,5
1,55,1
PLAYER 1WINS
PLAYER 2WINS
Figure 4: The Game Tree for a Set
38
1,0 0,1
0,0
1,0
0,1
1,1
2,1
1,2
2,2
3,2
2,3
2,03,1
0,21,3
3,0
0,34
,1
0,4
4,0
0,5
5,0
1,4
5,5
5,4
4,5
4,4
4,3
3,4
3,35
,3
4,2 3
,5
2,4
5,2
2,5
1,5
5,1
A WINS
0,0
1,0
0,1
1,1
2,1
1,2
2,2
3,2
2,3
2,03,1
0,21,3
3,0
0,34
,1
0,4
4,0
0,5
5,0
1,4
5,5
5,4
4,5
4,4
4,3
3,4
3,35
,3
4,2 3
,5
2,4
5,2
2,5
1,5
5,1
0,0
1,0
0,1
1,1
2,1
1,2
2,2
3,2
2,3
2,03,1
0,21,3
3,0
0,34
,1
0,4
4,0
0,5
5,0
1,4
5,5
5,4
4,5
4,4
4,3
3,4
3,35
,3
4,2 3
,5
2,4
5,2
2,5
1,5
5,1
0,0
1,0
0,1
1,1
2,1
1,2
2,2
3,2
2,3
2,03,1
0,21,3
3,0
0,34
,1
0,4
4,0
0,5
5,0
1,4
5,5
5,4
4,5
4,4
4,3
3,4
3,35
,3
4,2 3
,5
2,4
5,2
2,5
1,5
5,1
B WINS
B WINSA WINS
1,1
Figure 5: The Game Tree for a Match
39
Liste des cahiers de recherche publiés par les professeurs des H.E.C.
2003-2004 Institut d’économie appliquée IEA-03-01 ROBERT GAGNÉ; PIERRE THOMAS LÉGER. « Determinants of Physicians’ Decisions to
Specialize »,29 pages. IEA-03-02 BENOIT DOSTIE. « Controlling for Demand Side Factors and Job Matching: Maximum
Likelihood Estimates of the Returns to Seniority Using Matched Employer-Employee Data »,24 pages.
IEA-03-03 ALAIN LAPOINTE. « La performance de Montréal et l'économie du savoir: un changement
de politique s'impose », 35 pages. IEA-03-04 MICHEL NORMANDIN; LOUIS PHANEUF. « Monetary Policy Shocks: Testing Identification
Conditions Under Time-Varying Conditional Volatility », 43 pages. IEA-03-05 MARTIN BOILEAU; MICHEL NORMANDIN. « Dynamics of the Current Account and Interest
Differentials », 38 pages. IEA-03-06: MICHEL NORMANDIN; PASCAL ST-AMOUR. « Recursive Measures of Total Wealth and
Portfolio Return », 10 pages. IEA-03-07: DOSTIE, BENOIT ; LÉGER, PIERRE THOMAS. « The Living Arrangement Dynamics of Sick,
Elderly Individuals », 29 pages. IEA-03-08: MICHEL NORMANDIN. « Canadian and U.S. Financial Markets: Testing the International
Integration Hypothesis under Time-Varying Conditional Volatility », 35 pages.
- 1 -
- 2 -
IEA-04-01: ANDREW LEACH. « Integrated Assessment of Climate Change Using an OLG Model », 34 pages.