Loureiro et al,2010pvar effet sptial .pdf

download Loureiro et al,2010pvar effet sptial .pdf

of 41

Transcript of Loureiro et al,2010pvar effet sptial .pdf

  • 8/14/2019 Loureiro et al,2010pvar effet sptial .pdf

    1/41

    THE DYNAMICS OF LAND-USE IN BRAZILIANAMAZON

    Paulo R. A. LoureiroAdolfo Sachsida

    Mrio Jorge Cardoso de Mendona

    Universidade de Braslia

    Departamento de Economia

    Srie Textos para Discusso

    Department of Economics Working Paper 342University of Brasilia, November 2010

    Texto No 342Braslia, Novembro de 2010

  • 8/14/2019 Loureiro et al,2010pvar effet sptial .pdf

    2/41

    T HE DYNAMICS OF L AND -USE IN BRAZILIAN AMAZON

    Mrio Jorge [email protected]

    IPEA - Instituto de Pesquisa Econmica Aplicada (Brazil)

    Paulo R. A. Loureiro

    [email protected]

    University of Brasilia (UnB)

    Adolfo Sachsida

    [email protected]

    IPEA - Instituto de Pesquisa Econmica Aplicada (Brazil)

    Abstract : This paper studies the dynamics of land-use in the Brazilian Amazon using a

    structural vector autoregressive (SVAR) model. The heterogeneity in the data is controlled

    by mean of fixed effect panel specification. Meanwhile, spatial autocorrelation is also

    diagnosed by a statistical methodology that allows to us to split the model in subsamples(clusters) of more homogenous municipalities in order to re-estimate the model on separate

    clusters. The clustering analysis shows that there are three clusters whose land-use patterns

    are strongly different in an economical point of view. The first cluster identifies the pioneer

    fronts; dedicated to logging, natural resources exploitation and slash-and-burn cultures, the

    second cluster have grown a more diversified agriculture while the third cluster presents

    most developed, intensive agriculture oriented municipalities.

    Another distinctive feature of this article pertains to the assessment of the

    contemporaneous causal order that exists among distinct land-uses. This permits to evaluate

    the succession dynamics that derive from unexpected innovations in the process of soil

    occupation by means of impulse response functions (IRFs). The IRFs applied for cluster 1

    lead to the following results: (1) the new demand for cropping requires to clear new areas

    of forest. This extra cleared land will be transformed in pasture land or fallow in the long

    mailto:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]
  • 8/14/2019 Loureiro et al,2010pvar effet sptial .pdf

    3/41

    term. This process can be considered a necessary outcome of the slash-and-burn

    agriculture, a common practice in the Amazon; (2) Contrary to other studies we do not find

    evidence that cattle ranching is the primary driver of deforestation; (3) the impact of a

    shock of pasture land on itself is virtually null at the beginning but it augments substantially

    over time not requiring to clear extra areas of forest land but rather competing with crop

    land, and (4) it seems that if not for all the Amazon Basin, at least in this cluster, cattle

    ranching and cropping could be competitive activities.

    Key words : Brazilian Amazon, Vector autoregressive, Panel data, Spatial autocorrelation,

    Impulse response functions.

    JEL : Q24; Q34; Q56; C33; C38.

    1. INTRODUCTION

    Over the past two decades the international community has become aware of the global

    and regional environmental risks associated to possible massive forest losses in the

    Brazilian Amazon. The impacts on global carbon cycle, regional climate and the loss of

    biodiversity are among the main consequences of extensive land-use change processes in

    this region. Therefore a good understanding of land-use dynamics in the Brazilian Amazon

    is a fundamental step before action whether controlling the problem or even change its

    course towards a more sustainable path.

    Several studies have already contributed to a better understanding of the underlying

    processes that drive land-use change in the Amazon. One of the main difficulties of land-

    use studies lies in the fact that relevant economic and natural processes are fundamentally

    defined (and face constraints) at a fine geographic scale. This calls for an effort to bridgemethodological gaps within a triangle of three approaches: classical economic approach

    (e.g. Angelsen, A., 1996), agent-based models (e.g. Soares-Filho & alli, 2006) and

    econometric modelling [(Reis and Blanco, 1997); (Pfaff, 1999), (Reis and Guzmn, 1994),

    inter allia]. Each of these approaches has advantages and drawbacks relative to their

    respective research objectives. Here, we focus on the econometric approach. This

  • 8/14/2019 Loureiro et al,2010pvar effet sptial .pdf

    4/41

    methodology, applied to land-use, has been criticized for its lack of economic consistency

    (in comparison for example to sectoral or general equilibrium models), but it has

    definitively the advantage to unveil the most out of existing data when the primary research

    issue is to describe land-use dynamics in the whole Amazonian perimeter, including non-

    optimal succession of uses from a strict economic standpoint.

    Econometric and statistical methods have been used for example with the objective to

    determine the main dynamical features between succeeding anthropogenic land-use

    sequence, like, e.g., cropland, pastures, and regeneration, which constitute a typical pattern

    of successive land-uses following land clearing. Indeed, in order to seek for deforestation

    drivers, it is important to know up to what extent, in past deforestation trends, cropland or

    pastures uses are more or less keen to immediately follow deforestation. Such trends mayalso differ in different frontier patterns, for example the eastern versus southwestern

    frontiers of the Amazonian Basin.

    In one of the first attempts to determine a succession profile for the whole Amazon,

    Andersen et al. (2002, 1997) estimated a reduced form vector autoregressive (VAR) model

    involving the municipalities of Brazilian Amazon over the period 1970 through 1985.

    Concerning the use of this model to formulate environmental policies some remarks need

    be stated. A first difficulty lies in the fact that spatial heterogeneity among cross-section

    units (municipalities) has to be taken into account in order to contemplate the geographic,

    environmental and economic diversity among those different units and the effect of their

    mutual interactions. A second difficulty is related to the fact that land-use census data in the

    Brazilian Amazon are only available in a few 5-years time steps.

    The environment in the Amazon is subject to very strong dynamic forces that are

    capable to change the patterns of land-use even in a moderately short period of five years,

    that is, the period in which the data are collected. Combined with the fact the Andersen et

    al. model is a reduced form VAR and not a structural model, it implies that the

    contemporaneous (i.e. at each studied time step) relationships among distinct land-uses

    occuring during a given period are not clearly treated. In this case, the model fails to

  • 8/14/2019 Loureiro et al,2010pvar effet sptial .pdf

    5/41

    adequately estimate the succession dynamics that derives from contemporaneous causal

    orders of land-use change. In other words, such an approach is not adequate to uncover

    stylized facts on the short-run impacts of the identified exogenous sources in the land-use

    process. For instance, an exogenous demand shock on agricultural product or meat prices

    generates both a current and subsequent effect on soil occupation. In the same way a

    technological shock or demographic shock can induce a new unexpected evolution of land-

    use. However, the proper understanding of this kind of diffusion processes in Amazonian

    land-occupation can be improved by using a model that captures the structural relations

    among different land-uses .

    In this paper we adopt the structural VAR (SVAR) model with panel data to assess

    impacts of the identified exogenous sources. We expand and refine previous works totackle these metodological issues. First, we take into account the heterogeneity among land

    units using the panel data 1 model [(Hsiao, 1995), (Baltagi, 1995), (Arellano, 2003), etc.],

    mixing information concerning variation among individual units with variations taking

    place over time. Second, in the structural VAR literature, the contemporaneous relationship

    is identified on the basis of prior information supported by a-theoretical considerations.

    Notwithstanding, another distinctive feature of this article is the employment of Directed

    Acyclic Graphs (DAGs) to assess the contemporaneous causal order of the SVAR. Using

    the selected orderings to identify the SVAR models we obtained their impulse response

    functions (IRFs). The IR functions allow to model the dynamics of distinct land-uses

    derived from an unexpected shock on the occupation of the Amazon Basin. Finally,

    because the object of this study is clearly the spatial process that results from the complex

    interplay of many phenomena occurring in a much extended spatial domain, it is obviously

    a spatial phenomenon and we may expect spatial autocorrelation to be present in the data.

    The methodology developed here aims to diagnose the extend of the problem so that in a

    first step, the model may be re-estimated on subsamples of more homogenous groups of

    municipalities. Moreover, this methodology allows us to propose a comprehensive

    interpretation of the meaning of the clusters in an economic point of view.

    1. Important sources of variation may be left out if the data is only pooled in a single (temporal or spatial)dimension, and more precise parameter estimates can be obtained in panel approaches that explore thevariability present in the data both across counties and within counties over time.

  • 8/14/2019 Loureiro et al,2010pvar effet sptial .pdf

    6/41

    The article is organized as follows. The section 2 presents the methodology used to

    analyse the dynamics of land-use introducing the distinct subjects involved in our analysis

    such as structural and reduced form VAR, panel data method, spatial autocorrelation and

    identification by DAGs. The database is described in section 3. The econometric results are

    presented in sections 4 and 5. In section 4 we present many partial but important results

    derived from estimation, spatial analysis and the identification procedure undertaken by

    DAGs methodology. In section 5 we present the clustering and the IRFs analyses. Some

    concluding remarks are stated in section 6.

    2. M ETHODOLOGY

    2.1. Overview

    The need for a specific methodology allowing to deal with the contemporaneous

    relationship existing among distinct land-uses was soon perceived by many authors.

    Andersen et al. (1997, 2002) and Weinbold (1999) estimated a model of land-use using a

    vector autoregressive (VAR) specification on the municipalities of Brazilian Amazon. This

    model is given by the system of equations (1):

    it it it it it it

    it it it it it it

    it it it it it it

    fallow pasturecrop Dclear fallow

    fallow pasturecrop Dclear pasture

    fallow pasturecrop Dclear crop

    31413121

    21413121

    11413121

    (1) 2

    The indexes i and t are associated, respectively, to each municipality and each time

    period. Our point here is to reformulate this model in order to contemplate some interesting

    issues about land-use in the Brazilian Amazon. The first issue is how to identify the

    contemporaneous causal order underlying the land-use process. Any model that does not

    2 Here it it it it fallow pasturecropclear and 1 it it it clear clear Dclear .

  • 8/14/2019 Loureiro et al,2010pvar effet sptial .pdf

    7/41

    take this point into account is inappropriate in a policy perspective concerning land-use in

    the Amazon. The second issue is that equation system (1) does not deal explicitly with the

    heterogeneity in climate, soil quality, etc among the municipalities of the Brazilian

    Amazon. As a third issue, we question the possibility to analyze adequately deforestation in

    this dynamic framework. Finally, considering that the land-use change is a spatial

    phenomenon the treatment of spatial autocorrelation in the data is an open issue.

    In this paper, these questions are tackled in the following way. The two first problems

    are treated simultaneously using an econometric model that contemplates spatial

    heterogeneity among counties and contemporaneous causal order in the same framework.

    This is performed in a structural vector autoregressive (SVAR) with panel data

    specification. In order to model deforestation we include to system (1) an extra endogenousvariable associated explicitly to deforestation and hereafter designated by forest 3. Finally,

    considering the difficulty to incorporate directly spatial autocorrelation in a SVAR model

    with a panel structure, we chose a more progressive route where in a first step we carefully

    study the spatial correlation in the residuals after a first estimation step. Indeed, the

    literature either implements SVAR panel models (Favero, 2002) or spatial SVAR models

    (Di Giacinto, 2003). The direct estimation of the model with a spatial component would

    require further theoretical developments. Thus we rely on spatial autocorrelation diagnosis

    tests and propose to split the sample into homogeneous sub-samples for re-estimation with

    the idea to lower spatial autocorrelation.

    2.2. VAR analysis: Reduced x Structural form and Identification

    Considering that natural forest can be seen as another category of land-use, the

    endogenous variables of our model can be represented by the following vector

    ),,,(),,,( 4321 it it it it it it it it forest fallow pasturecrop y y y y . Under this notation we propose to

    use the following model to formalize the dynamics of land-use in Brazilian Amazon:

    3 This variable is associated to natural land and considers both natural forest and natural pasture. More comments about the

    variables used in this research appear in Section 4.

  • 8/14/2019 Loureiro et al,2010pvar effet sptial .pdf

    8/41

  • 8/14/2019 Loureiro et al,2010pvar effet sptial .pdf

    9/41

    with ),0(~ N u where u is the reduced form of disturbance covariance matrix and it

    is also assumed that ),0(~ , with a diagonal matrix. The relationship between

    structural form and reduced form is based on the following identities, providing 0 A is

    invertible, c Ab 10 , 11

    01 A A B , t

    10t Au and:

    '', )())(( 101

    01

    0t t 1

    0 A A A E A

    (5).

    Note that this representation does not allow to identify the effects of exogenous

    independent shocks onto the variables, since reduced form residuals are contemporaneously

    correlated (the matrix is not diagonal) 6. That is, the reduced form residuals t u can be

    interpreted as the result of linear combinations of exogenous shocks that are notcontemporaneously (in the same instant of time) correlated. In evaluations of the model

    (and economic policies) it only makes sense to measure exogenous independent shocks.

    Therefore, it is necessary to present the model in another form where the residuals are not

    contemporaneously correlated.

    Thus, without additional restrictions on 0 A we cannot recover the structural form from

    the reduced form because does not have enough estimated coefficients to identify an

    unrestricted 0 A matrix. Therefore, we need to set up restrictions in order to identify and

    estimate 0 A7. This procedure is named identification. It is possible to estimate the reduced

    form parameters b , 1 B and consistently, but except for forecasting t Y given 1t Y , these

    matrices are not the parameters of interest.

    6 These shocks are primitive and exogenous forces, with no common causes, that affect the variables of the model.

    7 The matrix 0 A cannot have, together, a number of free parameters bigger than the number of free parameters in the symmetricmatrix . If n is the number of endogenous variables of the model then, to satisfy the order condition for identification of 0 A ,it is necessary that the number of free parameters to be estimated in 0 A be no bigger than n(n-1)/2. When n is smaller thann(n-1)/2 the model is over-identified. There exists no simple general condition for local identification of the parameters of A.However, as has been shown by Rothenberg (1971), a necessary and sufficient condition for local identification of any regular

    point in R n is that the determinant of the information matrix be different from zero. In practice, evaluations of the determinantof the information matrix at some points, randomly chosen in the parameter space, is enough to establish the identification of amodel.

  • 8/14/2019 Loureiro et al,2010pvar effet sptial .pdf

    10/41

    Spirtes, Glymour, and Scheines (1993, 2000) [hereafter SGS] and Pearl and Verma

    (1991) claimed that it is possible to make causal inferences based on associations observed

    in non-experimental data without previous knowledge. These restrictions follow from

    directed acyclic graphs (DAGs) estimated by the TETRAD software developed by SGS

    using as input the covariance matrix of the reduced form disturbances. Moreover, if the

    causal relations can be represented by DAGs, SGS have shown that under some weak

    conditions the Markov Condition and distribution of random variables faithful to the

    causal graph there exist methods for identification of causal relations that are

    asymptotically (in sample size) correct 8 . The results of SGS are discussed in several

    articles 9 In other words, this methodology enables us to read from the data the

    contemporaneous relationships existing among distinct land-uses in the Amazon.

    2.3. Spatial Autocorrelation

    In order to assess spatial autocorrelation directly in system (3) some new elements

    must be introduced in our analysis. A standard choice is to include in the mean process a

    spatial autoregressive component that takes the spatial locations into account. This can be

    done by the use of a contiguity or neighborhood matrix )( ijwW with ijw representing the

    neighborhood between the sites i and j, such that 0ijw if the sites i and j are neighbors,

    and 0ijw otherwise. This spatial weight matrix is a square matrix representing the spatial

    context. It encodes the neighborhood relationship among the spatial units. The literature on

    this subject is vast, we shall only retain here that the neighborhood relationships chosen by

    the analyst may change the results 10. In this sense equation (3) may be rewritten in the

    following way:

    t t t t Y AY W cY A

    1110 (6)

    t t t W 2

    8 Demiralp and Hoover (2003) evaluate the PC algorithm employed by TETRAD in a Monte Carlo study and conclude that it isan effective tool for the selection the contemporaneous causal order of SVARs.

    9 Swanson and Granger (1997) were the first to apply graphical models to identify contemporaneous causal order of a SVAR,although they restrict the admissible structures to causal chains. Bessler and Lee (2002) use error correction and DAGs tostudy both lagged and contemporaneous relations in late 19 th and early 20 th century U.S. data.

    10 See Anselin at al. (1988) for a detailed discussion.

  • 8/14/2019 Loureiro et al,2010pvar effet sptial .pdf

    11/41

    where )( ij and )( ij are matrices and ),0(~ I t .

    In this study we assume that0

    which means that we focus our attention in random

    spatial effect or spatial correlation in errors. The development of a spatial SVAR panel data

    model in (6) is left for further research. We implement here the spatial analysis as a non

    stationarity and we propose to split the sample to estimate separate models. In doing so we

    expect to capture the land-use dynamics in homogeneous areas. The spatial autocorrelation

    analysis is conducted on the mean residuals of the model, that is, the arithmetic mean of the

    reduced form VAR model residuals 11 , because we are mainly interested in the splitting of

    the sample over the whole period.

    The methodology is implemented in four steps. First, spatial autocorrelation in the

    mean residuals of the model is tested by the Moran scatterplot (Anselin, 1995). We check

    two distinct contiguity structures (see below) to assess the stability of the results. That is, if

    the results are coherent for those different neighborhood structures we assume enough

    confidence to proceed. Second, the residuals are spatially smoothed meaning that current

    values are replaced by the average of the neighbors conditional on the neighborhood

    structure. Third, a principal component analysis (PCA) on the smoothed residuals is

    conducted. This method is motivated in Benali and Escoffier (1990) who name it Local

    Principal Component Analysis as a way to extract the systematic spatial features present in

    the data by local smoothing and dimension reduction 12 . As a byproduct, this tends to

    produce spatially connected clusters which is a desirable property of clusters for spatial

    data. Fourth, from the LPCA results we compute by clustering homogenous areas to split

    the model for re-estimation.

    Because one may have some difficulty to grasp all the different steps of the analysis we

    present a summary of the methodology we used in this paper. It can be described in the

    following way

    11 Formal tests on the whole residuals are left to posterior work.

    12 Note that in their original paper, these authors develop the dual of this analysis which consists in substracting from the originaldata the average of the neighbors. This method is not discussed here.

  • 8/14/2019 Loureiro et al,2010pvar effet sptial .pdf

    12/41

    a) In the first step we estimate the reduced form VAR for unrestricted sample;

    b) With the residuals of VAR we analyse spatial autocorrelation throughout the

    method described in section 2.3. Based on this analysis the sub-samples of

    homogeneous areas (clusters) are selected;

    c) Next we re-estimate the VAR on selected sub-samples. In this step the structural

    VAR is recovered using the DAGs method to identify the contemporaneous causal

    orders; and

    d) Finally, the impulse response functions are computed.

    3. Database

    The main original source of the data available for this study comes from Brazilian National Agriculture Census elaborated by the Brazilian Institute for Geography and

    Statistics (IBGE) which is usually conducted every five years. Others original data sources

    used are the Industrial and Commercial Census that were also elaborated by the IBGE for

    the same periods. The data were collected for the following for years 1970, 1975, 1980,

    1985 and 1995 at the municipality level. The data were cleaned, harmonized and merged

    with data of other sources by the Institute for Applied Economic Research (IPEA) managed

    by the team of IPEADATA 13 . The original database includes data on economic,

    demographic, ecological and agriculture variables.

    In the Brazilian Amazon Basin a county can be subjected to ongoing change in its size

    mainly during the expansion of the agricultural frontier in Amazon. This fact obstructs the

    comparison between periods at county level. That is why the concept of Minimum

    Comparable Area (MCA) 14 was introduced, which is the smallest stable spatial unity

    during these five censuses that accommodates the changing county boundaries over the

    panel. The aggregation of counties in the later census years, in order to match the county

    area in 1970, is greatest in the more recently populated and sub-divided regions found in

    the legal Amazon.

    13 http://www.ipeadata.gov.br

    14 For further details about the concept of MCA see Reis at al. (2007)

  • 8/14/2019 Loureiro et al,2010pvar effet sptial .pdf

    13/41

    The agricultural censuses group all land into private land and public land. Private land

    is stratified into eight categories according to agricultural use. These are (i) annual crops,

    (ii) perennial crops, (iii) planted forest, (iv) planted pasture, (v) short fallow and (vi) long

    fallow are classified in cleared land, while (vii) natural forest and natural pasture are

    considered non-cleared land. A small category of private non-usable land (rivers,

    mountains, etc.) is also considered non-cleared land. Finally, all land that is not claimed by

    anyone is considered public land and by definition non-cleared.

    Based on these definitions the dependent variables used in our land-use model fall into

    one of the following four categories: cropland ( crop ), pasture ( pasture ), fallow ( fallow ) and

    natural land ( forest ). Cropland covers annual crops, perennial crops and planted forest.Pasture is planted pasture only. Fallow land includes short fallow, long fallow and non-

    usable land like roads, dams, etc. Natural land considers natural forest and natural pasture.

    4. Econometric Results

    4. 1. Regression Analysis

    We propose in this section a consistent method to estimate the reduced form VAR

    with panel data (4). Before derivation of the estimator, we must introduce notations to

    accommodate the data sample. The observations related to each equation j can be

    represented in the following way. For J j ,...,1 , '1111 ),...,,...,,...,( jNT jN T j j j y y y y y where

    j y is a 1 NT vector of dependent variables associated to equation j . Let

    '1 ),...,( jN j j the 1 N vector of individual effects and 1 NT the vector of noise

    given by '1111 ),...,,...,,...,( jNT jN T j j j . We assume that individual effects are fixed. In

    this sense the individual effect can be included into the set of parameters. Thus the error

    covariance matrix is defined so that NT j jj j j I E 2' )( .

  • 8/14/2019 Loureiro et al,2010pvar effet sptial .pdf

    14/41

  • 8/14/2019 Loureiro et al,2010pvar effet sptial .pdf

    15/41

    The residuals of the model estimated in the preceding section are analyzed as follows.

    First, the mean residuals over the all periods are computed. Second, these mean residuals

    are further centered and standardized. In short, the residuals are transformed into:

    ie

    ii s

    i

    eee

    (8)

    with s standing for standardized. This transformation is performed on each of the four

    equations of the SVAR, that is residuals of the crop , pasture , fallow and forest equations, in

    that order. In other words, the present analysis do not use the NJ-K observations of the

    estimated model but the mean of the residual vector on the N spatial units: the 257 AMCs

    of the Amazon Basin.

    4.2.2 Spatial weights

    The spatial weight matrix is the square N by N matrix representing the spatial context.

    In this context we intend to test several neighborhood schemes to check the stability of our

    computations. Note that in the literature it is generally admitted that the analyst may test

    several neighborhood matrices and retain the most stable or meaningful results. This is why

    we use two common neighborhood structures to compute our tests.

    The most common way to introduce spatial weights is the contiguity matrix, which is

    defined by the presence of a common frontier between two spatial units. We have then w ij =

    1 if i and j share a common border, zero if not. The second form of neighborhood retained

    here is the k-nearest neighbors matrix, (with k = 4 and k = 8). One easily sees that

    contiguity is a symmetric relation, but not nearest neighbors. Consequently, the contiguity

    matrix is symmetric but not the k-nearest neighbors matrix. The row sums are the degrees

    of the incidence matrix of the graph associated to the neighborhood relation; in the

    contiguity case, the degree is variable but in the k-nearest neighbors case it is constant and

    equal to k. Symmetry is a desirable property in spatial analysis but generally it is lost via

    normalization. Effectively, it is often more practical to standardize the neighborhood

    matrices, and this is done by dividing each row by the degree, that is, the number of

    neighbors.

  • 8/14/2019 Loureiro et al,2010pvar effet sptial .pdf

    16/41

    4.2.3. Moran test

    The Moran spatial autocorrelation coefficient is defined as the ratio of the local

    autocovariances to the total variance of the series:

    2 x x

    x x x xw

    S n

    I i

    jiij (9)

    In other words it is the ratio of the covariance between neighbors to the variance of the

    whole series (complete graph): it is the fraction of total variance accounted by spatial

    neighborhood relationship. For the neighbors the normalization constant is S = ijw i,j, i. e.,

    the total number of links in the neighborhood matrix. The computation and visualization of

    the statistic can be greatly enhanced by the Moran scatterplot where the Moran I statistic is

    computed on the standardized residuals. From equation (8), let us call S ii e z ; given a row

    standardized neighborhood matrix W, then each row sums to one, and as there are exactly

    N rows, S=n, thus the first term of equation (8) is equal to one, to give in matrix form:

    zz

    zWz I t

    t

    (10)

    Then Moran I statistic can be easily expressed as a regression coefficient, as one can

    verify that if we pose: Wz y , and noting that the denominator of (10) may be written as

    1 zz t , we obtain:

    zy zz I t t 1 (11)

    This offers an easy way to compute Moran I, which is simply the slope coefficient of

    the regression equation (11) on the variable Wz, that is, one simply regresses y=Wz on z

    without a constant. The dependent is the spatially smoothed variable Wz and the

  • 8/14/2019 Loureiro et al,2010pvar effet sptial .pdf

    17/41

    independent is the original (standardized) variable. The standardization allows a clear

    interpretation of the regression graphs (see below). This phenomenon is easily understood

    in this context as Anselin (2003) puts it: he quotes deforestation studies as an exemplary

    case for spatial autocorrelation left in the residuals of an econometric model because of non

    observed variables.

    4.2.4. Spatial autocorrelation analysis

    The spatial correlation coefficients are summed up in Table 1 by contiguity structure.

    The computations were performed as explained in section 2.5, equation 3 and 4. Thus,

    neighborhood matrices are row standardized so that direct estimation by regression is

    possible. Computations where performed in GEODA (Anselin & al, 2005) and GRETL for

    comparison purposes and also because of a minor technical problem 17.

    Table 1: Moran coefficients by contiguity structure

    Neighborhood

    matrixVariable

    GRETL GEODA

    Moran

    I

    Bootstrap p-value

    for t

    Moran

    I

    Bootstrap p-value

    for t

    First order

    CROP 0.066 0.023** 0.066 0.007***

    PASTURE 0.303 0.000*** 0.303 0.002***

    FALLOW 0.306 0.000*** 0.307 0.001***

    FOREST 0.266 0.000*** 0.266 0.003***

    4NN

    CROP 0.035 0.153 0.035 0.020**

    PASTURE 0.124 0.000*** 0.124 0.004***

    FALLOW 0.239 0.000*** 0.239 0.001***

    FOREST 0.123 0.000*** 0.123 0.006***

    8NN

    CROP 0.026 0.113 0.026 0.030**

    PASTURE 0.102 0.000*** 0.102 0.001***

    FALLOW 0.239 0.000*** 0.239 0.001***

    17 In the first version of this paper we could not use GEODA to compute the Moran Is for the contiguity matrix becauseGEODA could not built properly the matrix because of defects in the shapefile of the Amazon Basin. These has now beencorrected and the full results presented here.

  • 8/14/2019 Loureiro et al,2010pvar effet sptial .pdf

    18/41

    FOREST 0.063 0.003*** 0.063 0.014**

    ** significant at 5 % ; *** significant at 1 % or less. All tests

    performed with 999 permutations.

    Spatial autocorrelation is present and significant for most variables. For example,considering the contiguity neighborhood, it is lower for the crop equation (~ 0.07); but

    quite higher for the three other variables: forest (~ 0.27), pasture (~ 0.30) and fallow (~

    0.31) which is the highest of all four spatial autocorrelation coefficients computed for both

    software application and neighborhood structure. The magnitudes of these autocorrelation

    coefficients may seem modest, but because they relate to residuals estimates, they are in

    fact rather high. We can expect these to decrease for both neighborhood structures (higher

    order of contiguity and number of near neighbors).

    There are slight differences between neighborhood structures, but the conclusions are

    convergent between the two software implementations except for the crop residuals which

    are not significant under the 4 and 8 nearest neighbors in GRETL. These differences are

    due to the different implementations of the permutation tests between the softwares.

    According to GEODA, all neighborhood structures show significant spatial autocorrelation

    of the mean residuals for all equations but not in the case of GRETL. However, the orders

    of magnitude of the computed coefficients are identical between software implementationsat least up to the second decimal place and often to higher places.

    There is thus a relatively good stability of the estimates among neighborhood structures

    and softwares, because the magnitude of the computed coefficients leads to a similar

    ordering of spatial autocorrelation present in the residuals. Both for contiguity, and nearest

    neighbors, the crop is the lowest followed by forest , then pasture and fallow which is

    always highest.

    The magnitude of the autocorrelation coefficients for the nearest neighbors matrices

    decreases slowly for all variables except for fallow . This may be explained by the

    increasing smoothing effect induced by the growing number of neighbors which lower the

    difference to the average at each location and thus the Moran I. However, this effect does

    not seem to affect the fallow equation mean residuals. For this variable, it appears that

  • 8/14/2019 Loureiro et al,2010pvar effet sptial .pdf

    19/41

    spatial autocorrelation may be both stronger than for the other variables (at a given distance

    or contiguity order) but also that it may be felt up farther away 18.

    To sum up, there is significant autocorrelation for all residuals, with fairly robust

    conclusions. The fallow equation residuals show the highest and strongest spatial

    autocorrelation among the four series, this may indicate that the SVAR specification may

    lack explanatory power for this variable, but seem better for the others variables.

    4.2.5. Clustering Analysis

    This step allows splitting the panel of MCAs in the Amazon Basin as a way to reduce

    spatial autocorrelation that may be present in the residuals. The same spatially smoothed

    data are used to perform the PCA on the 4 vectors of spatially smoothed residuals. The

    smoothed mean residuals are defined by:

    si

    jii

    si eW k

    e ~

    * 1 (11)

    which is technically the same as variable y=Wz above. That is, the stared (spatially

    smoothed) residuals are the average of the neighboring residuals. The computations are

    done on the non symmetric k-nearest neighbors, with k=4,8 and for the order one contiguity

    matrix. The results for k=8 do not differ from those relative to k=4, so they are not

    presented here at this stage. The reader shall note that the splitted samples are exactly the

    same for both these orders of neighborhood.

    From the 4 vectors, the first two principal components explain 85 % of the variability in

    the data. Now on we apply the k-means algorithm on the two first principal components.

    Three clusters are found to be optimal. The clusters are depicted on map 1 below. Table 3

    in Appendix 3 gives the bootstrapped Moran I computations for cluster I and II of this map.

    Here cluster I is composed of 156 municipalities and cluster II 75 municipalities. The 26

    18 We note this fact for further investigation.

  • 8/14/2019 Loureiro et al,2010pvar effet sptial .pdf

    20/41

  • 8/14/2019 Loureiro et al,2010pvar effet sptial .pdf

    21/41

  • 8/14/2019 Loureiro et al,2010pvar effet sptial .pdf

    22/41

    use in Brazilian Amazon. This why we restrict the use of prior theoretical restrictions in

    order to identify the matrix of contemporaneous relationship 0 A .

    4. 3. Identification

    4.3.1. The Identification Problem

    Structural inference and policy analysis employing VAR model require

    differentiating between correlation and causation, an issue known as the ide ntification

    problem. The practice in the literature has been the use of identifying assumptions based

    on economic theory or institutional knowledge to sort out the contemporaneous links

    among the variables in order to allow correlations to be interpreted causally. Given an

    exogenous impact on a certain category of land, it dictates the subsequent constraints by

    which the mechanism of land-use takes place during the period of five years after this

    exogenous shock. In short, it detects how the other categories of land-use react given a

    primary impulse on a certain type of land-use. In this paper it can be associated to the

    drivers or prime forcers of the soil occupation in this region.

    For example, credit and fiscal subsidies to the agriculture jointly with the expansion

    of the road network pushed the agricultural frontier in the northwestern part of the Amazon

    Basin while the colonization programs aided to fix people in the interior. Strong incentives

    were created to clear land for pasture. In fact, the growth of cattle herds has consistently

    been cited as one of the primary factors behind the land clearing in Amazon. Regarding the

    problem of the identification this analysis implies that the subsequent effect of an

    enlargement of the area of pasture has a contemporaneous effect on the other land-uses,

    most likely on forest land where new areas are required to be opened. Notwithstanding we

    need to check if it really happen.

    4.3.2. The Identification using DAGs

  • 8/14/2019 Loureiro et al,2010pvar effet sptial .pdf

    23/41

    Based on the results of the spatial autocorrelation analysis we re-estimate the reduced

    form VAR using the data related only for the biggest cluster (156 MCAs) because we doubt

    that the other clusters have enough information to warrant the computations of the results.

    Applying TETRAD at the 0.5% significance level 19 on the estimated disturbance

    covariance matrix

    and assuming that the variables selected for the model are causally

    sufficient 20 , we obtain what is known as a pattern 21 . The pattern is a graphical

    representation of the set of observationally equivalent DAGs containing the

    contemporaneous causal ordering of the variables.

    The DGAs detected four valid representations of the contemporaneous causal ordering.

    In accordance with this pattern three of the contemporaneous causal ordering display

    causality in one direction while one indicates causal ordering exists but it does not allow to

    know which one is the true 22. In this case we produced the IRFs of both identifications

    separately. Fortunately we did not note any relevant difference between them 23 . The

    relations derived from DGAs enter in matrix 0 A as restrictions that will help to identify the

    contemporaneous relationships 24 . The figure 1 shows the matrix 0 A identified using the

    causal ordering obtained from DGAs 25 . In this matrix one observes seven identified

    conditions represented by zeros. Because more the VAR has four variables the matrix 0 A

    requires six conditions for identification. Hence this matrix is over-identified 26.

    19 The significance level cannot be interpreted as the probability of type I error for the pattern output, but merely as a parameterof the search. Based on simulation tests with random DAGs, SGS suggests setting the significance level at 20% for samplesize smaller than 100; at 10% for sample size between 100 and 300; and at 0.5% (or smaller) for larger samples. Here wefollowed their suggestion. However, slight changes in the significance level can produce large variations in TETRADs output.

    20 A set of variables V is said to be causally sufficient if every common cause of any two or more variables in V is in V .TETRAD has a bias towards excluding causal relations present in the data, to overcome this problem it is suggested that a20% significance level be used.

    21 A pattern is a partially oriented DAG, where the directed edges represent arrows that are common to every member in theclass, while the undirected edges are directed one way in some DAGs and another way in others. Undirected edges ( ) meanthat there is causality in one of the two directions but not on both, while double oriented edges ( ) mean causality on bothdirections.

    22 The DAGs display the following pattern : pastures crops, fallow crops, forest crops and pasture forest .23

    The results can be obtained under request.

    24 In appendix 1 we show how to apply the causal ordering obtained by DGAs to identify matrix 0 A .

    25 We assume that pasture affects forest contemporaneous but the reverse is not true.

    26 See footnote 9.

  • 8/14/2019 Loureiro et al,2010pvar effet sptial .pdf

    24/41

    FIGURE 1

    10001000010

    1

    42

    141312

    0

    A forest fallow past

    A A Acrops

    forest fallow past crops

    A

    5. Analysis of Empirical Results

    5.2. An economic interpretation of the partition in two clusters

    Given a partition built by a clustering on the mean residuals of the unrestricted

    VAR model, i.e. the estimates on the whole sample of 257 Amazon municipalities 27 we

    would like to know the most salient features of the partitions. In this aim, we use the notion

    of valeurs -test proposed in the French school of Exploratory Data Analysis. A valeur -

    test is the e quivalent of a t-stat computed on the means of the clusters, provided the that

    these variables are external to the computations (Lebart & al., 1995). It is then a test of

    significance of the difference of means in the clusters to the mean on the whole sample; the

    test is done in the context of sampling without replacement. In this work we consider avariable mean in a cluster to significantly differ from the population mean if the valeur-test

    is greater than 2.0 in absolute value. This means that a variable for which a test -score is

    greater than 2.0 conveys significant information on the feature pertaining to the given

    variable in the cluster. The usual and natural interpretation of these statistics carries over: a

    negative score indicates a significantly inferior mean in the cluster and conversely for a

    positive value.

    We computed different statistics for each separate year of census and also from all

    the years. This gives an interesting perspective about the evolution of the clusters

    27 The data form a panel consisting in the 257 municipalities of Legal Amazon for the space domain and 5 dates in the timedomain. The dates are separated by about five years which where years of censuses. Let us stress that the partitions where builtfrom the mean residuals of the SVAR model estimated on all the panel, thus the partitions are generated on 257 meanresiduals.

  • 8/14/2019 Loureiro et al,2010pvar effet sptial .pdf

    25/41

    characteristics through time: the partitions are fixed (fixed zoning) but test-scores may vary

    at different dates. This would allow to sort out effects that may be present in the long run

    (30 years) from effects that are felt differently in time or more from more recent times.

    The t-scores on the illustrative attributes are used to describe the most interesting

    features of the clusters in each of the two partitions. To ease the exposition, the different

    attributes have been regrouped under the different topics: Demography, Agricultural

    Economy, General Economy, Education, Infrastructure, Natural Environment and Land-

    Use. The complete table is reported in Appendix 3. The reader is warned that not all

    variables may be present for all the years, we report and comment statistics for separate

    years on the available variables; note that some attributes are present on all years. For each

    partition a t-score was computed on all the attributes and every cluster in the partition. Thenumber of municipalities of each cluster is indicated as a number in parentheses under the

    cluster number. The t-scores have been computed on a set of 39 variables in the panel

    database, on all the years. In those computations, means and standard errors were computed

    using total population as a weighting variable 28.

    It is interesting to present a brief description of the clusters by topic and then compare

    the two partitions and comment those results. In accordance with the table of t-scores in

    Appendix 3 the demographic topic t-scores show that population and consequently

    dwellings are significantly higher than the average in group 1, lower and group 2, and not

    significantly different from the average in group 3. Thus, the biggest cluster of 156

    municipalities has a higher population, while the second is significantly less populated. We

    shall see below that this is also true for the size of municipalities. For this we would have to

    compute t-scores for the shares of dwellings with sanitation and electric lighting per group.

    The agricultural economy profile of t-scores shows that, in cluster one, even as we haveseen average population is higher than cluster two, agricultural investment is the lowest,

    while it is still lower than the average in cluster two and far higher in cluster three. The

    same is true for bovine livestock, total irrigated land area and number of tractors. On the

    28 More detailed results by year are available from the authors.

  • 8/14/2019 Loureiro et al,2010pvar effet sptial .pdf

    26/41

    contrary, labor force in the agricultural sector is higher than the average in cluster three,

    lower in cluster two and insignificant in cluster three. The value of wood production differs

    only in cluster two, and the average rental value of land is higher in the municipalities of

    cluster one, lower in those of cluster two, and not different from the average in cluster

    three. However, the volume of commercialized wood is the most significantly different

    from the average in cluster one, with a t-score of +27.31, lowest in cluster two (-8.75), and

    significantly under the average in cluster 3. However, the average wood production

    potential (cubic meters per hectare) is essentially the same in clusters one and two and it is

    far above the average. Thus, it is quite clear that municipalities of cluster one are largely

    devoted to logging production, having a higher population and total employment in the

    agricultural sector, while those of cluster two have a more diversified agriculture that is less

    reliant on logging: these have even a larger than average surface of irrigated land while being less populated on the average but still retaining a similar logging potential. Finally,

    the average purchase price of live beef is significantly higher than average in cluster one,

    and lower in cluster two.

    The general economy profile of the clusters shows significant differences: higher share

    of wages in group one, lower in group 2; significantly lower average wages in group one,

    average in group three and far higher in group 3; lower industrial wages in group two but

    not significantly different from the Amazon mean in groups 1 and 3; Interestingly, a very

    significantly lower share of government/Banco do Brasil financing in cluster 1 but both

    higher in clusters 2 and 3 and the lowest SUDAM financing for cluster one, lower than

    average in group 2 and of course, significantly higher in group 3. These differences in

    economic profile illustrate differences of economic structures.

    Consider now the education and infrastructure profiles. Surprisingly, the share of

    persons with five years of schooling is far above the Amazon average in group 1, and

    significantly lower in group 2; the same is observed with the average number of years of

    education in the population. On the paved roads length, cluster one shows no difference to

    the average, but significantly less unpaved roads, and both lower for cluster two. There are

    slight differences in infrastructure between clusters one and two, and cluster three has

    significantly higher level of infrastructure. The contrast is the greatest between cluster one

  • 8/14/2019 Loureiro et al,2010pvar effet sptial .pdf

    27/41

  • 8/14/2019 Loureiro et al,2010pvar effet sptial .pdf

    28/41

    Group 2: the margins; older pioneer fronts that have grown a more diversified

    agriculture;

    Group 3: the oldest, most developed, intensive agriculture oriented municipalities.

    5.2. Analysis of IRFs for the biggest cluster (156 MCAs)

    In this section we use the selected causal ordering obtained by DAGs to identify the

    structural model on the biggest cluster of 156 MCA in order to build up the impulse

    response functions. This procedure allows assessing the dynamic effect on different types

    of land-use that derived from an unexpected innovation in a specific land-use. It can be

    used to evaluate the impacts on areas of pasture, fallow and forest due to an unexpected

    event on crop land. The dynamic effect includes not only the future impacts but also the

    contemporaneous responses.

    An important point refers to the fact that economic analysis can be indirectly performed

    by means of the proper interpretation of the innovation. For instance, an unexpected

    expansion of credit and subsidies implemented by a government policy aimed at promoting

    agricultural activities can be seen as an innovation on the variable crop. Unexpected

    movements of price can be also incorporated in this analysis like, for example, a rise in the

    price of t imber may cause a growth of deforestation. In the same sense, an elevation of the

    price of meat derived from a demand shock can be linked in the current context like a shock

    on pasture land.

    The IRFs are showed in figure 2. This figure must be viewed like a matrix in which

    each graph represents the effect of a shock of the land-use of column j on the land-use ofrow i. We start the analysis of IR functions checking the consequences of a shock on crops.

    In this case there exists an immediate effect on forest land. One can still see that there is lag

    effect on fallow and pasture about six periods after. The new demand for cropping requires

    to open new areas of forest and due to the poor quality of Amazons soil, crop land will be

    transformed in the future in pasture land or left as fallow. This process is in accordance to

  • 8/14/2019 Loureiro et al,2010pvar effet sptial .pdf

    29/41

    Smith and alli (1997) and can be considered a necessary outcome for the slash-and-burn

    agriculture, a common practice in the Amazon 30

    Figure 2

    Impulse Response Functions

    C r o p s

    P a s t

    P o u s

    i o

    F o r e s t

    2 4 6 8 10 12

    -2

    -1

    0

    1

    x 104Crops

    2 4 6 8 10 12

    -2

    -1

    0

    1

    x 104Past

    2 4 6 8 10 12

    -2

    -1

    0

    1

    x 104Pousio

    2 4 6 8 10 12

    -2

    -1

    0

    1

    x 104Forest

    2 4 6 8 10 12

    0

    1

    2

    3

    4

    x 106

    2 4 6 8 10 12

    0

    1

    2

    3

    4

    x 106

    2 4 6 8 10 12

    0

    1

    2

    3

    4

    x 106

    2 4 6 8 10 12

    0

    1

    2

    3

    4

    x 106

    2 4 6 8 10 12

    0

    2

    4

    6x 10

    4

    2 4 6 8 10 12

    0

    2

    4

    6x 10

    4

    2 4 6 8 10 12

    0

    2

    4

    6x 10

    4

    2 4 6 8 10 12

    0

    2

    4

    6x 10

    4

    2 4 6 8 10 12

    0

    0.5

    1

    1.5

    2

    x 105

    2 4 6 8 10 12

    0

    0.5

    1

    1.5

    2

    x 105

    2 4 6 8 10 12

    0

    0.5

    1

    1.5

    2

    x 105

    2 4 6 8 10 12

    0

    0.5

    1

    1.5

    2

    x 105

    Contrary from the other studies, we do not find evidence that cattle ranching is the

    primary driver of deforestation. The contemporaneous response of the variable forest to animpulse on pasture land is in fact positive but small compared to the response of this same

    variable to a shock on crops land. As a matter of fact, it can be seen that the area of forest

    30 The cycle of slash-and burn agriculture can be briefly described in the following way. In the first place, it begins withdeforestation targeting the implementation of agricultural activities involving many perennials crops. In the following stage,when the soil fertility declines, the site is used as pasture land. The cycle finishes when the soil is entirely exhausted and

    pasture area is converted into fallow land.

  • 8/14/2019 Loureiro et al,2010pvar effet sptial .pdf

    30/41

    grows in the future (figure 2, graph at row 4, column 2). As we expected the effect on the

    area of fallow in response to the impulse on pasture land occurs fundamentally many

    periods after the shock (graph at row 3, column 2). The figure also shows that the impact of

    a shock on pasture land on itself is almost null initially but augments substantially over

    time (graph at row 2, column 2) ; it do not require extra areas of forest land to clear but

    rather competes with crop land (compare, graphs at row 1 and 2, column 2). In this sense

    pasture affects crops not only in the current period but also in the longer term while this

    long run impact appears considerably stronger than its contemporaneous counterpart. Our

    results suggest evidence that, at least in this cluster, if not for all the Amazon Basin, that

    cattle ranching and cropping could be competitive activities.

    6. Concluding Remarks.

    This paper studies the dynamics of land-use in the Brazilian Amazon using a

    structural vector autoregressive (SVAR) model. The heterogeneity in the data is controlled

    by the mean of fixed effect panel data. This method allows taking into account the specific

    features like geographic, environmental, economic diversity present among the different

    sites. Considering that the land-use is also determined by the interaction of the counties

    among themselves, possible spatial autocorrelation in the data must be taken into account.

    In this case spatial autocorrelation is diagnosed by a statistical methodology that allows to

    us to split the model in sub-samples of more homogenous municipalities so as to re-

    estimate the model on these. In doing so, cluster analysis shows that there are three clusters

    whose land-use patterns are strongly different.

    Furthermore, the illustrative attributes are used to describe the most interesting

    features in each of the three clusters. To ease the exposition, the attributes have been

    regrouped under the distinct topics such like demography, agriculture, education,

    infrastructure, environment, etc. In this way it can be said that the cluster 1 identifies the

    pioneer fronts; dedicated to logging, natural resources exploitation and slash-and-burn

  • 8/14/2019 Loureiro et al,2010pvar effet sptial .pdf

    31/41

  • 8/14/2019 Loureiro et al,2010pvar effet sptial .pdf

    32/41

    Arellano, M. (2003) . Panel Data Econometrics. Oxford University Press.

    Arellano, M. And Bond, S. (1991) . Some Tests of Specification for Panel Data: Monte

    Carlo Evidence and Application to Employment and Equations. Review of Economic

    Studies, 58: 277-297.

    Angelsen, A. (1996). Deforestation: Population or Market Driven? Different

    approaches in modelling agricultural expansion . CMI Working Papers 1996:9, Michelsen

    Institute, Bergen, Norway.

    Awokuse, T. and Bessler, D. (2003). Vector Autoregressions, Policy Analysis, and

    Directed Acyclic Graphs: an Application to the U.S. Economy . Journal of Applied

    Economics VI(1): 1-24.

    Baltagi, B. H. (1995). Econometric Analysis of Panel Data. John Wiley and Sons.

    Bessler, D. and Lee, S. (2002). Money and Prices: U.S. Data 1869-1914 (A Study with Directed Graphs) . Empirical Economics 27: 427-446.

    Binswanger, H.P. (1991). Brazilian Policies that Encourage Deforestation in the

    Amazon. World Development, 19(7): 821-829.

    Browder, J. O. (1988). The Social Cost of Rainforest Destruction: a critique of the

    hamburger debate . Intercience, 13: 115-120.

    Case, A. C. (1991). Spatial Patterns in Household Demand . Econometrica, 59 (4): 953-

    965.

    Demiralp, S. and Hoover, K. (2003). Searching for the Causal Structure of a Vector

    Autoregression. Oxford Bulletin of Economics and Statistics 65 (Supplement): 745-767.

    Enders, W. (1995). Applied Econometric Time Series, John Wiley and Sons.

    Faminow, M. D. (1998). Cattle, Deflorestation and Development in Latin American.

    CAB International.

    Fearnside, P. M. (2006). Desmatamento na Amaznia: dinmica, impactos e controle .

    Acta Amaznica, 36(3): 395-400.

    Hamilton, J. (1993). Time Series Analysis. Princeton University Press.

    Hsiao, C. 1995. Analysis of Panel Data . Cambridge University Press.

    Jacobs, R. Leamer, E. E. and Ward, M. P. (1979). Difficulties with Testing for

    Causation. Economic Inquiry, 17: 401-413.

  • 8/14/2019 Loureiro et al,2010pvar effet sptial .pdf

    33/41

    Lebart, L., Morineau, A., Tabart, N. (1995). Statistique Exploratoire

    Multidimensionnelle . Dunod, Paris.

    Mahar, D. (1989). Government Policies and Deforestation in Brazil's Amazon Region .

    Washington, DC: World Bank

    Myers, N. (1994). Tropical Deforestation Rates and Patterns . In: Brown, K and Pearce,

    D. (eds.) The causes of tropical deforestation, the economic and statistical analysis of

    factors giving rise to the loss of tropical forests, 172-91. University College London Press.

    Nepstad, D. C., Moreira, A. G. e Alencar, A. A. 1999 A Floresta em Chamas: Origens,

    Impactos e Preveno de Fogo na Amaznia. Braslia: Programa Piloto para a Proteo

    das Florestas Tropicais do Brasil .

    Pearl, J. (2000). Causality: Models, Reasoning, and Inference. Cambridge University

    Press.Pearl, J. and Verma, T. (1991). A Theory of Inferred Causation. In J. Allen, Fikes, R.

    and

    Pfaff, A.S. 1999. What drives deforestation in the Brazilian Amazon? Evidence from

    Satellite and Socioeconomic Data. Journal of Environmental Economics and Management,

    37(1): 26-43.

    Reis, E. J. and Blanco, F. A. (1997). The Causes of Brazilian Amazon Deforestation .

    Working Paper, IPEA.

    Reis, E and Guzmn, R. 1994. An Econometric Model of Amazon Deforestation . In:

    Brown,K and Pearce, D. (eds.) The causes of tropical deforestation, the economic and

    statistical analysis of factors giving rise to the loss of tropical forests, 172-91. University

    College London Press, London.

    Reis, E. J., Pimentel and Alvarenga, A. I. (2007). Areas Minimas Comparaveis entre os

    Anos de 1920 a 2000. WP NEMESIS.

    Robins, J., Sheines, R., Spirtes, P. and Wasserman, L. (2003). Uniform Consitency in

    Causal Inference . Biometrika 90(3): 491-515.

    Rothenberg, T.J. (1971) Identification in Parametric Models. Econometrica, VOL. 39,

    NO. 3-May, 1971. pp. 577-592.

  • 8/14/2019 Loureiro et al,2010pvar effet sptial .pdf

    34/41

    Smith, J., Winograd, M., Gallopin, G. and Pachico, D. (1998). Dynamics of the

    Agricutural Frontier in the Amazon: analysing the impact of policy and tecnology.

    Environmental Modelling and Assessment, 3:31-46.

    Spirtes, P., Glymour, C. and Scheines, R. (1993). Causation, Prediction, and Search .

    Lecture Notes in Statistics 81. Springer-Verlag.

    48(4): 555-568. 2000). Causation, Prediction, and Search. MIT Press.

    Soares-Filho, B. S.; Nepstad, D.; Curran, L.; Voll, E.; Cerqueira, G.; Garcia, R. A.;

    Ramos, C.; McDonald, A.; Lefebvre, P.; Schilesinger, P. Modeling Conservation in the

    Amazon Basin . Nature, 440: 520-523, 2006.

    Scheneider, R. R. (1992). An Economic Analysis of the Environmental Problem in the

    Amazon . Draft, World Bank.

    Swanson, N. and Granger, C. (1997). Impulse Response Functions Based on a Causal Approach to Residual Orthogonalization in Vector Autoregressions . Journal of the

    American Statistical Association 92(437): 357-367.

    Toiolo, A. and Uhl, C. (1995). Economic and Ecological Perspectives on Agriculture in

    Eastern Amazon . World Development, 23: 959-973.

    Walker, R. and Homma, A. K. O. (1996). Land-use and land Cover Dynamics in

    Brazilian Amazon: an overview. Ecological Economics, 18: 67-80.

    Wassenaar, T., Gerber, P. Verburg, P.H., Rosales, M., Ibrahim M. and Steinfeld, H.

    (2007). Projecting land-use changes in the Neotropics: The geography of pasture

    expansion into forest., Global Environmental Change, 17(1): 86-104.

    Weinhold, D. (1999). Estimating the Loss of Agricultural Productivity in the Amazon.

    Ecological Economics, 31: 63-76.

    Appendix 1. DAGS and the Identification of Structural Form VAR

    The SGS procedures allow us to establish the conditional independence relations that

    are equivalent to determining whose coefficients of matrix A 0 are equal to zero. The

    framework developed by SGS does not rule out the possibility of finding alternative

    sets of conditional independence relations for a given data set. In this case we arrive at a

  • 8/14/2019 Loureiro et al,2010pvar effet sptial .pdf

    35/41

    set of matrices A 0 that are observationally equivalent. It may be the case that the found

    conditional independence relations are not enough to allow for the identification of the

    matrices. In this case, additional restrictions are needed in order to identify the model.

    Next we show how DAGs can be used to impose restrictions that allows identification

    of Structural VARS (SVARs). In the example below we assume that the VAR has 4

    endogenous variables.

    The relationship between reduced form and structural form residuals is given by the

    following equation:

    t = [I -A 0 ] t + t

    where: t column vector, with dimension 4x1, with reduced form VAR residuals at

    period t;

    t column vector, with dimension 4x1, with structural form VAR residual at period t;

    A0 full rank matrix with the relationship between the two types of residuals.

    The above equations form a system of linear equations, where each variable (reduced

    form residual) is a linear function of its direct causes and an error term (structuralresidual), with error terms independent of each other. If the graph G that represents the

    model has no cycles (is a DAG) then the variables are generated by a Markovian model.

    Therefore, the model satisfy the property that guarantees the compatibility between its

    distribution function and graph G 31 . Because conditional independence implies zero

    partial correlation, Proposition 2 translates into a graphical test for identifying the

    partial correlations that must vanish in the model 32 . Therefore, equations (3) can be

    structured according to a DAG G, and the partial correlation coefficient 3V.2V1V

    vanishes whenever the vertices corresponding to the variables in V 3 d-separate vertex

    V1 from vertex V 2 in G.

    31 For a sketch of the proof see Pearl (2000).

    32 The partial correlation coefficient of X and Y, controlling for Z is given

    by 2/12/1. )1()1/()( YZ XZ YZ XZ XY Z XY .

  • 8/14/2019 Loureiro et al,2010pvar effet sptial .pdf

    36/41

  • 8/14/2019 Loureiro et al,2010pvar effet sptial .pdf

    37/41

    Appendix 2. Clustering Analysis using Bootstrapped Moran I computations

    The following table gives the bootstrapped Moran I computations for cluster 1 and 2of map 1. Here cluster 1 is composed of 156 municipalities and cluster 2 of 75

    municipalities. The 26 municipalities cluster is not included as the model could not be

    estimated because of insufficient degrees of freedom. All the computations were performed

    in GRETL because GEODA is lacking functionalities to select subsamples and compute

    associated stats.

    Here we present for comparison both final results for the truncated matrices and the

    correctly re-indexed matrix but only for the nearest neighbors structure. To be more

    precise, it is not possible to re-index the contiguity because when a neighbor is in another

    group, there are no new neighbor to replace it. This is why the border effect must be

    stronger, and subsequently the bias for the contiguity neighborhood. Table 3 presents the

    results for both computations of neighborhood, this allows to compare results and

    interestingly provide a qualitative assessment of border effects biases.

    Table 3: Moran test and bootstrap p-values for clusters I and II

  • 8/14/2019 Loureiro et al,2010pvar effet sptial .pdf

    38/41

    *significant at 10%; ** significant at 5 % ; *** significant at 1 % ; All tests performedwith 999 permutations; NA: Not Applicable.

    Zone 1: spatial autocorrelation is not significant for the 4NN, except for the fallow

    residuals. We know from initial computations on the whole model that fallow residuals

    display the strongest spatial autocorrelation, here it is significant at 5 % but largely

    reduced. For the 8NN, spatial autocorrelation is significant for crop and pasture . Note that

    it is negative for crop which is an interesting result meaning crop residuals values are

    significantly dissimilar at farther distance. Note that it is the only variable for which we

    find negative spatial autocorrelation. Then, both pasture and forest residuals do not display

    any significant spatial autocorrelation in the residuals of zone 1.

    Neighborhood

    structureVariables

    Moran I test sample

    Trucated matrices Correct matrices

    Zone 1 Zone 2 Zone 1 Zone 2

    MoranI

    Bootstrap

    pvalue

    for t

    MoranI

    Bootstrap

    pvalue

    for t

    MoranI

    Bootstrap

    pvalue

    for t

    MoranI

    Bootstrap

    pvalue

    for t

    First Order

    Contiguity

    CROP -0.091 0.072* 0.176 0.004***

    NAPASTURE 0.256 0.003*** 0.638 0.000***

    FALLOW -0.010 0.842 0.088 0.120

    FOREST 0.062 0.219 0.252 0.001***

    4NN

    CROP -0.007 0.691 0.186 0.005*** -0.004 0.802 0.219 0.120

    PASTURE 0.023 0.243 0.259 0.000*** 0.019 0.310 0.388 0.233

    FALLOW 0.032 0.089* 0.058 0.203 0.064 0.041** 0.095 0.048**

    FOREST 0.058 0.131 0.173 0.000*** 0.016 0.617 0.221 0.125

    8NN

    CROP -0.044 0.006*** 0.120 0.005*** -0.044 0.011** 0.004 0.004***

    PASTURE 0.025 0.144 0.201 0.000*** 0.026 0.118 0.000 0.000***

    FALLOW 0.020 0.221 0.067 0.055* 0.051 0.014** 0.049 0.243

    FOREST 0.031 0.281 0.132 0.001 0.025 0.388 0.000 0.002***

  • 8/14/2019 Loureiro et al,2010pvar effet sptial .pdf

    39/41

    Zone 2: At the 4NN we find the same results as for zone 1: significant spatial

    autocorrelation for fallow residuals. At the 8NN the picture is quite different, as we see

    significant autocorrelation for crop, pasture and forest residuals. The comparison of the

    results for truncated and correct neighborhood matrices show differences in values that are

    substantial for zone 2 (large differences) but smaller for zone 2. Also, the computed values

    for zone 1 at 8NN are nearly the same but significance levels differ strongly. These

    findings point to substantial biases due the boundary (border) effects and open interesting

    further research on this subject.

    Appendix 3.

    Table 4: t-scores for selected variables for partitions in two clusters

    whole panel, years 1970-1995

    Topics

    Variables Cluster

    1

    (156)

    Cluster

    2

    (75)

    Cluster

    3

    (26)

    Demograp

    hy

    Total population 11.43 -11.54 -0.92

    Urban population 10.79 -11.04 -0.81

    Total dwellings 9.70 -11.07 -0.35

    Dwellings with water 8.66 -10.13 -0.21

    Dwellings with sanitation 6.59 -7.84 -0.10

    Dwellings with electric lighting 9.54 -10.59 -0.46

    Agricultur

    al

    Economy

    Agricultural sector investments -9.13 -4.05 6.39

    Total employement in agricultural

    sector

    4.46 -6.18 0.17

    Bovine livestock -8.78 -3.47 5.97

    Irrigated land area -7.96 2.15 3.28

  • 8/14/2019 Loureiro et al,2010pvar effet sptial .pdf

    40/41

    Total number of tractors -11.31 -4.07 7.53

    Value of wood production -0.19 -3.15 1.40

    Average rental value of land (per

    hectare)

    4.10 -3.08 -1.04

    Total volume of commercialized

    wood

    27.31 -8.75 -2.79

    Average volume of wood in

    munipicality

    10.72 11.39 -7.15

    Average purchase price of live beef 6.41 -3.75 -1.39

    General

    Economy

    Share of wages in total expenses 4.09 -8.73 1.26

    Average wage per employed -8.49 -0.91 4.78

    Average wage in industry 1.97 -3.18 0.20

    Share of direct government / Banco

    do Brasil aids

    -13.84 5.56 5.32

    Surface of SUDAM financed projects -7.88 -5.67 5.94

    Value of SUDAM financing -13.59 -3.35 7.78

    Education

    Share of population w. 5 yrs. or more

    of education

    10.51 -12.25 -1.41

    Average number of years ofschooling

    4.45 -9.40 1.57

    Infrastruct

    ure

    Lenght of paved roads -0.08 -7.14 2.81

    Lenght of unpaved roads -9.40 -6.80 7.07

    Natural

    Environment

    Surface in 1997 2.94 -8.93 2.07

    Land carbon content per hectare -4.39 -7.30 5.53

    Share of water masses 11.25 -8.25 -2.87

    Land nitrogen content per hectare -2.81 -7.98 4.95

    Deforested area -8.50 -6.92 2.91

    Land-use

    Surface devoted to agriculture -10.28 -5.12 7.42

    Artificial forest area -0.79 -4.25 2.12

    Natural forests area -6.27 -6.75 5.99

  • 8/14/2019 Loureiro et al,2010pvar effet sptial .pdf

    41/41

    Permanent plowing area 4.22 -5.67 0.09

    Temporary plowing area -10.59 -4.39 7.28

    Artificial pastures area -10.93 -3.07 6.93

    Natural pastures area -13.77 -2.21 8.06

    Resting land area -7.69 -2.46 5.00