- xtabond2 can fit two closely related dynamic panel data models.
- The first is the Arellano-Bond (1991) estimator, which is also available with xtabond, though without the two-step standard error correction described below. It is sometimes called "difference GMM."
- The second is an augmented version outlined by Arellano and Bover (1995) and fully developed by Blundell and Bond (1998). It is known as "system GMM."
- Roodman (2009) provides a pedagogic introduction to linear GMM, these estimators, and xtabond2.
- The estimators are designed for dynamic "small-T, large-N" panels that may contain fixed effects and - separate from those fixed effects - idiosyncratic errors that are heteroskedastic and correlated within but not across individuals.
Mô hình:
y_it = x_it * b_1 + w_it * b_2 + u_it i=1,...,N; t=1,...,T
u_it = v_i + e_it,
where
- v_i are unobserved individual-level effects;
- e_it are the observation-specific errors;
- x_it is a vector of strictly exogenous covariates (ones dependent on neither current nor past e_it);
- w_it is a vector of predetermined covariates (which may include the lag of y) and endogenous covariates, all of which may be correlated with the v_i (Predetermined variables are potentially correlated with past errors. Endogenous ones are potentially correlated with past and present errors.);
- b_1 and b_2 are vectors of parameters to be estimated;
- and E[v_i]=E[e_it]=E[v_i*e_it]=0, and E[e_it*e_js]=0 for each i, j, t, s, i<>j.
- First-differencing the equation removes the v_i, thus eliminating a potential source of omitted variable bias in estimation.
- However, differencing variables that are predetermined but not strictly exogenous makes them endogenous since the w_it in some D.w_it = w_it - w_i,t-1 is correlated with the e_i,t-1 in D.e_it.
- Following Holt-Eakin, Newey, and Rosen (1988), Arellano and Bond (1991) develop a Generalized Method of Moments estimator that instruments the differenced variables that are not strictly exogenous with all their available lags in levels.
- A problem with the original Arellano-Bond estimator is that lagged levels are poor instruments for first differences if the variables are close to a random walk.
- Arellano and Bover (1995) describe how, if the original equation in levels is added to the system, additional instruments can be brought to bear to increase efficiency. In this equation, variables in levels are instrumented with suitable lags of their own first differences.
- The assumption needed is that these differences are uncorrelated with the unobserved country effects. Blundell and Bond show that this assumption in turn depends on a more precise one about initial conditions.
- The Mata version also includes the option to use the forward orthogonal deviations transform instead of first differencing. Proposed by Arellano and Bover (1995) the orthogonal deviations transform, rather than subtracting the previous observation, subtracts the average of all available future observations. The result is then multiplied by a scale factor chosen to yield the nice but relatively unimportant property that if the original e_it are i.i.d., then so are the transformed ones (see Arellano and Bover (1995) and Roodman (2009)).
- Like differencing, taking orthogonal deviations removes fixed effects. Because lagged observations of a variable do not enter the formula for the transformation, they remain orthogonal to the transformed errors (assuming no serial correlation), and available as instruments.
- In fact, for consistency, the software stores the orthogonal deviation of an observation one period late, so that, as with differencing, observations for period 1 are missing and, for an instrumenting variable w, w_i,t-1 enters the formula for the transformed observation stored at i,t. With this move, exactly the same lags of variables are valid as instruments under the two transformations.
- On balanced panels, GMM estimators based on the two transforms return numerically identical coefficient estimates, holding the instrument set fixed (Arellano and Bover 1995). But orthogonal deviations has the virtue of preserving sample size in panels with gaps. If some e_it is missing, for example, neither D.e_it nor D.e_i,t+1 can be computed. But the orthogonal deviation can be computed for every complete observation except the last for each individual. (First differencing can do no better since it must drop the first observation for each individual.) Note that "difference GMM" is still called that even when orthogonal deviations are used. We will refer to the equation in differences or orthogonal deviations as the transformed equation. In system GMM with orthogonal deviations, the levels or untransformed equation is still instrumented with differences as described above.
- xtabond2 reports the Arellano-Bond test for autocorrelation, which is applied to the differenced residuals in order to purge the unobserved and perfectly autocorrelated v_i. AR(1) is expected in first differences, because D.e_i,t = e_i,t - e_i,t-1 should correlate with D.e_i,t-1 = e_i,t-1 - e_i,t-2 since they share the e_i,t-1 term. So to check for AR(1) in levels, look for AR(2) in differences, on the idea that this will detect the relationship between the e_i,t-1 in D.e_i,t and the e_i,t-2 in D.e_i,t-2. This reasoning does not work for orthogonal deviations, in which the residuals for an individual are all mathematically interrelated, thus contaminated from the point of view of detecting AR in the e_it. So the test is run on differenced residuals even after estimation in deviations. Autocorrelation indicates that lags of the dependent variable (and any other variables used as instruments that are not strictly exogenous), are in fact endogenous, thus bad instruments. For example, if there is AR(s), then y_i,t-s would be correlated with e_i,t-s, which would be correlated with D.e_i,t-s, which would be correlated with D.e_i,t.
- xtabond2 also reports tests of over-identifying restrictions--of whether the instruments, as a group, appear exogenous.
- For one-step, non-robust estimation, it reports the Sargan statistic, which is the minimized value of the one-step GMM criterion function. The Sargan statistic is not robust to heteroskedasticity or autocorellation.
- So for one-step, robust estimation (and for all two-step estimation), xtabond2 also reports the Hansen J statistic, which is the minimized value of the two-step GMM criterion function, and is robust.
- xtabond2 still reports the Sargan statistic in these cases because the J test has its own problem: it can be greatly weakened by instrument proliferation.
- The Mata version goes further, reporting difference-in-Sargan statistics (really, difference-in-Hansen statistics, except in one-step robust estimation), which test for whether subsets of instruments are valid.
- To be precise, it reports one test for each group of instruments defined by an ivstyle() or gmmstyle() option (explained below). So replacing gmmstyle(x y) in a command line with gmmstyle(x) gmmstyle(y) will yield the same estimate but distinct difference-in-Sargan/Hansen tests.
- In addition, including the split suboption in a gmmstyle() option in system GMM splits an instrument group in two for difference-in-Sargan/Hansen purposes, one each for the transformed equation and levels equations. This is especially useful for testing the instruments for the levels equation based on lagged differences of the dependent variable, which are the most suspect in system GMM and the subject of the "initial conditions" in the title of Blundell and Bond (1998).
- In the same vein, in system GMM, xtabond2 also tests all the GMM-type instruments for the levels equation as a group. All of these tests, however, are weak when the instrument count is high.
- Difference-in-Sargan/Hansen tests are are computationally intensive since they involve re-estimating the model for each test; the nodiffsargan option is available to prevent them.
- As linear GMM estimators, the Arellano-Bond and Blundell-Bond estimators have one- and two-step variants. But though two-step is asymptotically more efficient, the reported two-step standard errors tend to be severely downward biased (Arellano and Bond 1991; Blundell and Bond 1998). To compensate, xtabond2 makes available a finite-sample correction to the two-step covariance matrix derived by Windmeijer (2005). This can make two-step robust estimations more efficient than one-step robust, especially for system GMM.
- The syntax of xtabond2 differs substantially from that of xtabond and xtdpdsys. xtabond2 almost completely decouples specification of regressors from specification of instruments. As a result, most variables used will appear twice in an xtabond2 command line.
- xtabond2 requires the initial varlist of the command line to include all regressors except for the optional constant term, be they strictly exogenous, predetermined, or endogenous. Variables used to form instruments then appear in gmmstyle() or ivstyle() options after the comma. The result is a loss of parsimony, but fuller control over the instrument matrix. Variables can be used as the basis for "GMM-style" instrument sets without being included as regressors, or vice versa.
Nguồn tài liệu:
- Roodman, D. 2009. How to do xtabond2: An introduction to difference and system GMM in Stata. Stata Journal 9(1): 86-136.
Không có nhận xét nào:
Đăng nhận xét