New Modified Estimators for the Spatial Lag Model with Randomly Missing Data in Dependent Variable: Methods and Simulation Study
Main Article Content
Abstract
Accurately estimating the spatial lag model (SLM) in the presence of randomly missing data in the dependent variable poses a significant challenge. We introduce some modifications to the two-stage least squares with imputation (I2SLS) estimator previously proposed by Izaguirre [1] and Wang and Lee [2]. Our key contributions include (1) introducing the generalized nonlinear least squares (GNLS) estimator as an alternative imputation method to the previously used nonlinear least squares (NLS) approach in the literature, (2) incorporating additional instrument matrices (IM), and (3) implementing both partial and total imputations for all modified estimators. Through a Monte Carlo simulation (MCS) study, we evaluate the performance of these estimators across various scenarios of sample size, spatial weights matrix densities, and missingness rate. Results are compared in terms of coefficient bias and root mean squares errors (RMSE) for both the parameters and model fit. The findings indicate that all estimators demonstrate relatively strong performance in the context of estimator coefficients bias and RMSE. However, our modified estimators demonstrate slightly better performance compared to those previously documented in the literature in terms of overall RMSE. While both total and partial imputation approaches tend to produce similar results, partial imputation demonstrated superior performance in certain scenarios. Additionally, the estimators proved robust, maintaining their reliability across varying levels of spatial connectivity. However, higher missing data rates led to slightly increased bias and RMSE.
Article Details
References
- A. Izaguirre, Estimation of Spatial Lag Model Under Random Missing Data in the Dependent Variable. Two Stage Estimator with Imputation, Economia 44 (2021), 1–19. https://doi.org/10.18800/economia.202101.001.
- W. Wang, L. Lee, Estimation of Spatial Autoregressive Models with Randomly Missing Data in the Dependent Variable, Econom. J. 16 (2013), 73–102. https://doi.org/10.1111/j.1368-423X.2012.00388.x.
- P. Krugman, Geography and Trade, MIT Press, Cambridge, 1991.
- M. Fujita, P. Krugman, A. Venables, The Spatial Economics: Cities, Regions and International Trade, MIT Press, Cambridge, 1999.
- F.J. Boehmke, E.U. Schilling, J.C. Hays, Missing Data in Spatial Regression, in: Midwest Political Science Association Conference, 16-19, April 2015.
- T. Yokoi, Spatial Lag Dependence in the Presence of Missing Observations, Ann. Reg. Sci. 60 (2018), 25–40. https://doi.org/10.1007/s00168-015-0737-2.
- L. Lee, GMM and 2SLS Estimation of Mixed Regressive, Spatial Autoregressive Models, J. Econom. 137 (2007), 489–514. https://doi.org/10.1016/j.jeconom.2005.10.004.
- T. Rüttenauer, Spatial Regression Models: A Systematic Comparison of Different Model Specifications Using Monte Carlo Experiments, Sociol. Methods Res. 51 (2022), 728–759. https://doi.org/10.1177/0049124119882467.
- H.H. Kelejian, I.R. Prucha, A Generalized Spatial Two-Stage Least Squares Procedure for Estimating a Spatial Autoregressive Model with Autoregressive Disturbances, J. Real Estate Finance Econ. 17 (1998), 99–121. https://doi.org/10.1023/A:1007707430416.
- H.H. Kelejian, I.R. Prucha, A Generalized Moments Estimator for the Autoregressive Parameter in a Spatial Model, Int. Econ. Rev. 40 (1999), 509–533. https://doi.org/10.1111/1468-2354.00027.
- L.-F. Lee, Asymptotic Distributions of Quasi-Maximum Likelihood Estimators for Spatial Autoregressive Models, Econometrica 72 (2004), 1899–1925. https://doi.org/10.1111/j.1468-0262.2004.00558.x.
- I. Mattsson, J. Lyhagen, Modeling Spatial Regimes With Smooth Transitions, Int. Reg. Sci. Rev. 48 (2025), 38–61. https://doi.org/10.1177/01600176241237180.
- L. Anselin, Lagrange Multiplier Test Diagnostics for Spatial Dependence and Spatial Heterogeneity, Geogr. Anal. 20 (1988), 1–17. https://doi.org/10.1111/j.1538-4632.1988.tb00159.x.
- L. Anselin, J.L. Gallo, H. Jayet, Spatial Panel Econometrics, in: L. Mátyás, P. Sevestre (Eds.), The Econometrics of Panel Data, Springer, Berlin, Heidelberg, 2008: pp. 625–660. https://doi.org/10.1007/978-3-540-75892-1_19.
- K. Ord, Estimation Methods for Models of Spatial Interaction, J. Am. Stat. Assoc. 70 (1975), 120–126. https://doi.org/10.1080/01621459.1975.10480272.
- O. Smirnov, L. Anselin, Fast Maximum Likelihood Estimation of Very Large Spatial Autoregressive Models: A Characteristic Polynomial Approach, Comput. Stat. Data Anal. 35 (2001), 301–319. https://doi.org/10.1016/S0167-9473(00)00018-9.
- L. Lee, Best Spatial Two‐Stage Least Squares Estimators for a Spatial Autoregressive Model with Autoregressive Disturbances, Econom. Rev. 22 (2003), 307–335. https://doi.org/10.1081/ETC-120025891.
- A.P. Dempster, N.M. Laird, D.B. Rubin, Maximum Likelihood from Incomplete Data Via the EM Algorithm, J. R. Stat. Soc. Ser. B Stat. Methodol. 39 (1977), 1–22. https://doi.org/10.1111/j.2517-6161.1977.tb01600.x.
- R.J. Little, D.B. Rubin, Statistical Analysis with Missing Data, Wiley, New York, 2002.
- A.S.A. Yaseen, Fractional Imputation Methods for Longitudinal Data Analysis, Thesis, Faculty of Economics and Political Science, Cairo University, Egypt, 2014.
- J.P. LeSage, R.K. Pace, Models for Spatially Dependent Missing Data, J. Real Estate Finance Econ. 29 (2004), 233–254. https://doi.org/10.1023/B:REAL.0000035312.82241.e4.
- P. Wolfgang, C. Llano, R. Sellner, Bayesian Methods for Completing Data in Spatial Models, Rev. Econ. Anal. 2 (2010), 194–214. https://doi.org/10.15353/rea.v2i2.1472.
- T. Suesse, A. Zammit-Mangion, Computational Aspects of the EM Algorithm for Spatial Econometric Models with Missing Data, J. Stat. Comput. Simul. 87 (2017), 1767–1786. https://doi.org/10.1080/00949655.2017.1286495.
- P. Amitha, V.S. Binu, B. Seena, Estimation of Missing Values in Aggregate Level Spatial Data, Clin. Epidemiol. Glob. Health 9 (2021), 304–309. https://doi.org/10.1016/j.cegh.2020.10.003.
- H. Seya, M. Tomari, S. Uno, Parameter Estimation in Spatial Econometric Models with Non-Random Missing Data, Appl. Econ. Lett. 28 (2021), 440–446. https://doi.org/10.1080/13504851.2020.1758618.
- J. Teng, S. Ding, X. Shi, H. Zhang, X. Hu, MCMCINLA Estimation of Missing Data and Its Application to Public Health Development in China in the Post-Epidemic Era, Entropy 24 (2022), 916. https://doi.org/10.3390/e24070916.
- H.H. Kelejian, I.R. Prucha, Y. Yuzefovich, Instrumental Variable Estimation of a Spatial Autoregressive Model with Autoregressive Disturbances: Large and Small Sample Results, in: Advances in Econometrics, Emerald, Bingley, 2004: pp. 163–198. https://doi.org/10.1016/S0731-9053(04)18005-5.
- M.M. Abdelwahab, O.A. Shalaby, H.E. Semary, M.R. Abonazel, Driving Factors of NOx Emissions in China: Insights from Spatial Regression Analysis, Atmosphere 15 (2024), 793. https://doi.org/10.3390/atmos15070793.
- Y. Song, A. Cibin, Optimizing Spatial Weight Matrices in Spatial Econometrics: A Graph-Theoretic Approach Based on Shortest Path Algorithms: A New York City Application of Crime and Economic Indicators, Int. Rev. Spat. Plan. Sustain. Dev. 12 (2024), 181–200. https://doi.org/10.14246/irspsd.12.2_181.
- J. Dubé, D. Legros, Spatial Econometrics Using Micro-Data, Wiley, 2014.
- A. Saguatti, Modeling the Spatial Dynamics of Economic Models, Thesis, Mater Studiorum Università di Bologna, 2013. https://doi.org/10.6092/UNIBO/AMSDOTTORATO/5978.
- T.E. Smith, Estimation Bias in Spatial Models with Strongly Connected Weight Matrices, Geogr. Anal. 41 (2009), 307–332. https://doi.org/10.1111/j.1538-4632.2009.00758.x.
- S. Farber, A. Páez, E. Volz, Topology, Dependency Tests and Estimation Bias in Network Autoregressive Models, in: A. Páez, J. Gallo, R.N. Buliung, S. Dall’erba (Eds.), Progress in Spatial Analysis, Springer Berlin Heidelberg, Berlin, Heidelberg, 2010: pp. 29–57. https://doi.org/10.1007/978-3-642-03326-1_3.
- M.R. Abonazel, A.H. Youssef, E.G. Ahmed, On Robust M, S and MM Estimations for the Poisson Fixed Effects Panel Model with Outliers: Simulation and Applications, J. Stat. Comput. Simul. (2025). https://doi.org/10.1080/00949655.2024.2449102.
- A.A. El-Sheikh, M.C. Ali, M.R. Abonazel, Development of Two Methods for Estimating High-Dimensional Data in the Case of Multicollinearity and Outliers, Int. J. Anal. Appl. 22 (2024), 187. https://doi.org/10.28924/2291-8639-22-2024-187.
- A.R. Azazy, M.R. Abonazel, A.M. Shafik, et al. A Proposed Robust Regression Model to Study Carbon Dioxide Emissions in Egypt, Commun. Math. Biol. Neurosci. 2024 (2024), 86. https://doi.org/10.28919/cmbn/8673.
- A.H. Youssef, M.R. Abonazel, E.G. Ahmed, Robust M Estimation for Poisson Panel Data Model with Fixed Effects: Method, Algorithm, Simulation, and Application, Stat. Optim. Inf. Comput. 12 (2024), 1292–1305. https://doi.org/10.19139/soic-2310-5070-1996.
- E.G. Ahmed, M.R. Abonazel, M.N. Al-Ghamdi, et al. Proposed Robust Estimators for the Poisson Panel Regression Model: Application to COVID-19 Deaths in Europe, Commun. Math. Biol. Neurosci. 2024 (2024), 121. https://doi.org/10.28919/cmbn/8795.