Learning-based Reconstruction of GRACE Data Based on Changes in Total Water Storage and Its Accuracy Assessment
Su Yong,*, Yang Yi-Fei, Yang Yi-Yu1
1. School of Civil Engineering and Geomatics, Southwest Petroleum University, Chengdu 610500, China
2. School of Geodesy and Geomatics, Wuhan University, Wuhan 430079, China
3. Hubei Luojia Laboratory, Wuhan 430079, China
4. Xizang Autonomous Region Key Laboratory of Satellite Remote Sensing and Application, Lhasa 851400, China
摘要 自2 0 0 2 年4 月以来, 重力恢复和气候实验卫星(G R A C E)提供了覆盖全球的每月总储水量异常(TWSA),但GRACE观测数据存在缺失而导致TWSA不连续。本研究提出了一种无需水文模型数据的基于机器学习的组合建模方法。本研究将全球11个主要区域的TWSA时间序列数据分为训练集和测试集,使用自回归求和移动平均模型(ARIMA)、长短期记忆网络模型(LSTM)和ARIMA-LSTM组合模型,将模型的预测值与GRACE观测结果进行比较,并使用Nash-Sutcliffe效率系数(NSE)、Pearson相关系数(CC)、均方根误差(RMSE)、归一化RMSE(NRMSE)和平均绝对百分比误差等五个指标评估模型的准确性。结果表明:在流域尺度上,ARIMALSTM模型的CC、NSE和NRMSE平均值分别为0.93、0.83和0.12。在网格尺度上,本研究比较了亚马逊河和伏尔加河流域5个指标的空间分布和累积分布曲线。ARIMALSTM模型在亚马逊河和伏尔加河流域的CC和NSE平均值分别为0.89和0.61以及0.92和0.61,优于ARIMA模型(分别为0.86和0.48以及0.88和0.46)和LSTM模型(分别是0.80和0.41以及0.89和0.31)。在ARIMA-LSTM模型中,两个流域NSE>0.50的网格单元比例分别为63.3%和80.8%,而在ARIMA模型中为54.3%和51.3%,在LSTM模型中为53.7%和43.2%。ARIMA-LSTM模型显著提高了预测的NSE值,同时保证了在流域尺度和格网尺度上重建的GRACE数据均有较高的Pearson相关系数,有助于填充时变重力场模型中的缺失数据。
Abstract:
Since April 2002, the Gravity Recovery and Climate Experiment Satellite (GRACE) has provided monthly total water storage anomalies (TWSAs) on a global scale. However, these TWSAs are discontinuous because some GRACE observation data are missing. This study presents a combined machine learningbased modeling algorithm without hydrological model data. The TWSA time-series data for 11 large regions worldwide were divided into training and test sets. Autoregressive integrated moving average (ARIMA), long short-term memory (LSTM), and an ARIMA–LSTM combined model were used. The model predictions were compared with GRACE observations, and the model accuracy was evaluated using fi ve metrics: the Nash–Sutcliff e effi ciency coeffi cient (NSE), Pearson correlation coeffi cient (CC), root mean square error (RMSE), normalized RMSE (NRMSE), and mean absolute percentage error. The results show that at the basin scale, the mean CC, NSE, and NRMSE for the ARIMA–LSTM model were 0.93, 0.83, and 0.12, respectively. At the grid scale, this study compared the spatial distribution and cumulative distribution function curves of the metrics in the Amazon and Volga River basins. The ARIMA–LSTM model had mean CC and NSE values of 0.89 and 0.61 and 0.92 and 0.61 in the Amazon and Volga River basins, respectively, which are superior to those of the ARIMA model (0.86 and 0.48 and 0.88 and 0.46, respectively) and the LSTM model (0.80 and 0.41 and 0.89 and 0.31, respectively). In the ARIMA–LSTM model, the proportions of grid cells with NSE > 0.50 for the two basins were 63.3% and 80.8%, while they were 54.3% and 51.3% in the ARIMA model and 53.7% and 43.2% in the LSTM model. The ARIMA–LSTM model significantly improved the NSE values of the predictions while guaranteeing high CC values in the GRACE data reconstruction at both scales, which can aid in fi lling in discontinuous data in temporal gravity fi eld models.
. Learning-based Reconstruction of GRACE Data Based on Changes in Total Water Storage and Its Accuracy Assessment[J]. APPLIED GEOPHYSICS, 2025, 22(2): 365-382.