面向自然资源信息提取的多源异构数据融合技术——以汉江流域NDVI数据为例
Multi-source heterogeneous data fusion technology for natural resource information extraction: A case study of NDVI data in Hanjiang Basin
-
摘要: 高时空分辨率的自然资源指标数据对大尺度自然资源动态观测与趋势评估至关重要。大数据时代下的海量多源数据为数据高效融合利用提供了可能。以重构汉江流域归一化植被指数(Normalized Difference Vegetation Index,NDVI)数据为例,搭建了PostgreSQL自然资源时空大数据处理底层架构,集成了数据级融合法、特征级融合法和决策级融合法,基于机器学习算法构建了一套面向自然资源信息提取的多源异构数据智能融合技术,实现了多源数据的高效利用与特征空间优选。同时,重构了2000—2019年汉江流域NDVI 1 km逐年数据集,全面反映了汉江流域植被动态变化。研究结果可为地球科学时空大数据的高效提取与模拟分析提供科学参考,为定量核算林草资源禀赋规模、探究生态系统时空演变规律提供一种更精准、更便捷的技术手段。Abstract: Natural resource indicator data with high spatio-temporal resolution are essential for large-scale natural resource dynamic observation and trend assessment. The large amount of multi-source data under big data era could provide the possibility for efficient utilization and fusion of data. Taking the Normalized Difference Vegetation Index (NDVI) in Hanjiang Basin as an example, the authors in this paper have built a spatio-temporal big data processing underlying architecture for natural resources based on PostgreSQL, and integrated three types of methods, including data-level fusion, feature-level fusion and decision-level fusion. Besides, the intelligent fusion system of multi-source heterogeneous data has been constructed based on the machine learning algorithms to achieve efficient utilization of multi-source data and feature spatial preference. Meanwhile, the year-by-year NDVI 1 km dataset of Hanjiang Basin from 2000 to 2019 has been reconstructed to comprehensively reflect the dynamic changes of vegetation in Hanjiang Basin. These results could provide some scientific reference for the efficient extraction and simulation analysis of spatio-temporal big data in earth sciences, and provide a more accurate and convenient technical means for quantitatively accounting the scale of forest and grassland resources endowment and exploring the spatio-temporal evolution of ecosystem.