一、引言
在今天越来越严重的气候变化条件下,水文覆盖成为了越来越多研究者重视的话题。水文覆盖指的是地表或植被表面被水覆盖的面积,包括河流、洼地、湖泊、蓄水池等。它反应了一个地区的水资源分布、水域利用等情况,对于水资源管理和自然环境保护具有重要意义。
GRACE水文数据包括地表水蓄积(SWS)、土壤水蓄积(SSS)、总水蓄积(TWS)等变量,通常以每月为单位进行统计和融合,并以网格的形式提供各个区域的数据。
在这里,我们将通过使用 R 语言及其相关包对 GRACE 数据进行研究。具体来说,我们将使用 ncdf4 包读取 GRACE 的 .nc 数据,并进行数据的预处理和可视化分析。
二、数据获取
2.1 安装包和加载数据
# install.packages("ncdf4") library(ncdf4) ncdata <- nc_open("data.nc") ncdata
2.2 基本数据信息解读
File D:\log\data\CSR_GRACE_GRACE-FO_RL06_Mascons_all-corrections_v02.nc (NC_FORMAT_NETCDF4): 2 variables (excluding dimension variables): float time_bounds[timebound,time] (Contiguous storage) float lwe_thickness[lon,lat,time] (Contiguous storage) grid_mapping: WGS84 coordinates: time lat lon standard_name: Liquid_Water_Equivalent_Thickness long_name: Liquid_Water_Equivalent_Thickness Units: cm 4 dimensions: timebound Size:2 (no dimvar) time Size:216 bounds: time_bounds calendar: gregorian axis: T standard_name: Time long_name: Time Units: days since 2002-01-01T00:00:00Z lon Size:1440 bounds: lon_bounds valid_max: 359.875 valid_min: 0.125 axis: X standard_name: Longitude long_name: Longitude Units: degrees_east lat Size:720 bounds: lat_bounds valid_max: 89.875 valid_min: -89.875 axis: Y standard_name: Latitude long_name: Latitude Units: degrees_north 58 global attributes: Conventions: CF-1.6, ACDD-1.3, ISO 8601 filename: netcdf/CSR_GRACE_GRACE-FO_RL06_Mascons_all-corrections_v02.nc Metadata_Conventions: Unidata Dataset Discovery v1.0 ......
从文件信息中,我们可以看出该文件包含四个变量:time(时间)、lon(经度)、lat(纬度)和 lwe_thickness(覆盖厚度)。
2.3 基础数据提取
lon <- ncvar_get(ncdata,'lon') # 经度 nrow(lon) lat <- ncvar_get(ncdata,'lat') # 纬度 nrow(lat) time <- ncvar_get(ncdata,'time') # 时间 time lwe_thickness <- ncvar_get(ncdata,'lwe_thickness') # 覆盖厚度 class(lwe_thickness) # 判定数据类型
结果展示:
# 经度 [1] 1440 # 纬度 [1] 720 # 时间 [1] 107.0 129.5 227.5 258.0 288.5 319.0 349.5 380.5 410.0 439.5 [11] 470.0 495.5 561.5 592.5 623.0 653.5 684.0 714.5 736.5 777.0 [21] 805.5 836.0 866.5 897.0 927.5 958.5 989.0 1019.5 1050.0 1080.5 [31] 1111.5 1141.0 1170.5 1201.0 1231.5 1262.0 1292.5 1323.5 1354.0 1384.5 [41] 1415.0 1445.5 1476.5 1506.0 1535.5 1566.0 1596.5 1627.0 1657.5 1688.5 [51] 1719.0 1749.5 1780.0 1810.5 1841.5 1870.5 1900.5 1931.0 1961.5 1992.0 [61] 2022.5 2053.5 2084.0 2114.5 2145.0 2175.5 2206.5 2236.5 2266.5 2297.0 [71] 2327.5 2358.0 2388.5 2419.5 2450.0 2480.5 2511.0 2541.5 2572.5 2602.0 [81] 2631.5 2662.0 2692.5 2723.0 2753.5 2784.5 2815.0 2845.5 2876.0 2906.5 [91] 2937.5 2967.0 2996.5 3027.0 3057.5 3088.0 3118.5 3149.5 3180.0 3210.5 [101] 3241.0 3269.5 3335.5 3361.5 3392.0 3422.5 3485.5 3514.5 3545.0 3575.5 [111] 3590.5 3648.0 3667.5 3697.5 3727.5 3746.0 3819.0 3849.5 3880.5 3908.5 [121] 3974.5 4002.5 4033.5 4062.0 4128.0 4153.5 4184.0 4214.5 4306.5 4337.0 [131] 4367.5 4391.0 4458.5 4488.0 4518.5 4546.0 4610.5 4641.0 4671.5 4702.0 [141] 4769.5 4793.0 4822.5 4853.0 4864.0 4943.5 4975.5 5004.5 5104.0 5128.5 [151] 5157.0 5188.5 5253.0 5280.0 5309.5 5346.5 5444.5 5471.5 5499.0 5568.5 [161] 5592.5 5611.0 5639.5 6010.0 6034.0 6147.5 6163.0 6193.5 6224.5 6254.0 [171] 6283.5 6314.0 6344.5 6375.0 6405.5 6436.5 6467.0 6497.5 6528.0 6558.5 [181] 6589.5 6619.5 6649.5 6680.0 6710.5 6741.0 6771.5 6802.5 6833.0 6863.5 [191] 6894.0 6924.5 6955.5 6985.0 7014.5 7045.0 7075.5 7106.0 7136.5 7167.5 [201] 7198.0 7228.5 7259.0 7289.5 7320.5 7350.0 7379.5 7410.0 7440.5 7471.0 [211] 7501.5 7532.5 7563.0 7593.5 7624.0 7654.5 # 覆盖厚度 [1] "array"
2.4 数据预处理
- 日期格式转换
# 这份测绘数据起始时间是2002.01.01,time的值表示过去的天数 time <- as.Date("2002-01-01") + time time
结果展示:
[1] "2002-04-18" "2002-05-10" "2002-08-16" "2002-09-16" "2002-10-16" [6] "2002-11-16" "2002-12-16" "2003-01-16" "2003-02-15" "2003-03-16" [11] "2003-04-16" "2003-05-11" "2003-07-16" "2003-08-16" "2003-09-16" [16] "2003-10-16" "2003-11-16" "2003-12-16" "2004-01-07" "2004-02-17" [21] "2004-03-16" "2004-04-16" "2004-05-16" "2004-06-16" "2004-07-16" [26] "2004-08-16" "2004-09-16" "2004-10-16" "2004-11-16" "2004-12-16" [31] "2005-01-16" "2005-02-15" "2005-03-16" "2005-04-16" "2005-05-16" [36] "2005-06-16" "2005-07-16" "2005-08-16" "2005-09-16" "2005-10-16" [41] "2005-11-16" "2005-12-16" "2006-01-16" "2006-02-15" "2006-03-16" [46] "2006-04-16" "2006-05-16" "2006-06-16" "2006-07-16" "2006-08-16" [51] "2006-09-16" "2006-10-16" "2006-11-16" "2006-12-16" "2007-01-16" [56] "2007-02-14" "2007-03-16" "2007-04-16" "2007-05-16" "2007-06-16" [61] "2007-07-16" "2007-08-16" "2007-09-16" "2007-10-16" "2007-11-16" [66] "2007-12-16" "2008-01-16" "2008-02-15" "2008-03-16" "2008-04-16" [71] "2008-05-16" "2008-06-16" "2008-07-16" "2008-08-16" "2008-09-16" [76] "2008-10-16" "2008-11-16" "2008-12-16" "2009-01-16" "2009-02-15" [81] "2009-03-16" "2009-04-16" "2009-05-16" "2009-06-16" "2009-07-16" [86] "2009-08-16" "2009-09-16" "2009-10-16" "2009-11-16" "2009-12-16" [91] "2010-01-16" "2010-02-15" "2010-03-16" "2010-04-16" "2010-05-16" [96] "2010-06-16" "2010-07-16" "2010-08-16" "2010-09-16" "2010-10-16" [101] "2010-11-16" "2010-12-14" "2011-02-18" "2011-03-16" "2011-04-16" [106] "2011-05-16" "2011-07-18" "2011-08-16" "2011-09-16" "2011-10-16" [111] "2011-10-31" "2011-12-28" "2012-01-16" "2012-02-15" "2012-03-16" [116] "2012-04-04" "2012-06-16" "2012-07-16" "2012-08-16" "2012-09-13" [121] "2012-11-18" "2012-12-16" "2013-01-16" "2013-02-14" "2013-04-21" [126] "2013-05-16" "2013-06-16" "2013-07-16" "2013-10-16" "2013-11-16" [131] "2013-12-16" "2014-01-09" "2014-03-17" "2014-04-16" "2014-05-16" [136] "2014-06-13" "2014-08-16" "2014-09-16" "2014-10-16" "2014-11-16" [141] "2015-01-22" "2015-02-15" "2015-03-16" "2015-04-16" "2015-04-27" [146] "2015-07-15" "2015-08-16" "2015-09-14" "2015-12-23" "2016-01-16" [151] "2016-02-14" "2016-03-16" "2016-05-20" "2016-06-16" "2016-07-15" [156] "2016-08-21" "2016-11-27" "2016-12-24" "2017-01-21" "2017-03-31" [161] "2017-04-24" "2017-05-13" "2017-06-10" "2018-06-16" "2018-07-10" [166] "2018-10-31" "2018-11-16" "2018-12-16" "2019-01-16" "2019-02-15" [171] "2019-03-16" "2019-04-16" "2019-05-16" "2019-06-16" "2019-07-16" [176] "2019-08-16" "2019-09-16" "2019-10-16" "2019-11-16" "2019-12-16" [181] "2020-01-16" "2020-02-15" "2020-03-16" "2020-04-16" "2020-05-16" [186] "2020-06-16" "2020-07-16" "2020-08-16" "2020-09-16" "2020-10-16" [191] "2020-11-16" "2020-12-16" "2021-01-16" "2021-02-15" "2021-03-16" [196] "2021-04-16" "2021-05-16" "2021-06-16" "2021-07-16" "2021-08-16" [201] "2021-09-16" "2021-10-16" "2021-11-16" "2021-12-16" "2022-01-16" [206] "2022-02-15" "2022-03-16" "2022-04-16" "2022-05-16" "2022-06-16" [211] "2022-07-16" "2022-08-16" "2022-09-16" "2022-10-16" "2022-11-16" [216] "2022-12-16"
现在就是问我们能看懂的时间类型了,仔细看,这数据是每个月记录测绘一次结果,其中有不少缺失数据。下一篇我们将介绍如何补全这些测绘数据,欢迎关注和一起讨论。
- lwe_thickness理解
lwe_thickness[lon,lat,time]
数据是float类型,表示每个时间点的经度和纬度的覆盖厚度值。这里补充说明一下:一个数据矩阵代表一个时间节点的一片区域的水文覆盖厚度情况。了解这个,我们下一步就可以将栅格化数据转换成dataframe.
# 检查数组的维度 dim(lwe_thickness)
结果展示:
[1] 1440 720 216
是不是和lon、lat和time的长度是一致的。
- 扁平化数据
# 创建一个空的数据框,指定 time 为 Date 类型 for(i in 1:dim(lwe_thickness)[1]){ for(j in 1:dim(lwe_thickness)[2]){ for(k in 1:dim(lwe_thickness)[3]){ # 添加一行数据 new_row <- data.frame(time = time[k],lon = lon[i],lat = lat[j],lwe_thickness = lwe_thickness[i,j,k]) df <- rbind(df, new_row) } } }
library(data.table) # 对数据进行重组和转置,以扁平化数据结构 lwe_thickness_flat <- as.data.table(lwe_thickness)
结果展示:
V1 V2 V3 value 1: 1 1 1 -2.468668 2: 1 1 2 -1.569404 3: 1 1 3 -3.044039 4: 1 1 4 -3.201635 5: 1 1 5 -1.785586 --- 223948796: 1440 720 212 2.893622 223948797: 1440 720 213 5.812358 223948798: 1440 720 214 9.525175 223948799: 1440 720 215 10.497257 223948800: 1440 720 216 8.949245
数据量很恐怖呀,达到了惊人的2亿多条!
- 将V1、V2、V3分别替换成lon、lat和time的数据
# lon lwe_thickness_flat$V1 <- lon[lwe_thickness_flat$V1] # lat lwe_thickness_flat$V2 <- lat[lwe_thickness_flat$V2] # time lwe_thickness_flat$V3 <- time[lwe_thickness_flat$V3]
- 修改列名
# 将数据框lwe_thickness_flat转换为数据表对象 lwe_thickness_flat<-as.data.table(lwe_thickness_flat) # 设置列名 setnames(lwe_thickness_flat, c("V1","V2","V3","value"), c("lon","lat","time","lwe_thickness"))
由于数据量巨大,有2亿多条数据,无法直接修改列名。至此,数据已经提取完全。
- 存入csv
# 写入文件并忽略行名 fwrite(lwe_thickness_flat, file = "D:/log/modis.csv", row.names = FALSE)
转存成csv,免得以后分析的时候需要再次转化!
三、基础数据展示
- 抽样
由于数据量过于庞大,选取一个栅格的数据。
library(dplyr) # 对数据进行日期筛选 sampled_data <- lwe_thickness_flat %>% filter(time == as.Date("2002-04-18"))
- 绘图
# 设定等值线的级别序列 level_seq <- seq(from = -90, to = 300, by = 0.25) # 绘制水平等值线图,并使用上面设定的等值线级别 p <- ggplot(data = sampled_data, aes(x = lon, y = lat, z = lwe_thickness)) + geom_contour(aes(colour = ..level..), breaks = level_seq) + scale_colour_gradient(low = "blue", high = "red") # 将ggplot对象换成plotly对象 ggplotly(p)
目前刚入手这个地信这一块,图形展示有点吃力!从科研文章中吸取经验和方法,我未来会提供更好看更简洁的图形的,希望大家可以一起相互交流学习。