我有一些天气预报数据,记录每小时的预报降雨量.我想将此与观测数据进行比较,观测数据每6小时观测一次降雨量.因此,我需要将预测数据汇总为6小时数据. 以下是我的数据概述: DateU
以下是我的数据概述:
DateUtc StationID FcstDay PrecipQuantity_hSum 1 2014-01-01 12:00:00 54745 0 0 2 2014-01-01 13:00:00 54745 0 0 3 2014-01-01 14:00:00 54745 0 0 4 2014-01-01 15:00:00 54745 0 0 5 2014-01-01 16:00:00 54745 0 0 6 2014-01-01 17:00:00 54745 0 0 7 2014-01-01 18:00:00 54745 0 0 8 2014-01-01 19:00:00 54745 0 0 9 2014-01-01 20:00:00 54745 0 0 10 2014-01-01 21:00:00 54745 0 0 11 2014-01-01 22:00:00 54745 0 0 12 2014-01-01 23:00:00 54745 0 0 13 2014-01-02 00:00:00 54745 1 0 14 2014-01-02 01:00:00 54745 1 0 15 2014-01-02 02:00:00 54745 1 0 16 2014-01-02 03:00:00 54745 1 0 17 2014-01-02 04:00:00 54745 1 0 18 2014-01-02 05:00:00 54745 1 0 19 2014-01-02 06:00:00 54745 1 0 20 2014-01-02 07:00:00 54745 1 0 ... <NA> <NA> ... ... 13802582 2014-11-20 08:00:00 55005 7 0 13802583 2014-11-20 09:00:00 55005 7 0 13802584 2014-11-20 10:00:00 55005 7 0 13802585 2014-11-20 11:00:00 55005 7 0 13802586 2014-11-20 12:00:00 55005 7 0
要正确聚合,重要的是在聚合之前按StationID(气象站)和FcstDay(计算预测日期和预测日期之间的天数)进行拆分.
我已经使用xts包进行聚合,如果我首先手动对数据进行子集,例如,它可以正常工作,例如
z <- fcst[which(fcst$StationID=="54745" & fcst$FcstDay==1),] z.xts <- xts(z$PrecipQuantity_hSum, z$DateUtc) ends <- endpoints(z.xts, "hours", 6) precip6 <- as.data.frame(period.appl(z.xts, ends, sum))
我需要自动化子集,但我试图将xts函数包装在各种split-apply函数中并始终得到相同的错误:
Error in xts(z$PrecipQuantity_hSum, z$DateUtc) : NROW(x) must match length(order.by)
这是我的最新版本的代码:
df <- data.frame() d_ply( .data = fcst, .variables = c("FcstDay", "StationID"), .fun = function(z){ z.xts <- xts(z$PrecipQuantity_hSum, z$DateUtc) ends <- endpoints(z.xts, "hours", 6) precip6 <- as.data.frame(period.apply(z.xts, ends, sum)) precip6$DateUtc <- rownames(precip6) rownames(precip6) <- NULL df <- rbind.fill(df, precip6) })
我也试过嵌套for循环.任何人都可以就错误提供任何指导吗?我已经在下面列出了可重现的示例集的代码.提前致谢.
DateUtc <- rep(seq(from=ISOdatetime(2014,1,1,0,0,0), to=ISOdatetime(2014,12,30,0,0,0), by=(60*60)), times=9) StationID <- rep(c("50060","50061","50062"), each=3*8713) FcstDay <- rep(c(1,2,3), each=8713, times=3) PrecipQuantity_hSum <- rgamma(78417, shape=1, rate=20) fcst <- data.frame(DateUtc, StationID, FcstDay, PrecipQuantity_hSum)我认为David Robinson得到的错误是因为您的示例代码使用的是PrecipQuantity_6hSum而不是PrecipQuantity_hSum.一旦改变了,你的ddply代码对我有用.
这对你有用吗?
df<-ddply( .data = fcst, .variables = c("FcstDay", "StationID"), .fun = function(z){ z.xts <- xts(z$PrecipQuantity_6hSum, z$DateUtc) ends <- endpoints(z.xts, "hours", 6) precip6 <- as.data.frame(period.apply(z.xts, ends, sum)) precip6$DateUtc <- rownames(precip6) rownames(precip6) <- NULL return(precip6) })