我有一个关于诊断的信息: data - tibble( id = c(1:10), diagnosis_1 = c("F32", "F431", "R58", "S32", "F11", NA, NA, "Y67", "F32", "Z032"), diagnosis_2 = c(NA, NA, NA, NA, NA, NA, "G35", NA, NA, NA), diagnosis_3 = c("F40", NA, "R67
data <- tibble( id = c(1:10), diagnosis_1 = c("F32", "F431", "R58", "S32", "F11", NA, NA, "Y67", "F32", "Z032"), diagnosis_2 = c(NA, NA, NA, NA, NA, NA, "G35", NA, NA, NA), diagnosis_3 = c("F40", NA, "R67", "F431", NA, "F60", "S58", "R68", "F11", NA), diagnosis_4 = c(NA, NA, "F65", NA, "F19", NA, NA, "F32", NA, NA) )
作为清洁过程的一部分,我已经删除了所有不符合某些标准的诊断(即不以字母F,G或Z开头).使用以下代码:
data$diagnosis_1[str_sub(data$diagnosis_1, 1,1) %in% c("R", "S", "Y")] <- NA data$diagnosis_2[str_sub(data$diagnosis_2, 1,1) %in% c("R", "S", "Y")] <- NA data$diagnosis_3[str_sub(data$diagnosis_3, 1,1) %in% c("R", "S", "Y")] <- NA data$diagnosis_4[str_sub(data$diagnosis_4, 1,1) %in% c("R", "S", "Y")] <- NA
结束这个tibble:
我现在需要将数据向左移动以从左到右填充列(即,如果诊断_2,诊断_3或诊断_4具有数据,则诊断_1不为空).我已经尝试使用ifelse(),因为它是矢量化但我似乎无法使用几个嵌套的ifelse().
ifelse(is.na(data$diagnosis_1), data$diagnosis_2, data$diagnosis_1))
所有建议都非常感谢.
编辑:添加预期输出:
我们首先将以“R”,“S”或“Y”开头的值替换为NA,然后左移非NA值.data[-1] <- lapply(data[-1], function(x) replace(x, grepl("^[R|S|Y]", x), NA)) data[] <- t(apply(data, 1, function(x) `length<-`(na.omit(x), length(x)))) data # A tibble: 10 x 5 # id diagnosis_1 diagnosis_2 diagnosis_3 diagnosis_4 # <chr> <chr> <chr> <chr> <chr> # 1 " 1" F32 F40 NA NA # 2 " 2" F431 NA NA NA # 3 " 3" F65 NA NA NA # 4 " 4" F431 NA NA NA # 5 " 5" F11 F19 NA NA # 6 " 6" F60 NA NA NA # 7 " 7" G35 NA NA NA # 8 " 8" F32 NA NA NA # 9 " 9" F32 F11 NA NA #10 10 Z032 NA NA NA
将非NA值向左移动已经取自大卫从here开始的答案.你可以尝试任何其他方法来改变同一问题的值.