r - Merging two data tables with two id variables and NA values -
i merge following data tables
dt1 <- data.table(id = letters[1:5], day = 1, var1 = c(2,5,8,7,9), var2 = c(5,5,8,6,7), key = "id") dt2 <- data.table(id = letters[3:7], day = 2, var1 = c(1,7,6,6,3), var2 = c(2,3,3,2,1), key = "id") and results should include each id, each of days. unfortunately, ids not present on days.
id day var1 var2 1 2 5 2 na na b 1 5 5 b 2 na na c 1 8 8 c 2 1 2 d 1 7 6 d 2 7 3 i have tried setting id , day keys both dt. following lines don't manage day 2 id missing, , variables doubled (var1.x var1.y)
merge(dt1, dt2, by= c("id","day"), all=true) merge(dt1, dt2, by= c("day","id"), all=true) the allow.cartesian doesn't work either. has idea/comment on how final table need?
try
library(data.table) dcast(melt(rbind(dt1, dt2), id=c('id', 'day')), id+day~variable, value.var='value', drop=false) # 1: 1 2 5 # 2: 2 na na # 3: b 1 5 5 # 4: b 2 na na # 5: c 1 8 8 # 6: c 2 1 2 # 7: d 1 7 6 # 8: d 2 7 3 # 9: e 1 9 7 #10: e 2 6 3 #11: f 1 na na #12: f 2 6 2 #13: g 1 na na #14: g 2 3 1 or @bramvisser commented, replace rbind(dt1, dt2) rbindlist(list(dt1, dt2))
or approach without using melt/dcast
rbindlist(list(dt1, dt2))[, if(.n <2) .sd[c(.n, .n+1)] else .sd, id][, day:=replace(day, is.na(day), setdiff(1:2,na.omit(day))) , id][] or
setkey(rbindlist(list(dt1, dt2)), id, day)[cj(id=unique(id), day=unique(day))]
Comments
Post a Comment