tstev
tstev

Reputation: 616

Combining/merging a column in R

I am probably missing something quite easy with this problem. I can't find the proper answer anywhere - and I really need to move on. So I have oversimplified my data:

eventID <- c(2,4)
Time <- c("09:32","09:43")
df1 <- data.frame(eventID,Time)
eventID <- rep(c(1:4),rep(3,4))
Time <- rep(c("09:30",NA,"09:40",NA),rep(3,4))
df2 <- data.frame(eventID,Time)

What I would like is to combine the Time column. So the NAs in df2 should be filled up with Time values from df1 matching the eventID. My original data is quite large so a for-loop is not what I am looking for really. I was hoping this would work:

> (res1 <- merge(df1,df2, by = "eventID", all = T))
#   eventID Time.x Time.y
#1        1   <NA>  09:30
#2        1   <NA>  09:30
#3        1   <NA>  09:30
#4        2  09:32   <NA>
#5        2  09:32   <NA>
#6        2  09:32   <NA>
#7        3   <NA>  09:40
#8        3   <NA>  09:40
#9        3   <NA>  09:40
#10       4  09:43   <NA>
#11       4  09:43   <NA>
#12       4  09:43   <NA>

Desired output:

> eventID <- rep(c(1:4),rep(3,4))
> Time <- rep(c("09:30","09:32","09:40","09:43"), rep(3,4))
> (res2 <- data.frame(eventID,Time))
#   eventID  Time
#1        1 09:30
#2        1 09:30
#3        1 09:30
#4        2 09:32
#5        2 09:32
#6        2 09:32
#7        3 09:40
#8        3 09:40
#9        3 09:40
#10       4 09:43
#11       4 09:43
#12       4 09:43

Suggestions? If possible in base I would prefer that, or data.table package.

Upvotes: 3

Views: 83

Answers (1)

akrun
akrun

Reputation: 886968

Try

library(data.table)
setkey(setDT(df2), eventID)[df1, Time:= i.Time][]
#   eventID  Time
#1:       1 09:30
#2:       1 09:30
#3:       1 09:30
#4:       2 09:32
#5:       2 09:32
#6:       2 09:32
#7:       3 09:40
#8:       3 09:40
#9:       3 09:40
#10:      4 09:43
#11:      4 09:43
#12:      4 09:43

Or a base R option after the merge would be

 data.frame(eventID= res1[,1], 
     Time=res1[-1][cbind(1:nrow(res1),max.col(!is.na(res1[-1])))])

Update

Suppose there are some 'eventID' in the first dataset are not in the second ('df2'), then we can specify nomatch=0

 df3 <- data.frame(eventID= c(2,4,5), Time=c('09:32', '09:43', '09:45'))

 setkey(setDT(df2), eventID)[df3, Time:= i.Time,nomatch=0][]
  #    eventID  Time
  # 1:       1 09:30
  # 2:       1 09:30
  # 3:       1 09:30
  # 4:       2 09:32
  # 5:       2 09:32
  # 6:       2 09:32
  # 7:       3 09:40
  # 8:       3 09:40
  # 9:       3 09:40
  #10:       4 09:43
  #11:       4 09:43
  #12:       4 09:43

Upvotes: 4

Related Questions