user5316628
user5316628

Reputation: 357

Return value of next occurrence based on multiple columns

Bellow is a DF forwhich I would like to add a fourth column "NextTime":

  Firm   Time  Date
1    A   8:00   1/1
2    B   9:00   1/1
3    A   9:30   1/1
4    B  10:00   1/1
5    B  11:00   1/1
6    B   8:00   1/2
7    C  10:00   1/2
8    C  10:30   1/2
9    C  10:35   1/2
10   B  10:55   1/2

The NextTime column would be the "next" Time, given the same Firm and Date. For the last value for the Firm and Date, then a value (16:30) should be entered, as follows:

  Firm   Time  Date  NextTime
1    A   8:00   1/1      9:30
2    B   9:00   1/1     10:00
3    A   9:30   1/1     16:30
4    B  10:00   1/1     11:00
5    B  11:00   1/1     16:30
6    B   8:00   1/2     10:55 
7    C  10:00   1/2     10:30
8    C  10:30   1/2     10:35
9    C  10:35   1/2     16:30
10   B  10:55   1/2     16:30

Have tried variations of match, zoo, and apply, help would be appreciated.

Upvotes: 1

Views: 87

Answers (2)

G. Grothendieck
G. Grothendieck

Reputation: 269471

Here is a base R solution. For each combination of Firm and Date ave returns the next value or "16:40" :

transform(DF, NextTime = ave(Time, Firm, Date, FUN = function(x) c(x[-1], "16:30")))

giving:

   Firm  Time Date NextTime
1     A  8:00  1/1     9:30
2     B  9:00  1/1    10:00
3     A  9:30  1/1    16:30
4     B 10:00  1/1    11:00
5     B 11:00  1/1    16:30
6     B  8:00  1/2    10:55
7     C 10:00  1/2    10:30
8     C 10:30  1/2    10:35
9     C 10:35  1/2    16:30
10    B 10:55  1/2    16:30

Note To make this reproducible (please provide this yourself next time) we used this input:

Lines <- "Firm   Time  Date
1    A   8:00   1/1
2    B   9:00   1/1
3    A   9:30   1/1
4    B  10:00   1/1
5    B  11:00   1/1
6    B   8:00   1/2
7    C  10:00   1/2
8    C  10:30   1/2
9    C  10:35   1/2
10   B  10:55   1/2"

DF <- read.table(text = Lines, header = TRUE, as.is = TRUE)

Upvotes: 0

akrun
akrun

Reputation: 886978

We can use data.table. We convert the 'data.frame' to 'data.table' (setDT(df1)), grouped by 'Firm' and 'Date', we use the type as 'lead' of 'Time' with fill as '16:30' in the shift function.

library(data.table)#v1.9.6+
setDT(df1)[, NextTime:= shift(Time, type='lead', fill= '16:30'), .(Firm, Date)]
df1
#    Firm  Time Date NextTime
# 1:    A  8:00  1/1     9:30
# 2:    B  9:00  1/1    10:00
# 3:    A  9:30  1/1    16:30
# 4:    B 10:00  1/1    11:00
# 5:    B 11:00  1/1    16:30
# 6:    B  8:00  1/2    10:55
# 7:    C 10:00  1/2    10:30
# 8:    C 10:30  1/2    10:35
# 9:    C 10:35  1/2    16:30
#10:    B 10:55  1/2    16:30

Or using dplyr, we group by the columns and get the lead of 'Time' to create a new column.

library(dplyr)
df1 %>% 
  group_by(Firm, Date) %>% 
  mutate(NextTime= lead(as.character(Time), default='16:30'))

Upvotes: 2

Related Questions