Nebulloyd
Nebulloyd

Reputation: 264

Create consecutive numbering column based on 2 conditional columns in R

I am trying to create a survey day number column. The column should be full of repeating numbers that increases to the next consecutive number each time the date changes. The count of these consecutive values should also restart each time the value in the survey column changes.

I am working in R Version 1.1.463 on mac High Sierra

I have tried using the data.frame::rleid() function which comes close, providing the repeating values that change consecutively for each date from rleid(x$DATE) but I can't figure out how to restart the count when the SURVEY value changes. The outcome should look as below but SURVEY.DAY values will go beyond 2 in the real world...


DATE<-rep(c("2012-03-12", "2012-03-13", "2012-04-21", "2012-04-22", "2012-09-07", "2012-09-08"), each=4)

x<- data.frame(DATE= c(DATE), SURVEY=c(rep(1:3, each=8)), SURVEY.DAY= c(rep(1:2, each=4)))

    DATE        SURVEY    SURVEY.DAY
1  2012-03-12      1          1
2  2012-03-12      1          1
3  2012-03-12      1          1
4  2012-03-12      1          1
5  2012-03-13      1          2
6  2012-03-13      1          2
7  2012-03-13      1          2
8  2012-03-13      1          2
9  2012-04-21      2          1
10 2012-04-21      2          1
11 2012-04-21      2          1
12 2012-04-21      2          1
13 2012-04-22      2          2
14 2012-04-22      2          2
15 2012-04-22      2          2
16 2012-04-22      2          2
17 2012-09-07      3          1
18 2012-09-07      3          1
19 2012-09-07      3          1
20 2012-09-07      3          1
21 2012-09-08      3          2
22 2012-09-08      3          2
23 2012-09-08      3          2
24 2012-09-08      3          2



I have looked here and here but they are either too complicated or not exactly what I need. Ideally I would like to have a slightly modified version of my existing code rleid(x$DATE)

EDIT

Please note the example given is just an arbitrary example I created to show my desired outcome. I chose these dates hoping it would make it clearer that a new survey was beginning by changing the month for each survey but in my data there is no pattern between month and survey number.

Upvotes: 0

Views: 663

Answers (1)

Humpelstielzchen
Humpelstielzchen

Reputation: 6441

How about this?

library(tidyverse)
library(data.table)

x %>%
  group_by(SURVEY) %>%
  mutate(SURVEY.DAY = rleid(DATE))

# A tibble: 24 x 3
# Groups:   SURVEY [3]
   DATE       SURVEY SURVEY.DAY
   <fct>       <int>      <int>
 1 2012-03-12      1          1
 2 2012-03-12      1          1
 3 2012-03-12      1          1
 4 2012-03-12      1          1
 5 2012-03-13      1          2
 6 2012-03-13      1          2
 7 2012-03-13      1          2
 8 2012-03-13      1          2
 9 2012-04-21      2          1
10 2012-04-21      2          1
11 2012-04-21      2          1
12 2012-04-21      2          1
13 2012-04-22      2          2
14 2012-04-22      2          2
15 2012-04-22      2          2
16 2012-04-22      2          2
17 2012-09-07      3          1
18 2012-09-07      3          1
19 2012-09-07      3          1
20 2012-09-07      3          1
21 2012-09-08      3          2
22 2012-09-08      3          2
23 2012-09-08      3          2
24 2012-09-08      3          2

Upvotes: 2

Related Questions