Ashleigh
Ashleigh

Reputation: 33

Return number based on grouping variables

I am working with a bird dataset where each individual (ID) has the territory they are born in (TERR), year of birth (YOB) and number of days after the start of the year they are born in (DOB).

There are often multiple individuals which have the same TERR, YEAR and DOB. There may be more individuals born in the same TERR and YEAR, but these will have a different DOB (in this data set the first set of individuals will have a 'lower' DOB than the second set of individuals).

I want to insert a new column 'n' where the first set of individuals per year return a '1', the second set of individuals return a '2' and the third set of individuals return a '3'. When it is the next year the number will revert back to '1'.

E.g.

ID TERR YOB  DOB N
1  A1   1982 148 1
2  A1   1982 148 1
3  A1   1982 148 1
4  A1   1982 185 2
5  A1   1982 185 2
6  A1   1985 137 1
7  A1   1985 137 1
8  BIAN 1989 132 1
9  BIAN 1989 132 1
10 BIAN 1989 132 1
11 BIAN 1992 155 1
12 BIAN 1992 155 1
13 BIAN 1992 155 1
14 BIAN 1992 254 2
15 BIAN 1992 254 2
16 BIAN 1992 254 2
17 BIAN 1994 164 1
18 BIAN 1994 164 1
19 GATE 1998 119 1
20 GATE 1998 119 1
21 GATE 1998 172 2
22 GATE 1998 172 2
23 GATE 1998 172 2
24 GATE 1999 153 1
25 GATE 1999 153 1

I'm quite new to R so any help is much appreciated. I've been trying to use the if_else function but not getting far with it.

Upvotes: 3

Views: 36

Answers (2)

akrun
akrun

Reputation: 887881

After grouping by 'TERR', 'YOB', get the match of 'DOB' with unique elements of 'DOB'

library(dplyr)
out <- df1 %>%
         group_by(TERR, YOB) %>% 
         mutate(N1 = match(DOB, unique(DOB)))
identical(out$N, out$N1)
#[1] TRUE

out
# A tibble: 25 x 6
# Groups:   TERR, YOB [7]
#      ID TERR    YOB   DOB     N    N1
#   <int> <chr> <int> <int> <int> <int>
# 1     1 A1     1982   148     1     1
# 2     2 A1     1982   148     1     1
# 3     3 A1     1982   148     1     1
# 4     4 A1     1982   185     2     2
# 5     5 A1     1982   185     2     2
# 6     6 A1     1985   137     1     1
# 7     7 A1     1985   137     1     1
# 8     8 BIAN   1989   132     1     1
# 9     9 BIAN   1989   132     1     1
#10    10 BIAN   1989   132     1     1
# ... with 15 more rows

Or convert the 'DOB' to factor and coerce it to numeric

df1 %>%
   group_by(TERR, YOB) %>% 
   mutate(N1 = as.integer(factor(DOB, levels = unique(DOB))))

The same approach can be used in base R with ave

with(df1, ave(DOB, TERR, YOB, FUN = function(x) match(x, unique(x))))

data

df1 <- structure(list(ID = 1:25, TERR = c("A1", "A1", "A1", "A1", "A1", 
"A1", "A1", "BIAN", "BIAN", "BIAN", "BIAN", "BIAN", "BIAN", "BIAN", 
"BIAN", "BIAN", "BIAN", "BIAN", "GATE", "GATE", "GATE", "GATE", 
"GATE", "GATE", "GATE"), YOB = c(1982L, 1982L, 1982L, 1982L, 
1982L, 1985L, 1985L, 1989L, 1989L, 1989L, 1992L, 1992L, 1992L, 
1992L, 1992L, 1992L, 1994L, 1994L, 1998L, 1998L, 1998L, 1998L, 
1998L, 1999L, 1999L), DOB = c(148L, 148L, 148L, 185L, 185L, 137L, 
137L, 132L, 132L, 132L, 155L, 155L, 155L, 254L, 254L, 254L, 164L, 
164L, 119L, 119L, 172L, 172L, 172L, 153L, 153L), N = c(1L, 1L, 
1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 
1L, 1L, 2L, 2L, 2L, 1L, 1L)), .Names = c("ID", "TERR", "YOB", 
"DOB", "N"), class = "data.frame", row.names = c(NA, -25L))

Upvotes: 1

jogo
jogo

Reputation: 12569

Here is a solution with data.table:

library("data.table")
DT <- fread(
"ID TERR YOB  DOB N
1  A1   1982 148 1
2  A1   1982 148 1
3  A1   1982 148 1
4  A1   1982 185 2
5  A1   1982 185 2
6  A1   1985 137 1
7  A1   1985 137 1
8  BIAN 1989 132 1
9  BIAN 1989 132 1
10 BIAN 1989 132 1
11 BIAN 1992 155 1
12 BIAN 1992 155 1
13 BIAN 1992 155 1
14 BIAN 1992 254 2
15 BIAN 1992 254 2
16 BIAN 1992 254 2
17 BIAN 1994 164 1
18 BIAN 1994 164 1
19 GATE 1998 119 1
20 GATE 1998 119 1
21 GATE 1998 172 2
22 GATE 1998 172 2
23 GATE 1998 172 2
24 GATE 1999 153 1
25 GATE 1999 153 1")
DT[, N2:=rleidv(DOB), .(TERR, YOB)][]
# > DT[, N2:=rleidv(DOB), .(TERR, YOB)][]
#    ID TERR  YOB DOB N N2
# 1:  1   A1 1982 148 1  1
# 2:  2   A1 1982 148 1  1
# 3:  3   A1 1982 148 1  1
# 4:  4   A1 1982 185 2  2
# 5:  5   A1 1982 185 2  2
# 6:  6   A1 1985 137 1  1
# 7:  7   A1 1985 137 1  1
# 8:  8 BIAN 1989 132 1  1
# 9:  9 BIAN 1989 132 1  1
# 10: 10 BIAN 1989 132 1  1
# 11: 11 BIAN 1992 155 1  1
# 12: 12 BIAN 1992 155 1  1
# 13: 13 BIAN 1992 155 1  1
# 14: 14 BIAN 1992 254 2  2
# 15: 15 BIAN 1992 254 2  2
# 16: 16 BIAN 1992 254 2  2
# 17: 17 BIAN 1994 164 1  1
# 18: 18 BIAN 1994 164 1  1
# 19: 19 GATE 1998 119 1  1
# 20: 20 GATE 1998 119 1  1
# 21: 21 GATE 1998 172 2  2
# 22: 22 GATE 1998 172 2  2
# 23: 23 GATE 1998 172 2  2
# 24: 24 GATE 1999 153 1  1
# 25: 25 GATE 1999 153 1  1
#     ID TERR  YOB DOB N N2

Upvotes: 1

Related Questions