Reputation: 33
I am working with a bird dataset where each individual (ID) has the territory they are born in (TERR), year of birth (YOB) and number of days after the start of the year they are born in (DOB).
There are often multiple individuals which have the same TERR, YEAR and DOB. There may be more individuals born in the same TERR and YEAR, but these will have a different DOB (in this data set the first set of individuals will have a 'lower' DOB than the second set of individuals).
I want to insert a new column 'n' where the first set of individuals per year return a '1', the second set of individuals return a '2' and the third set of individuals return a '3'. When it is the next year the number will revert back to '1'.
E.g.
ID TERR YOB DOB N
1 A1 1982 148 1
2 A1 1982 148 1
3 A1 1982 148 1
4 A1 1982 185 2
5 A1 1982 185 2
6 A1 1985 137 1
7 A1 1985 137 1
8 BIAN 1989 132 1
9 BIAN 1989 132 1
10 BIAN 1989 132 1
11 BIAN 1992 155 1
12 BIAN 1992 155 1
13 BIAN 1992 155 1
14 BIAN 1992 254 2
15 BIAN 1992 254 2
16 BIAN 1992 254 2
17 BIAN 1994 164 1
18 BIAN 1994 164 1
19 GATE 1998 119 1
20 GATE 1998 119 1
21 GATE 1998 172 2
22 GATE 1998 172 2
23 GATE 1998 172 2
24 GATE 1999 153 1
25 GATE 1999 153 1
I'm quite new to R so any help is much appreciated. I've been trying to use the if_else function but not getting far with it.
Upvotes: 3
Views: 36
Reputation: 887881
After grouping by 'TERR', 'YOB', get the match
of 'DOB' with unique
elements of 'DOB'
library(dplyr)
out <- df1 %>%
group_by(TERR, YOB) %>%
mutate(N1 = match(DOB, unique(DOB)))
identical(out$N, out$N1)
#[1] TRUE
out
# A tibble: 25 x 6
# Groups: TERR, YOB [7]
# ID TERR YOB DOB N N1
# <int> <chr> <int> <int> <int> <int>
# 1 1 A1 1982 148 1 1
# 2 2 A1 1982 148 1 1
# 3 3 A1 1982 148 1 1
# 4 4 A1 1982 185 2 2
# 5 5 A1 1982 185 2 2
# 6 6 A1 1985 137 1 1
# 7 7 A1 1985 137 1 1
# 8 8 BIAN 1989 132 1 1
# 9 9 BIAN 1989 132 1 1
#10 10 BIAN 1989 132 1 1
# ... with 15 more rows
Or convert the 'DOB' to factor
and coerce it to numeric
df1 %>%
group_by(TERR, YOB) %>%
mutate(N1 = as.integer(factor(DOB, levels = unique(DOB))))
The same approach can be used in base R
with ave
with(df1, ave(DOB, TERR, YOB, FUN = function(x) match(x, unique(x))))
df1 <- structure(list(ID = 1:25, TERR = c("A1", "A1", "A1", "A1", "A1",
"A1", "A1", "BIAN", "BIAN", "BIAN", "BIAN", "BIAN", "BIAN", "BIAN",
"BIAN", "BIAN", "BIAN", "BIAN", "GATE", "GATE", "GATE", "GATE",
"GATE", "GATE", "GATE"), YOB = c(1982L, 1982L, 1982L, 1982L,
1982L, 1985L, 1985L, 1989L, 1989L, 1989L, 1992L, 1992L, 1992L,
1992L, 1992L, 1992L, 1994L, 1994L, 1998L, 1998L, 1998L, 1998L,
1998L, 1999L, 1999L), DOB = c(148L, 148L, 148L, 185L, 185L, 137L,
137L, 132L, 132L, 132L, 155L, 155L, 155L, 254L, 254L, 254L, 164L,
164L, 119L, 119L, 172L, 172L, 172L, 153L, 153L), N = c(1L, 1L,
1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L,
1L, 1L, 2L, 2L, 2L, 1L, 1L)), .Names = c("ID", "TERR", "YOB",
"DOB", "N"), class = "data.frame", row.names = c(NA, -25L))
Upvotes: 1
Reputation: 12569
Here is a solution with data.table
:
library("data.table")
DT <- fread(
"ID TERR YOB DOB N
1 A1 1982 148 1
2 A1 1982 148 1
3 A1 1982 148 1
4 A1 1982 185 2
5 A1 1982 185 2
6 A1 1985 137 1
7 A1 1985 137 1
8 BIAN 1989 132 1
9 BIAN 1989 132 1
10 BIAN 1989 132 1
11 BIAN 1992 155 1
12 BIAN 1992 155 1
13 BIAN 1992 155 1
14 BIAN 1992 254 2
15 BIAN 1992 254 2
16 BIAN 1992 254 2
17 BIAN 1994 164 1
18 BIAN 1994 164 1
19 GATE 1998 119 1
20 GATE 1998 119 1
21 GATE 1998 172 2
22 GATE 1998 172 2
23 GATE 1998 172 2
24 GATE 1999 153 1
25 GATE 1999 153 1")
DT[, N2:=rleidv(DOB), .(TERR, YOB)][]
# > DT[, N2:=rleidv(DOB), .(TERR, YOB)][]
# ID TERR YOB DOB N N2
# 1: 1 A1 1982 148 1 1
# 2: 2 A1 1982 148 1 1
# 3: 3 A1 1982 148 1 1
# 4: 4 A1 1982 185 2 2
# 5: 5 A1 1982 185 2 2
# 6: 6 A1 1985 137 1 1
# 7: 7 A1 1985 137 1 1
# 8: 8 BIAN 1989 132 1 1
# 9: 9 BIAN 1989 132 1 1
# 10: 10 BIAN 1989 132 1 1
# 11: 11 BIAN 1992 155 1 1
# 12: 12 BIAN 1992 155 1 1
# 13: 13 BIAN 1992 155 1 1
# 14: 14 BIAN 1992 254 2 2
# 15: 15 BIAN 1992 254 2 2
# 16: 16 BIAN 1992 254 2 2
# 17: 17 BIAN 1994 164 1 1
# 18: 18 BIAN 1994 164 1 1
# 19: 19 GATE 1998 119 1 1
# 20: 20 GATE 1998 119 1 1
# 21: 21 GATE 1998 172 2 2
# 22: 22 GATE 1998 172 2 2
# 23: 23 GATE 1998 172 2 2
# 24: 24 GATE 1999 153 1 1
# 25: 25 GATE 1999 153 1 1
# ID TERR YOB DOB N N2
Upvotes: 1