tnt
tnt

Reputation: 1447

add rows to data frame for non-observations

I have a dataframe that summarizes the number of times birds were observed at their breeding site one each day and each hour during daytime (i.e., when the sun was above the horizon). example:

head(df)
    ID   site day  hr  nObs
1    19  A    202  11  60
2    19  A    202  13  18
3    19  A    202  15  27
4     8  B    188  8   6
5     8  B    188  9   6
6     8  B    188  11  7

However, this dataframe does not include hours when the bird was not observed. Eg. no line for bird 19 on day 202 at 14 with an nObs value of 0.

I'd like to find a way, preferably with dplyr (tidy verse), to add in those rows for when individuals were not observed.

Upvotes: 2

Views: 159

Answers (2)

Jozef
Jozef

Reputation: 2737

One way to do this would be to first build a "template" of all possible combinations where birds can be observed and then merge ("left join") the actual observations onto that template:

a = read.table(text = "    ID   site day  hr  nObs
1    19  A    202  11  60
2    19  A    202  13  18
3    19  A    202  15  27
4     8  B    188  8   6
5     8  B    188  9   6
6     8  B    188  11  7")


tpl <- expand.grid(c(unique(a[, 1:3]), list(hr = 1:24)))
merge(tpl, a, all.x = TRUE)

Edit based on comment by @user3220999: in case we want to do the process per ID, we can just use split to get a list of data.frames per ID, get a list of templates and mapply merge on the two lists:

a <- split(a, a$ID)
tpl <- lapply(a, function(ai) {
  expand.grid(c(unique(ai[, 1:3]), list(hr = 1:24))) 
})
res <- mapply(merge, tpl, a, SIMPLIFY = FALSE, MoreArgs = list(all.x = TRUE)) 

Upvotes: 1

Sotos
Sotos

Reputation: 51582

You can use complete from tidyr, i.e.

library(tidyverse)

df %>% 
 group_by(ID, site) %>% 
 complete(hr = seq(min(hr), max(hr)))

which gives,

# A tibble: 9 x 5
# Groups:   ID, site [2]
     ID site     hr   day  nObs
  <int> <fct> <int> <int> <int>
1     8 B         8   188     6
2     8 B         9   188     6
3     8 B        10    NA    NA
4     8 B        11   188     7
5    19 A        11   202    60
6    19 A        12    NA    NA
7    19 A        13   202    18
8    19 A        14    NA    NA
9    19 A        15   202    27

Upvotes: 3

Related Questions