Reputation: 73
I have an issue right now. the column I am trying to manipulate looks like this:
> DT <- data.table(Group= c("SM", NA, NA, NA, NA, NA, "GH", NA, NA, NA, NA, NA, NA, NA))
> DT
Group
1: SM
2: <NA>
3: <NA>
4: <NA>
5: <NA>
6: <NA>
7: GH
8: <NA>
9: <NA>
10: <NA>
11: <NA>
12: <NA>
13: <NA>
14: <NA>
I want to fill the NAs with the previous value but just for a specific amount of rows, in this case just 4, meaning the desired result is:
Group
1: SM
2: SM
3: SM
4: SM
5: SM
6: <NA>
7: GH
8: GH
9: GH
10: GH
11: GH
12: <NA>
13: <NA>
14: <NA>
how can I achieve this? I tried with na.locf() but it's not doing what I want it to do. thanks in advance
Upvotes: 3
Views: 181
Reputation: 886938
An option with data.table
would be
library(data.table)
DT[, Group := Group[1][NA^(seq_len(.N) > 5)], cumsum(!is.na(Group))]
DT
# Group
# 1: SM
# 2: SM
# 3: SM
# 4: SM
# 5: SM
# 6: <NA>
# 7: GH
# 8: GH
# 9: GH
#10: GH
#11: GH
#12: <NA>
#13: <NA>
#14: <NA>
Upvotes: 4
Reputation: 39154
Here is a solution using the dplyr
package.
library(dplyr)
library(data.table)
# Set the threshold
threshold <- 4
DT2 <- DT %>%
mutate(Group_ID = cumsum(!is.na(Group))) %>%
group_by(Group_ID) %>%
mutate(ID = row_number() - 1) %>%
mutate(Group = ifelse(ID <= threshold, first(Group), NA_character_)) %>%
ungroup() %>%
select(Group)
DT2
# # A tibble: 14 x 1
# Group
# <chr>
# 1 SM
# 2 SM
# 3 SM
# 4 SM
# 5 SM
# 6 NA
# 7 GH
# 8 GH
# 9 GH
# 10 GH
# 11 GH
# 12 NA
# 13 NA
# 14 NA
Upvotes: 3
Reputation: 17289
Here is a way to do it:
> DT[, Group := ifelse(seq_len(.N) <= 1 + 4, Group[1], Group),by = cumsum(!is.na(Group))]
> DT
Group
1: SM
2: SM
3: SM
4: SM
5: SM
6: <NA>
7: GH
8: GH
9: GH
10: GH
11: GH
12: <NA>
13: <NA>
14: <NA>
Upvotes: 2