Hao
Hao

Reputation: 23

How to efficiently replace first row's NA with 0 by group with R

is there a better way of replace the first row's NA with 0 by group? This is the example. Thanks.

x <- matrix(c(NA,NA,2,3,NA,4,NA,NA,6,NA,NA,7),nrow=4)
x <- as.data.table(x)
names(x) <- c("a","b","c")
name <- rep(c("P-1","P-2"),each=2)
x <- cbind(name,x)

x[!duplicated(x$name),] <- replace(x[!duplicated(x$name),],sapply(x[!duplicated(x$name),],is.na),0)

Upvotes: 2

Views: 635

Answers (3)

GKi
GKi

Reputation: 39727

You can store !duplicated(x$name) and there is no need for sapply. A base solution to replace first row's NA with 0 by group:

i <- !duplicated(x$name)
x[i,] <- replace(x[i,], is.na(x[i,]), 0)
x
#  name  a  b  c
#1  P-1  0  0  6
#2  P-1 NA  4 NA
#3  P-2  2  0  0
#4  P-2  3 NA  7

Upvotes: 2

chinsoon12
chinsoon12

Reputation: 25223

Another data.table option is:

x[name!=shift(name, fill=""), c("a","b","c") := {
    s <- copy(.SD) 
    s[is.na(.SD)] <- 0
    s
}, .SDcols=a:c]

Upvotes: 1

Ronak Shah
Ronak Shah

Reputation: 389305

We can replace NA values at first row in each group for all columns.

Using data.table, that can be done as :

library(data.table)
x[, lapply(.SD, function(x) replace(x, seq_along(x) == 1 & is.na(x), 0)), name]

#   name  a  b  c
#1:  P-1  0  0  6
#2:  P-1 NA  4 NA
#3:  P-2  2  0  0
#4:  P-2  3 NA  7

Or with dplyr :

library(dplyr)

x %>%
  group_by(name) %>%
  mutate_at(vars(-group_cols()), ~replace(., row_number() == 1 & is.na(.), 0))

Upvotes: 6

Related Questions