EGM8686
EGM8686

Reputation: 1572

Conditionally change NA's to zeros

I have a dataframe with a structure like this:

id x1 x2 x3 x4 x5 x6 x7  pos
1  1  2  1  5  NA NA NA  1
2  NA NA NA 4  2  2  3   3
3  NA NA 2  4  2  2  3   2
4  NA NA 7  4  2  2  3   2
5  NA NA NA 4  2  2  3   1

I want to change all NAs to zeros but starting according to the pos variable, so resulting df is:

id x1 x2 x3 x4 x5 x6 x7  pos
1  1  2  1  5  NA NA NA  1
2  NA NA 0  4  2  2  3   3
3  NA 0  2  4  2  2  3   2
4  NA 0  7  4  2  2  3   2
5  0  0  0  4  2  2  3   1

So the postion marks the starting position in the list of variables for which NA should be changed to zero.

Thx!

Upvotes: 1

Views: 73

Answers (1)

Maurits Evers
Maurits Evers

Reputation: 50668

Here is a base R option using mapply and replace

df[, -c(1, ncol(df))] <- t(mapply(
    function(x, y) replace(x, is.na(x) & seq_along(x) >= y, 0),
    as.data.frame(t(df[, -c(1, ncol(df))])), 
    unlist(df[ncol(df)])))
df
#  id x1 x2 x3 x4 x5 x6 x7 pos
#1  1  1  2  1  5  0  0  0   1
#2  2 NA NA  0  4  2  2  3   3
#3  3 NA  0  2  4  2  2  3   2
#4  4 NA  0  7  4  2  2  3   2
#5  5  0  0  0  4  2  2  3   1

The various t() are necessary because mapply applies a function by column and we would like to process df by rows.


Update

Here is a shorter and faster version avoiding the mapply call and using direct indexing

# df2 is the x1...x7 block of df
df2 <- df[, -c(1, ncol(df))]
df2[is.na(df2) & t(apply(df2, 1, seq_along)) == df[, ncol(df)]] <- 0

df[, -c(1, ncol(df))] <- df2
df
#  id x1 x2 x3 x4 x5 x6 x7 pos
#1  1  1  2  1  5 NA NA NA   1
#2  2 NA NA  0  4  2  2  3   3
#3  3 NA  0  2  4  2  2  3   2
#4  4 NA  0  7  4  2  2  3   2
#5  5  0 NA NA  4  2  2  3   1

Sample data

df <- read.table(text =
"id x1 x2 x3 x4 x5 x6 x7  pos
1  1  2  1  5  NA NA NA  1
2  NA NA NA 4  2  2  3   3
3  NA NA 2  4  2  2  3   2
4  NA NA 7  4  2  2  3   2
5  NA NA NA 4  2  2  3   1", header = T)

Upvotes: 1

Related Questions