Molia
Molia

Reputation: 311

Replace NA values using if statement based on group by

I am looking to do the following in a more elegant manner in R. I believe there is a way but just cant wrap my head around it. Following is the problem.

I have a df which contains NAs. However, I want to make the NAs into zeros where if the sum of the NA is not equal to zero and if the sum is NA then leave as NA. The example below should make it clear.

A<-c("A", "A", "A", "A", 
     "B","B","B","B",
     "C","C","C","C")
B<-c(1,NA,NA,1,NA,NA,NA,NA,2,1,2,3)
data<-data.frame(A,B)

Following is how the data looks like

   A  B
1  A  1
2  A NA
3  A NA
4  A  1
5  B NA
6  B NA
7  B NA
8  B NA
9  C  2
10 C  1
11 C  2
12 C  3

And am looking to get a result as per the following

   A  B
1  A  1
2  A  0
3  A  0
4  A  1
5  B NA
6  B NA
7  B NA
8  B NA
9  C  2
10 C  1
11 C  2
12 C  3

I know I can use inner join by creating a table first and and then making an IF statement based on that table but I was wondering if there is a way to do it in one or two lines of code in R.

Following is the solution related to the inner join I was referring to

sum_NA <- function(x) if(all(is.na(x))) NA_integer_ else sum(x, na.rm=TRUE)

data2 <- data %>% group_by(A) %>% summarize(x = sum_NA(B), Y = 
ifelse(is.na(x), TRUE, FALSE))
data2

data2_1 <- right_join(data, data2, by = "A")

data <- mutate(data2_1, B = ifelse(Y == FALSE & is.na(B), 0,B))
data <- select(data, - Y,-x)
data

Upvotes: 1

Views: 353

Answers (3)

Stephan
Stephan

Reputation: 2246

or with dplyr its:

library(dplyr)
data %>%
  mutate(B=ifelse(is.na(B) & A %in% unique(na.omit(data)$A), 0, B))

Upvotes: 2

93i7hdjb
93i7hdjb

Reputation: 1196

Or similarly, with ifelse():

data$B <- ifelse(is.na(data$B) & data$A %in% unique(na.omit(data)$A), 0, data$B)

Upvotes: 2

pogibas
pogibas

Reputation: 28379

Maybe solution like this would work:

data[is.na(B) & A %in% unique(na.omit(data)$A), ]$B <- 0

Here you're asking:

  • if B is NA
  • if A is within letters that have non-NA values

Then make those values 0.

Upvotes: 4

Related Questions