Rafael
Rafael

Reputation: 3196

replace NAs with division of two columns

I have a data.frame that looks like:

a   b  c   d
1   2  NA  1
NA  2  2   1 
3   2  NA  1
NA  NA 20  2

And I want to replace the NAs with c / d (and delete c and d) to look like:

a  b
1  2
2  2
3  2
10 10

Some background: d is a sum of NAs in that particular row.

I don't know the names of the columns, so I tried a few variations of things like:

df2[, 1:(length(colnames(df2)) - 2)][is.na(df2[, 1:(length(colnames(df2)) - 2)])] = df2$c / df2$d

but got:

Error in `[<-.data.frame`(`*tmp*`, is.na(df2[, 1:(length(colnames(df2)) -  : 
  'value' is the wrong length

Upvotes: 0

Views: 118

Answers (2)

austensen
austensen

Reputation: 3017

Here's a way you can do this with dplyr.


library(dplyr)

df <- tibble(
  a = c(1, NA, 3, NA),
  b = c(2, 2, 2, NA),
  c = c(NA, 2, NA, 20L),
  d = c(1, 1, 1, 2)
)

df %>% 
  mutate_at(vars(-c, -d), funs(if_else(is.na(.), c / d, .))) %>% 
  select(-c, -d)

#> # A tibble: 4 x 2
#>       a     b
#>   <dbl> <dbl>
#> 1     1     2
#> 2     2     2
#> 3     3     2
#> 4    10    10

You can specify the variables in the vars() call using any of the functions from ?dplyr::select_helpers. These could be regex, a simple vector of names, or you can just use all columns except c and d (as I've changed this example to now).

Upvotes: 1

Vitalijs
Vitalijs

Reputation: 950

library(data.table)
data<-fread("a   b  c   d
1   2  NA  1
            NA  2  2   1 
            3   2  NA  1
            NA  NA 20  2")
names_to_loop<-names(data)
names_to_loop<-names_to_loop[names_to_loop!="c"&names_to_loop!="d"]
for (ntl in names_to_loop){
  set(data,j=ntl,value=ifelse(is.na(data[[ntl]]),data[["c"]]/data[["d"]],data[[ntl]]))
}
data[,c:=NULL]
data[,d:=NULL]
> data
    a  b
1:  1  2
2:  2  2
3:  3  2
4: 10 10

Upvotes: 0

Related Questions