1darknight
1darknight

Reputation: 73

User-written function to replace NA by 0 by for loop not working in r

[data.table] I have written a function like this to replace NA to 0 if a column is numeric

fn.naremove <- function(data){ 
for (i in 1: length(data)){
if (class(data[[i]]) %in% c("numeric", "interger", "interger64")) {
  print(data[, names(data[, i]) := replace(data[, i], is.na(data[, i]), 0)])
} 
else {
 print(data)
}}}

I have a sample data table like below

dt1<- data.table(C1= c(1, 5, 14, NA, 54), C2= c(9, NA, NA, 3, 42), C3= c(9, 7, 42, 87, NA))

if I use fn.naremove(dt1) it returns the error

Error in `[.data.table`(data, , i) : 
j (the 2nd argument inside [...]) is a single symbol but column name 'i' is not found. 
Perhaps you intended DT[, ..i]. This difference to data.frame is deliberate and explained in FAQ 1.1.

If I run the code with the actual column index, it runs smoothly and returns the result I wanted for column number 1:

dt1[, names(dt1[, 1]) := replace(dt1[, 1], is.na(dt1[, 1]), 0)]

  C1 C2 C3
1:  1  9  9
2:  5 NA  7
3: 14 NA 42
4:  0  3 87
5: 54 42 NA

Please tell me if I miss or did something wrong with my function. Thanks in advance!!

Upvotes: 3

Views: 75

Answers (2)

jay.sf
jay.sf

Reputation: 72919

You may use replace.

replace(dt1, is.na(dt1), 0)
#    C1 C2 C3
# 1:  1  9  9
# 2:  5  0  7
# 3: 14  0 42
# 4:  0  3 87
# 5: 54 42  0

There's a nice function around that stays in the data.table universe and which we may expand to account for specific classes.

dt1 <- cbind(dt1, x=c("a", NA))  ## add a categorcal variable

library(data.table)
classes <- c("numeric", "interger", "interger64")  ## define sp. classes

fun <- function(DT) {
  for (j in names(DT)) {
    set(DT, which(is.na(DT[[j]]) & class(DT[[j]]) %in% classes), j, 0)
  }
}

fun(dt1)
dt1
#    C1 C2 C3    x
# 1:  1  9  9    a
# 2:  5  0  7 <NA>
# 3: 14  0 42    a
# 4:  0  3 87 <NA>
# 5: 54 42  0    a

Only NA's of defined classes are replaced. This should be most effective since no copies are made.

Upvotes: 3

Ronak Shah
Ronak Shah

Reputation: 388982

Note that names(dt1[, 1]) works but when you do -

i <- 1
names(dt1[, i])

It doesn't work and returns an error

Error in [.data.table(dt1, , i) : j (the 2nd argument inside [...]) is a single symbol but column name 'i' is not found. Perhaps you intended DT[, ..i]. This difference to data.frame is deliberate and explained in FAQ 1.1.

The solution is to use ..i i.e names(dt1[, ..i]).


Other option is -

fn.naremove <- function(data){ 
  for (i in 1: length(data)){
    if (class(data[[i]]) %in% c("numeric", "interger", "interger64")) {
      print(data[, names(data)[i] := replace(data[[i]], is.na(data[[i]]), 0)])
    } else {
      print(data)
    }}
}
fn.naremove(dt1)

Upvotes: 2

Related Questions