Dave
Dave

Reputation: 35

Using an if statement in apply in R for every value in a data frame

I have a data frame that I created using the read_excel function and then duplicated it. I'm going to explain it as if I was using Excel, because it's so easy to do this in Excel. I want to check if each cell in each row within columns 3 to 11 have a zero, and if so, put a zero in columns 12 to 20. If not, keep the original value.

Data2 <- Data1

Data2[,12:20] <- apply(Data1[,3:11],1:2,function(x) {if(x==0) {0})

This is the error message I get:

Warning message: In [<-.data.frame(*tmp*, , 12:20, value = list(0, 0, 0, 0, 0, : provided 450 variables to replace 9 variables

Example:

Data1 <- matrix(data=c(0,1,1,0,3,4,5,6,2,3,0,5,6,5,6,2,6,2,3,4,5,6,5,6),nrow=6,ncol=4)
Data2 <- Data1
Data2[,3:4] <- apply(Data1[,1:2],1:2,function(x) if(x==0) {0})
Data2 <- matrix(Data2,nrow=6,ncol=4)

The result should look like this:

     [,1] [,2] [,3] [,4]
[1,]    0    5    0    3
[2,]    1    6    5    4
[3,]    1    2    6    5
[4,]    0    3    0    6
[5,]    3    0    6    0
[6,]    4    5    2    6

where any zero in columns 1 and 2 become zeros in the appropriate spot in columns 3 and 4.

Instead, I get this:

     [,1] [,2] [,3] [,4]
[1,] 0    5    0    NULL
[2,] 1    6    NULL NULL
[3,] 1    2    NULL NULL
[4,] 0    3    0    NULL
[5,] 3    0    NULL 0   
[6,] 4    5    NULL NULL

Also, I'm still getting the same error message from the original data that had 50+ row and 20 columns shown at the beginning.

Upvotes: 2

Views: 64

Answers (3)

Ben
Ben

Reputation: 30474

Here is an alternative solution:

First, create a logical matrix representing which elements are 0 in the columns of interest.

mat <- Data1[,1:2] == 0
mat

      [,1]  [,2]
[1,]  TRUE FALSE
[2,] FALSE FALSE
[3,] FALSE FALSE
[4,]  TRUE FALSE
[5,] FALSE  TRUE
[6,] FALSE FALSE

Then, select the elements for the target columns where the logical matrix has a TRUE value and set those to 0:

Data2[,3:4][mat==TRUE] <- 0
Data2

     [,1] [,2] [,3] [,4]
[1,]    0    5    0    3
[2,]    1    6    5    4
[3,]    1    2    6    5
[4,]    0    3    0    6
[5,]    3    0    6    0
[6,]    4    5    2    6

Upvotes: 1

Justin Landis
Justin Landis

Reputation: 2071

With R, you always want to work with vectors, ifelse is a great way to do an if statement on a vector. It is generally slow to use for loops in R and applying a function to each element in a matrix I believe is reserved for a different apply function, but I do not know which. Applies are sensitive to their return types also, So sense you are trying to append a data frame, using the standard apply and working on the columns (as most data frames in R are expected to be handled), it makes it easy to add additional columns.

Data2[,12:20] <- apply(Data1[,3:11], 2, function(x){ifelse(x==0,0,x)})

Upvotes: 0

ulfelder
ulfelder

Reputation: 5335

There's probably a more elegant solution, but this works:

for (j in seq(nrow(Data1))) {

  for (i in seq(2)) {

    if (Data1[j,i] == 0) {

      Data1[j,i + 2] <- 0

    }
  }
}

Result:

> Data1
     [,1] [,2] [,3] [,4]
[1,]    0    5    0    3
[2,]    1    6    5    4
[3,]    1    2    6    5
[4,]    0    3    0    6
[5,]    3    0    6    0
[6,]    4    5    2    6

Obviously, you'll want to tweak the 2 in i in seq(2) and Data[j,i + 2] <- 0 to fit the correct number of columns over which you're iterating this.

Upvotes: 1

Related Questions