Reputation: 1925
I'm trying to understand how can I work on the rows of a data frame based on a condition. Having a data frame like this
> d<-data.frame(x=c(0,1,2,3), y=c(1,1,1,0))
> d
x y
1 0 1
2 1 1
3 2 1
4 3 0
how can I add +1 to all rows that contain a value of zero? (note that zeros can be found in any column), so that the result would look like this:
x y
1 1 2
2 1 1
3 2 1
4 4 1
The following code seems to do part of the job, but is just printing the rows where the action was taken, the number of times it was taken (2)...
> for(i in 1:nrow(d)){
+ d[d[i,]==0,]<-d[i,]+1
+ }
> d
x y
1 1 2
2 4 1
3 1 2
4 4 1
I'm sure there is a simple solution for this, maybe an apply function?, but I'm not getting there.
Thanks.
Upvotes: 4
Views: 2267
Reputation: 48191
Some possibilities:
# 1
idx <- which(d == 0, arr.ind = TRUE)[, 1]
d[idx, ] <- d[idx, ] + 1
# 2
t(apply(d, 1, function(x) x + any(x == 0)))
# 3
d + apply(d == 0, 1, max)
The usage of which
for vectors, e.g. which(1:3 > 2)
, is quite common, whereas it is used less for matrices: by specifying arr.ind = TRUE
what we get is array indices, i.e. coordinates of every 0:
which(d == 0, arr.ind = TRUE)
row col
[1,] 1 1
[2,] 4 2
Since we are interested only in rows where zeros occur, I take the first column of which(d == 0, arr.ind = TRUE)
and add 1 to all the elements in these rows by d[idx, ] <- d[idx, ] + 1
.
Regarding the second approach, apply(d, 1, function(x) x)
would be simply going row by row and returning the same row without any modifications. By any(x == 0)
we check whether there are any zeros in a particular row and get TRUE
or FALSE
. However, by writing x + any(x == 0)
we transform TRUE
or FALSE
to 1 or 0, respectively, as required.
Now the third approach. d == 0
is a logical matrix, and we use apply
to go over its rows. Then when applying max
to a particular row, we again transform TRUE
, FALSE
to 1, 0 and find a maximal element. This element is 1 if and only if there are any zeros in that row. Hence, apply(d == 0, 1, max)
returns a vector of zeros and ones. The final point is that when we write A + b
, where A
is a matrix and b
is a vector, the addition is column-wise. In this way, by writing d + apply(d == 0, 1, max)
we add apply(d == 0, 1, max)
to every column of d
, as needed.
Upvotes: 3