Vectorisation of for loop with multiple conditions

dummies  = matrix(c(0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0), nrow=6, ncol=6) 
colnames(dummies)  <- c("a","b", "c", "d", "e", "f")

I have a matrix with dummies

> dummies
     a b c d e f
[1,] 0 0 0 0 1 0
[2,] 0 0 1 0 0 0
[3,] 1 0 0 0 0 0
[4,] 0 0 0 0 0 1
[5,] 0 1 0 0 0 0
[6,] 0 0 0 1 0 0

I know that my dummies are related in that line 1 is grouped with 2, 3 with 4, and 5 with 6. I want to split each dummy code(1) between those in the same group on the same line as above:

> dummies
        a    b    c    d    e    f
[1,]  0.0  0.0 -0.5  0.0  0.5  0.0
[2,]  0.0  0.0  0.5  0.0 -0.5  0.0
[3,]  0.5  0.0  0.0  0.0  0.0 -0.5
[4,] -0.5  0.0  0.0  0.0  0.0  0.5
[5,]  0.0  0.5  0.0 -0.5  0.0  0.0
[6,]  0.0 -0.5  0.0  0.5  0.0  0.0

To achieve this, I do the following:

dummies <- ifelse(dummies==1, 0.5, 0)
for (i in 1:nrow(dummies)){
    column = which(dummies[i,] %in% 0.5)
    if (i %% 2 != 0) {      
      dummies[i+1, column] <- -0.5
    } else {            
      dummies[i-1, column] <- -0.5
   }
 }

My question is whether I could achieve this with vectorised code. I cannot figure out how to use ifelse in this case because I cannot combine it with the line indexing to find the 0.5 on each line.

Upvotes: 11

Answers (4)

lmo

Reputation: 38500

Here is one attempt in base R

# get locations of ones
ones <- which(dummies == 1)
# get adjacent locations
news <- ones + c(1L, -1L)[(ones %% 2 == 0L) + 1L]

# fill out matrix
dummiesDone <- dummies * 0.5
dummiesDone[news] <- -0.5

dummiesDone
        a    b    c    d    e    f
[1,]  0.0  0.0 -0.5  0.0  0.5  0.0
[2,]  0.0  0.0  0.5  0.0 -0.5  0.0
[3,]  0.5  0.0  0.0  0.0  0.0 -0.5
[4,] -0.5  0.0  0.0  0.0  0.0  0.5
[5,]  0.0  0.5  0.0 -0.5  0.0  0.0
[6,]  0.0 -0.5  0.0  0.5  0.0  0.0

This solution uses the fact that a matrix is simply a vector with a dimension attribute. which finds the location of 1s in the underlying vector.

the second term in the second line, c(1, -1)[(ones %% 2 == 0L) + 1L] allows for the selection of the "pair" element of the vector that will be used to split the ones value, based on whether or not the original position is even or odd. This works here because there are an even number of rows, which is necessary in this problem of paired elements.

The next lines fill in the matrix based on whether the element is originally a one (0.5) or if it is an adjacent, pair element (-0.5). Note that the second command exploits the underlying vector position concept.

A second method that borrows off of the concept of posts and comments from hubertl, thelatemail, and martin-morgan that subtract 0.5 from the original matrix in the correct locations first to get the indices same as above

# get locations of ones
ones <- which(dummies == 1)
# get adjacent locations
news <- ones + c(1L, -1L)[(ones %% 2 == 0L) + 1L]

and then combine [<- with subtraction

dummies[c(ones, news)] <- dummies[c(ones, news)] - .5
dummies
        a    b    c    d    e    f
[1,]  0.0  0.0 -0.5  0.0  0.5  0.0
[2,]  0.0  0.0  0.5  0.0 -0.5  0.0
[3,]  0.5  0.0  0.0  0.0  0.0 -0.5
[4,] -0.5  0.0  0.0  0.0  0.0  0.5
[5,]  0.0  0.5  0.0 -0.5  0.0  0.0
[6,]  0.0 -0.5  0.0  0.5  0.0  0.0

Upvotes: 12

Martin Morgan

Reputation: 46856

Create a vector indicating the row groups, grp, and subtract the group means rowsum(dummies, grp) / 2 from each member of the group, as

grp = rep(seq_len(nrow(dummies) / 2), each=2)
dummies - rowsum(dummies, grp)[grp,] / 2

A little more generally, allowing for different sized and un-ordered groups

dummies - (rowsum(dummies, grp) / tabulate(grp))[grp,]

Upvotes: 6

Pierre L

Reputation: 28441

Here's another approach:

dummies[] <- sapply(split(dummies, gl(length(dummies)/2,2)), function(v) if(any(!!v))v-.5 else v)
        a    b    c    d    e    f
[1,]  0.0  0.0 -0.5  0.0  0.5  0.0
[2,]  0.0  0.0  0.5  0.0 -0.5  0.0
[3,]  0.5  0.0  0.0  0.0  0.0 -0.5
[4,] -0.5  0.0  0.0  0.0  0.0  0.5
[5,]  0.0  0.5  0.0 -0.5  0.0  0.0
[6,]  0.0 -0.5  0.0  0.5  0.0  0.0

Upvotes: 5

HubertL

Reputation: 19544

Another approach :

dummies - ((dummies[c(1,3,5),]+dummies[c(2,4,6),])/2)[c(1,1,2,2,3,3),]

        a    b    c    d    e    f
[1,]  0.0  0.0 -0.5  0.0  0.5  0.0
[2,]  0.0  0.0  0.5  0.0 -0.5  0.0
[3,]  0.5  0.0  0.0  0.0  0.0 -0.5
[4,] -0.5  0.0  0.0  0.0  0.0  0.5
[5,]  0.0  0.5  0.0 -0.5  0.0  0.0
[6,]  0.0 -0.5  0.0  0.5  0.0  0.0

Upvotes: 4

Vectorisation of for loop with multiple conditions

Answers (4)

Related Questions