Mostafa90
Mostafa90

Reputation: 1706

Condition and row and column with a double loop

I get a problem about "subscript out of bounds", what I want is to get the first and last month for each observation where I have three consecutive "1" or "True". I want to create 2 new column "begin" and "end" where I get repectively the first month and the last month. In my example for the first observation : begin equal to avril and end equal to juin In the 5 observation : begin equal to fevrier and end equal to avril In the 9 observation : begin equal to janvier and end equal to mars ...

I tried to do this :

nom <- letters[1:5]
pseudo <- paste(name, 21:25, sep = "")
janvier <- c(0, 1, 1, 1, 0)
fevrier <- c(1, 1, 1, 1, 1)
mars <- c(0, 0, 0, 1, 1)
avril <- c(1, 1, 1, 0, 1)
mai <- c(1, 0, 1, 1, 1)
juin <- c(1, 1, 0, 1, 0)

df <- data.frame(nom =nom, pseudo = pseudo, janvier = janvier,
                 fevrier = fevrier, mars = mars, avril = avril,
                 mai = mai, juin = juin)

dfm <- as.matrix(df[, -c(1, 2)])

my_matrix <- matrix(nrow = 10, ncol = 6)


for(i in 1:dim(dfm)[1]){
  for(j in 1:dim(dfm)[2]){
    if(dfm[i, j] + dfm[i, j+1] + dfm[i, j+2] == 3){
      my_matrix[i, j] <- "periode_ok"
      my_matrix[i, j+1] <- "periode_ok"
      my_matrix[i, j+2] <- "periode_ok"
    } 
  }
}

The ouput should be this :

begin <- c("avril", "no  info", "no info",
           "janvier", "fevrier", "avril", "no info",
           "no info", "janvier", "fevrier")
end <- c("juin", "no info", "no info", "mars",
         "avril", "juin", "no info", "no info",
         "mars", "avril")

output <- data.frame(nom =nom, pseudo = pseudo, janvier = janvier,
                 fevrier = fevrier, mars = mars, avril = avril,
                 mai = mai, juin = juin, begin = begin,end = end)

Any help will be apreciated

Upvotes: 3

Views: 331

Answers (2)

M--
M--

Reputation: 29109

Of course there's a vectorized solution for this but if you want to correct your for loop you need to limit j to dimension of dfm minus 2 as you are checking for two columns ahead. Based on what you provided this would help you; however, it is not clear how you get 10 rows (repeated twice) from 5 rows of df.

      my_matrix <- matrix("no info", nrow = 5, ncol = 2)
      colnames(my_matrix) <- c("begin", "end")


      for(i in 1:dim(dfm)[1]){
        for(j in 1:(dim(dfm)[2]-2)){
          if(dfm[i, j] + dfm[i, j+1] + dfm[i, j+2] == 3){
            my_matrix[i, 1] <- colnames(dfm)[j]
            my_matrix[i, 2] <- colnames(dfm)[j+2]
            break
          } 
        }
      }


output <- cbind(df, my_matrix)

Then result would be:

output

#   nom pseudo janvier fevrier mars avril mai juin   begin     end 
# 1   a name21       0       1    0     1   1    1   avril    juin 
# 2   b name22       1       1    0     1   0    1 no info no info 
# 3   c name23       1       1    0     1   1    0 no info no info 
# 4   d name24       1       1    1     0   1    1 janvier    mars 
# 5   e name25       0       1    1     1   1    0 fevrier   avril

Upvotes: 2

Rui Barradas
Rui Barradas

Reputation: 76565

First of all, constructs like 1:dim(dfm)[1]are dangerous because if dim(dfm)[1]is zero you will get the perfectly valid vector 1:0 and the loop will try to address element zero of a vector or, in this case, matrix. This is illegal and will throw an error. The recommended solution is to use seq_len(...). Second, instead of dim(dfm)[.] I've used nrow and ncol. Now for the error you've got. You are trying to address columns j + 1 and j + 2, so when j reaches ncol(dfm) you're out of bonds. The code below removes the last two elements of the loop limit.

n <- ncol(dfm)
for(i in seq_len(nrow(dfm))){
  for(j in seq_len(n)[-c(n - 1, n)]){
    if(dfm[i, j] + dfm[i, j+1] + dfm[i, j+2] == 3){
      my_matrix[i, j] <- "periode_ok"
      my_matrix[i, j+1] <- "periode_ok"
      my_matrix[i, j+2] <- "periode_ok"
    } 
  }
}

my_matrix

Upvotes: 4

Related Questions