Reputation: 1706
I get a problem about "subscript out of bounds", what I want is to get the first and last month for each observation where I have three consecutive "1" or "True". I want to create 2 new column "begin" and "end" where I get repectively the first month and the last month. In my example for the first observation : begin equal to avril and end equal to juin In the 5 observation : begin equal to fevrier and end equal to avril In the 9 observation : begin equal to janvier and end equal to mars ...
I tried to do this :
nom <- letters[1:5]
pseudo <- paste(name, 21:25, sep = "")
janvier <- c(0, 1, 1, 1, 0)
fevrier <- c(1, 1, 1, 1, 1)
mars <- c(0, 0, 0, 1, 1)
avril <- c(1, 1, 1, 0, 1)
mai <- c(1, 0, 1, 1, 1)
juin <- c(1, 1, 0, 1, 0)
df <- data.frame(nom =nom, pseudo = pseudo, janvier = janvier,
fevrier = fevrier, mars = mars, avril = avril,
mai = mai, juin = juin)
dfm <- as.matrix(df[, -c(1, 2)])
my_matrix <- matrix(nrow = 10, ncol = 6)
for(i in 1:dim(dfm)[1]){
for(j in 1:dim(dfm)[2]){
if(dfm[i, j] + dfm[i, j+1] + dfm[i, j+2] == 3){
my_matrix[i, j] <- "periode_ok"
my_matrix[i, j+1] <- "periode_ok"
my_matrix[i, j+2] <- "periode_ok"
}
}
}
The ouput should be this :
begin <- c("avril", "no info", "no info",
"janvier", "fevrier", "avril", "no info",
"no info", "janvier", "fevrier")
end <- c("juin", "no info", "no info", "mars",
"avril", "juin", "no info", "no info",
"mars", "avril")
output <- data.frame(nom =nom, pseudo = pseudo, janvier = janvier,
fevrier = fevrier, mars = mars, avril = avril,
mai = mai, juin = juin, begin = begin,end = end)
Any help will be apreciated
Upvotes: 3
Views: 331
Reputation: 29109
Of course there's a vectorized solution for this but if you want to correct your for loop you need to limit j
to dimension of dfm
minus 2 as you are checking for two columns ahead. Based on what you provided this would help you; however, it is not clear how you get 10 rows (repeated twice) from 5 rows of df
.
my_matrix <- matrix("no info", nrow = 5, ncol = 2)
colnames(my_matrix) <- c("begin", "end")
for(i in 1:dim(dfm)[1]){
for(j in 1:(dim(dfm)[2]-2)){
if(dfm[i, j] + dfm[i, j+1] + dfm[i, j+2] == 3){
my_matrix[i, 1] <- colnames(dfm)[j]
my_matrix[i, 2] <- colnames(dfm)[j+2]
break
}
}
}
output <- cbind(df, my_matrix)
Then result would be:
output
# nom pseudo janvier fevrier mars avril mai juin begin end
# 1 a name21 0 1 0 1 1 1 avril juin
# 2 b name22 1 1 0 1 0 1 no info no info
# 3 c name23 1 1 0 1 1 0 no info no info
# 4 d name24 1 1 1 0 1 1 janvier mars
# 5 e name25 0 1 1 1 1 0 fevrier avril
Upvotes: 2
Reputation: 76565
First of all, constructs like 1:dim(dfm)[1]
are dangerous because if dim(dfm)[1]
is zero you will get the perfectly valid vector 1:0
and the loop will try to address element zero of a vector or, in this case, matrix. This is illegal and will throw an error. The recommended solution is to use seq_len(...)
.
Second, instead of dim(dfm)[.]
I've used nrow
and ncol
.
Now for the error you've got. You are trying to address columns j + 1
and j + 2
, so when j
reaches ncol(dfm)
you're out of bonds. The code below removes the last two elements of the loop limit.
n <- ncol(dfm)
for(i in seq_len(nrow(dfm))){
for(j in seq_len(n)[-c(n - 1, n)]){
if(dfm[i, j] + dfm[i, j+1] + dfm[i, j+2] == 3){
my_matrix[i, j] <- "periode_ok"
my_matrix[i, j+1] <- "periode_ok"
my_matrix[i, j+2] <- "periode_ok"
}
}
}
my_matrix
Upvotes: 4