Reputation: 1667
I am running a for loop to fill dynamically a dataframe (I know a baby seal dies somewhere because I am using a for loop)
I have something like this in mind (the 5 is a placeholder for a function that returns a scalar):
results<-data.frame(matrix(NA, nrow = length(seq(1:10)), ncol =
length(seq(1:10))))
rows<-data.frame(matrix(NA, nrow = 1, ncol = 1))
for (j in seq(1:10)){
rows<-data.frame()
for (i in seq(1:10)){
rows<-cbind(rows,5)
}
results<-cbind(results,rows)
}
I get the following error message with my approach above.
Error in match.names(clabs, names(xi)) :
names do not match previous names
Is there an easier way?
Upvotes: 3
Views: 7744
Reputation: 20095
Not sure what is your intention. Now keeping your intention and way of implementation a way to fix the problem to change for-loop
so that rows
is initialized with 1st value. The second for-loop
should be from seq(2:10)
.
The error is occurring because attempting to cbind
a blank data.frame
with valid value.
for (j in seq(1:10)){
rows<-data.frame(5) #Initialization with 1st value
for (i in seq(2:10)){ #Loop 2nd on wards.
rows<-cbind(rows,5)
}
results<-cbind(results,rows)
}
Upvotes: 1
Reputation: 10543
Dynamically filling an object using a for loop is fine - what causes problems is when you dynamically build an object using a for loop (e.g. using cbind
and rbind
rows).
When you build something dynamically, R has to go and request new memory for the object in each loop, because it keeps increasing in size. This causes a for loop to slow down with every iteration as the object gets bigger.
When you create the object beforehand (e.g. a data.frame
with the right number of rows and columns), and fill it in by index, the for loop doesn't have this problem.
One final thing to keep in mind is that for data.frames
(and matrices
) each column is stored as a vector in memory – so its usually more efficient to fill these in one column at a time.
With all that in mind we can revise your code as follows:
results <- data.frame(matrix(NA, nrow = length(seq(1:10)),
ncol = length(seq(1:10))))
for (rowIdx in 1:nrow(results)) {
for (colIdx in 1:ncol(results)) {
results[rowIdx, colIdx] <- 5 # or whatever value you want here
}
}
Upvotes: 13