Reputation: 21
I have a very large dataset and I'm trying to get the sums of values. The variables are binary with 0s and 1s.
Somehow, when I run a for loop
for (i in 7:39){
agegroup1[53640, i]<-sum(agegroup1[, i])
}
The loop processes but everything but the first column would contain nothing but just NA. I tried calling the values up and would see 0 and 1s, as well as checking the class (it returns "integer"). But when adding it all up, R does not work.
Any advice?
Upvotes: 0
Views: 121
Reputation: 2417
@Gavin Simpson provided a workable solution but alternatively you could use apply. This function allows you to apply a function to the row or column margin.
x <- cbind(x1=1, x2=c(1:8), y=runif(8))
# If you wanted to sum the rows of columns 2 and 3
apply(x[,2:3], 1, sum, na.rm=TRUE)
# If you want to sum the columns of columns 2 and 3
apply(x[,2:3], 2, sum, na.rm=TRUE)
Upvotes: 1
Reputation: 174948
cs <- colSums(agegroup1[, 7:39])
will give you the vector of column sums without looping (at the R level).
If you have any missing values (NA
s) in agegroup1[, 7:39]
then you may want to add na.rm = TRUE
to the colSums()
call (or even your sum()
call).
You don't say what agegroup1
is or how many rows it has etc, but to finalise what your loop is doing, you then need
agegroup1[53640, 7:39] <- cs
What was in agegroup1[53640, ]
before you started adding the column sums? NA
? If so that would explain some behaviour.
We do really need more detail though...
Upvotes: 3