Minh Mai
Minh Mai

Reputation: 21

issue summing columns

I have a very large dataset and I'm trying to get the sums of values. The variables are binary with 0s and 1s.

Somehow, when I run a for loop

for (i in 7:39){
agegroup1[53640, i]<-sum(agegroup1[, i])
}

The loop processes but everything but the first column would contain nothing but just NA. I tried calling the values up and would see 0 and 1s, as well as checking the class (it returns "integer"). But when adding it all up, R does not work.

Any advice?

Upvotes: 0

Views: 121

Answers (2)

Jeffrey Evans
Jeffrey Evans

Reputation: 2417

@Gavin Simpson provided a workable solution but alternatively you could use apply. This function allows you to apply a function to the row or column margin.

x <- cbind(x1=1, x2=c(1:8), y=runif(8))

# If you wanted to sum the rows of columns 2 and 3
apply(x[,2:3], 1, sum, na.rm=TRUE)

# If you want to sum the columns of columns 2 and 3
apply(x[,2:3], 2, sum, na.rm=TRUE)

Upvotes: 1

Gavin Simpson
Gavin Simpson

Reputation: 174948

cs <- colSums(agegroup1[, 7:39])

will give you the vector of column sums without looping (at the R level).

If you have any missing values (NAs) in agegroup1[, 7:39] then you may want to add na.rm = TRUE to the colSums() call (or even your sum() call).

You don't say what agegroup1 is or how many rows it has etc, but to finalise what your loop is doing, you then need

agegroup1[53640, 7:39] <- cs

What was in agegroup1[53640, ] before you started adding the column sums? NA? If so that would explain some behaviour.

We do really need more detail though...

Upvotes: 3

Related Questions