Reputation: 147
I'm trying to work on code to build a function for three stage cluster sampling, however, I am just working with dummy data right now so I can understand what is going into my function.
I am working on for loops and have a data frame with grouped values. I'm have a data frame that has data:
Cluster group value value.K.bar value.M.bar N.bar
1 1 A 1 1.5 2.5 4
2 1 A 2 1.5 2.5 4
3 1 B 3 4.0 2.5 4
4 1 B 4 4.0 2.5 4
5 2 B 5 4.0 6.0 4
6 2 C 6 6.5 6.0 4
7 2 C 7 6.5 6.0 4
and I am trying to run the for loop
n <- dim(data)[1]
e <- 0
total <- 0
for(i in 1:n) {e = data.y$value.M.bar[i] - data$N.bar[i]
total = total + e^2}
My question is: Is there a way to run the same loop but for the unique values in the group? Say by:
Group 'A', 'B', 'C'
Any help would be greatly appreciated!
Edit: for correct language
Upvotes: 2
Views: 6242
Reputation: 121568
You can use by
for example, to apply your data per group. First I wrap your code in a function that take data as input.
get.total <- function(data){
n <- dim(data)[1]
e <- 0
total <- 0
for(i in 1:n) {
e <- data$value.M.bar[i] - data$N.bar[i] ## I correct this line
total <- total + e^2
}
total
}
Then to compute total just for group B and C you do this :
by(data,data$group,FUN=get.total)
data$group: A
[1] 4.5
----------------------------------------------------------------------------------------------------
data$group: B
[1] 8.5
----------------------------------------------------------------------------------------------------
data$group: C
[1] 8
But better , Here a vectorized version of your function
by(data,data$group,
function(dat)with(dat, sum((value.M.bar - N.bar)^2)))
Upvotes: 4