Reputation: 1823
I need to create a function for mean calculation using an specific rule without the use of apply
or aggregate
functions. I have 3 variables and I would like to calculate the mean of var3
each change in var2
first and second the var 3 mean each change in the var1
in the same function. This is possible? My code is:
var1 <- sort(rep(LETTERS[1:3],10))
var2 <- rep(1:5,6)
var3 <- rnorm(30)
DB<-NULL
DB<-cbind(var1,var2,as.numeric(var3))
head(DB)
mymean <- function(x, db=DB){
for (1:length(db[,1])){
if (db[,[i]] != db[,[i]]) {
mean(db[,[i]])
}
else (db[,[i]] == db[,[i]]) {
stop("invalid rule")
}}
Thanks Alexandre
Upvotes: 0
Views: 67
Reputation: 4513
It appears that you want to obtain means by groups.
To do this I would use the dplyr
package
library(dplyr)
db <- data.frame(var1 = sort(rep(LETTERS[1:3],10)), var2=rep(1:5,6), var3=rnorm(30))
db %>%
group_by(var1) %>%
summarise(mean_over_va1 = mean(var3))
var1 mean_over_va1
1 A 0.07314416
2 B -0.05983557
3 C -0.03592565
db %>%
group_by(var2) %>%
summarise(mean_over_va2 = mean(var3))
var2 mean_over_va2
1 1 -0.4512942044
2 2 -0.1331316802
3 3 0.0821958902
4 4 -0.0001081054
5 5 0.4646429921
From you comments however, it appears that you don't want to use any base R commands like apply
and aggregate
so I assume you may not like the above solution.
If I had to do this with brute force do something like this:
db <- data.frame(var1 = sort(rep(LETTERS[1:3],10)), var2=rep(1:5,6), var3=rnorm(30), stringsAsFactors = FALSE)
#Obtaining Groups
group1 <- unique(db$var1)
group2 <- unique(db$var2)
#Obtaining Number of Different types of groups so I dont have to keep calling length
N1 <- length(group1)
N2 <- length(group2)
#Preallocating, not necessary but a good habit
res1 <- data.frame(group = group1, mean = rep(NA, N1))
res2 <- data.frame(group = group2, mean = rep(NA, N2))
#Looping over the group members rather than each row of data. I like this approach because it relies more heavily on sub-setting than it does on iteration, which is always a good idea in R.
for (i in seq(1, N1)){
res1[i,"mean"] <- mean(db[db$var1%in%group1[i], "var3"])
}
for (i in seq(1, N2)){
res2[i,"mean"] <- mean(db[db$var2%in%group2[i], "var3"])
}
res <- list(res1, res2)
Upvotes: 1