Reputation: 111
I've a df
with 50 columns (X1, X2, X3, etc.), each one contains estimates of a different question.
A small version of it:
X1 X2 X3 X4
1 10 15 17 12
2 5 2 5 5
3 15 20 10 16
4 10 15 15 20
I want to calculate the log10()
of each estimate and then subtract the median of the logs for each question.
E.g. for question 1: sum(log(X1)-median1)/4)
I used the following syntax to loop over the columns:
s <- vector("integer", ncol(df))
for(i in seq_along(df)) {
s[[i]] <- sum(log10(df[[i]]) - median(log10(df[[i]])/4))
But if I check the result, e.g. for X1, I get a different value:
s1 <- df %>% mutate(log_e = log10(X1)) %>%
summarize(s = sum(log_e - median(log_e))/4)
In the first command its 2.88
and in the second -0.0312
.
Where is the mistake?
Upvotes: 1
Views: 33
Reputation: 8880
you did not have enough brackets
apply(df, 2, function(x) mean(log10(x) - median(log10(x))))
s <- vector("integer", ncol(df))
for(i in seq_along(df)){
s[[i]] <- (sum(log10(df[[i]]) - median(log10(df[[i]])))/ 4)
}
s <- vector("integer", ncol(df))
for(i in seq_along(df)){
s[[i]] <- mean(log10(df[[i]]) - median(log10(df[[i]])))
}
Upvotes: 2