Julia
Julia

Reputation: 111

Why do I get different results for two functions with the same logic behind it? Where is the mistake?

I've a df with 50 columns (X1, X2, X3, etc.), each one contains estimates of a different question.

A small version of it:

    X1  X2  X3  X4
1   10  15  17  12
2   5   2   5   5
3   15  20  10  16
4   10  15  15  20

I want to calculate the log10() of each estimate and then subtract the median of the logs for each question.

E.g. for question 1: sum(log(X1)-median1)/4)

I used the following syntax to loop over the columns:

s <- vector("integer", ncol(df))
for(i in seq_along(df)) {
  s[[i]] <- sum(log10(df[[i]]) - median(log10(df[[i]])/4))

But if I check the result, e.g. for X1, I get a different value:

s1 <- df %>% mutate(log_e = log10(X1)) %>%
    summarize(s = sum(log_e - median(log_e))/4)

In the first command its 2.88 and in the second -0.0312.

Where is the mistake?

Upvotes: 1

Views: 33

Answers (1)

Yuriy Saraykin
Yuriy Saraykin

Reputation: 8880

you did not have enough brackets

apply(df, 2, function(x) mean(log10(x) - median(log10(x))))

s <- vector("integer", ncol(df))
for(i in seq_along(df)){
  s[[i]] <- (sum(log10(df[[i]]) - median(log10(df[[i]])))/ 4)
}

s <- vector("integer", ncol(df))
for(i in seq_along(df)){
  s[[i]] <- mean(log10(df[[i]]) - median(log10(df[[i]])))
}

Upvotes: 2

Related Questions