Marcin Sajdak
Marcin Sajdak

Reputation: 55

T test in R and problems with run

When I want to make a t- test in result console gives me feedback: Błąd w poleceniu 'var(x)':Calling var(x) on a factor x is defunct. Use something like 'all(duplicated(x)[-1L])' to test for a constant vector. Dodatkowo: Komunikat ostrzegawczy: W poleceniu 'mean.default(x)': argument is not numeric or logical: returning NA.

What does it mean?

Upvotes: 0

Views: 199

Answers (1)

Ian Campbell
Ian Campbell

Reputation: 24848

In R, there is a type of data called factor.

Consider the following two sets of data:

set1 <- round(rnorm(10,5,2))
set1
 [1] 6 3 4 5 7 3 5 7 5 7
set2 <- round(rnorm(10,10,2))
set2
 [1] 11  9  5 11 11 10  9  7  8  9

You can perform a t-test as follows:

t.test(set1,set2)
    Welch Two Sample t-test
data:  set1 and set2
t = -4.8347, df = 17.147, p-value = 0.0001515

Now see what happens if we convert both sets to factors:

set1 <- as.factor(set1)
set2 <- as.factor(set2)
set1
[1] 6 3 4 5 7 3 5 7 5 7
Levels: 3 4 5 6 7

You can see that set1 has now become the same numbers but with a collection of levels.

levels(set1)
[1] "3" "4" "5" "6" "7"

This can save a lot of space for long repetitive character levels, and can help clarify meaning in statistical analysis.

However, when you try to convert between factors and numeric representations, potentially surprising things can happen:

as.integer(set1)
 [1] 4 1 2 3 5 1 3 5 3 5

In this case, we got the factor level number for each element.

Thus, because of increased risk of unexpected results, this does not work:

t.test(set1,set2)
Error in var(x) : Calling var(x) on a factor x is defunct.

Upvotes: 2

Related Questions