Reputation: 3491
I have an integer vector that I expected I could treat as a numeric vector:
> class(pf$age)
[1] "integer"
> is.numeric(pf$age)
[1] TRUE
However, when I try to use it to calculate a correlation, I get an error:
> cor.test(x = "age", y = "friend_count", data = pf)
Error in cor.test.default(x = "age", y = "friend_count", data = pf) :
'x' must be a numeric vector
None of my best guesses at alternate syntax work either: http://pastie.org/9595290
What's going on?
Edit:
The following syntax works:
> x = pf$age
> y = pf$friend_count
> cor.test(x, y, data = pf, method="pearson", alternative="greater")
However, I don't understand why I can't specify x and y in the function (as you can with other R functions like ggplot
). What is the difference between ggplot
and cor.test
?
Upvotes: 1
Views: 1476
Reputation: 24535
You can use 'get' with strings to get data:
age = pf$age
friend_count = pf$friend_count
or:
attach(pf)
then following should work:
cor.test(x = get("age"), y = get("friend_count"))
Upvotes: 0
Reputation: 174813
You don't refer to variables using character strings like that in a function call. You want to pass to the x
and y
arguments numeric vectors. You passed length 1 character vectors:
> is.numeric("age")
[1] FALSE
> is.character("age")
[1] TRUE
Hence you were asking cor.test()
to compute the correlation between the strings "age"
and "friend_count"
.
You also mixed up the formula
method of cor.test()
with the default
one. You supply a formula and a data
object or you supply arguments x
and y
. You can't mix and match.
Two solutions are:
with(pdf, cor.test(x = age, y = friend_count))
cor.test( ~ age + friend_count, data = pf)
The first uses the default method, but we allow ourselves to refer to the variables in pf
directly by using with()
. The second uses the formula method.
As to your question in the title; yes, integer vectors are considered numeric in R:
> int <- c(1L, 2L)
> is.integer(int)
[1] TRUE
> is.numeric(int)
[1] TRUE
Do note @Joshua Ulrich's point in the comment below. Technically integers are slightly different to numerics in R as Joshua shows. However this difference need not concern users most of the time as R can convert/use these as needed. It does matter in some places, such as .C()
calls for example.
Upvotes: 2